ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Foxtrot - Real Time Analytics
Santanu Sinha, Architect, Flipkart
? Why did we build Foxtrot?
? What features does Foxtrot support?
? How is it built?
2
User buys a product on Flipkart
3
The Buying Process (Checkout)
? Provide delivery details
? Fine tune order components and delivery SLAs
? Select payment mode
? Will be redirected to bank site
? Make payment at bank
? Get redirected back to Flipkart site
? Order confirmed and confirmation page
rendered
4
Need for real-time metrics
? Checkout process divided into data generation
steps
? User interacts with multiple services at almost
every step
? We need to make sure none of the services are
failing or latent over a reasonable amount of
time
? Be proactive over failures and gracefully
degrade some parts while keeping the system up
? Some payment mode is failing, turn it off/
5
Sample step monitoring widget
6
Need for real time rich events
? Multiple interactions across different sites during
checkout
? Many places and reasons where things can go wrong
? What to tell him when he calls up customer care
immediately after?
? Need a system where we can query by order ID/
customer ID
? Get all steps taken by the user
? Get all service calls in every step
? Check reconciliation status for the order
7
Similar other cases
? Search Indexing
? At what rate are we indexing the signals?
? When was the last price update for a listing
indexed?
? Notification
? Is everything fine in the notification pipe?
? When was the last notification sent to a
device?
? ¡­
8
Requirements
? Has to work with raw events
? Must provide both metrics type aggregations as
well as search
? Aggregations and queries needed for few weeks
? Must have interactive response times for queries
? Access to all events over a longer period
? Access to long-term events are very ad-hoc and
don¡¯t share common characteristics
9
Requirements (contd)
? Ingestion needs to be fast, scalable and easy to
integrate to
? Has to support multi-tenancy as a intrinsic
feature
? Clients shouldn¡¯t have to build consoles for
regular monitoring
? Must be easy for analysts to work with
? Must support downloads of data
? If required, clients should be able to easily build
consoles using APIs
10
FOXTROT
Introducing Foxtrot
? Data-Store for Real-Time event data
? Works on Raw events with multiple attributes
? Searchable on all fields and dimensions
? Built in support for SQL like queries to
summarize and view data
? Built in console builder to build and share
monitoring consoles
? JSON query language
? Based on REST APIs for easy client integration
? Dynamic node discovery for writing smart clients
12
Constructs
? Table
? A logical namespace for events coming from an
app
? Document
? A event coming into Foxtrot
? Consists of:
? ID - A unique ID for the event
? Timestamp - Epoch timestamp in milliseconds
? Data - The fields for an event
13
Table
{
¡°name¡± : ¡°test¡±,
¡°ttl¡± : 7
}
14
? Name - Logical name of the table
? TTL - Number of days table data can be queried
from front-end
Document
{
"id": "569b3782-255b-48d7-a53f-7897d605db0b",
"timestamp": 1401650819000,
"data": {
"event": "APP_LOAD",
"os": "android",
"device": "XperiaZ",
"appVersion": {
"major": "1",
"minor": "2"
}
}
}
15
Console
16
FQL
select * from test
where eventType in ('APP_LOAD', ¡®APP_CRASH¡¯) and last(¡®1h')
group by eventType, os
+-------------------------------+
| eventType | os | count |
+-------------------------------+
| APP_CRASH | android | 38330 |
| APP_CRASH | ios | 2888 |
| APP_LOAD | android | 2749803 |
| APP_LOAD | ios | 35380 |
+-------------------------------+
17
JSON Query
{
"opcode": "group",
"table": "test",
"filters": [
{
"field": "event",
"operator": "in",
"value": "APP_LOAD"
},
{
"operator" : "last",
"duration" : "1h"
}
],
"nesting": [ "os","version"]
}
18
{
"opcode": "group",
"result": {
"android": {
"3.2": 2019,
"4.3": 299430,
"4.4": 100000032023
},
"ios": {
"6": 12281,
"7": 23383773637
}
}
}
APIs
? Everything done through REST APIs
? Ingestion:
? Post to : /foxtrot/v1/document/bulk
? Console:
? GET: /
? FQL:
? POST: /foxtrot/v1/fql
? JSON Analytics:
? Post to: /foxtrot/v1/analytics
? Other admin and discovery APIs available
19
Architecture
20
Long Term Data Store
Query Store
Caching
FQL
JSON
Console
Analytics
Engine
Ingestion
API
Long Term Store
? Apache HBase
? Known for scalability
? Native Map-Reduce support
? Supports compression over column families
? Extensively used at Flipkart
? Used primarily as Key-Value store
? Never accessed directly
? Used for event downloads and select * queries
? Currently stores about 9 TB of compressed data
21
Query Store
? Elasticsearch
? Very rich interface over Apache Lucene
? Provides both search and aggregation
? Built for scale
? Rich community and active development
? Indexes divided by table and days
? Optimized template based index mapping used
? Does not store full events as source, only fields
? Stored data kept for the mentioned table TTL
22
Cache
? Hazelcast
? Simplistic easy to integrate model
? Works well in embedded mode
? Provides distributed map that can be used as cache
? Provides distributed executors that can be used for
analytics
? Used as cache for query results
? Used as metadata store for tables
? Used for discovery of Foxtrot nodes by clients
? JSON queries analyzed to generate unique keys
? Same query generates different key every 30 seconds
23
Ingestion
? Document saved to long term store and the
query store
? Row key of HBase becomes the document ID in
Elasticsearch
? API returns success only if save succeeds to both
stores
? Always use bulk if possible
? Clients use the node discovery API to call
Foxtrot nodes directly
? Client can use a disk-based or in-memory queue
24
Analytics/Queries
? Simple JSON based DSL
? FQL is translated to JSON
? An analysis runs to generate cache key
? Cache key changes every 30 seconds for same query
? Results returned if found in cache
? A basic analysis runs on the query to figure out the time
range for the query
? Indexes are selected and queries forwarded to Query
store
? The query store might use the key-value store and
Hazelcast distributed executors to return results
25
Console
? Built out of customizable widgets
? Consoles can be saved and shared
? Supports filtering in configuration
? Supports ad-hoc on the fly filtering
? Uses flot for light weight charts
? Easily extensible design
? Embedded in Foxtrot jar
26
Foxtrot at Flipkart
? Many systems hooked into Foxtrot
? Used by devs to monitor system health
? Devs build and share consoles on Foxtrot
? Custom consoles built easily using JSON queries
? Used by analysts to run ad-hoc queries
? Provides analysis and monitoring over billions of
events across scores of tables
? Active development with more than 400 commits
? Getting used and evaluated at other companies
27
Show me the Code!!
? Released under Apache 2 License at:
https://github.com/Flipkart/foxtrot
? Java smart client available at:
https://github.com/flipkart-incubator/foxtrot-client
? Please refer to wiki on github for:
? Introduction
? Installation and configuration
? Usage
? Please use github issues to report bugs and ask for
new features
28
Foxtrot: Real time analytics

More Related Content

Similar to Foxtrot: Real time analytics (20)

CQRS and Event Sourcing for IoT applications
CQRS and Event Sourcing for IoT applicationsCQRS and Event Sourcing for IoT applications
CQRS and Event Sourcing for IoT applications
Michael Blackstock
?
Azure Application insights - An Introduction
Azure Application insights - An IntroductionAzure Application insights - An Introduction
Azure Application insights - An Introduction
Matthias G¨¹ntert
?
2013.devcon3 liferay and google authenticator integration rafik_harabi
2013.devcon3 liferay and google authenticator integration rafik_harabi2013.devcon3 liferay and google authenticator integration rafik_harabi
2013.devcon3 liferay and google authenticator integration rafik_harabi
Rafik HARABI
?
Cashing in on logging and exception data
Cashing in on logging and exception dataCashing in on logging and exception data
Cashing in on logging and exception data
Stackify
?
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
DataWorks Summit/Hadoop Summit
?
Monitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applicationsMonitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applications
Satya Sanjibani Routray
?
Monitoring-Docker-Container-and-Dockerized-Applications
Monitoring-Docker-Container-and-Dockerized-ApplicationsMonitoring-Docker-Container-and-Dockerized-Applications
Monitoring-Docker-Container-and-Dockerized-Applications
Satya Sanjibani Routray
?
Monitoring docker-container-and-dockerized-applications
Monitoring docker-container-and-dockerized-applicationsMonitoring docker-container-and-dockerized-applications
Monitoring docker-container-and-dockerized-applications
Satya Sanjibani Routray
?
Monitoring docker container and dockerized applications
Monitoring docker container and dockerized applicationsMonitoring docker container and dockerized applications
Monitoring docker container and dockerized applications
Ananth Padmanabhan
?
Ibm_IoT_Architecture_and_Capabilities
Ibm_IoT_Architecture_and_CapabilitiesIbm_IoT_Architecture_and_Capabilities
Ibm_IoT_Architecture_and_Capabilities
IBM_Info_Management
?
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
?
[DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elasti...
[DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elasti...[DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elasti...
[DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elasti...
DataScienceConferenc1
?
Data Stack Summit 2023
Data Stack Summit 2023Data Stack Summit 2023
Data Stack Summit 2023
Manimuthu Ayyannan
?
Rakuten¡¯s Journey with Splunk - Evolution of Splunk as a Service
Rakuten¡¯s Journey with Splunk - Evolution of Splunk as a ServiceRakuten¡¯s Journey with Splunk - Evolution of Splunk as a Service
Rakuten¡¯s Journey with Splunk - Evolution of Splunk as a Service
Rakuten Group, Inc.
?
IDEAS Global A.I. Conference 2022.pdf
IDEAS Global A.I. Conference 2022.pdfIDEAS Global A.I. Conference 2022.pdf
IDEAS Global A.I. Conference 2022.pdf
Manimuthu Ayyannan
?
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
?
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
Angel Borroy L¨®pez
?
Splunk Developer Platform
Splunk Developer PlatformSplunk Developer Platform
Splunk Developer Platform
Damien Dallimore
?
User activity monitoring with SysKit
User activity monitoring with SysKitUser activity monitoring with SysKit
User activity monitoring with SysKit
SysKit Ltd
?
Part 5 of the REAL Webinars on Oracle Cloud Native Application Development - ...
Part 5 of the REAL Webinars on Oracle Cloud Native Application Development - ...Part 5 of the REAL Webinars on Oracle Cloud Native Application Development - ...
Part 5 of the REAL Webinars on Oracle Cloud Native Application Development - ...
Lucas Jellema
?
CQRS and Event Sourcing for IoT applications
CQRS and Event Sourcing for IoT applicationsCQRS and Event Sourcing for IoT applications
CQRS and Event Sourcing for IoT applications
Michael Blackstock
?
Azure Application insights - An Introduction
Azure Application insights - An IntroductionAzure Application insights - An Introduction
Azure Application insights - An Introduction
Matthias G¨¹ntert
?
2013.devcon3 liferay and google authenticator integration rafik_harabi
2013.devcon3 liferay and google authenticator integration rafik_harabi2013.devcon3 liferay and google authenticator integration rafik_harabi
2013.devcon3 liferay and google authenticator integration rafik_harabi
Rafik HARABI
?
Cashing in on logging and exception data
Cashing in on logging and exception dataCashing in on logging and exception data
Cashing in on logging and exception data
Stackify
?
Monitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applicationsMonitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applications
Satya Sanjibani Routray
?
Monitoring-Docker-Container-and-Dockerized-Applications
Monitoring-Docker-Container-and-Dockerized-ApplicationsMonitoring-Docker-Container-and-Dockerized-Applications
Monitoring-Docker-Container-and-Dockerized-Applications
Satya Sanjibani Routray
?
Monitoring docker-container-and-dockerized-applications
Monitoring docker-container-and-dockerized-applicationsMonitoring docker-container-and-dockerized-applications
Monitoring docker-container-and-dockerized-applications
Satya Sanjibani Routray
?
Monitoring docker container and dockerized applications
Monitoring docker container and dockerized applicationsMonitoring docker container and dockerized applications
Monitoring docker container and dockerized applications
Ananth Padmanabhan
?
Ibm_IoT_Architecture_and_Capabilities
Ibm_IoT_Architecture_and_CapabilitiesIbm_IoT_Architecture_and_Capabilities
Ibm_IoT_Architecture_and_Capabilities
IBM_Info_Management
?
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
?
[DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elasti...
[DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elasti...[DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elasti...
[DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elasti...
DataScienceConferenc1
?
Rakuten¡¯s Journey with Splunk - Evolution of Splunk as a Service
Rakuten¡¯s Journey with Splunk - Evolution of Splunk as a ServiceRakuten¡¯s Journey with Splunk - Evolution of Splunk as a Service
Rakuten¡¯s Journey with Splunk - Evolution of Splunk as a Service
Rakuten Group, Inc.
?
IDEAS Global A.I. Conference 2022.pdf
IDEAS Global A.I. Conference 2022.pdfIDEAS Global A.I. Conference 2022.pdf
IDEAS Global A.I. Conference 2022.pdf
Manimuthu Ayyannan
?
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
?
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
Angel Borroy L¨®pez
?
User activity monitoring with SysKit
User activity monitoring with SysKitUser activity monitoring with SysKit
User activity monitoring with SysKit
SysKit Ltd
?
Part 5 of the REAL Webinars on Oracle Cloud Native Application Development - ...
Part 5 of the REAL Webinars on Oracle Cloud Native Application Development - ...Part 5 of the REAL Webinars on Oracle Cloud Native Application Development - ...
Part 5 of the REAL Webinars on Oracle Cloud Native Application Development - ...
Lucas Jellema
?

Recently uploaded (20)

Bringing AI to Production - An Introduction
Bringing AI to Production - An IntroductionBringing AI to Production - An Introduction
Bringing AI to Production - An Introduction
benf22
?
Introduction to sql.pdf Database Systems
Introduction to sql.pdf Database SystemsIntroduction to sql.pdf Database Systems
Introduction to sql.pdf Database Systems
adansunahri
?
Credit Worthiness of Kirana Stores on the basis of Non Financial Data
Credit Worthiness of Kirana Stores on the basis of Non Financial DataCredit Worthiness of Kirana Stores on the basis of Non Financial Data
Credit Worthiness of Kirana Stores on the basis of Non Financial Data
rohitagarwal24
?
A Brief Guide to Azure Migration Services.pdf
A Brief Guide to Azure Migration Services.pdfA Brief Guide to Azure Migration Services.pdf
A Brief Guide to Azure Migration Services.pdf
Bloom Consulting Services Private Limited
?
OpenMetadata Community Meeting - 16th April 2025
OpenMetadata Community Meeting - 16th April 2025OpenMetadata Community Meeting - 16th April 2025
OpenMetadata Community Meeting - 16th April 2025
OpenMetadata
?
FRM2 units (persons) in a population.pptx
FRM2 units (persons) in a population.pptxFRM2 units (persons) in a population.pptx
FRM2 units (persons) in a population.pptx
KhalidLafi2
?
Salesforce Data Cloud with Xero 0425.pdf
Salesforce Data Cloud with Xero 0425.pdfSalesforce Data Cloud with Xero 0425.pdf
Salesforce Data Cloud with Xero 0425.pdf
Raksha Meanger
?
FM presentation by group members .... 02.pptx
FM presentation by group members .... 02.pptxFM presentation by group members .... 02.pptx
FM presentation by group members .... 02.pptx
ridaakbar310
?
API Days SG_2025_Not an AI expert by building GenAI apps.pdf
API Days SG_2025_Not an AI expert by building GenAI apps.pdfAPI Days SG_2025_Not an AI expert by building GenAI apps.pdf
API Days SG_2025_Not an AI expert by building GenAI apps.pdf
Naveen Nandan
?
Elastic Kafka Meetup Singapore_Privacy Protected Data Management.pdf
Elastic Kafka Meetup Singapore_Privacy Protected Data Management.pdfElastic Kafka Meetup Singapore_Privacy Protected Data Management.pdf
Elastic Kafka Meetup Singapore_Privacy Protected Data Management.pdf
Naveen Nandan
?
lecture 18 (Loader-Machine Dependent Features).ppt
lecture 18 (Loader-Machine Dependent Features).pptlecture 18 (Loader-Machine Dependent Features).ppt
lecture 18 (Loader-Machine Dependent Features).ppt
ATHMARANJANBhandary
?
DEVELPOMENT OF DATA STROAGE .pptx
DEVELPOMENT OF DATA STROAGE        .pptxDEVELPOMENT OF DATA STROAGE        .pptx
DEVELPOMENT OF DATA STROAGE .pptx
kingchaurasiyakong
?
Introduction to Data Visualization for Agriculture and Allied Sciences using ...
Introduction to Data Visualization for Agriculture and Allied Sciences using ...Introduction to Data Visualization for Agriculture and Allied Sciences using ...
Introduction to Data Visualization for Agriculture and Allied Sciences using ...
Shubham Shah
?
User Experience Research Plan.pdf
User Experience Research Plan.pdfUser Experience Research Plan.pdf
User Experience Research Plan.pdf
MeganMontgomery24
?
Clarkson Depot Business Report and Analysis
Clarkson Depot Business Report and AnalysisClarkson Depot Business Report and Analysis
Clarkson Depot Business Report and Analysis
Blazing Perfection
?
Webinar_Fundamentals to Education Plus_Feb2025.pdf
Webinar_Fundamentals to Education Plus_Feb2025.pdfWebinar_Fundamentals to Education Plus_Feb2025.pdf
Webinar_Fundamentals to Education Plus_Feb2025.pdf
TrailTesting
?
12. Medical microbiology 2023 (1).pdfghj
12. Medical microbiology 2023 (1).pdfghj12. Medical microbiology 2023 (1).pdfghj
12. Medical microbiology 2023 (1).pdfghj
gietiedemlieug
?
Oracle Financial Analytics Part 1 blog (1).docx
Oracle Financial Analytics Part 1 blog  (1).docxOracle Financial Analytics Part 1 blog  (1).docx
Oracle Financial Analytics Part 1 blog (1).docx
ajaykumar405166
?
Certificate of Reference_Happy City Hub_
Certificate of Reference_Happy City Hub_Certificate of Reference_Happy City Hub_
Certificate of Reference_Happy City Hub_
A M
?
Hill Climbing in Artificial Intelligence
Hill Climbing in Artificial IntelligenceHill Climbing in Artificial Intelligence
Hill Climbing in Artificial Intelligence
uthraarul2002
?
Bringing AI to Production - An Introduction
Bringing AI to Production - An IntroductionBringing AI to Production - An Introduction
Bringing AI to Production - An Introduction
benf22
?
Introduction to sql.pdf Database Systems
Introduction to sql.pdf Database SystemsIntroduction to sql.pdf Database Systems
Introduction to sql.pdf Database Systems
adansunahri
?
Credit Worthiness of Kirana Stores on the basis of Non Financial Data
Credit Worthiness of Kirana Stores on the basis of Non Financial DataCredit Worthiness of Kirana Stores on the basis of Non Financial Data
Credit Worthiness of Kirana Stores on the basis of Non Financial Data
rohitagarwal24
?
OpenMetadata Community Meeting - 16th April 2025
OpenMetadata Community Meeting - 16th April 2025OpenMetadata Community Meeting - 16th April 2025
OpenMetadata Community Meeting - 16th April 2025
OpenMetadata
?
FRM2 units (persons) in a population.pptx
FRM2 units (persons) in a population.pptxFRM2 units (persons) in a population.pptx
FRM2 units (persons) in a population.pptx
KhalidLafi2
?
Salesforce Data Cloud with Xero 0425.pdf
Salesforce Data Cloud with Xero 0425.pdfSalesforce Data Cloud with Xero 0425.pdf
Salesforce Data Cloud with Xero 0425.pdf
Raksha Meanger
?
FM presentation by group members .... 02.pptx
FM presentation by group members .... 02.pptxFM presentation by group members .... 02.pptx
FM presentation by group members .... 02.pptx
ridaakbar310
?
API Days SG_2025_Not an AI expert by building GenAI apps.pdf
API Days SG_2025_Not an AI expert by building GenAI apps.pdfAPI Days SG_2025_Not an AI expert by building GenAI apps.pdf
API Days SG_2025_Not an AI expert by building GenAI apps.pdf
Naveen Nandan
?
Elastic Kafka Meetup Singapore_Privacy Protected Data Management.pdf
Elastic Kafka Meetup Singapore_Privacy Protected Data Management.pdfElastic Kafka Meetup Singapore_Privacy Protected Data Management.pdf
Elastic Kafka Meetup Singapore_Privacy Protected Data Management.pdf
Naveen Nandan
?
lecture 18 (Loader-Machine Dependent Features).ppt
lecture 18 (Loader-Machine Dependent Features).pptlecture 18 (Loader-Machine Dependent Features).ppt
lecture 18 (Loader-Machine Dependent Features).ppt
ATHMARANJANBhandary
?
DEVELPOMENT OF DATA STROAGE .pptx
DEVELPOMENT OF DATA STROAGE        .pptxDEVELPOMENT OF DATA STROAGE        .pptx
DEVELPOMENT OF DATA STROAGE .pptx
kingchaurasiyakong
?
Introduction to Data Visualization for Agriculture and Allied Sciences using ...
Introduction to Data Visualization for Agriculture and Allied Sciences using ...Introduction to Data Visualization for Agriculture and Allied Sciences using ...
Introduction to Data Visualization for Agriculture and Allied Sciences using ...
Shubham Shah
?
User Experience Research Plan.pdf
User Experience Research Plan.pdfUser Experience Research Plan.pdf
User Experience Research Plan.pdf
MeganMontgomery24
?
Clarkson Depot Business Report and Analysis
Clarkson Depot Business Report and AnalysisClarkson Depot Business Report and Analysis
Clarkson Depot Business Report and Analysis
Blazing Perfection
?
Webinar_Fundamentals to Education Plus_Feb2025.pdf
Webinar_Fundamentals to Education Plus_Feb2025.pdfWebinar_Fundamentals to Education Plus_Feb2025.pdf
Webinar_Fundamentals to Education Plus_Feb2025.pdf
TrailTesting
?
12. Medical microbiology 2023 (1).pdfghj
12. Medical microbiology 2023 (1).pdfghj12. Medical microbiology 2023 (1).pdfghj
12. Medical microbiology 2023 (1).pdfghj
gietiedemlieug
?
Oracle Financial Analytics Part 1 blog (1).docx
Oracle Financial Analytics Part 1 blog  (1).docxOracle Financial Analytics Part 1 blog  (1).docx
Oracle Financial Analytics Part 1 blog (1).docx
ajaykumar405166
?
Certificate of Reference_Happy City Hub_
Certificate of Reference_Happy City Hub_Certificate of Reference_Happy City Hub_
Certificate of Reference_Happy City Hub_
A M
?
Hill Climbing in Artificial Intelligence
Hill Climbing in Artificial IntelligenceHill Climbing in Artificial Intelligence
Hill Climbing in Artificial Intelligence
uthraarul2002
?

Foxtrot: Real time analytics

  • 1. Foxtrot - Real Time Analytics Santanu Sinha, Architect, Flipkart
  • 2. ? Why did we build Foxtrot? ? What features does Foxtrot support? ? How is it built? 2
  • 3. User buys a product on Flipkart 3
  • 4. The Buying Process (Checkout) ? Provide delivery details ? Fine tune order components and delivery SLAs ? Select payment mode ? Will be redirected to bank site ? Make payment at bank ? Get redirected back to Flipkart site ? Order confirmed and confirmation page rendered 4
  • 5. Need for real-time metrics ? Checkout process divided into data generation steps ? User interacts with multiple services at almost every step ? We need to make sure none of the services are failing or latent over a reasonable amount of time ? Be proactive over failures and gracefully degrade some parts while keeping the system up ? Some payment mode is failing, turn it off/ 5
  • 7. Need for real time rich events ? Multiple interactions across different sites during checkout ? Many places and reasons where things can go wrong ? What to tell him when he calls up customer care immediately after? ? Need a system where we can query by order ID/ customer ID ? Get all steps taken by the user ? Get all service calls in every step ? Check reconciliation status for the order 7
  • 8. Similar other cases ? Search Indexing ? At what rate are we indexing the signals? ? When was the last price update for a listing indexed? ? Notification ? Is everything fine in the notification pipe? ? When was the last notification sent to a device? ? ¡­ 8
  • 9. Requirements ? Has to work with raw events ? Must provide both metrics type aggregations as well as search ? Aggregations and queries needed for few weeks ? Must have interactive response times for queries ? Access to all events over a longer period ? Access to long-term events are very ad-hoc and don¡¯t share common characteristics 9
  • 10. Requirements (contd) ? Ingestion needs to be fast, scalable and easy to integrate to ? Has to support multi-tenancy as a intrinsic feature ? Clients shouldn¡¯t have to build consoles for regular monitoring ? Must be easy for analysts to work with ? Must support downloads of data ? If required, clients should be able to easily build consoles using APIs 10
  • 12. Introducing Foxtrot ? Data-Store for Real-Time event data ? Works on Raw events with multiple attributes ? Searchable on all fields and dimensions ? Built in support for SQL like queries to summarize and view data ? Built in console builder to build and share monitoring consoles ? JSON query language ? Based on REST APIs for easy client integration ? Dynamic node discovery for writing smart clients 12
  • 13. Constructs ? Table ? A logical namespace for events coming from an app ? Document ? A event coming into Foxtrot ? Consists of: ? ID - A unique ID for the event ? Timestamp - Epoch timestamp in milliseconds ? Data - The fields for an event 13
  • 14. Table { ¡°name¡± : ¡°test¡±, ¡°ttl¡± : 7 } 14 ? Name - Logical name of the table ? TTL - Number of days table data can be queried from front-end
  • 15. Document { "id": "569b3782-255b-48d7-a53f-7897d605db0b", "timestamp": 1401650819000, "data": { "event": "APP_LOAD", "os": "android", "device": "XperiaZ", "appVersion": { "major": "1", "minor": "2" } } } 15
  • 17. FQL select * from test where eventType in ('APP_LOAD', ¡®APP_CRASH¡¯) and last(¡®1h') group by eventType, os +-------------------------------+ | eventType | os | count | +-------------------------------+ | APP_CRASH | android | 38330 | | APP_CRASH | ios | 2888 | | APP_LOAD | android | 2749803 | | APP_LOAD | ios | 35380 | +-------------------------------+ 17
  • 18. JSON Query { "opcode": "group", "table": "test", "filters": [ { "field": "event", "operator": "in", "value": "APP_LOAD" }, { "operator" : "last", "duration" : "1h" } ], "nesting": [ "os","version"] } 18 { "opcode": "group", "result": { "android": { "3.2": 2019, "4.3": 299430, "4.4": 100000032023 }, "ios": { "6": 12281, "7": 23383773637 } } }
  • 19. APIs ? Everything done through REST APIs ? Ingestion: ? Post to : /foxtrot/v1/document/bulk ? Console: ? GET: / ? FQL: ? POST: /foxtrot/v1/fql ? JSON Analytics: ? Post to: /foxtrot/v1/analytics ? Other admin and discovery APIs available 19
  • 20. Architecture 20 Long Term Data Store Query Store Caching FQL JSON Console Analytics Engine Ingestion API
  • 21. Long Term Store ? Apache HBase ? Known for scalability ? Native Map-Reduce support ? Supports compression over column families ? Extensively used at Flipkart ? Used primarily as Key-Value store ? Never accessed directly ? Used for event downloads and select * queries ? Currently stores about 9 TB of compressed data 21
  • 22. Query Store ? Elasticsearch ? Very rich interface over Apache Lucene ? Provides both search and aggregation ? Built for scale ? Rich community and active development ? Indexes divided by table and days ? Optimized template based index mapping used ? Does not store full events as source, only fields ? Stored data kept for the mentioned table TTL 22
  • 23. Cache ? Hazelcast ? Simplistic easy to integrate model ? Works well in embedded mode ? Provides distributed map that can be used as cache ? Provides distributed executors that can be used for analytics ? Used as cache for query results ? Used as metadata store for tables ? Used for discovery of Foxtrot nodes by clients ? JSON queries analyzed to generate unique keys ? Same query generates different key every 30 seconds 23
  • 24. Ingestion ? Document saved to long term store and the query store ? Row key of HBase becomes the document ID in Elasticsearch ? API returns success only if save succeeds to both stores ? Always use bulk if possible ? Clients use the node discovery API to call Foxtrot nodes directly ? Client can use a disk-based or in-memory queue 24
  • 25. Analytics/Queries ? Simple JSON based DSL ? FQL is translated to JSON ? An analysis runs to generate cache key ? Cache key changes every 30 seconds for same query ? Results returned if found in cache ? A basic analysis runs on the query to figure out the time range for the query ? Indexes are selected and queries forwarded to Query store ? The query store might use the key-value store and Hazelcast distributed executors to return results 25
  • 26. Console ? Built out of customizable widgets ? Consoles can be saved and shared ? Supports filtering in configuration ? Supports ad-hoc on the fly filtering ? Uses flot for light weight charts ? Easily extensible design ? Embedded in Foxtrot jar 26
  • 27. Foxtrot at Flipkart ? Many systems hooked into Foxtrot ? Used by devs to monitor system health ? Devs build and share consoles on Foxtrot ? Custom consoles built easily using JSON queries ? Used by analysts to run ad-hoc queries ? Provides analysis and monitoring over billions of events across scores of tables ? Active development with more than 400 commits ? Getting used and evaluated at other companies 27
  • 28. Show me the Code!! ? Released under Apache 2 License at: https://github.com/Flipkart/foxtrot ? Java smart client available at: https://github.com/flipkart-incubator/foxtrot-client ? Please refer to wiki on github for: ? Introduction ? Installation and configuration ? Usage ? Please use github issues to report bugs and ask for new features 28