This document discusses Pulsar, an open-source real-time analytics platform used by eBay to collect, process, and analyze user event data at scale in real time. Pulsar enables eBay to gain insights from billions of daily events and terabytes of cross-platform data transfers to power uses like personalization, fraud detection, and marketing. The document outlines key customer demands on scalability, latency, processing, and availability that Pulsar addresses and lessons learned around connecting diverse data sources to provide a complete view of the customer experience.
1 of 13
Download to read offline
More Related Content
Pulsar - Real Time Analytics at Scale - eMetrics SF 2016
1. Pulsar - Real Time Analytics At
Scale
Dror Engel
Product Lead - eBay
http://www.linkedin.com/in/drorengel
@drorengel
eMetrics Summit, SF, April 5, 2016
2. Global
Connected Commerce
$32B!
GMV VIA MOBILE!
(2015)
304M!
MOBILE DOWNLOADS
GLOBALLY
1.4B!
LISTINGS CREATED
VIA MOBILE
162M!
ACTIVE BUYERS
25M!
ACTIVE SELLERS
800M!
ACTIVE LISTINGS
9.2M!
MOBILE LISTINGS
EVERY WEEK
3. Every 7 seconds
(U.S.A.)
Every 2 hours
(Korea)
Every 2 mins
(UK)
Global Commerce Velocity
4. COMMERCE IS AT AN INFLECTION
POINT
Of鍖ine / online lines are collapsing
Customer expectations are changing
6. TECHNOLOGY TRENDS
≒ Customer centric continues Intelligence
≒ Faster analysis (Daily -> Hourly -> Minutes -> Seconds)
≒ Bigger data volume and processing
≒ Big data technologies shifts from POC to production use cases
≒ More data points: IoT services; link data quickly and conveniently
≒ More data sources
≒ Fast data exploration capabilities - OLAP
7. CONNECTING WITH USER
BEHAVIOIR DATA
User
Behavior
Data
Real-time
reporting
Business
activity
monitoring
Personalization
Advertising
Marketing
Fraud & Bot
Detection
8. ENABLING DATA INSIGHTS AT SCALE
Pulsar is an open-source(2015), real-time analytics platform that
includes stream processing, metrics store, and reporting
frameworks. Pulsar is used to collect, process user and business
events in real time, provide key insights using custom dashboards,
and enable systems to react to user activities within seconds.
9. Key Customers Demands
Millions of events
per second
SCALABILITY
< 1 seconds from
source to end-user
LATENCY
Enrichments 1st,
3rd data sourced
Filtering (bots)
Grouping (customer
level)
Ordering
PRCCESSING
99.99% Up Time
No Downtime
During Upgrades
Self Healing &
Distributed
AVAILABILITY
Integrations with
other sources &
channels
FLEXIBILITY
10. Pulsar Lessons Learned
Connecting data points is the present not the future
By connecting new data points, you bring many new insights. Always seek to
add new dots to draw the full picture
Complete view of customer journey is essential but leveraging real-time signals
is the future
Real-time insights must be aggregated at the customer level to deliver actionable
insights