際際滷

際際滷Share a Scribd company logo
Reelbid
Real time ad bidding DSP
( programmatic)
MUDIT UPPAL
INSIGHT DATA ENGINEERING, SV
What is Real Time Bidding(RTB)?
Image source:
Reel bid   insightd-eproject
Demand side platform
- Reelbid
Mashable.com
(publisher with
inventory)
Sell Side Platform
OpenRTB
server
Auction
service
Ad
exchange
OpenRTB
client
Bidder
service
Data management
platform
Data broker
Some
advertiser
Motivation
 Really interesting & new* data engineering challenges
Current State of the art
Googles open-bidder still in alpha(not for public)
In 2014, according to Business Insider Intelligence, ad revenue topped $15 billion. Real-time bidding, and in
particular, mobile and video real-time bidding, lead the way for that growth. Business Insider Intelligence estimates
say RTB revenue will pass $26 billion by the end of 2020, which far surpasses the $11.7 billion from this year.
( *Disclaimer: NO background in Ad tech )
Latency ( 100 ms )
Queries Per Second (~2
- 3 million bids/second)
+ Infrequent bids
Some Sample
AdExchange
Reelbid
1. Hey do you want
to bid on this users
impression?:
2. Yes! Heres my bid:
100 ms
"id" : "32a69c6ba388f110487f9d1e63f77b22d86e916b",
"banner": {
"h": 250,
"w": 300,
"name": "mashable.com",
"domain": "http://www.example.com",
..
"page": "http://easy.example.com/easy?cu=13824;
cre=mu;target=_blank",
"ref" : "http://refer+url",
"publisher": {
"id": "qqwer1234xgfd",
"name": "site_name",
"domain": "my.site.com"
"device": {
"ua": "Mozilla/5.0(KHTML,
like Gecko) Version/5.1.7 Safari/534.
57.2",
"ip": "192.168.5.5",
"geo": {
"lat": 37.789,
"lon": -122.394,
"country": "USA",
"city": "San Francisco",
"region": "CA",
"zip" : "94105",
"type": 2
}
"user": {
"buyeruid" :
"89776897686798fwe87rtryt8976fsd7869678",
"id":
"55816b39711f9b5acf3b90e313ed29e51665623f",
"gender": "M",
"yob": 1975,
"customdata": "Data-asdfdwerewr",
"data": [{
"id": "pub-demographics",
"name": "data_name",
"segment": [{
"id" : "345qw245wfrtgwertrt56765wert",
"name" : "segment_name",
"value": "segment_value
}
{
"id": "eb85349d-03c3-44f4-a77b-
824f7221d116",
"seatbid": [{
"bid": [{
"id": "bid1",
"impid": "eb85349d-03c3-44f4-a77b-
824f7221d116",
"price": 0.1,
"adm": "<div>Ad Creative</div>",
"adomain": [
"http://www.example.com/clickthrough"
Goals
- Create a more efficient bidding framework using sampling and hierarchical
techniques (as described in the paper)
- Ways to Scale without adding nodes and save costs
Demo
Reel bid   insightd-eproject
A different approach to RTB
 A paper published by in IEEE for Data Mining--
(source of truth)
 Instead of comparing bid requests with user
profile<U,P> we look at bid requests only.
 Compensate computational resources with price
(and not accept and bid on everything)
 ROI with Gaussian/Gamma models are much better
Architecture 1.0
Architecture 2.0
11 2
3
How do we handle 2-3 million QPS?
(efficient scaling)
Sampling!
Ziggurat Algorithm for sampling random values within milli seconds (also most
memory efficient )
How do we decide the bid price?
Y < f(x)
Cost(Budget) Resources
1. Looking at overall compute resources
2. User input filtering
Algorithm
Request selection Based on Utilization Selection policy based on reward models
Filter 1
Filter 2
Filter 3
Collecting server system metrics
- Graphite
- Carbon | StatsD | Collectd
- Redis
Cost + ROI
Testing
- Event-sim
- Parallec io (ebay) [8K async HTTP requests per second]
- Smaato AdX
Challenges
- To process that file in 80 ms
- Throttling issues
- Networking challenges
- Programming/debugging/testing in a distributed environment is a very time-
consuming task
About me
Surf, Football and micro-controllers [ Building products ]
Data Scientist, Planetary/Fusion Network Ltd (NY)
MS Comp Science, Media, Business Mgmt - University of California, Santa Barbara
MUDIT UPPAL
Learning at Insight: Learning by doing
Before Insight: Batch processing, Data Science Analytics
After Insight: Stream processing [ Spark streaming, Storm, Kafka, Cassandra ] +
learning from feedback, the right tool for the right job, More debugging/testing in a
distributed computing with polyglot approach + Thinking asynchronously

More Related Content

Reel bid insightd-eproject

  • 1. Reelbid Real time ad bidding DSP ( programmatic) MUDIT UPPAL INSIGHT DATA ENGINEERING, SV
  • 2. What is Real Time Bidding(RTB)? Image source:
  • 4. Demand side platform - Reelbid Mashable.com (publisher with inventory) Sell Side Platform OpenRTB server Auction service Ad exchange OpenRTB client Bidder service Data management platform Data broker Some advertiser
  • 5. Motivation Really interesting & new* data engineering challenges Current State of the art Googles open-bidder still in alpha(not for public) In 2014, according to Business Insider Intelligence, ad revenue topped $15 billion. Real-time bidding, and in particular, mobile and video real-time bidding, lead the way for that growth. Business Insider Intelligence estimates say RTB revenue will pass $26 billion by the end of 2020, which far surpasses the $11.7 billion from this year. ( *Disclaimer: NO background in Ad tech ) Latency ( 100 ms ) Queries Per Second (~2 - 3 million bids/second) + Infrequent bids
  • 6. Some Sample AdExchange Reelbid 1. Hey do you want to bid on this users impression?: 2. Yes! Heres my bid: 100 ms "id" : "32a69c6ba388f110487f9d1e63f77b22d86e916b", "banner": { "h": 250, "w": 300, "name": "mashable.com", "domain": "http://www.example.com", .. "page": "http://easy.example.com/easy?cu=13824; cre=mu;target=_blank", "ref" : "http://refer+url", "publisher": { "id": "qqwer1234xgfd", "name": "site_name", "domain": "my.site.com" "device": { "ua": "Mozilla/5.0(KHTML, like Gecko) Version/5.1.7 Safari/534. 57.2", "ip": "192.168.5.5", "geo": { "lat": 37.789, "lon": -122.394, "country": "USA", "city": "San Francisco", "region": "CA", "zip" : "94105", "type": 2 } "user": { "buyeruid" : "89776897686798fwe87rtryt8976fsd7869678", "id": "55816b39711f9b5acf3b90e313ed29e51665623f", "gender": "M", "yob": 1975, "customdata": "Data-asdfdwerewr", "data": [{ "id": "pub-demographics", "name": "data_name", "segment": [{ "id" : "345qw245wfrtgwertrt56765wert", "name" : "segment_name", "value": "segment_value } { "id": "eb85349d-03c3-44f4-a77b- 824f7221d116", "seatbid": [{ "bid": [{ "id": "bid1", "impid": "eb85349d-03c3-44f4-a77b- 824f7221d116", "price": 0.1, "adm": "<div>Ad Creative</div>", "adomain": [ "http://www.example.com/clickthrough"
  • 7. Goals - Create a more efficient bidding framework using sampling and hierarchical techniques (as described in the paper) - Ways to Scale without adding nodes and save costs
  • 10. A different approach to RTB A paper published by in IEEE for Data Mining-- (source of truth) Instead of comparing bid requests with user profile<U,P> we look at bid requests only. Compensate computational resources with price (and not accept and bid on everything) ROI with Gaussian/Gamma models are much better
  • 13. How do we handle 2-3 million QPS? (efficient scaling)
  • 14. Sampling! Ziggurat Algorithm for sampling random values within milli seconds (also most memory efficient )
  • 15. How do we decide the bid price?
  • 16. Y < f(x) Cost(Budget) Resources 1. Looking at overall compute resources 2. User input filtering
  • 17. Algorithm Request selection Based on Utilization Selection policy based on reward models Filter 1 Filter 2 Filter 3
  • 18. Collecting server system metrics - Graphite - Carbon | StatsD | Collectd - Redis
  • 20. Testing - Event-sim - Parallec io (ebay) [8K async HTTP requests per second] - Smaato AdX
  • 21. Challenges - To process that file in 80 ms - Throttling issues - Networking challenges - Programming/debugging/testing in a distributed environment is a very time- consuming task
  • 22. About me Surf, Football and micro-controllers [ Building products ] Data Scientist, Planetary/Fusion Network Ltd (NY) MS Comp Science, Media, Business Mgmt - University of California, Santa Barbara MUDIT UPPAL
  • 23. Learning at Insight: Learning by doing Before Insight: Batch processing, Data Science Analytics After Insight: Stream processing [ Spark streaming, Storm, Kafka, Cassandra ] + learning from feedback, the right tool for the right job, More debugging/testing in a distributed computing with polyglot approach + Thinking asynchronously