ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
DISCOVER REAL TIME
KNOWLEDGE CLUSTERS
DevsNearMe
Priorities
ï‚— Separate signal from noise
ï‚— Can we at least predict better than others
ï‚— Ideally a probability distribution model giving us
ideas for best-case, expected value or worst-case
behavior
Model/Algorithm
ï‚— Prediction based on
 Absolute number + Prediction (Context –based
information + Learning from User past behavior)
Absolute numbers
ï‚— Foursquare check-ins are fairly reliable, as are MTA
and TSA swipes
ï‚— This just gets added to the prediction, no weighing
applied currently but may be modify if there is a
trend of fake data being generated
Context based Prediction
 Context based – Use Decision Tree Learning to
generate weights to apply to event rsvp counts for
eventbrite, meetup, facebook.
ï‚— E.g A meetup event rsvp has a higher weight if it is a
paid event, has free giveaways and if the weather is
nice
ï‚— Weights are in range 0-1 and we multiply each event
rsvp count by their weight and divide by 3 to get the
weighted average rsvp count.
ï‚— Events of similar nature in a nearby radius will
downgrade the potential attendance
User Learning based prediction
ï‚— A persons likelihood of attending an event can be
modelled in a Bayesian manner
ï‚— Past event attendance/rsvp ratio , history of
attending a series of events of a particular nature
ï‚— Item based classification is another factor e.g if
person a,b,c,d attend events X and Y and we know
that b,c, and d are attending event Z, there is a
higher chance for a to attend event Z
Age based classification
ï‚— 
ï‚— Sharing peaks at teenage, early adulthood and then falls
down
ï‚— Influence of social data needs inversely weighing to infer
total count of people at an event
Gender based classification
ï‚— Social sharing can weigh in gender for better
classification
Miscelleneous
ï‚— Chart data sources : appdata, beevolve, appdata,
quora, statista

More Related Content

DevsNearMe

  • 1. DISCOVER REAL TIME KNOWLEDGE CLUSTERS DevsNearMe
  • 2. Priorities ï‚— Separate signal from noise ï‚— Can we at least predict better than others ï‚— Ideally a probability distribution model giving us ideas for best-case, expected value or worst-case behavior
  • 3. Model/Algorithm ï‚— Prediction based on ï‚— Absolute number + Prediction (Context –based information + Learning from User past behavior)
  • 4. Absolute numbers ï‚— Foursquare check-ins are fairly reliable, as are MTA and TSA swipes ï‚— This just gets added to the prediction, no weighing applied currently but may be modify if there is a trend of fake data being generated
  • 5. Context based Prediction ï‚— Context based – Use Decision Tree Learning to generate weights to apply to event rsvp counts for eventbrite, meetup, facebook. ï‚— E.g A meetup event rsvp has a higher weight if it is a paid event, has free giveaways and if the weather is nice ï‚— Weights are in range 0-1 and we multiply each event rsvp count by their weight and divide by 3 to get the weighted average rsvp count. ï‚— Events of similar nature in a nearby radius will downgrade the potential attendance
  • 6. User Learning based prediction ï‚— A persons likelihood of attending an event can be modelled in a Bayesian manner ï‚— Past event attendance/rsvp ratio , history of attending a series of events of a particular nature ï‚— Item based classification is another factor e.g if person a,b,c,d attend events X and Y and we know that b,c, and d are attending event Z, there is a higher chance for a to attend event Z
  • 7. Age based classification ï‚— ï‚— Sharing peaks at teenage, early adulthood and then falls down ï‚— Influence of social data needs inversely weighing to infer total count of people at an event
  • 8. Gender based classification ï‚— Social sharing can weigh in gender for better classification
  • 9. Miscelleneous ï‚— Chart data sources : appdata, beevolve, appdata, quora, statista