- The document presents an approach called HypTrails that uses Bayesian inference to compare hypotheses about mechanisms that produce human trails on the web.
- HypTrails expresses hypotheses as priors in a Markov chain model and compares the marginal likelihood of the data under different hypotheses to obtain a partial ordering of hypothesis plausibility.
- It demonstrates applying HypTrails to compare hypotheses on human navigation trails using data from Wikigame, human song trails from Last.fm, and human review trails from Yelp.
HypTrails: A Bayesian Approach for Comparing Hypotheses about Human Trails on the Web
1. GESIS - Leibniz Institute for the Social Sciences
HypTrails: A Bayesian Approach for Comparing
Hypotheses about Human Trails on the Web
Philipp Singer, Denis Helic, Andreas Hotho
and Markus Strohmaier
2. Vannevar Bush
227.05.2015 HypTrails - Philipp Singer
image courtesy of brucesterling on Flickr
Bush, V. (1945). As we may think. The Atlantic
Monthly, 176(1):101 108. Bush, V. (1945).
As we may think. The Atlantic Monthly,
176(1):101 108.
[The human brain] operates by association.
With one item in its grasp, it snaps instantly to the
next that is suggested by the association of thoughts.
3. Human trails on the Web
27.05.2015 HypTrails - Philipp Singer 3
image courtesy of user Mmxx on Wikipedia
4. Human trails on the Web
27.05.2015 HypTrails - Philipp Singer 4
image courtesy of user Mmxx on Wikipedia
What are the mechanisms
producing human trails on
the Web?
5. Example: Human navigational trails
Humans prefer to navigate
H1: over semantically similar websites
H2: via self-loops (e.g., refreshing)
H3: by using the structural link network
H4: by preferring similar categories
H5: by utilizing structural properties
H6: by information scent
[West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012], [Chi et al. CHI 2001]
27.05.2015 HypTrails - Philipp Singer 5
6. Example: Human navigational trails
Humans prefer to navigate
H1: over semantically similar websites
H2: via self-loops (e.g., refreshing)
H3: by using the structural link network
H4: by preferring similar categories
H5: by utilizing structural properties
H6: by information scent
[West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012], [Chi et al. CHI 2001]
27.05.2015 HypTrails - Philipp Singer 6
What is the relative
plausibility of these
hypotheses given data?
7. HypTrails in a nutshell
Goal: Express and compare hypotheses about human trails
in a coherent research approach
First-order Markov chain model
Bayesian inference
Incorporate hypotheses as priors
Utilize sensitivity of marginal likelihood on the prior
Outcome: Partial ordering of hypotheses
27.05.2015 HypTrails - Philipp Singer 7
8. Markov chain model
Stochastic model
Transition probabilities between states
27.05.2015 HypTrails - Philipp Singer 8
S2 S3
1/2 1/2
16. Bayesian model comparison:
Marginal likelihood
27.05.2015 HypTrails - Philipp Singer 16
Probability of data given hypothesis
= Model evidence
17. Bayesian model comparison:
Marginal likelihood
27.05.2015 HypTrails - Philipp Singer 17
Probability of data given hypothesis
Model evidence
Parameters are marginalized out
Probability of observing data
given parameters and hypothesis
18. Bayesian model comparison:
Marginal likelihood
27.05.2015 HypTrails - Philipp Singer 18
Probability of data given hypothesis
Model evidence
Parameters are marginalized out
Probability of observing data
given parameters and hypothesis Probability of parameters
before observing data
19. Bayesian model comparison:
Marginal likelihood
27.05.2015 HypTrails - Philipp Singer 19
Probability of data given hypothesis
Model evidence
Parameters are marginalized out
Probability of observing data
given parameters and hypothesis Probability of parameters
before observing data
20. Structure of HypTrails
27.05.2015 HypTrails - Philipp Singer 20
MC Model
Belief in parameters
Prior (H1)
Data (Trails)
likelihood (H1)
21. How to elicit priors from hypotheses?
27.05.2015 HypTrails - Philipp Singer 21
24. (Trial) roulette method
Prior distribution
Eliciting priors
27.05.2015 HypTrails - Philipp Singer 24
25. Conjugate Dirichlet prior
Hyperparameters pseudo counts
27.05.2015 HypTrails - Philipp Singer 25
MC parameters Dirichlet hyperparameters
26. Eliciting priors from hypotheses
about human trails
Adaption of (trial) roulette method
27.05.2015 HypTrails - Philipp Singer 26
#Chips = 硫
Strength of hypothesis
硫 = 18
27. Eliciting priors from hypotheses
about human trails
Adaption of (trial) roulette method
27.05.2015 HypTrails - Philipp Singer 27
#Chips = 硫
Strength of hypothesis
硫 = 18
Dirichlet hyperparameters
33. Structure of HypTrails
27.05.2015 HypTrails - Philipp Singer 33
MC Model
Prior (H1)
Data (Trails)
likelihood (H1)
Prior (H2)
likelihood (H2)
34. Demonstration of general applicability
Synthetic data
Human song trails (Last.fm)
Human review trails (Yelp)
Human navigation trails (Wikigame)
27.05.2015 HypTrails - Philipp Singer 34
37. Summary
Studying mechanisms producing human trails
HypTrails: A coherent approach for expressing and
comparing hypotheses about human trails
Can be applied to all kinds of human trails
Implementations: www.philippsinger.info/hyptrails
27.05.2015 HypTrails - Philipp Singer 37
38. GESIS - Leibniz Institute for the Social Sciences
for your attention!
39. References 1/2
[West et al. WWW 2015]
Robert West, Ashwin Paranjape, and Jure Leskovec: Mining Missing Hyperlinks from Human
Navigation Traces: A Case Study of Wikipedia. 24th International World Wide Web Conference
(WWW'15), Florence, Italy, 2015.
[De Choudhury et al. HT 2010]
De Choudhury, Munmun and Feldman, Moran and Amer-Yahia, Sihem and Golbandi, Nadav and
Lempel, Ronny and Yu, Cong: Automatic construction of travel itineraries using social breadcrumbs.
21st ACM conference on Hypertext and hypermedia, 2010.
[Bestavros CIKM 1995]
Bestavros, Azer: Using speculation to reduce server load and service time on the WWW. 4th International conference
on Information and knowledge management. 1995.
[Perkowitz IJCAI 1997]
Perkowitz, Mike, and Oren Etzioni: Adaptive web sites: an AI challenge. 15th international joint
conference on Artifical intelligence. 1997.
[West et al. IJCAI 2009]
West, Robert, Joelle Pineau, and Doina Precup. "Wikispeedia: An Online Game for Inferring Semantic
Distances between Concepts." IJCAI. 2009.
27.05.2015 HypTrails - Philipp Singer 39
40. References 2/2
[Singer et al. IJSWIS 2013]
Philipp Singer, Thomas Niebler, Markus Strohmaier and Andreas Hotho, Computing Semantic
Relatedness from Human Navigational Paths: A Case Study on Wikipedia, International Journal on
Semantic Web and Information Systems (IJSWIS), vol 9(4), 41-70, 2013
[West & Leskovec WWW 2012]
Robert West and Jure Leskovec: Human Wayfinding in Information Networks 21st International
World Wide Web Conference (WWW'12), pp. 619628, Lyon, France, 2012.
[Chi et al. CHI 2001]
Chi, Ed H., et al. "Using information scent to model user information needs and actions and the
Web." Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2001.
27.05.2015 HypTrails - Philipp Singer 40