際際滷

際際滷Share a Scribd company logo
Statistical Physics, Network
theory & Big data
An approach to human mobility

Oleguer Sagarra
Dept. F鱈sica Fonamental,
University of Barcelona
1
A killing combination...
Statistical Physics
&
Big Data
New Social Sciences
2
Why?
We want to study Human Mobility

Mobility has deep implications in many processes..
(contagion, spread of ideas...)
The development of GPS/mobile phone technologies
makes gathering data cheap and possible at large
scale.
3
What?
(Human) Mobility is a rather complex process
Different scales (Micro/Meso/Macro)
Society is heterogeneous (Humans are not
monkeys in principle!)

But we are physicists! So we will try to
model it anyway
4
But we dont need
modelling
Computers are useless, they can only
give you answers (P. Picasso)
This talk is about questions rather
Models push the boundaries of our
understanding"
5
How?
Theoretical
Physics
Mathematics

Empirical
Real (big) Data

Network Science

6
The data... (has problems)
a) How to get it?
Private companies
(Social Media)

Citizens

7
Getting the data... Experiments
Smartphones give lots of sensing opportunities
Citizen science aims to involve people in data
collection, sharing and processing

BeePath: Experiments on
human mobility
http://bee-path.net
(Btw: Very interesting project, but dont have time for it today)

8
Getting the data...
Social Media
b) Is it biased?
(Big data can also mean big errors)

9
Social media data
Social media data is geolocalized, we can extract
trajectories from it.
But 鍖rst, is the data representative from the population?

(We want info about people, not about some people that tweet a lot)

We can compare with the census
Analysis must be done at user level!
10
The data... is geolocalized,
and (too) big!
c) Continuous vs discrete data
From points to a network?
(We want only the 鍖ows: From where and to where people go, on average)

11
The network approach
Data
Filtering
Aggregation (grid)
Network

12
Network data

(We can now apply network metrics
and data is normalized!)
Sagarra, O. Master Thesis. http://upcommons.upc.edu/pfc/handle/
2099.1/13134

13
Now we know how to deal
with the data...

We want to detect abnormal patterns...
What is chance, what is not?
What is important, what is not?
14
Modeling as a physicist
Take all trivial elements out
Keep just the basic factors in mobility
!

- Distance / Cost (a.k.a. laziness)
- Population density (a.k.a. opportunities)

(We look for causality, not correlation)
15
Macro/Meso level:
(urban/regional/national)

We need a general model for mobility networks

Taking inspiration from Statistical Mechanics
and Network Theory, one can de鍖ne 鍖exible
null models.

16
We need a null model for the
data...
Procedure:
1. Fix some hypothesis
The population leaving or entering each cell is given
!
(quite a lot of maths.)*

2. Generate predictions
How do the 鍖ows organize?
!

3. Compare
Data vs Prediction
Sagarra, O. et altr. Phys. Rev. E 88, 062806 (2013)

17
Roadmap
Raw data
Experiments, Databases...

Prediction
(Product)

Data treatment tools

Statistical Validation
Hypothesis...
Modelling

(We are here)

Clean data
Null Model predictions

Data features
Visualizations
18
Whats the goal of all this?
Understand what drives human mobility
Discriminate important factors from negligible ones
(population density, distance, cost...)
Create tools to study data in an unbiased manner

19
osagarra@ub.edu
@usagarra

Thanks for your attention...

20

More Related Content

Networks, Big Data and Statistical Physics: A killing combination

  • 1. Statistical Physics, Network theory & Big data An approach to human mobility Oleguer Sagarra Dept. F鱈sica Fonamental, University of Barcelona 1
  • 2. A killing combination... Statistical Physics & Big Data New Social Sciences 2
  • 3. Why? We want to study Human Mobility Mobility has deep implications in many processes.. (contagion, spread of ideas...) The development of GPS/mobile phone technologies makes gathering data cheap and possible at large scale. 3
  • 4. What? (Human) Mobility is a rather complex process Different scales (Micro/Meso/Macro) Society is heterogeneous (Humans are not monkeys in principle!) But we are physicists! So we will try to model it anyway 4
  • 5. But we dont need modelling Computers are useless, they can only give you answers (P. Picasso) This talk is about questions rather Models push the boundaries of our understanding" 5
  • 7. The data... (has problems) a) How to get it? Private companies (Social Media) Citizens 7
  • 8. Getting the data... Experiments Smartphones give lots of sensing opportunities Citizen science aims to involve people in data collection, sharing and processing BeePath: Experiments on human mobility http://bee-path.net (Btw: Very interesting project, but dont have time for it today) 8
  • 9. Getting the data... Social Media b) Is it biased? (Big data can also mean big errors) 9
  • 10. Social media data Social media data is geolocalized, we can extract trajectories from it. But 鍖rst, is the data representative from the population? (We want info about people, not about some people that tweet a lot) We can compare with the census Analysis must be done at user level! 10
  • 11. The data... is geolocalized, and (too) big! c) Continuous vs discrete data From points to a network? (We want only the 鍖ows: From where and to where people go, on average) 11
  • 13. Network data (We can now apply network metrics and data is normalized!) Sagarra, O. Master Thesis. http://upcommons.upc.edu/pfc/handle/ 2099.1/13134 13
  • 14. Now we know how to deal with the data... We want to detect abnormal patterns... What is chance, what is not? What is important, what is not? 14
  • 15. Modeling as a physicist Take all trivial elements out Keep just the basic factors in mobility ! - Distance / Cost (a.k.a. laziness) - Population density (a.k.a. opportunities) (We look for causality, not correlation) 15
  • 16. Macro/Meso level: (urban/regional/national) We need a general model for mobility networks Taking inspiration from Statistical Mechanics and Network Theory, one can de鍖ne 鍖exible null models. 16
  • 17. We need a null model for the data... Procedure: 1. Fix some hypothesis The population leaving or entering each cell is given ! (quite a lot of maths.)* 2. Generate predictions How do the 鍖ows organize? ! 3. Compare Data vs Prediction Sagarra, O. et altr. Phys. Rev. E 88, 062806 (2013) 17
  • 18. Roadmap Raw data Experiments, Databases... Prediction (Product) Data treatment tools Statistical Validation Hypothesis... Modelling (We are here) Clean data Null Model predictions Data features Visualizations 18
  • 19. Whats the goal of all this? Understand what drives human mobility Discriminate important factors from negligible ones (population density, distance, cost...) Create tools to study data in an unbiased manner 19