This document discusses how big data is impacting tourism. It notes that data comes from tourists, attractions, hotels, and restaurants through sensors and online activities. This data is owned by both the government and private sector, but the private sector data is more integrated with user experiences. It also discusses challenges with scraping data from major travel sites which frequently change their structures. Potential applications of big data in tourism include analyzing user behaviors, automated storytelling for destinations, and tailoring experiences based on user profiles.
1 of 30
Downloaded 19 times
More Related Content
Big data, big tourism
1. Big Data, Big Tourism
Tourism and Mechanics
/sirmmo/big-data-big-tourism
3. What are 束Big Data損?
Excel gets stuck working a
dataset? => 束medium損 data
Stata/R suffer working a
dataset? => 束big損 data
4. Where do we get the data?
Tourists
Have sensors
Are sensors
Are actors
Attractions
Are sensors
Are actors
Hotels, restaurants
Are sensors
Have sensors
5. Can we access the data?
Tourists
Have sensors
Are sensors
Are actors
Attractions
Are sensors
Are actors
Hotels, restaurants
Are sensors
Have sensors
6. Can we access the data?
Tourists
Have sensors
Are sensors
Are actors
Attractions
Are sensors
Are actors
Hotels, restaurants
Are sensors
Have sensors
7. Can we access the data?
Tourists
Have sensors
Are sensors
Are actors
Attractions
Are sensors
Are actors
Hotels, restaurants
Are sensors
Have sensors
8. Government
Can we access the data?
Tourists
Have sensors
Are sensors
Are actors
Attractions
Are sensors
Are actors
Hotels, restaurants
Are sensors
Have sensors
Private Sector
9. Can we access the data?
Tourists
Have sensors
Are sensors
Are actors
Attractions
Are sensors
Are actors
Hotels, restaurants
Are sensors
Have sensors
Private SectorGovernment
Open(able/ish)
Data
Almost
always
10. Ok so who owns that data?
Government
Bureaucracy-driven data
Incoherent
Inconsistent
Irregular production
Private Sector
Deeply integrated with user
experience
Very 束behavioral損, and as such
very 束real損
Very business-oriented metrics
11. Ok so who owns that data?
Government
Bureaucracy-driven data
Incoherent
Inconsistent
Irregular production
Private Sector
Deeply integrated with user
experience
Very 束behavioral損, and as such
very 束real損
Very business-oriented metrics
12. Ok so who owns that data?
Government
Bureaucracy-driven data
Incoherent
Inconsistent
Irregular production
Private Sector
Deeply integrated with user
experience
Very 束behavioral損, and as such
very 束real損
Very business-oriented metrics
13. Scraping
Time consuming
Power consuming
Illegal (up to a certain point)
Unavoidable (up to a certain
point)
14. Scraping
It relies on the fact that (most)
web is based on HTML
And HTML is text
And JavaScript is text
And CSS is text
Everything can be read before
the render
15. Scraping
It relies on the fact that (most)
web is based on HTML
And HTML is text
And JavaScript is text
And CSS is text
Everything can be read before
the render
Or after the render
16. Tools
Not easy for 束complex損 sites
Some cases come up
Some tools help
Maybe knowledge of Xml Query
Language or CSS required
Some tools are very advanced
Selenium browser driver
束headless損 browsers
Chrome
https://chrome.google.com/webstore/detai
l/scraper/mbigbapnjcgaffohmbkdlecaccepn
gjd?hl=en
https://chrome.google.com/webstore/detai
l/web-
scraper/jnhgnonknehpejjnehehllkliplmbmh
n?hl=en
https://chrome.google.com/webstore/detai
l/advanced-web-
scraper/gpolcofcjjiooogejfbaamdgmgfehgff
Firefox
https://addons.mozilla.org/en-
US/firefox/addon/datascraper/
Web
https://www.import.io/
https://scrapinghub.com/portia/
17. Cases and issues of scraping
Booking.com
Amazing website
Easy navigation for the user
Issues
They know!!!
The website gets a complete
structural overhaul every 6-9
months
They tend to hate scrapers
The webpage is empty at the
beginning
18. Cases and issues of scraping
Booking.com
Amazing website
Easy navigation for the user
Issues
They know!!!
The website gets a complete
structural overhaul every 6-9
months
They tend to hate scrapers
The webpage is empty at the
beginning
19. Cases and issues of scraping
AirBnB
Nice navigation
Full overhaul every 3 months
Issues
The page really tracks what kind of
user is accessing
The visible pages are 13 (only)
They are randomly generated
every day for the major areas
20. Cases and issues of scraping
Weather
Many sources
Many formats
Issues
Normalization of vocabulary
Bad weather == Rain == Rainy ==
Cloud Icon == ???
Normalization of ranges
Normalization of numbers
Normalization of periodicity
21. Apps
Questionnaire
to get user to
explicitly give
data
Information
driven
application to
track user
data
Gamification
and/or
information
platform to
elaborate
and give data
back
22. Explicit data
Relies on the users knowing
actions
Requires real willing acceptance
for sharing information
Stops at politically correctness
Implies (almost always)
anonimity
Questionnaire
In-place review
In-place comment
Bureaucracy
25. Behavioral data
Almost always true
Difficult to get
Easily contextualizable
Interactive
Interconnected
Application
Platform
Social Media integration
Gamification
Social Media involvement
26. Cool, so what can be done?
Getting Data
Municipalities are setting up
open wireless networks.
Users can be tracked.
Services can be offered (and
instrumented)
Museums can track users within
their premises
Social Media interactions
Using Data
Analysis of context of specific
behaviours
Automated storytelling for city
visits
Pricing methodologies
Destination brand analysis
27. Big and Big-ish Data Tools
The problem is computational
power
Lots of work on AI
Classification
Generation
Machine Learning
Correlations
DataWarehouses
Mondrian -
http://community.pentaho.com/projects/
mondrian/
Big Data DBs
Cassandra - http://cassandra.apache.org/
Hadoop - http://hadoop.apache.org/
Big Data Search
BigQuery -
https://cloud.google.com/bigquery/
GraphQL - http://graphql.org/
Big Data AI/ML
TensorFlow -
https://www.tensorflow.org/
ScikitPy - https://www.scipy.org/
28. A few open questions
Impact of crowdfunding on tourism-bound projects
Impact of meta-search-engines on pricing
Impact (or lack thereof) of destination information websites on user
decisions
How can the user be 束vetted損 in order to tailor the touristic
experience around her?
Would such vetting process impact on customer return decisions?