Theory is nice but data is heaven. Most market researchers have heard a lot of theory about big data, but few have seen the data and worked with it themselves. And we all know that the best way to truly understand and internalise something is to see the raw data for yourself. In this presentation, we'll blast ten big data myths using stories that many researchers can actually relate to - survey panel data. With millions of panellists, millions of profiles, millions of survey clicks, and millions of incentives, market researchers have been sitting on pretty big data for nearly 20 years. See how easy it is for trained scientists like yourselves to learn some SAS, R, or SQL, and dig into that big data on your own.
1 of 27
More Related Content
Blasting 10 Big Data Myths with 10 Panel Data Examples
1. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Please tweet!
@LoveStats #ESOMAR
BIGData Myths
Presented by Annie Pettit
Chief Research Officer at Peanut Labs,
a Research Now Group Company .
2. Please tweet! #ESOMAR @LoveStats
Big Data Myths
What is Big Data?
Volume VelocityVariety
Research panel data
Shopper/ Loyalty/
Transactional data
Web tracking data
Text, video, audio, date, time, $, 蔵,
coupons, loyalty card, SKU
url, click, save, download, lat/long
Eye motion, brain wave, electrical
pulse
Every
picosecond
http://giphy.com/search/gotta-go-fast
4. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Myth: Big Data is New
2015 Supercomputer
≒Take on the biggest jobs, tasks other
computer systems simply cant handle
Clock speed 173 petaflops
http://www.forbes.com/sites/sungardas/2015/04/14/the-amazing-super-powers-of-a-supercomputer/
5. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Is 1985 New?
Supercomputer 30 years ago
≒Large memory and performance allows
users to solve problems that cannot be
solved with any other computer
Clock cycle 4.1 nanoseconds
http://archive.computerhistory.org/resources/text/Cray/Cray.Cray2.1985.102646185.pdf
6. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Big Data is only new to some
2002: drugstore transactional database
2005: research panel database
2010: social media database
2015: research panel database
Just Me
1979: Texas Airlines loyalty program
2002: Target advertising, Andrew Poole
2004: Walmart stocks stores for hurricanes
MRX
7. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Myth: Big Data Is Better
Emotions Attitudes
Beliefs
8. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Myth: Volume trumps knowledge
4.7
5.1
15.0
15.8
15.6
16.5
Total
Only surveys
Only completes
Only USA
Only recent
No test links
PLs
average
survey
minutes:
9. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Myth: Big data is clean
One SQL Table
N=75 million
Variables = 1012
Missing values = 53
million
11. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Myth: Big data is the population
One second later, it was missing 3 records
One minute later, it was missing 180 records
One day later, it was missing 260,000
records
Today, it is missing 15 million records.
On March 25 at 12:10:16
13. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Myth: Data speed is everything
Completion Rate (Per Second)
14. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Speed is awesome!
If you dont care about
Coding Errors
Outliers
AccuracyExceptions Interactions
Validity
Generalizability
Reliability
Comprehension
15. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Myth: Big Data Renders Science Obsolete
Incomplete data
Miscoded data
Misplaced data
Remember non-random
+/- 5, 19 times out of 20
p-values
Type 1 and Type 2 errors
Remember random
Total Research
Error
17. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Name That Software!
Compute VoteForHiggins=999.
If Q2votepastYes=1 and Q4votetodayYes=1 and (Q6voteHigginsLikely=1) VoteForHiggins=1.
If Q2votepastYes=1 and Q4votetodayYes=1 and (Q6voteHigginsUnlikely=1) VoteForHiggins=0.
If (Q2votepastYes=1 and Q4votetodayYes=1) and (Q6voteHigginsUnsure=1)
VoteForHiggins=69.
MISSING VALUES VoteForHiggins (69 '999').
Execute.
18. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Name That Software!
PROC corr data = ResearchData.Client243 OUTP=ClientOutput
nomiss;
VAR PurchaseIntent Recommend Different New Value;
TITLE2 Correlations of Key Indicators';
RUN;
19. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Name That Software!
Select RecruitDate, avg(CompletesPerPerson)
From
(select RecruitDate, count(*) as CompletesPerPerson
from CompleteDataBase
group by UserID) RecruitData
Group by RecruitDate
Order by RecruitDate
20. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Myth: Big Data is for the IT Department
1. RapidMiner
2. R
3. Excel
4. SQL
5. Python
6. Weka
7. KNIME
8. Hadoop
9. SAS base
10. SQL Server
http://www.kdnuggets.com/polls/2014/analytics-data-mining-data-science-software-used.html
21. Please tweet! #ESOMAR @LoveStats
Big Data Mythshttp://www.anlytcs.com/2014/01/data-science-venn-diagram-v20.html
Math
and
Statistics
Subject
Matter
Expertise
Computer
Science
BIG DATA is
YOU and ME!
22. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Myth: Big data requires a big budget
24. Please tweet! #ESOMAR @LoveStats
Big Data Myths
People Costs
Marketing Manager: $60 000
IT Product Manager: $80 000
Research Scientist: $61 000
Software Engineer: $60 000
Statistician: $57 000
http://www.payscale.com/research/CA/Job=Data_Scientist,_IT/Salary
Data Scientist:
$70 000
25. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Big Data
Is not new
Is not clean nor complete
Does not trump knowledge
Does not render science obsolete
Is not just for IT
Doesnt win because of speed
Does not require a huge budget
Is not by definition better
26. Please tweet! #ESOMAR @LoveStats
Big Data Myths
What is Big Data Really?
Fast Actionable Relevant
Your products
Your clients
Your key metrics
Definable
Measurable
Changeable
Awesomeable
Already fielded
Already awesome
sample sizes
Already in a dataset
27. Please tweet! #ESOMAR @LoveStats
Big Data Myths
Thank you!
Annie Pettit
Chief Research Officer
annie@peanutlabs.com
ca.linkedin.com/in/AnniePettit/
facebook.com/AnniePettit
twitter.com/LoveStats
Jonathan Cheriff
Director of Sales & Marketing
jonathan.cheriff@peanutlabs.com
Find PeanutLabs on
LinkedIn Facebook Twitter YouTube