�ݺ�ߣ

V
olume

elocity

ariety

#bigdataMY

Feeds and notiﬁcations
Insights
Recommendation & Matching
Security
Monitoring & Reporting
Event logging

#bigdataMY

Feeds and notiﬁcations
Insights Change detection
Recommendation & Matching Change reaction
Security Audit
Monitoring & Reporting
Event logging

#bigdataMY

Get ahead of the curve

Noise

Ø
“Normal”

Ø
Ø Ø
Ø
Ø Ø
Ø
Ø Ø
ØØ Ø
Ø Ø Ø
Ø
Ø
Ø Ø
Ø
ØØ
Ø Ø Ø Ø
Ø Ø
Ø Ø Ø
Ø
“Normal” Ø Ø
Ø Ø
“Normal”
Ø
Ø

Ø

Noise

[J Gama, University of Porto] #bigdataMY

Get ahead of the curve

Noise
“New concept”
Ø
Ø Ø Ø
“Normal” Ø Ø
Ø
Ø Ø
Ø Ø
Ø
Ø Ø
Ø
Ø Ø
ØØ Ø
Ø Ø
Ø
Ø
“Concept
Ø
Ø
Ø
Ø drift”
ØØ
Ø Ø Ø
Ø
Ø Ø
Ø Ø Ø
Ø Ø Ø “Normal”
“Normal” Ø Ø Ø Ø

Ø
“Big Data is much more likely to catch the
black swan as it swoops in”
Noise - Norman Nie, Revolution Analytics

[J Gama, University of Porto] #bigdataMY

Acunu Analytics

#bigdataMY

UserID
EMEA
UK
London
N1
Female
16-21 year old
16-21 year old
Female
16-21 year old
Female
London

#bigdataMY

V Under the hood

21:00 all = 1345 :00 = 45 :01 = 62 ...

22:00 all = 3221 :00 = 22 :01 = 19 ...

... ...

UK all = 228 user01 = 1 user14 = 12 ...

US all = 354 user01 = 15 user14 = 0 ...

MY all = 28 user01 = 0 user02 = 0 ...

...

#bigdataMY

V Under the hood

21:00 all = 1345 :00 = 45 :01 = 62 ...

22:00 all = 3221 +1 :00 = 22 :01 = 19 +1 ...

{
cust_id: user01, ... ...
session_id: 102,
geography: UK,
UK all = 228 +1 user01 = 1 +1 user14 = 12 ...
browser: IE,
time: 22:01,
} US all = 354 user01 = 15 user14 = 0 ...

MY all = 28 user01 = 0 user02 = 0 ...

...

#bigdataMY

V
where time 21:00 - 22:00
count(*)
Under the hood

21:00 all = 1345 :00 = 45 :01 = 62 ...

22:00 all = 3221 :00 = 22 :01 = 19 ...

... ...

UK all = 228 user01 = 1 user14 = 12 ...

US all = 354 user01 = 15 user14 = 0 ...

MY all = 28 user01 = 0 user02 = 0 ...

...

#bigdataMY

V
where time 21:00 - 23:00
count(*)
Under the hood

21:00 all = 1345 :00 = 45 :01 = 62 ...

22:00 all = 3221 :00 = 22 :01 = 19 ...

... ...

UK all = 228 user01 = 1 user14 = 12 ...

US all = 354 user01 = 15 user14 = 0 ...

MY all = 28 user01 = 0 user02 = 0 ...

...

#bigdataMY

Little Trouble with Big Disks

#bigdataMY

COTS Journal, 2008

#bigdataMY

Streaming algorithms

A = [a1, a2, a3, a4, a5]
mean(A) = sum it up / number of things

#bigdataMY


A = [a1, a2, a3, a4, a5]

now add another item a6...???

#bigdataMY


A = [a1, a2, a3, a4, a5]

sum = sum + a6
inc(number of things)

#bigdataMY


A = [a1, a2, a3, a4, a5]

sum = sum + a6
inc(number of things)

try this with median?

#bigdataMY

V Realtime tradeoffs

ity
loc

Ad
-ve

-ho
gh

c
Hi

High-volume

#bigdataMY

V Conclusion

Big Data also about the Little Things, done fast.

The devil is in the details.

Make it accessible.

#bigdataMY

�ݺ�ߣ

V

More Related Content

V