際際滷

際際滷Share a Scribd company logo
舒仗亠亟亠仍亠仆仆亠
于亳仍亠仆亳 仆舒
JavaScript!
Viktor Turskyi
CTO at WebbyLab
Kyiv.js 2015
亳亰仆亠 亰舒亟舒舒
弍舒弍仂从舒 弍仂仍仂亞仂 仄舒亳于舒 (仄亳仍仍亳舒亟
亰舒仗亳亠亶) 亟舒仆仆  仂. 亠亠亶.
亳仄亠舒 舒仆舒仍亳亰舒
舒亠 仗仂仄亳仆舒亠仄仂亳 亞仗仗 从仍ム亠于
仍仂于.
仂仆亳从 亟舒仆仆 - Twitter Public Stream API
弍亠仄 亟舒仆仆: +15 亠亢亠亟仆亠于仆仂 (5丐 于
亞仂亟)
亳仄亠 亰舒仗仂舒
#nike 仗仂亳于 #adidas
#nike & nba 仗仂亳于 #adidas & nba
(仗仂|仆亳) & -弍仂仍
于仂仍从亳 & 从仂仄亠亳从舒
MapReduce 亳仍亳 从舒从 仂 亟亠仍舒ム 于
Google
MapReduce  仄仂亟亠仍 舒仗亠亟亠仍仆仆
于亳仍亠仆亳亶, 仗亠亟舒于仍亠仆仆舒 从仂仄仗舒仆亳亠亶
Google, 亳仗仂仍亰亠仄舒 亟仍 仗舒舒仍仍亠仍仆
于亳仍亠仆亳亶 仆舒亟 仂亠仆 弍仂仍亳仄亳, 仆亠从仂仍从仂
仗亠舒弍舒亶, 仆舒弍仂舒仄亳 亟舒仆仆 于 从仂仄仗ム亠仆
从仍舒亠舒. (Wikipedia)
舒从 舒弍仂舒亠 MapReduce?
个舒亰 mapreduce
map: mapper(line) -> (k1, v1)
shuffle: 仂亳仂于从舒 仗仂 k1
reduce: reducer(k1, [v1, v2, v3])
仂亟亠 仍仂于 (hello world 亳亰 仄亳舒 MR)
仂亟亠 仍仂于
仂亞亟舒 亢亠 亳仆 仆亠 仄仂亢亠 亰舒仆,
仂仆 仄舒仗-亠亟ム亳 仂于亠亠从
亅从仂亳亠仄舒 Hadoop
Google MapReduce -> Hadoop Mapreduce
Google File System -> HDFS
Google BigTable -> Hbase
舒从 亳仗仂仍亰仂于舒 JS c Hadoop
(hadoop streaming)
仂亟亠 仍仂于 仆舒 hadoop streaming
丐亠亳亠仄 仍仂从舒仍仆仂
cat data | ./mapper.js | sort -k1,1 | ./reducer.js
Boilerplate for Hadoop tasks
https://github.com/koorchik/node-hadoop-
boilerplate
弌仂亰亟舒亠仄 从仍舒亠 仆舒 AWS EMR
(亟亠仄仂)
1) 丕舒仆仂于从舒 NodeJS 仆舒 从仍舒亠
2) 舒弍仂舒  亰舒于亳亳仄仂礆亳
 hello world 从 亠舒仍仆仂亶 亰舒亟舒亠
舒亟舒舒: 舒于仆亠仆亳亠 仗仂仄亳仆舒亠仄仂亳 亞仗仗
从仍ム亠于 仍仂于.
仂亟: 亟舒仆仆亠  于亳亠舒
仂亟: 亞舒亳从 仗仂仄亳仆舒亠仄仂亳 亞仗仗
从仍ム亠于 仍仂于 仗仂 亟仆礆
亠仄 仂弍舒弍仂从亳: 亟仂 10 亠从仆亟
仆于亠亳仂于舒仆仆亶 亳仆亟亠从
仂仂亠仆亳亠 亳仆于亠亳仂于舒仆仆仂亞仂
亳仆亟亠从舒
仂弍仍亠仄
亳仆仂仆仆仂 仄舒仗仗亠舒
弌仂仗-仍仂于舒
弌从仍仂仆亠仆亳 仍仂于 (亠仄亳仆亞, 仍亠仄舒亳亰舒亳)
丐仂从亠仆亳亰舒亳 (仍从亳, 亠-亠亞亳, ミ経却仆亠亶仄)
仂仄仗亠亳 亳仆亟亠从舒
亳仍亠仆亳亠 仗亠亠亠亠仆亳亶 于 亳仆亟亠从亠
舒仆亢亳仂于舒仆亳亠 亟仂从仄亠仆仂于
弍舒弍仂从舒 仍仂于仂仂亠舒仆亳亶
亳于舒 亟亠仄仂仆舒亳
弌仍从亳
Hadoop streaming utils for NodeJS https://www.npmjs.com/package/hadoop-
streaming-utils
Node Hadoop boilerplate https://github.com/koorchik/node-hadoop-boilerplate
NodeJS Mystem3 - https://www.npmjs.com/package/mystem3
MapReduce: Simplified Data Processing on Large Clusters http://research.
google.com/archive/mapreduce.html
Amazon Elastic MapReduce http://aws.amazon.com/elasticmapreduce/
Viktor Turskyi
viktor@webbylab.com
https://twitter.com/koorchik
https://github.com/koorchik
WebbyLab
http://webbylab.com

More Related Content

Mapreduce in JavaScript