Radu Pastia: I've been working with Hadoop two years ago, when I started the Big Data Team at Avira. At first I was oriented more towards the operations side - sizing and setting up our new Hadoop cluster to run smoothly. As our setup stabilized, I started delving deeper into data science and machine learning. I have been coding ever since I had my first home computer running BASIC and my background before Hadoop is in backend scripting for web-based applications.
1 of 20
Download to read offline
More Related Content
Radu Pastia - Couchdoop - Connecting Hadoop with Couchbase
6. Building a connector The Right Way
Mapper
Par$$oner
Reducer
Input
Format
Input
Split
Record
Reader
Output
Format
Record
Writer
10. The InputFormat: From Input to Mapper
--range 2014-09-01;2014-09-20
--number_of_mappers 4
2014-足09-足01
2014-足09-足02
2014-足09-足03
2014-足09-足04
2014-足09-足05
2014-足09-足06
2014-足09-足20
Input Split 1
2014-足09-足01
2014-足09-足02
...
2014-足09-足05
Record Reader 1
(2014-足09-足01-足A;
record
A)
(2014-足09-足01-足B;
record
B)
(2014-足09-足01-足;
record
)
(2014-足09-足02-足A;
record
A)
(2014-足09-足02-足B;
record
B)
(2014-足09-足02-足;
record
)
(2014-足09-足05-足A;
record
A)
(2014-足09-足05-足B;
record
B)
(2014-足09-足05-足;
record
)
Mapper
16. The InputFormat: From Input to Mapper
--range 2014-09-01;2014-09-20
--number_of_mappers 4
2014-足09-足01
2014-足09-足02
2014-足09-足03
2014-足09-足04
2014-足09-足05
2014-足09-足06
2014-足09-足20
Input Split 1
2014-足09-足01
2014-足09-足02
...
2014-足09-足05
Record Reader 1
(2014-足09-足01-足A;
record
A)
(2014-足09-足01-足B;
record
B)
(2014-足09-足01-足;
record
)
(2014-足09-足02-足A;
record
A)
(2014-足09-足02-足B;
record
B)
(2014-足09-足02-足;
record
)
(2014-足09-足05-足A;
record
A)
(2014-足09-足05-足B;
record
B)
(2014-足09-足05-足;
record
)
Mapper