ݺߣ

ݺߣShare a Scribd company logo
800           
  Hadoop
? id:sasata299 (            )

? Ruby Perl
?
?          http://blog.livedoor.jp/sasata299/
1. Hadoop

2. Hadoop

3.

4.

5.
Hadoop
816
30         3   1
(   )
(   )
?
? GROUP BY        (
        (   `)

?                     7000   (
    )
!!
Hadoop
Hadoop
? Google   MapReduce

?
?

? HDFS
(            )   (             )



    Mapper           Reducer


(            )   (             )
800ˤ&ܴdz;ʳ٤&ܴdz;ᲹǴDZǷɢI
? Hadoop Streaming
?               Ruby

? EC2      Hadoop               (
            50 )

?   HDFS      S3       (s3fs)
800ˤ&ܴdz;ʳ٤&ܴdz;ᲹǴDZǷɢI
(            )

    (            )


        Mapper       (   )


    (            )
HDFS
Mapper, Reducer
Hadoop            cat



`hadoop dfs -cat
 s3://xxx/user/root/in/hoge`
require csv

path = s3://xxx/user/root/in/user_info #
user_info = `hadoop dfs -cat #{path}`

ARGF.each_line do |line| #
 line.chomp!
 csv = CSV.parse(line)

 #              user_info
end
800ˤ&ܴdz;ʳ٤&ܴdz;ᲹǴDZǷɢI
7000   (   )
7000   (   )

30
Hadoop   !!
? Mapper, Reducer   HDFS
               (Hadoop     cat)

?
? DB
800ˤ&ܴdz;ʳ٤&ܴdz;ᲹǴDZǷɢI
Ad

Recommended

ᲹǴDZҵʹäƤߤޤ
ᲹǴDZҵʹäƤߤޤ
Tatsuya Sasaki
?
ީ`ƥ󥰤ΤαᲹǴDZ
ީ`ƥ󥰤ΤαᲹǴDZ
Tatsuya Sasaki
?
Big Data in the Microsoft Platform
Big Data in the Microsoft Platform
Jesus Rodriguez
?
䰿ʴ٤ǤαᲹǴDZ
䰿ʴ٤ǤαᲹǴDZ
Tatsuya Sasaki
?
961ˤʳ׿֧ǩ`
961ˤʳ׿֧ǩ`
Tatsuya Sasaki
?
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebook
elliando dias
?
ߧѧߧڧ ѧܧѧ֧ (Sofware Engineer): SPARK ...
ߧѧߧڧ ѧܧѧ֧ (Sofware Engineer): SPARK ...
Provectus
?
Hadoop Conference Japan 2011 FallФäƤޤ
Hadoop Conference Japan 2011 FallФäƤޤ
moai kids
?
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Mitsuharu Hamba
?
Building Location Aware Apps - Get Started with PostGIS, PART II
Building Location Aware Apps - Get Started with PostGIS, PART II
lasmasi
?
Streaming API, Spark and Ruby
Streaming API, Spark and Ruby
Manohar Amrutkar
?
Hive at Last.fm
Hive at Last.fm
Skills Matter
?
ץ뤫ҊMap reduce`
ץ뤫ҊMap reduce`
Shinpei Ohtani
?
Introduction to pig & pig latin
Introduction to pig & pig latin
knowbigdata
?
Hadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticians
attilacsordas
?
ᲹǴDZҵʹäƤߤ
ᲹǴDZҵʹäƤߤ
Tatsuya Sasaki
?
Big Data Hadoop Training in Pune-Course Content Advanto Software
Big Data Hadoop Training in Pune-Course Content Advanto Software
Advanto Software
?
Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010
Yahoo Developer Network
?
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hw09 Hadoop Development At Facebook Hive And Hdfs
Cloudera, Inc.
?
Apache spark session
Apache spark session
knowbigdata
?
Java/Scala Lab 2016. ݧ֧ܧѧߧէ ߧܧ: ѧڧߧߧ ҧ֧ߧڧ Spark.
Java/Scala Lab 2016. ݧ֧ܧѧߧէ ߧܧ: ѧڧߧߧ ҧ֧ߧڧ Spark.
GeeksLab Odessa
?
Apache Pig
Apache Pig
Shashidhar Basavaraju
?
`륢ץǤ Amazon Elastic MapReduce
`륢ץǤ Amazon Elastic MapReduce
Takahiro Kamatani
?
Pig and Pig Latin - Module 5
Pig and Pig Latin - Module 5
Rohit Agrawal
?
Big Data @ Orange - Dev Day 2013 - part 2
Big Data @ Orange - Dev Day 2013 - part 2
ovarene
?
Pig, Making Hadoop Easy
Pig, Making Hadoop Easy
Nick Dimiduk
?
Cassandra + Hadoop @ApacheCon
Cassandra + Hadoop @ApacheCon
Jeremy Hanna
?
Hadoop in åѥå
Hadoop in åѥå
Tatsuya Sasaki
?
ǩ`(ᲹǴDZ)
ǩ`(ᲹǴDZ)
Takumi Asai
?
ץ뤫Ѳܳ𥳩`
ץ뤫Ѳܳ𥳩`
Shinpei Ohtani
?

More Related Content

What's hot (19)

Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Mitsuharu Hamba
?
Building Location Aware Apps - Get Started with PostGIS, PART II
Building Location Aware Apps - Get Started with PostGIS, PART II
lasmasi
?
Streaming API, Spark and Ruby
Streaming API, Spark and Ruby
Manohar Amrutkar
?
Hive at Last.fm
Hive at Last.fm
Skills Matter
?
ץ뤫ҊMap reduce`
ץ뤫ҊMap reduce`
Shinpei Ohtani
?
Introduction to pig & pig latin
Introduction to pig & pig latin
knowbigdata
?
Hadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticians
attilacsordas
?
ᲹǴDZҵʹäƤߤ
ᲹǴDZҵʹäƤߤ
Tatsuya Sasaki
?
Big Data Hadoop Training in Pune-Course Content Advanto Software
Big Data Hadoop Training in Pune-Course Content Advanto Software
Advanto Software
?
Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010
Yahoo Developer Network
?
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hw09 Hadoop Development At Facebook Hive And Hdfs
Cloudera, Inc.
?
Apache spark session
Apache spark session
knowbigdata
?
Java/Scala Lab 2016. ݧ֧ܧѧߧէ ߧܧ: ѧڧߧߧ ҧ֧ߧڧ Spark.
Java/Scala Lab 2016. ݧ֧ܧѧߧէ ߧܧ: ѧڧߧߧ ҧ֧ߧڧ Spark.
GeeksLab Odessa
?
Apache Pig
Apache Pig
Shashidhar Basavaraju
?
`륢ץǤ Amazon Elastic MapReduce
`륢ץǤ Amazon Elastic MapReduce
Takahiro Kamatani
?
Pig and Pig Latin - Module 5
Pig and Pig Latin - Module 5
Rohit Agrawal
?
Big Data @ Orange - Dev Day 2013 - part 2
Big Data @ Orange - Dev Day 2013 - part 2
ovarene
?
Pig, Making Hadoop Easy
Pig, Making Hadoop Easy
Nick Dimiduk
?
Cassandra + Hadoop @ApacheCon
Cassandra + Hadoop @ApacheCon
Jeremy Hanna
?
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Mitsuharu Hamba
?
Building Location Aware Apps - Get Started with PostGIS, PART II
Building Location Aware Apps - Get Started with PostGIS, PART II
lasmasi
?
Streaming API, Spark and Ruby
Streaming API, Spark and Ruby
Manohar Amrutkar
?
Introduction to pig & pig latin
Introduction to pig & pig latin
knowbigdata
?
Hadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticians
attilacsordas
?
Big Data Hadoop Training in Pune-Course Content Advanto Software
Big Data Hadoop Training in Pune-Course Content Advanto Software
Advanto Software
?
Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010
Yahoo Developer Network
?
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hw09 Hadoop Development At Facebook Hive And Hdfs
Cloudera, Inc.
?
Apache spark session
Apache spark session
knowbigdata
?
Java/Scala Lab 2016. ݧ֧ܧѧߧէ ߧܧ: ѧڧߧߧ ҧ֧ߧڧ Spark.
Java/Scala Lab 2016. ݧ֧ܧѧߧէ ߧܧ: ѧڧߧߧ ҧ֧ߧڧ Spark.
GeeksLab Odessa
?
`륢ץǤ Amazon Elastic MapReduce
`륢ץǤ Amazon Elastic MapReduce
Takahiro Kamatani
?
Pig and Pig Latin - Module 5
Pig and Pig Latin - Module 5
Rohit Agrawal
?
Big Data @ Orange - Dev Day 2013 - part 2
Big Data @ Orange - Dev Day 2013 - part 2
ovarene
?
Pig, Making Hadoop Easy
Pig, Making Hadoop Easy
Nick Dimiduk
?
Cassandra + Hadoop @ApacheCon
Cassandra + Hadoop @ApacheCon
Jeremy Hanna
?

Similar to 800ˤ&ܴdz;ʳ٤&ܴdz;ᲹǴDZǷɢI (20)

Hadoop in åѥå
Hadoop in åѥå
Tatsuya Sasaki
?
ǩ`(ᲹǴDZ)
ǩ`(ᲹǴDZ)
Takumi Asai
?
ץ뤫Ѳܳ𥳩`
ץ뤫Ѳܳ𥳩`
Shinpei Ohtani
?
Hadoop ݆iᡡ?
Hadoop ݆iᡡ?
moai kids
?
ᲹǴDZŤȥ饦
ᲹǴDZŤȥ饦
Naoki Yanai
?
Hadoop
Hadoop
Po-Han Chen
?
Hadoop & HDFS for Beginners
Hadoop & HDFS for Beginners
Rahul Jain
?
Brust hadoopecosystem
Brust hadoopecosystem
Andrew Brust
?
Hadoop
Hadoop
Scott Leberknight
?
Chemogenomics in the cloud: Is the sky the limit?
Chemogenomics in the cloud: Is the sky the limit?
Rajarshi Guha
?
Hadoop I/O Analysis
Hadoop I/O Analysis
Richard McDougall
?
åѥåɤǤα𳾰
åѥåɤǤα𳾰
Tatsuya Sasaki
?
An Overview of Hadoop
An Overview of Hadoop
Asif Ali
?
äȱᲹǴDZˤĤäƤߤ뤫⣩
äȱᲹǴDZˤĤäƤߤ뤫⣩
moai kids
?
Hadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 Fall
Ryu Kobayashi
?
Big Data - Lab A1 (SC 11 Tutorial)
Big Data - Lab A1 (SC 11 Tutorial)
Robert Grossman
?
Hadoop - Lessons Learned
Hadoop - Lessons Learned
tcurdt
?
Hadoop Family and Ecosystem
Hadoop Family and Ecosystem
tcloudcomputing-tw
?
Hadoop Overview kdd2011
Hadoop Overview kdd2011
Milind Bhandarkar
?
Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
?
Hadoop & HDFS for Beginners
Hadoop & HDFS for Beginners
Rahul Jain
?
Brust hadoopecosystem
Brust hadoopecosystem
Andrew Brust
?
Chemogenomics in the cloud: Is the sky the limit?
Chemogenomics in the cloud: Is the sky the limit?
Rajarshi Guha
?
An Overview of Hadoop
An Overview of Hadoop
Asif Ali
?
äȱᲹǴDZˤĤäƤߤ뤫⣩
äȱᲹǴDZˤĤäƤߤ뤫⣩
moai kids
?
Hadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 Fall
Ryu Kobayashi
?
Big Data - Lab A1 (SC 11 Tutorial)
Big Data - Lab A1 (SC 11 Tutorial)
Robert Grossman
?
Hadoop - Lessons Learned
Hadoop - Lessons Learned
tcurdt
?
Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
?
Ad

More from Tatsuya Sasaki (6)

餢󥸥˥ˤĤ
餢󥸥˥ˤĤ
Tatsuya Sasaki
?
餢ȥө`˽
餢ȥө`˽
Tatsuya Sasaki
?
᥿ץߥ󥰤Ƕٳ餳
᥿ץߥ󥰤Ƕٳ餳
Tatsuya Sasaki
?
dzϳǩ`٩`dz؏
dzϳǩ`٩`dz؏
Tatsuya Sasaki
?
ᲹǴDZ𳾰Uɤä뷽
ᲹǴDZ𳾰Uɤä뷽
Tatsuya Sasaki
?
Ad

800ˤ&ܴdz;ʳ٤&ܴdz;ᲹǴDZǷɢI