ݺߣ

ݺߣShare a Scribd company logo
Elastic MapReduce

Hadoop EMR
?             (@sasata299)

?                       NoSQL

?
? http://blog.livedoor.jp/sasata299/
Hadoop
ٳ
?
    ?   EC2    Hadoop            & S3

    ?   Cloudera (CDH1)

?
    ?   Hadoop Streaming (Ruby     )
    ?
?
    ?   (     )

    ?
    ?   master ssh

?   Hadoop           (HADOOP-6254)
    ?   S3            cpu

    ?   S3
SocketTimeoutException
ᲹǴDZ𳾰Uɤä뷽
HADOOP-6254
Elastic MapReduce                                         !!




      https://issues.apache.org/jira/browse/HADOOP-6254
HADOOP-6254
Cloudera (CDH2)                                                             !!




 http://archive.cloudera.com/cdh/2/hadoop-0.20.1+169.88.releasenotes.html
ᲹǴDZ𳾰Uɤä뷽
Elastic Mapreduce
       (EMR)
? EC2, S3
?
?
? GUI( )
? EC2, S3                  

?                  

?                      

? GUI( )    CUI

?
? EC2, S3                  

?                  

?                      

? GUI( )    CUI

?
EMR CDH2
                     AMI
                  (Amazon Machine
        UP            Image)




EMR


CDH2
EMR CDH2
                     AMI
                  (Amazon Machine
        UP            Image)




EMR


CDH2
EMR      !!
      (eHarmony)
ᲹǴDZ𳾰Uɤä뷽
ᲹǴDZ𳾰Uɤä뷽
EMR
        BootStrap Action


        Step (Hadoop Job)




  Job Flow (        )
EMR
        BootStrap Action


        Step (Hadoop Job)




  Job Flow (        )
EMR
        BootStrap Action


        Step (Hadoop Job)




  Job Flow (        )
EMR
        BootStrap Action


        Step (Hadoop Job)




  Job Flow (        )
(               )
elastic-mapreduce
--create #

--num-instances 10 # master:1      , slave:9

--bootstrap-action s3n://xxx/hoge.sh #

--alive #
(               )
elastic-mapreduce
--create #

--num-instances 10 # master:1      , slave:9

--bootstrap-action s3n://xxx/hoge.sh #

--alive #

      Created job ?ow j-8IXS98OW1WEE
                                     ID
(        )
elastic-mapreduce
--stream # Hadoop streaming
--input, --output, --mapper, --reducer #

--cache s3n://xxx/fuga.rb #

--jobconf xxx=yyy #

--job?ow j-xxxxx #                    ID
(        )
elastic-mapreduce
--stream # Hadoop streaming
--input, --output, --mapper, --reducer #

--cache s3n://xxx/fuga.rb #

--jobconf xxx=yyy #

--job?ow j-xxxxx #                    ID
?
    ?
    ?
    ?   --alive

? AMI
    ?   Cloudera           AMI
    ?   BootStrap Action
?
?   mapred.child.java.opts
?   Java
?   Streaming

?
?
?   ElasticMapReduce-master 5100
?   EMR
              Hadoop

?   EMR

?

?   --alive
ᲹǴDZ𳾰Uɤä뷽

More Related Content

ᲹǴDZ𳾰Uɤä뷽

  • 2. ? (@sasata299) ? NoSQL ? ? http://blog.livedoor.jp/sasata299/
  • 4. ٳ
  • 5. ? ? EC2 Hadoop & S3 ? Cloudera (CDH1) ? ? Hadoop Streaming (Ruby ) ?
  • 6. ? ? ( ) ? ? master ssh ? Hadoop (HADOOP-6254) ? S3 cpu ? S3
  • 9. HADOOP-6254 Elastic MapReduce !! https://issues.apache.org/jira/browse/HADOOP-6254
  • 10. HADOOP-6254 Cloudera (CDH2) !! http://archive.cloudera.com/cdh/2/hadoop-0.20.1+169.88.releasenotes.html
  • 13. ? EC2, S3 ? ? ? GUI( )
  • 14. ? EC2, S3 ? ? ? GUI( ) CUI ?
  • 15. ? EC2, S3 ? ? ? GUI( ) CUI ?
  • 16. EMR CDH2 AMI (Amazon Machine UP Image) EMR CDH2
  • 17. EMR CDH2 AMI (Amazon Machine UP Image) EMR CDH2
  • 18. EMR !! (eHarmony)
  • 19.
  • 22. EMR BootStrap Action Step (Hadoop Job) Job Flow ( )
  • 23. EMR BootStrap Action Step (Hadoop Job) Job Flow ( )
  • 24. EMR BootStrap Action Step (Hadoop Job) Job Flow ( )
  • 25. EMR BootStrap Action Step (Hadoop Job) Job Flow ( )
  • 26. ( ) elastic-mapreduce --create # --num-instances 10 # master:1 , slave:9 --bootstrap-action s3n://xxx/hoge.sh # --alive #
  • 27. ( ) elastic-mapreduce --create # --num-instances 10 # master:1 , slave:9 --bootstrap-action s3n://xxx/hoge.sh # --alive # Created job ?ow j-8IXS98OW1WEE ID
  • 28. ( ) elastic-mapreduce --stream # Hadoop streaming --input, --output, --mapper, --reducer # --cache s3n://xxx/fuga.rb # --jobconf xxx=yyy # --job?ow j-xxxxx # ID
  • 29. ( ) elastic-mapreduce --stream # Hadoop streaming --input, --output, --mapper, --reducer # --cache s3n://xxx/fuga.rb # --jobconf xxx=yyy # --job?ow j-xxxxx # ID
  • 30. ? ? ? ? --alive ? AMI ? Cloudera AMI ? BootStrap Action
  • 31. ? ? mapred.child.java.opts ? Java ? Streaming ? ? ? ElasticMapReduce-master 5100
  • 32. ? EMR Hadoop ? EMR ? ? --alive

Editor's Notes