ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Method for Monitoring and
Pro?ling of Hadoop using AspectJ

    Yusuke Shimizu, Kouhei Sakurai, Satoshi Yamane
   Graduate School of Natural Science & Technology,
                 Kanazawa University

             PRDC2012@TOKIMESSE
Introduction
The use scene of Large-scale Distributed Systems is increasing


              Large-scale Distributed System is ...

            ¡°Flexible and available architecture for large
            scale computation and data processing on a
                 network of commodity hardware¡±
                            [-- P. Julio, 2009]

                      - e.g. Apache Hadoop
For Dependable Distributed System ..
 We have to consider
about and deal with ...
                           Only using advance
- Non-deterministic        and static analysis
  network                  or veri?cation
- Fault tolerance           is dif?cult
- Incomprehensible users

 We also need runtime monitoring and analysis
How to monitor and debug

 General method of debugging or monitoring the
 Hadoop is ...
? logging text messages

? checking metrics via Web Interfaces, Ganglia, etc..
There are dif?culties and requirements



 General method of debugging or monitoring the
 Hadoop is ...
? logging text message
 ¡ú Dif?culties by a huge number of nodes
? checking metrics via Web Interfaces, Ganglia, etc..
 ¡ú For operators, not enough to developers
Introduction

 Proposal

   1. The Method Level Monitor

   2. The Adaptive Pro?ling

- Provide effective information for development
- Help developers to understand system behaviors
and speci?cations
Outline of Talk

Introduction
- Distributed system¡¯s dif?culty
Proposal
- Monitor
- Pro?le Method
Experimental Results & Conclusion
2. PROPOSALS

        The Runtime Monitor


                  &


     The Adaptive Pro?ling Method
Outline of Proposed System
 Hadoop       Monitor        Profile

?MapReduce   Record Trace    Count up
             using AspectJ   frequency
?HDFS
                             of
?RPC                         instruction
Monitor

?   observe the system behavior at runtime
?   logging executed instructions passively = make ¡°Trace¡±
    ? using   AspectJ
       -   ¡°AspectJ is implementation of ¡°Aspect
           Oriented Programming¡± using Java ¡°
      ? no    modi?cation is needed to applications
Architecture of Hadoop & Monitor
 Master

 Name      Job                                        Slaves
 Node    Tracker

                                       Map                                   Map

                          Data               Reduce             Data               Reduce
                         Blocks                                Blocks
    Monitor


                          Data           Task                   Data           Task
                          Node          Tracker                 Node          Tracker
 RPC

                   RPC
                                  Monitor                               Monitor
Architecture of Hadoop & Monitor
 Master

 Name      Job                                        Slaves
 Node    Tracker

                                       Map                                   Map

                          Data               Reduce             Data               Reduce
                         Blocks                                Blocks
    Monitor


                          Data           Task                   Data           Task
                          Node          Tracker                 Node          Tracker
 RPC

                   RPC
                                  Monitor                               Monitor
Architecture of Hadoop & Monitor
 Master

 Name      Job                                        Slaves
 Node    Tracker

                                       Map                                       Map

                          Data               Reduce                 Data               Reduce
                         Blocks                                    Blocks
    Monitor


                          Data           Task                       Data           Task
                          Node          Tracker                     Node          Tracker
 RPC

                   RPC
                                  Monitor                                   Monitor




   Master¡¯s Trace                                      Slaves¡¯ Trace
  ?NameNode Trace                                     ?DataNode Trace
  ?JobTracker Trace                                   ?TaskTracker Trace
  ?RPC Trace                                          ?RPC Trace
Method of Pro?ling

?   based on frequency of instructions
?   count up instructions involved in ¡°Trace¡±
?   count up on each grain
    ?   each node
        ?   each process
            ?   each method
Outline of Talk
Introduction
- Distributed system¡¯s dif?culty
Proposal
- Monitor
- Pro?le Method
Experimental Results & Conclusion
3. EXPERIMENT

   Benchmark on the impact of the Monitor
                      &
                do Pro?ling
                      &
        Visualize the pro?ling results
Benchmark                   - the impact of Monitor

Throughput [MB/sec] = Data size / Elapsed time
 Data size               Elapsed time           Throughput   Trace size
             Monitor
   [GB]                     [sec]                [MB/sec]       [MB]

     1         ?       2m 25s (145sec)   6.9                    2.4
                                                    84.1%
     1          ¡Á      2m 2s (122s)      8.2                     0


    10         ?       8m 45s (525sec)   19.0                   3.6
                                                    88.3%
    10          ¡Á      7m 45s (465sec)   21.5                    0

                       1h 21m 54s
    100        ?                         20.4                  31.6
                                                    96.2%
                       (4,914sec)
                       1h 18m 37s
    100         ¡Á                        21.2                    0
                       (4,717sec)


     use ¡°terasort¡± - a sample sorting program using MapReduce
     Trace size increase by 6.43 KB/sec
A Part of Pro?ling
    the statistics of the last 10 seconds, about master
   Tue Nov 13 12:30:08 JST 2012
from 1352777408766 until 10000 after
HOSTNAME ::> DAEMON & PROCESS = { METHODS }
--------------------------
sirius:177 ::>>
  [namenodetrace : 23, jobtrackertrace : 41, datanodetrace : 0,
tasktrackertrace : 0, rpctrace : 113]
 ={
! hdfs.server.namenode.CorruptReplicasMap.numCorruptReplicas=5
! hdfs.server.namenode.FSNamesystem.getBlockLocations=3
! hdfs.server.namenode.FSNamesystem.getDatanode=1
! hdfs.server.namenode.NameNode.getBlockLocations=4
! hdfs.server.namenode.NameNode.getFileInfo=2
! hdfs.server.namenode.NameNode.sendHeartbeat=2
! hdfs.server.namenode.NameNode.verifyVersion=3
! hdfs.server.namenode.UnderReplicatedBlocks.BlockIterator.hasNext=2
! hdfs.server.namenode.UnderReplicatedBlocks.BlockIterator.next=1
! ipc.Client.Connection.PingInputStream.read=4
! ipc.Client.Connection.sendParam=2
! ipc.Client.call=1
! ipc.ConnectionHeader.readFields=4
Node Level Pro?ling
                                               Node Level Pro?ling is
                                             -- pro?ling by aggregating frequencies of
                                             instruction within each node for per unit
                                             time.
                              800
                                                     192.168.1.10    192.168.1.11
number	
 ?of	
 ?occurrences




                              640
                                                     192.168.1.12    192.168.1.13
                                                     192.168.1.14    192.168.1.15
                              480


                              320


                              160


                               0
                                         time(s)                          6420
Process Level Pro?ling about MASTER




  Process Level Pro?ling is
-- pro?ling by aggregating frequencies of instruction of each process
within each node for per unit time.
                                            Master
                                      400
                                                               rpc
        number	
 ?of	
 ?occurrences




                                      300                      jobtracker
                                                               namenode
                                      200


                                      100


                                       0
                                                                            6420
                                                     time(s)
Process Level Pro?ling about Slaves
                                       192.168.1.11
                                200
  number	
 ?of	
 ?occurrences



                                                                                        rpctrace
                                150                                                     tasktrackertrace
                                                                                        datanodetrace
                                100

                                 50

                                 0
                                                                                  6420 time(s)
                                         Map phase                 Reduce phase
                                  192.168.1.12              192.168.1.13
                                                                           There are free resouces.
200

150
                                                      150

                                                      113
                                                                                  should do
100                                                   75                    speculative executions.
 50                                                   38



                                  192.168.1.14              192.168.1.15
200                                                   200
150

100
                                                      150

                                                      100
                                                                           Imbalance of RPC
 50                                                    50
Conclusion
    summary
?   Proposal
    -   the lightweight method-level monitor using AspectJ
    -   the pro?ling method based on frequency of instruction
?   Provide effective information for development
?   Help developers to understand system behaviors and
    speci?cations
    future work
?   Create an algorithm for determining the degree of deviation
    using a pro?ling results indicate the possibility of failure.
Thank you for your kind attention

More Related Content

Prdc2012

  • 1. Method for Monitoring and Pro?ling of Hadoop using AspectJ Yusuke Shimizu, Kouhei Sakurai, Satoshi Yamane Graduate School of Natural Science & Technology, Kanazawa University PRDC2012@TOKIMESSE
  • 2. Introduction The use scene of Large-scale Distributed Systems is increasing Large-scale Distributed System is ... ¡°Flexible and available architecture for large scale computation and data processing on a network of commodity hardware¡± [-- P. Julio, 2009] - e.g. Apache Hadoop
  • 3. For Dependable Distributed System .. We have to consider about and deal with ... Only using advance - Non-deterministic and static analysis network or veri?cation - Fault tolerance is dif?cult - Incomprehensible users We also need runtime monitoring and analysis
  • 4. How to monitor and debug General method of debugging or monitoring the Hadoop is ... ? logging text messages ? checking metrics via Web Interfaces, Ganglia, etc..
  • 5. There are dif?culties and requirements General method of debugging or monitoring the Hadoop is ... ? logging text message ¡ú Dif?culties by a huge number of nodes ? checking metrics via Web Interfaces, Ganglia, etc.. ¡ú For operators, not enough to developers
  • 6. Introduction Proposal 1. The Method Level Monitor 2. The Adaptive Pro?ling - Provide effective information for development - Help developers to understand system behaviors and speci?cations
  • 7. Outline of Talk Introduction - Distributed system¡¯s dif?culty Proposal - Monitor - Pro?le Method Experimental Results & Conclusion
  • 8. 2. PROPOSALS The Runtime Monitor & The Adaptive Pro?ling Method
  • 9. Outline of Proposed System Hadoop Monitor Profile ?MapReduce Record Trace Count up using AspectJ frequency ?HDFS of ?RPC instruction
  • 10. Monitor ? observe the system behavior at runtime ? logging executed instructions passively = make ¡°Trace¡± ? using AspectJ - ¡°AspectJ is implementation of ¡°Aspect Oriented Programming¡± using Java ¡° ? no modi?cation is needed to applications
  • 11. Architecture of Hadoop & Monitor Master Name Job Slaves Node Tracker Map Map Data Reduce Data Reduce Blocks Blocks Monitor Data Task Data Task Node Tracker Node Tracker RPC RPC Monitor Monitor
  • 12. Architecture of Hadoop & Monitor Master Name Job Slaves Node Tracker Map Map Data Reduce Data Reduce Blocks Blocks Monitor Data Task Data Task Node Tracker Node Tracker RPC RPC Monitor Monitor
  • 13. Architecture of Hadoop & Monitor Master Name Job Slaves Node Tracker Map Map Data Reduce Data Reduce Blocks Blocks Monitor Data Task Data Task Node Tracker Node Tracker RPC RPC Monitor Monitor Master¡¯s Trace Slaves¡¯ Trace ?NameNode Trace ?DataNode Trace ?JobTracker Trace ?TaskTracker Trace ?RPC Trace ?RPC Trace
  • 14. Method of Pro?ling ? based on frequency of instructions ? count up instructions involved in ¡°Trace¡± ? count up on each grain ? each node ? each process ? each method
  • 15. Outline of Talk Introduction - Distributed system¡¯s dif?culty Proposal - Monitor - Pro?le Method Experimental Results & Conclusion
  • 16. 3. EXPERIMENT Benchmark on the impact of the Monitor & do Pro?ling & Visualize the pro?ling results
  • 17. Benchmark - the impact of Monitor Throughput [MB/sec] = Data size / Elapsed time Data size Elapsed time Throughput Trace size Monitor [GB] [sec] [MB/sec] [MB] 1 ? 2m 25s (145sec) 6.9 2.4 84.1% 1 ¡Á 2m 2s (122s) 8.2 0 10 ? 8m 45s (525sec) 19.0 3.6 88.3% 10 ¡Á 7m 45s (465sec) 21.5 0 1h 21m 54s 100 ? 20.4 31.6 96.2% (4,914sec) 1h 18m 37s 100 ¡Á 21.2 0 (4,717sec) use ¡°terasort¡± - a sample sorting program using MapReduce Trace size increase by 6.43 KB/sec
  • 18. A Part of Pro?ling the statistics of the last 10 seconds, about master Tue Nov 13 12:30:08 JST 2012 from 1352777408766 until 10000 after HOSTNAME ::> DAEMON & PROCESS = { METHODS } -------------------------- sirius:177 ::>> [namenodetrace : 23, jobtrackertrace : 41, datanodetrace : 0, tasktrackertrace : 0, rpctrace : 113] ={ ! hdfs.server.namenode.CorruptReplicasMap.numCorruptReplicas=5 ! hdfs.server.namenode.FSNamesystem.getBlockLocations=3 ! hdfs.server.namenode.FSNamesystem.getDatanode=1 ! hdfs.server.namenode.NameNode.getBlockLocations=4 ! hdfs.server.namenode.NameNode.getFileInfo=2 ! hdfs.server.namenode.NameNode.sendHeartbeat=2 ! hdfs.server.namenode.NameNode.verifyVersion=3 ! hdfs.server.namenode.UnderReplicatedBlocks.BlockIterator.hasNext=2 ! hdfs.server.namenode.UnderReplicatedBlocks.BlockIterator.next=1 ! ipc.Client.Connection.PingInputStream.read=4 ! ipc.Client.Connection.sendParam=2 ! ipc.Client.call=1 ! ipc.ConnectionHeader.readFields=4
  • 19. Node Level Pro?ling Node Level Pro?ling is -- pro?ling by aggregating frequencies of instruction within each node for per unit time. 800 192.168.1.10 192.168.1.11 number ?of ?occurrences 640 192.168.1.12 192.168.1.13 192.168.1.14 192.168.1.15 480 320 160 0 time(s) 6420
  • 20. Process Level Pro?ling about MASTER Process Level Pro?ling is -- pro?ling by aggregating frequencies of instruction of each process within each node for per unit time. Master 400 rpc number ?of ?occurrences 300 jobtracker namenode 200 100 0 6420 time(s)
  • 21. Process Level Pro?ling about Slaves 192.168.1.11 200 number ?of ?occurrences rpctrace 150 tasktrackertrace datanodetrace 100 50 0 6420 time(s) Map phase Reduce phase 192.168.1.12 192.168.1.13 There are free resouces. 200 150 150 113 should do 100 75 speculative executions. 50 38 192.168.1.14 192.168.1.15 200 200 150 100 150 100 Imbalance of RPC 50 50
  • 22. Conclusion summary ? Proposal - the lightweight method-level monitor using AspectJ - the pro?ling method based on frequency of instruction ? Provide effective information for development ? Help developers to understand system behaviors and speci?cations future work ? Create an algorithm for determining the degree of deviation using a pro?ling results indicate the possibility of failure.
  • 23. Thank you for your kind attention