際際滷

際際滷Share a Scribd company logo
Off the Grid
Introduction to Grid Computing with GridGain



                   QJUG
                February 2007


    Tom Adams            Nick Partridge
    Workingmouse       Veitch Lister Consulting
Why are we here?
Large distributed application
Grid-based solution worked
Flow
Grid?
  Multiple independent computing clusters which act like a
 quot;gridquot; (Wikipedia)
  Many nodes, each node is indistinguishable from other nodes
Complete machines over co-located CPUs?
Multiple processes?
Commodity hardware?
Homogenous machines?
A tale of two grids
Partition data across grid
Partition processing across grid
http://www.jroller.com/nivanov/entry/grid_computing_compute_grid_data
Selection
Requirements
  Callable from a Rails webapp
Real-time - synchronous responses less than 30 seconds
Large dataset - 100 GB (computation runs across all data)
Rails webapp
   Simple document-literal web service
 Ruby - soap4r
 Java - GlassFish, Spring-WS
Not really interesting for this talk... see Brisbane.rb
Data
  Read-only
Full control
45 TB (became 100 GB with pre-processing)
SQL? 3 tables, one query w/ 2 joins
Dont want to roll our own
(Row) database good enough
And we can federate them
Result?
http://battellemedia.com/archives/2007_01.php
What about BigTable?
Column database
Result?
http://failblog.wordpress.com/2008/01/29/satellite/
Where are we?
Progress
  Dont need to distribute data   no data grid

No off the shelf solutions that scale/go fast
Understand data better happy to roll our own as fallback
Data solution
Data
  CSV 鍖les on 鍖lesystem (now binary)
Directories form indices
Data 鍖les broken up into chunks
What about the code?
http://giapet.net/wp-content/uploads/2007/05/luluwtf.gif
Need to distribute the
    computation
Options?
Erlang
Scala
Java
Java frameworks
  Hadoop
GridGain
Oracle Coherence
GigaSpaces
Terracotta
JavaSpaces/Jini
Shoal
GridGain
GridGain
  fully open source full-stack grid computing platform for Java
Map/reduce-based computation
Easy to setup and use
Can be extended via SPI implementations
Just works
≒Scalable (weve had it up to 32 nodes)
Map/reduce
When does it work
  When data is independent (pure/referentially transparent)
When data can be combined (reduce) based solely on input
foo          foo:1
                   bar          bar:1
foo bar            bar          bar:1             foo: 1
           split
bar baz            baz    map
                                baz:1    reduce
                                                  bar: 4
quux bar           quux         quux:1            baz: 2
baz bar            bar          bar:1             quux: 1
                   baz          baz:1
                   bar          bar:1
GridGain grid
foo bar           foo: 1
bar baz           bar: 4
quux bar          baz: 2
baz bar           quux: 1




           Grid
foo bar                                                foo: 1
bar baz                                                bar: 4
                              ?
quux bar                                               baz: 2
baz bar                                                quux: 1


                                             bar: 2
           foo bar
                                             baz: 1
           bar baz
                                             quux: 1
                     foo: 1
                                  quux bar
                     bar: 2
                                  baz bar
                     baz: 1


           Node                               Node
foo bar                                                     foo: 1
bar baz                       Master                        bar: 4
quux bar                      Node                          baz: 2
baz bar                                                     quux: 1


                                                  bar: 2
           foo bar
                                                  baz: 1
           bar baz
                                                  quux: 1
                     foo: 1
                                       quux bar
                     bar: 2
                                       baz bar
                     baz: 1


           Node                                    Node
foo bar                                                   quux bar
bar baz                                                   baz bar

foo: 1              Master             Master              bar: 2
bar: 2              Node               Node                baz: 1
baz: 1                                                     quux: 1



          foo bar                               baz bar
                       quux bar   bar baz



             Node                               Node
Did you say map/reduce?
foo bar                                                            foo: 1
bar baz                           Master                           bar: 4
quux bar                         reduce
                                   Node                            baz: 2
baz bar                                                            quux: 1


                                                         bar: 2
           foo bar
                                                         baz: 1
           bar baz
                                                         quux: 1
                        foo: 1
                                           quux bar
                        bar: 2
                                           baz bar
                        baz: 1


           Node   map                                 map Node
Show me the types!
foo bar                                                foo: 1
bar baz                     Master                     bar: 4
      reduce[B, C](List[B], C, (C, B)
quux bar                    Node
                                         C)    List[C] 2
                                                       baz:
baz bar                                                quux: 1


                                             bar: 2
         foo bar
                                             baz: 1
         bar baz
                                             quux: 1
                    foo: 1
                                  quux bar
                    bar: 2
                                  baz bar
                    baz: 1


         map[A, B](List[A],
        Node                  A  B)  List[B]   Node
Terminology
foo bar                                                 foo: 1
bar baz                       Master                    bar: 4
quux bar                      Node                      baz: 2
baz bar                                                 quux: 1
  Task                                                      Result
                                                  bar: 2
           foo bar                     quux bar   baz: 1
           bar baz                     baz bar    quux: 1
                     foo: 1
             Job     bar: 2              Job
                     baz: 1


           Node                                    Node
foo bar                                                       foo: 1
bar baz                             Master                    bar: 4
quux bar                            Node                      baz: 2
baz bar                                                       quux: 1
  Task                                                            Result
                          foo bar            baz bar
                            Job                Job
                  bar baz                      quux bar
                    Job                              Job



           Node                                            Node
foo bar                                                      foo: 1
bar baz                          Master                      bar: 4
quux bar                         Node                        baz: 2
baz bar                                                      quux: 1
  Task                                                         Result


             bar baz     foo bar          quux bar   baz bar
                  Job      Job              Job        Job




           Node         Node              Node        Node
What de鍖nes a grid?
IP MCast: 228.1.2.4      IP MCast: 228.1.2.5



         Node                     Node




Node              Node   Node              Node
Failover
foo bar                                                        foo: 1
bar baz                          Master                        bar: 4
quux bar                         Node                          baz: 2
baz bar                                                        quux: 1
  Task


             bar baz     foo bar          quux bar   baz bar
                  Job      Job              Job        Job




           Node         Node              Node        Node
foo bar                                                        foo: 1
bar baz                          Master                        bar: 4
quux bar                         Node                          baz: 2
baz bar                                                        quux: 1
  Task


             bar baz     foo bar          quux bar   baz bar
                  Job      Job              Job        Job




           Node
                        X
                        Node              Node        Node
foo bar                                                       foo: 1
bar baz                         Master                        bar: 4
quux bar                        Node                          baz: 2
baz bar                                                       quux: 1
  Task


             bar baz                     quux bar   baz bar
               Job bar
                 foo                       Job        Job
                  Job



           Node
                        XNode            Node        Node
foo bar                                              foo: 1
bar baz                  Master                      bar: 4
quux bar                 Node                        baz: 2
baz bar                                              quux: 1
  Task

                          foo bar                 bar baz
                            Job quux bar   baz bar Job
                                  Job        Job




         X X
           Node   Node            Node      Node
Task execution
http://www.gridgain.com/javadoc/org/gridgain/grid/GridTask.html
GridGain demo
The good, the bad, the ugly
Just works, fast, easy,
 extensible, scalable
Error messages, doco, code
quality, coupling, odd APIs,
  management overview
Nomenclature, JMS?
References
  http://wiki.workingmouse.com/
http://www.gridgain.com/
http://labs.google.com/papers/mapreduce.html

More Related Content

Off the Grid