際際滷

際際滷Share a Scribd company logo
1




      Powering the Britains Got Talent
      buzzer*
       *And Big Data



                        Big Data Meetup, London 25/5/2011


Thursday, 26 May 2011                                           1
2




      What we do




Thursday, 26 May 2011       2
3




      Me




            Malcolm Box, Co-founder & CTO

            boxm@livetalkback.com

            @malcolmbox




Thursday, 26 May 2011                           3
4




      The Buzzer




                        BIG DATA




Thursday, 26 May 2011                  4
5




      The challenge



            10 Million+ viewers

            Design goal of 50,000 requests/s, 10,000 buzzes/second

                  Equivalent to 130 Billion requests/month

            But just on Saturday night

            And four weeks to build




Thursday, 26 May 2011                                                    5
6




      The challenge




                                      Where does 130
                                     Billion requests 鍖t?




                 Source: http://www.google.com/adplanner/static/top1000/#

Thursday, 26 May 2011                                                           6
7




      Where we started....

                                          app.livetalkback.com          cdn.livetalkback.com


                        Control plane
                                                    ELB                      CloudFront




                           Zabbix
                                        Webserver           Webserver
                                         Django              Django
                                         Ubuntu              Ubuntu




                                                    MySQL                        S3




Thursday, 26 May 2011                                                                              7
8




      Step 1: Testing




            Started with a platform with a previous peak of 100 requests/s

            No idea where it would break

            Tsung! http://tsung.erlang-projects.org/




Thursday, 26 May 2011                                                            8
9




      Step 2: ELB



            Amazon Elastic Load Balancer

            In鍖nite capacity

            BUT very long impulse response and NO controls :(


            HAProxy to the rescue

                  5K requests/s per node




Thursday, 26 May 2011                                               9
10




      Step 3: Avoid the DB



            MySQL was never going to be able to handle 10,000 writes/s, nor 50,000
            reads

            Hey, Django does memcached. Problem solved

            Help, our memcached server I/O is maxed out :(

            Two-layer cache: https://gist.github.com/953524

            Write-behind data




Thursday, 26 May 2011                                                                     10
11




      But we want analytics!




            Now 10K things to write to disk every second

            Logging? Database?

            This is starting to look like BIG DATA




Thursday, 26 May 2011                                           11
12




      Step 4: Baby




Thursday, 26 May 2011        12
13




      Step 5: Cassandra




            Deployed Cassandra cluster on EC2 to handle buzz records

            Tested to > 10K writes/s

            All good!

            So how many users did we have last night?




Thursday, 26 May 2011                                                       13
14




      Where we ended...

                                                     app.livetalkback.com                   cdn.livetalkback.com
                                                                                                                    10
                        Control plane
                                               HAProxy               HAProxy                     CloudFront        nodes


                            Chef

                                              Webserver             Webserver                                      100+
                                                                                                                   nodes
                                               Django                Django
                                               Ubuntu                Ubuntu
                           Zabbix




                                        Memcached          Cassandra
                                         Memcached          Cassandra          RDS Master            S3




Thursday, 26 May 2011                                                                                                           14
15




      Scaling up - and down


            Con鍖guring 100+ servers by
            hand each week would have
            been a pain

            Used to Chef to automate

            Also builds the test swarm

            http://wiki.opscode.com/display/
            chef/Home




Thursday, 26 May 2011                               15
16




      Now what?




            Still challenges with analytics & ad-hoc queries

                  Looking at Brisk and Hadoop

            Were sucking the Twitter 鍖rehose for Tellybug

                  MySQL is coping so far, but only just




Thursday, 26 May 2011                                               16
17




      Questions?
       boxm@livetalkback.com

       @malcolmbox




Thursday, 26 May 2011               17

More Related Content

Scaling the Britain's Got Talent Buzzer

  • 1. 1 Powering the Britains Got Talent buzzer* *And Big Data Big Data Meetup, London 25/5/2011 Thursday, 26 May 2011 1
  • 2. 2 What we do Thursday, 26 May 2011 2
  • 3. 3 Me Malcolm Box, Co-founder & CTO boxm@livetalkback.com @malcolmbox Thursday, 26 May 2011 3
  • 4. 4 The Buzzer BIG DATA Thursday, 26 May 2011 4
  • 5. 5 The challenge 10 Million+ viewers Design goal of 50,000 requests/s, 10,000 buzzes/second Equivalent to 130 Billion requests/month But just on Saturday night And four weeks to build Thursday, 26 May 2011 5
  • 6. 6 The challenge Where does 130 Billion requests 鍖t? Source: http://www.google.com/adplanner/static/top1000/# Thursday, 26 May 2011 6
  • 7. 7 Where we started.... app.livetalkback.com cdn.livetalkback.com Control plane ELB CloudFront Zabbix Webserver Webserver Django Django Ubuntu Ubuntu MySQL S3 Thursday, 26 May 2011 7
  • 8. 8 Step 1: Testing Started with a platform with a previous peak of 100 requests/s No idea where it would break Tsung! http://tsung.erlang-projects.org/ Thursday, 26 May 2011 8
  • 9. 9 Step 2: ELB Amazon Elastic Load Balancer In鍖nite capacity BUT very long impulse response and NO controls :( HAProxy to the rescue 5K requests/s per node Thursday, 26 May 2011 9
  • 10. 10 Step 3: Avoid the DB MySQL was never going to be able to handle 10,000 writes/s, nor 50,000 reads Hey, Django does memcached. Problem solved Help, our memcached server I/O is maxed out :( Two-layer cache: https://gist.github.com/953524 Write-behind data Thursday, 26 May 2011 10
  • 11. 11 But we want analytics! Now 10K things to write to disk every second Logging? Database? This is starting to look like BIG DATA Thursday, 26 May 2011 11
  • 12. 12 Step 4: Baby Thursday, 26 May 2011 12
  • 13. 13 Step 5: Cassandra Deployed Cassandra cluster on EC2 to handle buzz records Tested to > 10K writes/s All good! So how many users did we have last night? Thursday, 26 May 2011 13
  • 14. 14 Where we ended... app.livetalkback.com cdn.livetalkback.com 10 Control plane HAProxy HAProxy CloudFront nodes Chef Webserver Webserver 100+ nodes Django Django Ubuntu Ubuntu Zabbix Memcached Cassandra Memcached Cassandra RDS Master S3 Thursday, 26 May 2011 14
  • 15. 15 Scaling up - and down Con鍖guring 100+ servers by hand each week would have been a pain Used to Chef to automate Also builds the test swarm http://wiki.opscode.com/display/ chef/Home Thursday, 26 May 2011 15
  • 16. 16 Now what? Still challenges with analytics & ad-hoc queries Looking at Brisk and Hadoop Were sucking the Twitter 鍖rehose for Tellybug MySQL is coping so far, but only just Thursday, 26 May 2011 16
  • 17. 17 Questions? boxm@livetalkback.com @malcolmbox Thursday, 26 May 2011 17