ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
How to handle
                                 cloud failure


Grzegorz Kochan | grzegorz@adtaily.com   1
subtitle text

                  Jak sobie radzi?
               z awari? w chmurach



Grzegorz Kochan | grzegorz@adtaily.com   2
AdTaily
   on Amazon AWS


   1,5 bilion widget pageviews monthly
   100 000 ad clicks daily
   35 thousand registered publishers
   15 thousand advertisers
   over 1500 requests per second
   over 150 mbit data per second

Grzegorz Kochan | grzegorz@adtaily.com   3
Startup in the cloud
   why?




               Availability                     Pricing

                                         API

                Scalability                    Simplicity


Grzegorz Kochan | grzegorz@adtaily.com    4
But things
   can go wrong
 TechCrunch


               ¡°Amazon EC2 goes down, taking with it Reddit,
               Foursquare and Quora¡± - April 2011

               ?Down Goes The Internet¡­ Again. Amazon EC2
               Outage Takes Down Foursquare, Instagram,
               Quora, Reddit, Etc¡± - August 2011


Grzegorz Kochan | grzegorz@adtaily.com   5
Design for failure


Grzegorz Kochan | grzegorz@adtaily.com   6
Amazon AWS
   Geographical map
   US - Oregon                           EU Ireland            Asia Paci?c: Tokyo
   Availability Zones:                   Availability Zones:   Availability Zones:
    us-west-1a                           eu-west-1b            ap-southeast-1a
    us-west-1b                           eu-west-1b            ap-southeast-1b
    us-west-1c                           eu-west-1c




   US - N. California                    US - Virginia         Asia Paci?c:
   Availability Zones:                   Availability Zones:   Singapore
    us-west-2a                                                 Availability Zones:
                                         us-east-1a
    us-west-2b                                                  ap-southeast-1a
                                         us-east-1b
                                                                ap-southeast-1b
                                         us-east-1c
                                         us-east-1d


Grzegorz Kochan | grzegorz@adtaily.com                7
Replicate,
   duplicate and balance
                                         Multi server   Availability Zone
    Single server                        setup                 us-east-1a
        setup




Grzegorz Kochan | grzegorz@adtaily.com         8
Distribute
   multi A-Zone architecture

    Availability Zone                        Availability Zone
    us-east-1a                                     us-east-1b




Grzegorz Kochan | grzegorz@adtaily.com   9
Distribute more
   multi Region architecture


   US East Region                             US West Region




Grzegorz Kochan | grzegorz@adtaily.com   10
Design for failure
   application decoupling


                  Stateless services
                  Gracefull degradation
                  Die fast and alone
                  Auto recover
                  Backup and scale independently


Grzegorz Kochan | grzegorz@adtaily.com   11
Decoupling
   example
                                          Monitoring & Alerting
       Ecommerce                          (CloudWatch & SNS)
       Application
                                                      AutoScaling
    - shopping cart
    - process orders
    - process payments                   Product Catalog            #2   #3
    - generate invoices




                                                                              Messaging (SQS)
    - send emails                        Order Processor            #2   #3

                                          Payment Processor              #2

                                           Invoice generator             #2

                                                 Email Sender            #2

                                           ...         #2   #3      #4   #5
Grzegorz Kochan | grzegorz@adtaily.com           12
Automate
   everything



                  Infrastructure - custom AMIs, Chef, Puppet
                  Monitoring - CloudWatch
                  Scaling and recoverying - AutoScaling
                  Fail and recover constantly - ChaosMonkey
                  by NetFlix


Grzegorz Kochan | grzegorz@adtaily.com   13
Be rational
   weight the risks and costs




Grzegorz Kochan | grzegorz@adtaily.com   14
More info
   on Amazon AWS




  http://aws.amazon.com/architecture




Grzegorz Kochan | grzegorz@adtaily.com   15
Questions?
                  Grzegorz Kochan
                  CTO & VP of Products w AdTaily
                  email: grzegorz@adtaily.com
                  www.adtaily.pl
                  facebook.com/adtaily




                     http://adtai.ly/TechCamp1

Grzegorz Kochan | grzegorz@adtaily.com     16

More Related Content

How to handle cloud failure.

  • 1. How to handle cloud failure Grzegorz Kochan | grzegorz@adtaily.com 1
  • 2. subtitle text Jak sobie radzi? z awari? w chmurach Grzegorz Kochan | grzegorz@adtaily.com 2
  • 3. AdTaily on Amazon AWS 1,5 bilion widget pageviews monthly 100 000 ad clicks daily 35 thousand registered publishers 15 thousand advertisers over 1500 requests per second over 150 mbit data per second Grzegorz Kochan | grzegorz@adtaily.com 3
  • 4. Startup in the cloud why? Availability Pricing API Scalability Simplicity Grzegorz Kochan | grzegorz@adtaily.com 4
  • 5. But things can go wrong TechCrunch ¡°Amazon EC2 goes down, taking with it Reddit, Foursquare and Quora¡± - April 2011 ?Down Goes The Internet¡­ Again. Amazon EC2 Outage Takes Down Foursquare, Instagram, Quora, Reddit, Etc¡± - August 2011 Grzegorz Kochan | grzegorz@adtaily.com 5
  • 6. Design for failure Grzegorz Kochan | grzegorz@adtaily.com 6
  • 7. Amazon AWS Geographical map US - Oregon EU Ireland Asia Paci?c: Tokyo Availability Zones: Availability Zones: Availability Zones: us-west-1a eu-west-1b ap-southeast-1a us-west-1b eu-west-1b ap-southeast-1b us-west-1c eu-west-1c US - N. California US - Virginia Asia Paci?c: Availability Zones: Availability Zones: Singapore us-west-2a Availability Zones: us-east-1a us-west-2b ap-southeast-1a us-east-1b ap-southeast-1b us-east-1c us-east-1d Grzegorz Kochan | grzegorz@adtaily.com 7
  • 8. Replicate, duplicate and balance Multi server Availability Zone Single server setup us-east-1a setup Grzegorz Kochan | grzegorz@adtaily.com 8
  • 9. Distribute multi A-Zone architecture Availability Zone Availability Zone us-east-1a us-east-1b Grzegorz Kochan | grzegorz@adtaily.com 9
  • 10. Distribute more multi Region architecture US East Region US West Region Grzegorz Kochan | grzegorz@adtaily.com 10
  • 11. Design for failure application decoupling Stateless services Gracefull degradation Die fast and alone Auto recover Backup and scale independently Grzegorz Kochan | grzegorz@adtaily.com 11
  • 12. Decoupling example Monitoring & Alerting Ecommerce (CloudWatch & SNS) Application AutoScaling - shopping cart - process orders - process payments Product Catalog #2 #3 - generate invoices Messaging (SQS) - send emails Order Processor #2 #3 Payment Processor #2 Invoice generator #2 Email Sender #2 ... #2 #3 #4 #5 Grzegorz Kochan | grzegorz@adtaily.com 12
  • 13. Automate everything Infrastructure - custom AMIs, Chef, Puppet Monitoring - CloudWatch Scaling and recoverying - AutoScaling Fail and recover constantly - ChaosMonkey by NetFlix Grzegorz Kochan | grzegorz@adtaily.com 13
  • 14. Be rational weight the risks and costs Grzegorz Kochan | grzegorz@adtaily.com 14
  • 15. More info on Amazon AWS http://aws.amazon.com/architecture Grzegorz Kochan | grzegorz@adtaily.com 15
  • 16. Questions? Grzegorz Kochan CTO & VP of Products w AdTaily email: grzegorz@adtaily.com www.adtaily.pl facebook.com/adtaily http://adtai.ly/TechCamp1 Grzegorz Kochan | grzegorz@adtaily.com 16