Continuous Deployment and DevOps: Deprecating Silos - JAOO 2010

•

2 likes•1,080 views

JAOO 2010 In this session, we’ll run a retrospective on our efforts to break down organizational barriers with continuous deployment and other DevOps goodness. We’ll talk about what we have done with tools and practices like CI and build pipelines, Puppet and Yum. We’ll also address some puzzles we have encountered such as massive data deployments to many global data centres, and replacing silos with cross-functional teams in a complex, evolving environment. http://jaoo.dk/aarhus-2010/presentation/Continuous%20Deployment%20and%20DevOps:%20Deprecating%20Silos

CONTINUOUS DEPLOYMENT
AND DEVOPS
D E P R E C A T I N G S I L O S

JOSH DEVINS, NOKIA JAOO 2010
TOM SULSTON, THOUGHTWORKS ÅRHUS, DENMARK

WHO ARE WE AND WHERE
ARE WE FROM?
• Josh Devins, Nokia Berlin

• Software architect, Location Services

• Sysadmin of honour

• Tom Sulston, ThoughtWorks

• Lead consultant

• DevOps, build & deploy

CONTINUOUS INTEGRATION
AND BUILD PIPELINE

More!

CONTINUOUS INTEGRATION
AND BUILD PIPELINE

More
automation
Less
administration

ITIL, DEVOPS AND YOU

JOIN US!

• Nokia is hiring in Berlin!

• www.nokia.com/careers

• ThoughtWorks is hiring in London, Hamburg and further
abroad.

• www.thoughtworks.com/jobs

THANKS!
JOSH DEVINS www.joshdevins.net @joshdevins
TOM SULSTON www.thoughtworks.com @tomsulston

JOSH DEVINS, NOKIA JAOO 2010
TOM SULSTON, THOUGHTWORKS ÅRHUS, DENMARK

Continuous Deployment and DevOps: Deprecating Silos - JAOO 2010

1. CONTINUOUS DEPLOYMENT AND DEVOPS D E P R E C A T I N G S I L O S JOSH DEVINS, NOKIA JAOO 2010 TOM SULSTON, THOUGHTWORKS ÅRHUS, DENMARK

2. WHO ARE WE AND WHERE ARE WE FROM? • Josh Devins, Nokia Berlin • Software architect, Location Services • Sysadmin of honour • Tom Sulston, ThoughtWorks • Lead consultant • DevOps, build & deploy

3. PROBLEM SITUATION

4. DEVELOPMENT AND OPERATIONS SILOS

5. MANY SEPARATE TEAMS

6. TOO MUCH MANUAL WORK

7. DIFFICULT DEPLOYMENTS

8. AD-HOC INFRASTRUCTURE MANAGEMENT

9. MAKING IT BETTER

10. CONTINUOUS DELIVERY

11. CONTINUOUS DELIVERY More!

12. CONTINUOUS INTEGRATION AND BUILD PIPELINE

13. More! CONTINUOUS INTEGRATION AND BUILD PIPELINE

14. A DIVERSION INTO MAVEN PAIN

15. Less! A DIVERSION INTO MAVEN PAIN

16. CDC TESTING

17. More! CDC TESTING

18. PACKAGING: RPM & YUM

19. Keep doing! PACKAGING: RPM & YUM

20. RDBMS, NOSQL, DATA DEPLOYMENT

21. ??? RDBMS, NOSQL, DATA DEPLOYMENT

22. PUPPET

23. More! PUPPET

24. BDD

25. More! BDD

26. APPLICATION CONFIGURATION

27. Less! APPLICATION CONFIGURATION

28. PRE-FLIGHT TESTING

29. More! PRE-FLIGHT TESTING

30. MONITORING

31. More! MONITORING

32. ITIL, DEVOPS AND YOU

33. More automation ITIL, DEVOPS AND YOU

34. More automation Less administration ITIL, DEVOPS AND YOU

35. WHERE ARE WE?

36. JOIN US! • Nokia is hiring in Berlin! • www.nokia.com/careers • ThoughtWorks is hiring in London, Hamburg and further abroad. • www.thoughtworks.com/jobs

37. THANKS! JOSH DEVINS www.joshdevins.net @joshdevins TOM SULSTON www.thoughtworks.com @tomsulston JOSH DEVINS, NOKIA JAOO 2010 TOM SULSTON, THOUGHTWORKS ÅRHUS, DENMARK

Editor's Notes

#3: Flip to ovi maps, describe what the product is (kind of)
#4: A few words of introduction on what the &#x201C;before&#x201D; state was - web and device - growth from startup to millions of devices/mo - free navigation earlier this year increased usage - rapid feature and team growth
#5: http://www.flickr.com/photos/tonyjcase/4092410854/sizes/l/in/photostream/ Developers and operations teams separated both organisationally and physically Whole different organisational structure - need to go to C-level (VP-level?) to find a common reporting line Started as a hardware company, and really bolted on services at the beginning Poor alignment of technology choices (base OS, packaging, monitoring) Very little common ground, because...
#6: - lots of technology/approach divergence caused by: - many ops teams - &#x201C;operations&#x201D;, &#x201C;transitions&#x201D;, &#x201C;development support&#x201D; - many development teams - frontend, backend, backend function x/y/z - Conway&#x2019;s Law - short term scaled well and fast - right intention of giving small teams autonomy but...balance needed - Lots of integration points - more complexity than necessary - lots of inventory - Integration is v. painful
#7: - lots of things done by hand, non-repeatable QA, almost nothing automated (except where really necessary -- perf tests) Baroque configuration process Releases take a long time and a lot of manual testing/verification Cycle time is very slow Right intentions, did not scale - change management process (?) - carrying knowledge/understanding across silos has a cost (x4) Frequent rework - fixing the same problem again and again and usually at the last-minute
#8: http://www.flickr.com/photos/14608834@N00/2260818367/sizes/o/in/photostream/ - reality: about one and a half people knew how the whole thing worked end-to-end - reality: ~10-days to build a new image with Java, 5 Tomcat instances, as many war files, nothing else! - worse: the "image system" was not used anywhere except staging and production so failures can very late - maintenance: in dev/QA regular Debian systems with DEB packaging was used, had to essentially maintain two complete distribution mechanisms - change management process is heavyweight - ITIL++, multi-tab Excel spreadsheets, CABs in other countries, not directly involved - often circumvented - communication gaps between ops teams - package and config structure (ISO + rsync) - it worked, but was slow and cryptic - building whole OS images in very slow and non-parallelisable (4 hrs?) CI - multi-phased approach requiring first a custom packaging system and description language (VERY cryptic and bespoke) - using PXE Linux to boot images from a central control server for configuration rsync - any booted server can act as a peer to boot other machines
#9: http://www.flickr.com/photos/14608834@N00/2260818367/sizes/o/in/photostream/ - lots of things done by hand, non-repeatable - &#x201C;We don&#x2019;t have time to do it right&#x201D; - time-to-recovery is slow - monitoring is: inconsistent (lots of false alarms) unclear (multiple tools, teams) too coarse (the site is down!) - hard to triage infrastructure or code issues - inventory management is weak - many data centres, - not enough knowledge kept in-house
#10: - Any questions on describing the problem? - has anyone got similar problems? - What actions did we take to address these issues? Time check: 20 mins
#11: http://www.flickr.com/photos/snogging/4688579468/sizes/l/ - what is continuous delivery? - Continuous Delivery: every SCM commit results in releasable software - that is, from a purely infrastructural and "binary-level" perspective, the software is always releasable - This includes layers of testing, not just releasing anything that compiles! - features may be incomplete, etc. so in practice you might not actually release every commit (ie: Continuous Deployment) - &#x201C;If something hurts, do it more often&#x201D; - You should have gone to Jez&#x2019;s session this morning!
#12: http://www.uvm.edu/~wbowden/Image_files/Pipeline_at_Kuparuk.jpg - how do we get from a SCM commit to something that is deployable and tested enough? - Building the &#x2018;conveyor belt&#x2019; - Turn up existing CI practices to 11 - Each team already did &#x201C;build & unit test&#x201D; - no deployable package (WARs to Nexus) - Automated integration of various teams&#x2019; work - Automated integration testing - Testing deployments - same method on all environments - Currently using Hudson & ant - this works OK.
#13: http://www.petsincasts.com/?p=162 - workaround: don't use the Maven "release" process or just live with it and do Maven "releases" as often as possible - lesson learned: don't try to mess with "the Maven way", it gets very hairy and is a huge time suck - lesson learned: don't depend on SNAPSHOT dependencies unless they are under your own control (can't safely release your module with SNAPSHOT deps meaning you will have to wait for someone else to release their module) - standard Maven versioning lifecycle: 1.0.0-SNAPSHOT, pull down dependencies (some SNAPSHOTs themselves) from some repository (usually one that is not integrated with your source code repository) - working away on 1.0.0-SNAPSHOT and I'm ready to release so then do a Maven "release", tagging SCM, and I get version 1.0.0 - crap we found a bug, so we keep working now on version 1.0.1-SNAPSHOT - okay, ready to release again so I get version 1.0.1 - do some testing and everything is happy so I drop my 1.0.1 war into my production Tomcat - what's wrong with this picture? - key: we "release" software BEFORE we are satisfied with its' quality - like we said before, continuous delivery is all about the possibility of releasing to production at all times, from all commits
#14: CDC - Consumer-Driven Contract http://www.martinfowler.com/articles/consumerDrivenContracts.html Each service/team provides tests for those teams whose services they consume. (ie: If I use your service, I write you a test that expresses how I am using it. You can then run that test in your build.) Lets us do quick integration-type testing at the unit/functional level. Much easier than maintaining stubs. Designed to catch integration failures earlier (typical failure mode is for clients/servers to diverge while still passing their own tests, only to be caught at manual QA stages) Ceremony for giving tests to another team
#15: http://www.flickr.com/photos/delgrossodotcom/2553424895/ - Build once! - passing deployable packages (RPMs) up the value chain - Categorically 100% sure that you&#x2019;re testing what you&#x2019;re going to deploy - Can wrap up all sorts of useful things in OS packages - reference data - hook scripts - dependencies on tiered applications - build pipeline of repositories - Each repo means &#x201C;X level of testing has been done on these packages&#x201D; - gotcha: createrepo caching - gotcha: no concurrent running of createrepo - gotcha: using metapackages to join versions (Might re-introduce in future)
#16: - not doing this yet, but here are some ideas - Currently using mySQL - is there a need to change to Key/Value store? - RDBMS: check out ??? - NoSQL: big, huge question mark and little tooling support, so consider this seriously if considering NoSQL - some teams are using BitTorrent to distribute large (GB and TB) datasets around the world - Lucene indices, map files, etc. - similar to the idea that Twitter uses to deploy stuff with their Murder tool - can we use dbdeploy?
#17: - Puppet overview & alternatives (Chef, CFEngine, hand-rolled tools) - manifests - modules and inheritance - passing puppet configs with deployable code + configs - Driven from developer-facing sysadmins
#18: - infrastructure testing with cucumber-puppet - applying good development practices to the Ops world - absolutely crucial to having a refactorable infrastructure - how unchanging are your systems? - can we start doing Behaviour-driven releases? - This is alpha software! - Does not catch all errors
#19: Configurations passed up from development team through Subversion Deployed with puppet Tested with cucumber-puppet Tested on application start for missing values Bundling application deployments simplifies configuration TODO: review architecture of all apps and simplify (easier now that deployment tech debt is reduced)
#20: http://www.flickr.com/photos/jimbl/2881681649/sizes/o/ - scripted checks before anything even happens - ensure that the stage is set and all known pre-requisites are tested and monitored - application health-check on startup (are all my config values set?) - check_http through nrpe
#21: http://www.flickr.com/photos/kylesteeddesign/4395772305/sizes/o/ - speaking of monitoring... - Nagios, nrpe - cucumber-nagios - Monitoring-driven deployments? Would like developers to push up monitors alongside features. - developers and engineers gaining common understanding around monitoring and system behaviour
#22: ITIL is a framework. DevOps is a series of practices. While you could have lightweight ITIL implementations, they tend to be process-heavy. DevOps is about doing all the good technical diligence in a way that marries with Agile practices and values - not dependent on tool choice Build up shared understanding by automation Jez: A document proves nothing. But a script is real proof that you have done what is in the script.
#23: ITIL is a framework. DevOps is a series of practices. While you could have lightweight ITIL implementations, they tend to be process-heavy. DevOps is about doing all the good technical diligence in a way that marries with Agile practices and values - not dependent on tool choice Build up shared understanding by automation Jez: A document proves nothing. But a script is real proof that you have done what is in the script.
#24: - not doing continuous deployment, but are making-ready - it takes time for large organisations to catch up to technical change - addressing cultural issues - building common understanding and shared ownership
#26: &#x201C;Stock photos are the bullet points of the twenty-first century&#x201D; - Martin Fowler

�ݺ�ߣ

Continuous Deployment and DevOps: Deprecating Silos - JAOO 2010

More Related Content

Continuous Deployment and DevOps: Deprecating Silos - JAOO 2010

Editor's Notes