際際滷

際際滷Share a Scribd company logo
Testing Infrastructure as Code
Best Practices and Common Mistakes
Given by Derek C. Ashmore
STAREAST 2022
April 24-29, 2022
息2022 Derek C. Ashmore, All Rights Reserved 1
Who am I?
 Professional Geek
since 1987
 AWS since 2010
 Azure since 2017
 Specialties
 Application
Transformation
 Infrastructure
Automation
 Yes  I still code!
息2022 Derek C. Ashmore, All Rights Reserved 2
Discussion Resources
 This slide deck
 /derekashmore/presentations
 際際滷 deck has hyper-links!
 Dont bother writing down URLs
 Put questions in the chat  well address them at the end
息2022 Derek C. Ashmore, All Rights Reserved 3
Agenda
Intro and
Level Set
Testing
Challenges
Best
Practices
and Anti-
Patterns
Summary /
Q&A
息2022 Derek C. Ashmore, All Rights Reserved 4
Infrastructure as Code (IaC)
 Manual changes
 Increase errors
 Increase unwanted differences
between environments
 Increase admin workload
 Scripted/Coded changes
 Larger upfront cost, but..
 Less busywork
 Leverage Others Work
 Decreases Errors
 Errors fixed in one place
 Eliminates unwanted differences
 Change history (with source control)
息2022 Derek C. Ashmore, All Rights Reserved 5
Infrastructure Code Categories
Network / non-application specific infrastructure
 Virtual Networks/VPCs and subnets
 Route tables, Network peering
 Security groups / NSGs
Application infrastructure
 Relational databases
 Serverless constructs
Security privileges and policies
 IAM Roles and privilege grants
Virtual machine image production
 Produce machine images for teams to use
息2022 Derek C. Ashmore, All Rights Reserved 6
Infrastructure Code Testing
IaC is code!
 Housed in source control
 Often changed and released
 Needs testing like any other code
IaC change can have negative impact
 Environment outages
 End-user internet connectivity outage
 Application outages
 Testing team delayed for four days
Testing IaC can minimize negative impact
息2022 Derek C. Ashmore, All Rights Reserved 7
Infrastructure Code Testing Differences
IaC != Application Code
 IaC requires external resources (e.g. Cloud) to run
 In-process unit testing often not possible
Limited localized (in-process) testing
 Generally limited to syntax checks
 Terraform validation
 Ansible Dry Runs
 IDE syntax checks
Most testing is integration testing
息2022 Derek C. Ashmore, All Rights Reserved 8
Agenda
Intro and
Level Set
Testing
Challenges
Best
Practices
and Anti-
Patterns
Summary /
Q&A
息2022 Derek C. Ashmore, All Rights Reserved 9
Infrastructure Code Testing Challenges
Friction
Dependencies
Testing costs
Manual intervention requirements
Shifting Sand Problem
息2022 Derek C. Ashmore, All Rights Reserved 10
IaC Testing Friction
Larger effort to write tests
 Tooling varies widely
 RSpec style testing [awspec (AWS), inspec
(Azure), etc.]
 Terratest
 Tooling requires additional expertise
 Ruby for Rspec
 GoLang for Terratest
Hard to make comprehensive
 Sheer number of conditions and
attributes to check
息2022 Derek C. Ashmore, All Rights Reserved 11
Friction Mitigation
Make IaC code easier to test
 Code in smaller reusable modules
 Small number of dependencies
 Self-contained
 Allows testing in Sandbox environment
 Re-usable Terraform module example
Focus testing on greatest areas of risk
 Let defects drive test code activity
 Watch testing scope
 No substitute for app teams/testers for
changes
 Must follow SDLC with changes
息2022 Derek C. Ashmore, All Rights Reserved 12
Reusable Module Test Cases
Example reusable modules
 Kubernetes Cluster
 Virtual Machine
 Virtual Networks and Subnets
 S3/Storage accounts
 Serverless services/functions
70+ Modules in all
 Used in 200+ pipelines
Tested in merge to master
息2022 Derek C. Ashmore, All Rights Reserved 13
IaC Testing Dependencies
Dependencies are the heaviest type of friction
 Examples include
 Active Directory
 DNS services
 Virtual networking
 Labor costs money, but most of it is up front
 Dont strive for 100% coverage
 Spend testing labor time wisely
 Limit testing to our code, not infrastructure tech stack functionality
The cost for not testing doesnt save money
 The business impact for deployed defects usually greater than testing runtime
costs
息2022 Derek C. Ashmore, All Rights Reserved 14
IaC Testing Costs
All tests for IaC cost money
 Test in small, disposable units
 We always test in a Sandbox
 All resources deallocated after each test
 Labor costs money, but most of it is up front
 Dont strive for 100% coverage
 Spend testing labor time wisely
 Limit testing to our code, not infrastructure tech stack functionality
The cost for not testing doesnt save money
 The business impact for deployed defects usually greater than testing runtime
costs
息2022 Derek C. Ashmore, All Rights Reserved 15
Manual Intervention Requirements
Some companies require manual intervention
 Often dictated by company policy
 Examples include
 Requiring DNS entries to be manually entered
 Separate group allocates security privileges
 Cloud functionality changes for Azure storage groups
 On-premises connectivity
IaC depending on manual intervention cannot have automation tests
 Localize the manual intervention requirements
息2022 Derek C. Ashmore, All Rights Reserved 16
Accommodating Manual Processes
息2022 Derek C. Ashmore, All Rights Reserved 17
Shifting Sand Problem
Changes occur outside of IaC
 Common Sources
 Automatic dependency updates
 Cloud backend updates
 Right-hand/Left-hand issues
 Managing cloud assets from multiple
sources
Makes IaC Code not Repeatable!
 Causes to break in future runs
 Recipe for unplanned work!
Further reading
息2022 Derek C. Ashmore, All Rights Reserved 18
Avoid Automatic Updates!
Perceived as a convenience
 Avoids upgrade work
 Causes unplanned work
Examples include
 Latest dependencies
 Terraform cloud provider version
 Common IaC Code version
 VM automatic updates
 Security updates not avoidable!
Convert unplanned work to
planned work
息2022 Derek C. Ashmore, All Rights Reserved 19
Early Detection Strategy
Scheduled testing in a sandbox
environment
 Detect/fix shifting sand problems before
users do!
Addresses
 Required automatic updates
 Cloud back-end changes
 Uncontrollable actions by other groups
Disadvantages
 Work is still unplanned
 Scheduled testing costs
 Labor to set up testing costs
息2022 Derek C. Ashmore, All Rights Reserved 20
Agenda
Intro and
Level Set
Testing
Challenges
Best
Practices
and Anti-
Patterns
Summary /
Q&A
息2022 Derek C. Ashmore, All Rights Reserved 21
Infrastructure Code Testing Best Practices
Don't expect perfection
 IaC output is much more complex than most application code
Concentrate Automated Testing for reused modules
 Tests cover the most widely used infrastructure code
 Defects for widely used modules will have the most negative impact
 Example: Some common modules get used by dozens of pipelines
 Completely self-contained
 Test create all dependencies the reused code needs
 Easier because of smaller blast radius
 Less for the test to set up in the sandbox
 Easy to make completely disposable
 Use sandbox environment
 Drop after test (save $)
 Coding mistakes have less impact
息2022 Derek C. Ashmore, All Rights Reserved 22
Infrastructure Code Testing Best Practices (cont)
Let defects guide automated test writing
 Focuses test writing labor on the most important
Watch the blast radius
 Blast radius correlates with test code complexity
Make IaC more easily testable
 Keep small and targeted
 Reduce dependencies
息2022 Derek C. Ashmore, All Rights Reserved 23
Infrastructure Code Testing Anti-Patterns
No testing in Sandbox
 Need a place to test/edit without affecting others
 Environment must be disposable
No peer reviews
 IaC is complex
 IaC requires large breadth of skills
Inadequate testing in lower environments
No substitute for real world contact with others
Automated testing of IaC is not comprehensive
息2022 Derek C. Ashmore, All Rights Reserved 24
Summary
Infrastructure code must have automated tests too
 Infrastructure code has defects too
 Defects can cause outages
 Untested code cannot be trusted
Automated testing needs to be funded
 You will get a positive ROI
 You dont need to be perfect
 Dont let perfection prevent progress!
Discipline is Key
 Changes starting out will take longer
 Will be faster after developers are up to speed
息2022 Derek C. Ashmore, All Rights Reserved 25
Thank you!
 Derek Ashmore:
 Blog: www.derekashmore.com
 LinkedIn: www.linkedin.com/in/derekashmore
 Connect Invites from attendees welcome
 Twitter: https://twitter.com/Derek_Ashmore
 GitHub: https://github.com/Derek-Ashmore
 Book: http://dvtpress.com/
 Please fill out the evaluation form!
息2022 Derek C. Ashmore, All Rights Reserved 26

More Related Content

Testing Infrastructure Code Best Practices and Common Mistakes

  • 1. Testing Infrastructure as Code Best Practices and Common Mistakes Given by Derek C. Ashmore STAREAST 2022 April 24-29, 2022 息2022 Derek C. Ashmore, All Rights Reserved 1
  • 2. Who am I? Professional Geek since 1987 AWS since 2010 Azure since 2017 Specialties Application Transformation Infrastructure Automation Yes I still code! 息2022 Derek C. Ashmore, All Rights Reserved 2
  • 3. Discussion Resources This slide deck /derekashmore/presentations 際際滷 deck has hyper-links! Dont bother writing down URLs Put questions in the chat well address them at the end 息2022 Derek C. Ashmore, All Rights Reserved 3
  • 4. Agenda Intro and Level Set Testing Challenges Best Practices and Anti- Patterns Summary / Q&A 息2022 Derek C. Ashmore, All Rights Reserved 4
  • 5. Infrastructure as Code (IaC) Manual changes Increase errors Increase unwanted differences between environments Increase admin workload Scripted/Coded changes Larger upfront cost, but.. Less busywork Leverage Others Work Decreases Errors Errors fixed in one place Eliminates unwanted differences Change history (with source control) 息2022 Derek C. Ashmore, All Rights Reserved 5
  • 6. Infrastructure Code Categories Network / non-application specific infrastructure Virtual Networks/VPCs and subnets Route tables, Network peering Security groups / NSGs Application infrastructure Relational databases Serverless constructs Security privileges and policies IAM Roles and privilege grants Virtual machine image production Produce machine images for teams to use 息2022 Derek C. Ashmore, All Rights Reserved 6
  • 7. Infrastructure Code Testing IaC is code! Housed in source control Often changed and released Needs testing like any other code IaC change can have negative impact Environment outages End-user internet connectivity outage Application outages Testing team delayed for four days Testing IaC can minimize negative impact 息2022 Derek C. Ashmore, All Rights Reserved 7
  • 8. Infrastructure Code Testing Differences IaC != Application Code IaC requires external resources (e.g. Cloud) to run In-process unit testing often not possible Limited localized (in-process) testing Generally limited to syntax checks Terraform validation Ansible Dry Runs IDE syntax checks Most testing is integration testing 息2022 Derek C. Ashmore, All Rights Reserved 8
  • 9. Agenda Intro and Level Set Testing Challenges Best Practices and Anti- Patterns Summary / Q&A 息2022 Derek C. Ashmore, All Rights Reserved 9
  • 10. Infrastructure Code Testing Challenges Friction Dependencies Testing costs Manual intervention requirements Shifting Sand Problem 息2022 Derek C. Ashmore, All Rights Reserved 10
  • 11. IaC Testing Friction Larger effort to write tests Tooling varies widely RSpec style testing [awspec (AWS), inspec (Azure), etc.] Terratest Tooling requires additional expertise Ruby for Rspec GoLang for Terratest Hard to make comprehensive Sheer number of conditions and attributes to check 息2022 Derek C. Ashmore, All Rights Reserved 11
  • 12. Friction Mitigation Make IaC code easier to test Code in smaller reusable modules Small number of dependencies Self-contained Allows testing in Sandbox environment Re-usable Terraform module example Focus testing on greatest areas of risk Let defects drive test code activity Watch testing scope No substitute for app teams/testers for changes Must follow SDLC with changes 息2022 Derek C. Ashmore, All Rights Reserved 12
  • 13. Reusable Module Test Cases Example reusable modules Kubernetes Cluster Virtual Machine Virtual Networks and Subnets S3/Storage accounts Serverless services/functions 70+ Modules in all Used in 200+ pipelines Tested in merge to master 息2022 Derek C. Ashmore, All Rights Reserved 13
  • 14. IaC Testing Dependencies Dependencies are the heaviest type of friction Examples include Active Directory DNS services Virtual networking Labor costs money, but most of it is up front Dont strive for 100% coverage Spend testing labor time wisely Limit testing to our code, not infrastructure tech stack functionality The cost for not testing doesnt save money The business impact for deployed defects usually greater than testing runtime costs 息2022 Derek C. Ashmore, All Rights Reserved 14
  • 15. IaC Testing Costs All tests for IaC cost money Test in small, disposable units We always test in a Sandbox All resources deallocated after each test Labor costs money, but most of it is up front Dont strive for 100% coverage Spend testing labor time wisely Limit testing to our code, not infrastructure tech stack functionality The cost for not testing doesnt save money The business impact for deployed defects usually greater than testing runtime costs 息2022 Derek C. Ashmore, All Rights Reserved 15
  • 16. Manual Intervention Requirements Some companies require manual intervention Often dictated by company policy Examples include Requiring DNS entries to be manually entered Separate group allocates security privileges Cloud functionality changes for Azure storage groups On-premises connectivity IaC depending on manual intervention cannot have automation tests Localize the manual intervention requirements 息2022 Derek C. Ashmore, All Rights Reserved 16
  • 17. Accommodating Manual Processes 息2022 Derek C. Ashmore, All Rights Reserved 17
  • 18. Shifting Sand Problem Changes occur outside of IaC Common Sources Automatic dependency updates Cloud backend updates Right-hand/Left-hand issues Managing cloud assets from multiple sources Makes IaC Code not Repeatable! Causes to break in future runs Recipe for unplanned work! Further reading 息2022 Derek C. Ashmore, All Rights Reserved 18
  • 19. Avoid Automatic Updates! Perceived as a convenience Avoids upgrade work Causes unplanned work Examples include Latest dependencies Terraform cloud provider version Common IaC Code version VM automatic updates Security updates not avoidable! Convert unplanned work to planned work 息2022 Derek C. Ashmore, All Rights Reserved 19
  • 20. Early Detection Strategy Scheduled testing in a sandbox environment Detect/fix shifting sand problems before users do! Addresses Required automatic updates Cloud back-end changes Uncontrollable actions by other groups Disadvantages Work is still unplanned Scheduled testing costs Labor to set up testing costs 息2022 Derek C. Ashmore, All Rights Reserved 20
  • 21. Agenda Intro and Level Set Testing Challenges Best Practices and Anti- Patterns Summary / Q&A 息2022 Derek C. Ashmore, All Rights Reserved 21
  • 22. Infrastructure Code Testing Best Practices Don't expect perfection IaC output is much more complex than most application code Concentrate Automated Testing for reused modules Tests cover the most widely used infrastructure code Defects for widely used modules will have the most negative impact Example: Some common modules get used by dozens of pipelines Completely self-contained Test create all dependencies the reused code needs Easier because of smaller blast radius Less for the test to set up in the sandbox Easy to make completely disposable Use sandbox environment Drop after test (save $) Coding mistakes have less impact 息2022 Derek C. Ashmore, All Rights Reserved 22
  • 23. Infrastructure Code Testing Best Practices (cont) Let defects guide automated test writing Focuses test writing labor on the most important Watch the blast radius Blast radius correlates with test code complexity Make IaC more easily testable Keep small and targeted Reduce dependencies 息2022 Derek C. Ashmore, All Rights Reserved 23
  • 24. Infrastructure Code Testing Anti-Patterns No testing in Sandbox Need a place to test/edit without affecting others Environment must be disposable No peer reviews IaC is complex IaC requires large breadth of skills Inadequate testing in lower environments No substitute for real world contact with others Automated testing of IaC is not comprehensive 息2022 Derek C. Ashmore, All Rights Reserved 24
  • 25. Summary Infrastructure code must have automated tests too Infrastructure code has defects too Defects can cause outages Untested code cannot be trusted Automated testing needs to be funded You will get a positive ROI You dont need to be perfect Dont let perfection prevent progress! Discipline is Key Changes starting out will take longer Will be faster after developers are up to speed 息2022 Derek C. Ashmore, All Rights Reserved 25
  • 26. Thank you! Derek Ashmore: Blog: www.derekashmore.com LinkedIn: www.linkedin.com/in/derekashmore Connect Invites from attendees welcome Twitter: https://twitter.com/Derek_Ashmore GitHub: https://github.com/Derek-Ashmore Book: http://dvtpress.com/ Please fill out the evaluation form! 息2022 Derek C. Ashmore, All Rights Reserved 26