ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Tactics for Testing DevOps
Infrastructure Code
Given by Derek C. Ashmore
STAREAST 2023
May 3, 2023
?2022 Derek C. Ashmore, All Rights Reserved 1
Who am I?
? Professional Geek
since 1987
? AWS since 2010
? Azure since 2017
? Specialties
? Application
Transformation
? Infrastructure
Automation
? Yes ¨C I still code!
?2022 Derek C. Ashmore, All Rights Reserved 2
Discussion Resources
? This slide deck
¨C /derekashmore/presentations
? ºÝºÝߣ deck has hyper-links!
¨C Don¡¯t bother writing down URLs
? Put questions in the chat ¨C we¡¯ll address them at the end
?2022 Derek C. Ashmore, All Rights Reserved 3
Agenda
Intro and
Level Set
Testing
Tactics
Testing
Challenges
Best
Practices
and Anti-
Patterns
Summary /
Q&A
?2022 Derek C. Ashmore, All Rights Reserved 4
Infrastructure as Code (IaC)
? Manual changes
¨C Increase errors
¨C Increase unwanted differences
between environments
¨C Increase admin workload
? Scripted/Coded changes
¨C Larger upfront cost, but¡­..
¨C Less busywork
¨C Leverage Others Work
¨C Decreases Errors
¨C Errors fixed in one place
¨C Eliminates unwanted differences
¨C Change history (with source control)
?2022 Derek C. Ashmore, All Rights Reserved 5
Infrastructure Code Categories
?Network / non-application specific infrastructure
? Virtual Networks/VPCs and subnets
? Route tables, Network peering
? Security groups / NSGs
?Application infrastructure
? Relational databases
? Serverless constructs
?Security privileges and policies
? IAM Roles and privilege grants
?Virtual machine image production
? Produce machine images for teams to use
?2022 Derek C. Ashmore, All Rights Reserved 6
Infrastructure Code Testing
?IaC is code!
? Housed in source control
? Often changed and released
? Needs testing like any other code
?IaC change can have negative impact
? Environment outages
? End-user internet connectivity outage
? Application outages
? Testing team delayed for four days
?Testing IaC can minimize negative impact
?2022 Derek C. Ashmore, All Rights Reserved 7
Infrastructure Code Testing Differences
?IaC != Application Code
? IaC requires external resources (e.g. Cloud) to run
? In-process unit testing often not possible
?Limited localized (in-process) testing
? Generally limited to syntax checks
? Terraform validation
? Ansible Dry Runs
? IDE syntax checks
?Most testing is ¡°integration¡± testing
?2022 Derek C. Ashmore, All Rights Reserved 8
Agenda
Intro and
Level Set
Testing
Tactics
Testing
Challenges
Best
Practices
and Anti-
Patterns
Summary /
Q&A
?2022 Derek C. Ashmore, All Rights Reserved 9
Static Testing
?Dry Run Testing
? Validates syntax
? Terraform validation
? Ansible Dry Runs
? IDE syntax checks
?Linter validation (e.g. tflint)
? Naming standards
? Deprecated syntax
? Simple best practice violations
?Is simplest and easiest to write
?Is not comprehensive
?Is relatively cheap to run
?2022 Derek C. Ashmore, All Rights Reserved 10
Static Testing Getting Started
?Start here ¨C quickest way to get testing benefits
?Execute Dry Runs and Linter Validation on change check-in
? Prevent merges with invalid syntax
? Prevent merges violating defined standards checks
?Automation is your friend
? No need to manually spend time with this
?This can be implemented in a DRY fashion
? Easy to write once and configure/activate for multiple repos
?Will catch some percentage of errors
?Is relatively cheap to run
?2022 Derek C. Ashmore, All Rights Reserved 11
Dynamic Environment Testing
?Sandbox Testing
? Run in a sandbox environment
? Automated tests to validate infrastructure is correct
? This is complicated and tedious
? Tools usually in the ¡°RSPec¡± family or cloud CLI scripts
? More comprehensive than static testing, but not 100%
? Limited to the checks you script
?Configuration Drift Testing
? Executed against the existing environments
? Outputs differences between IaC and what exists
? No intent to change
? Detects changes from other sources (e.g. manual)
? Capabilities differ by IaC platform
?2022 Derek C. Ashmore, All Rights Reserved 12
Sandbox Testing Getting Started
? Sandbox Testing
? Start simple ¨C just run the IaC in a sandbox environment
? Does the IaC complete without error?
? Forces IaC to be configuration driven
? Discourages environment-specific IaC
? Encourages writing IaC so it¡¯s easy to test
? Prioritize testing common code
? Has a larger usage base
? Has you prioritizing IaC that is executed most frequently
? Terraform modules and Ansible playbooks are good examples
? Add Interrogation Testing
? Run scripts after run to validate that the IaC did what is expected
? Scripts interrogate the cloud to validate the existence and configuration
? More complex and takes longer to write
? Adds test coverage at higher cost
? Tear down after test to conserve costs
? Has challenges
?2022 Derek C. Ashmore, All Rights Reserved 13
Configuration Drift Detection
?Detection capabilities differs per IaC tool
? Terraform plans can show changes
? Ansible dry runs
?Detection should be automated
? Tag the current version of IaC that¡¯s in an environment
? Run drift detection on a schedule
? Alert if drift is detected
?2022 Derek C. Ashmore, All Rights Reserved 14
Agenda
Intro and
Level Set
Testing
Tactics
Testing
Challenges
Best
Practices
and Anti-
Patterns
Summary /
Q&A
?2022 Derek C. Ashmore, All Rights Reserved 15
Infrastructure Code Testing Challenges
?Friction
?Dependencies
?Testing costs
?Manual intervention requirements
?Shifting Sand Problem
?2022 Derek C. Ashmore, All Rights Reserved 16
IaC Testing Friction
?Larger effort to write tests
? Tooling varies widely
? RSpec style testing [awspec (AWS), inspec
(Azure), etc.]
? Terratest
? Tooling requires additional expertise
? Ruby for Rspec
? GoLang for Terratest
?Hard to make comprehensive
? Sheer number of conditions and
attributes to check
?2022 Derek C. Ashmore, All Rights Reserved 17
Friction Mitigation
?Make IaC code easier to test
? Code in smaller reusable modules
? Small number of dependencies
? Self-contained
? Allows testing in Sandbox environment
? Re-usable Terraform module example
?Focus testing on greatest areas of risk
? Let defects drive test code activity
? Watch testing scope
? No substitute for app teams/testers for
changes
? Must follow SDLC with changes
?2022 Derek C. Ashmore, All Rights Reserved 18
Reusable Module Test Cases
?Example reusable modules
? Kubernetes Cluster
? Virtual Machine
? Virtual Networks and Subnets
? S3/Storage accounts
? Serverless services/functions
?70+ Modules in all
? Used in 200+ pipelines
?Tested in merge to master
?2022 Derek C. Ashmore, All Rights Reserved 19
IaC Testing Dependencies
?Dependencies are the heaviest type of friction
? Examples include
? Active Directory
? DNS services
? Virtual networking
? Labor costs money, but most of it is up front
? Don¡¯t strive for 100% coverage
? Spend testing labor time wisely
? Limit testing to our code, not infrastructure tech stack functionality
?The cost for not testing doesn¡¯t save money
? The business impact for deployed defects usually greater than testing runtime
costs
?2022 Derek C. Ashmore, All Rights Reserved 20
IaC Testing Costs
?All tests for IaC cost money
? Test in small, disposable units
? We always test in a Sandbox
? All resources deallocated after each test
? Labor costs money, but most of it is up front
? Don¡¯t strive for 100% coverage
? Spend testing labor time wisely
? Limit testing to our code, not infrastructure tech stack functionality
?The cost for not testing doesn¡¯t save money
? The business impact for deployed defects usually greater than testing runtime
costs
?2022 Derek C. Ashmore, All Rights Reserved 21
Manual Intervention Requirements
?Some companies require manual intervention
? Often dictated by company policy
? Examples include
? Requiring DNS entries to be manually entered
? Separate group allocates security privileges
? Cloud functionality changes for Azure storage groups
? On-premises connectivity
?IaC depending on manual intervention cannot have automation tests
? Localize the manual intervention requirements
?2022 Derek C. Ashmore, All Rights Reserved 22
Accommodating Manual Processes
?2022 Derek C. Ashmore, All Rights Reserved 23
Shifting Sand Problem
?Changes occur outside of IaC
? Common Sources
? Automatic dependency updates
? Cloud backend updates
? Right-hand/Left-hand issues
? Managing cloud assets from multiple
sources
?Makes IaC Code not Repeatable!
? Causes to break in future runs
? Recipe for unplanned work!
?Further reading
?2022 Derek C. Ashmore, All Rights Reserved 24
Avoid Automatic Updates!
?Perceived as a ¡°convenience¡±
? Avoids upgrade work
? Causes unplanned work
?Examples include
? ¡°Latest¡± dependencies
? Terraform cloud provider version
? Common IaC Code version
? VM automatic updates
? Security updates not avoidable!
?Convert ¡°unplanned¡± work to
¡°planned¡± work
?2022 Derek C. Ashmore, All Rights Reserved 25
Early Detection Strategy
?Scheduled testing in a sandbox
environment
? Detect/fix shifting sand problems before
users do!
?Addresses
? Required automatic updates
? Cloud back-end changes
? Uncontrollable actions by other groups
?Disadvantages
? Work is still unplanned
? Scheduled testing costs
? Labor to set up testing costs
?2022 Derek C. Ashmore, All Rights Reserved 26
Agenda
Intro and
Level Set
Testing
Tactics
Testing
Challenges
Best
Practices
and Anti-
Patterns
Summary /
Q&A
?2022 Derek C. Ashmore, All Rights Reserved 27
Infrastructure Code Testing Best Practices
?Don't expect perfection
? IaC output is much more complex than most application code
?Concentrate Automated Testing for reused modules
? Tests cover the most widely used infrastructure code
? Defects for widely used modules will have the most negative impact
? Example: Some common modules get used by dozens of pipelines
? Completely self-contained
? Test create all dependencies the reused code needs
? Easier because of smaller blast radius
? Less for the test to set up in the sandbox
? Easy to make completely disposable
? Use sandbox environment
? Drop after test (save $)
? Coding mistakes have less impact
?2022 Derek C. Ashmore, All Rights Reserved 28
Infrastructure Code Testing Best Practices (con¡¯t)
?Let defects guide automated test writing
? Focuses test writing labor on the most important
?Watch the blast radius
? Blast radius correlates with test code complexity
?Make IaC more easily testable
? Keep small and targeted
? Reduce dependencies
?2022 Derek C. Ashmore, All Rights Reserved 29
Infrastructure Code Testing Anti-Patterns
?No testing in Sandbox
? Need a place to test/edit without affecting others
? Environment must be disposable
?Inadequate testing in lower environments
?No substitute for real world contact with others
?Automated testing of IaC is not comprehensive
?2022 Derek C. Ashmore, All Rights Reserved 30
Summary
?Infrastructure code must have automated tests too
? Infrastructure code has defects too
? Defects can cause outages
? Untested code cannot be trusted
?Automated testing needs to be funded
? You will get a positive ROI
? You don¡¯t need to be perfect
? Don¡¯t let perfection prevent progress!
?Discipline is Key
? Changes starting out will take longer
? Will be faster after developers are up to speed
?2022 Derek C. Ashmore, All Rights Reserved 31
Thank you!
? Derek Ashmore:
¨C Blog: www.derekashmore.com
¨C LinkedIn: www.linkedin.com/in/derekashmore
? Connect Invites from attendees welcome
¨C Twitter: https://twitter.com/Derek_Ashmore
¨C GitHub: https://github.com/Derek-Ashmore
¨C Book: http://dvtpress.com/
? Please fill out the evaluation form!
?2022 Derek C. Ashmore, All Rights Reserved 32

More Related Content

Tactics for Testing DevOps Infrastructure Code

  • 1. Tactics for Testing DevOps Infrastructure Code Given by Derek C. Ashmore STAREAST 2023 May 3, 2023 ?2022 Derek C. Ashmore, All Rights Reserved 1
  • 2. Who am I? ? Professional Geek since 1987 ? AWS since 2010 ? Azure since 2017 ? Specialties ? Application Transformation ? Infrastructure Automation ? Yes ¨C I still code! ?2022 Derek C. Ashmore, All Rights Reserved 2
  • 3. Discussion Resources ? This slide deck ¨C /derekashmore/presentations ? ºÝºÝߣ deck has hyper-links! ¨C Don¡¯t bother writing down URLs ? Put questions in the chat ¨C we¡¯ll address them at the end ?2022 Derek C. Ashmore, All Rights Reserved 3
  • 4. Agenda Intro and Level Set Testing Tactics Testing Challenges Best Practices and Anti- Patterns Summary / Q&A ?2022 Derek C. Ashmore, All Rights Reserved 4
  • 5. Infrastructure as Code (IaC) ? Manual changes ¨C Increase errors ¨C Increase unwanted differences between environments ¨C Increase admin workload ? Scripted/Coded changes ¨C Larger upfront cost, but¡­.. ¨C Less busywork ¨C Leverage Others Work ¨C Decreases Errors ¨C Errors fixed in one place ¨C Eliminates unwanted differences ¨C Change history (with source control) ?2022 Derek C. Ashmore, All Rights Reserved 5
  • 6. Infrastructure Code Categories ?Network / non-application specific infrastructure ? Virtual Networks/VPCs and subnets ? Route tables, Network peering ? Security groups / NSGs ?Application infrastructure ? Relational databases ? Serverless constructs ?Security privileges and policies ? IAM Roles and privilege grants ?Virtual machine image production ? Produce machine images for teams to use ?2022 Derek C. Ashmore, All Rights Reserved 6
  • 7. Infrastructure Code Testing ?IaC is code! ? Housed in source control ? Often changed and released ? Needs testing like any other code ?IaC change can have negative impact ? Environment outages ? End-user internet connectivity outage ? Application outages ? Testing team delayed for four days ?Testing IaC can minimize negative impact ?2022 Derek C. Ashmore, All Rights Reserved 7
  • 8. Infrastructure Code Testing Differences ?IaC != Application Code ? IaC requires external resources (e.g. Cloud) to run ? In-process unit testing often not possible ?Limited localized (in-process) testing ? Generally limited to syntax checks ? Terraform validation ? Ansible Dry Runs ? IDE syntax checks ?Most testing is ¡°integration¡± testing ?2022 Derek C. Ashmore, All Rights Reserved 8
  • 9. Agenda Intro and Level Set Testing Tactics Testing Challenges Best Practices and Anti- Patterns Summary / Q&A ?2022 Derek C. Ashmore, All Rights Reserved 9
  • 10. Static Testing ?Dry Run Testing ? Validates syntax ? Terraform validation ? Ansible Dry Runs ? IDE syntax checks ?Linter validation (e.g. tflint) ? Naming standards ? Deprecated syntax ? Simple best practice violations ?Is simplest and easiest to write ?Is not comprehensive ?Is relatively cheap to run ?2022 Derek C. Ashmore, All Rights Reserved 10
  • 11. Static Testing Getting Started ?Start here ¨C quickest way to get testing benefits ?Execute Dry Runs and Linter Validation on change check-in ? Prevent merges with invalid syntax ? Prevent merges violating defined standards checks ?Automation is your friend ? No need to manually spend time with this ?This can be implemented in a DRY fashion ? Easy to write once and configure/activate for multiple repos ?Will catch some percentage of errors ?Is relatively cheap to run ?2022 Derek C. Ashmore, All Rights Reserved 11
  • 12. Dynamic Environment Testing ?Sandbox Testing ? Run in a sandbox environment ? Automated tests to validate infrastructure is correct ? This is complicated and tedious ? Tools usually in the ¡°RSPec¡± family or cloud CLI scripts ? More comprehensive than static testing, but not 100% ? Limited to the checks you script ?Configuration Drift Testing ? Executed against the existing environments ? Outputs differences between IaC and what exists ? No intent to change ? Detects changes from other sources (e.g. manual) ? Capabilities differ by IaC platform ?2022 Derek C. Ashmore, All Rights Reserved 12
  • 13. Sandbox Testing Getting Started ? Sandbox Testing ? Start simple ¨C just run the IaC in a sandbox environment ? Does the IaC complete without error? ? Forces IaC to be configuration driven ? Discourages environment-specific IaC ? Encourages writing IaC so it¡¯s easy to test ? Prioritize testing common code ? Has a larger usage base ? Has you prioritizing IaC that is executed most frequently ? Terraform modules and Ansible playbooks are good examples ? Add Interrogation Testing ? Run scripts after run to validate that the IaC did what is expected ? Scripts interrogate the cloud to validate the existence and configuration ? More complex and takes longer to write ? Adds test coverage at higher cost ? Tear down after test to conserve costs ? Has challenges ?2022 Derek C. Ashmore, All Rights Reserved 13
  • 14. Configuration Drift Detection ?Detection capabilities differs per IaC tool ? Terraform plans can show changes ? Ansible dry runs ?Detection should be automated ? Tag the current version of IaC that¡¯s in an environment ? Run drift detection on a schedule ? Alert if drift is detected ?2022 Derek C. Ashmore, All Rights Reserved 14
  • 15. Agenda Intro and Level Set Testing Tactics Testing Challenges Best Practices and Anti- Patterns Summary / Q&A ?2022 Derek C. Ashmore, All Rights Reserved 15
  • 16. Infrastructure Code Testing Challenges ?Friction ?Dependencies ?Testing costs ?Manual intervention requirements ?Shifting Sand Problem ?2022 Derek C. Ashmore, All Rights Reserved 16
  • 17. IaC Testing Friction ?Larger effort to write tests ? Tooling varies widely ? RSpec style testing [awspec (AWS), inspec (Azure), etc.] ? Terratest ? Tooling requires additional expertise ? Ruby for Rspec ? GoLang for Terratest ?Hard to make comprehensive ? Sheer number of conditions and attributes to check ?2022 Derek C. Ashmore, All Rights Reserved 17
  • 18. Friction Mitigation ?Make IaC code easier to test ? Code in smaller reusable modules ? Small number of dependencies ? Self-contained ? Allows testing in Sandbox environment ? Re-usable Terraform module example ?Focus testing on greatest areas of risk ? Let defects drive test code activity ? Watch testing scope ? No substitute for app teams/testers for changes ? Must follow SDLC with changes ?2022 Derek C. Ashmore, All Rights Reserved 18
  • 19. Reusable Module Test Cases ?Example reusable modules ? Kubernetes Cluster ? Virtual Machine ? Virtual Networks and Subnets ? S3/Storage accounts ? Serverless services/functions ?70+ Modules in all ? Used in 200+ pipelines ?Tested in merge to master ?2022 Derek C. Ashmore, All Rights Reserved 19
  • 20. IaC Testing Dependencies ?Dependencies are the heaviest type of friction ? Examples include ? Active Directory ? DNS services ? Virtual networking ? Labor costs money, but most of it is up front ? Don¡¯t strive for 100% coverage ? Spend testing labor time wisely ? Limit testing to our code, not infrastructure tech stack functionality ?The cost for not testing doesn¡¯t save money ? The business impact for deployed defects usually greater than testing runtime costs ?2022 Derek C. Ashmore, All Rights Reserved 20
  • 21. IaC Testing Costs ?All tests for IaC cost money ? Test in small, disposable units ? We always test in a Sandbox ? All resources deallocated after each test ? Labor costs money, but most of it is up front ? Don¡¯t strive for 100% coverage ? Spend testing labor time wisely ? Limit testing to our code, not infrastructure tech stack functionality ?The cost for not testing doesn¡¯t save money ? The business impact for deployed defects usually greater than testing runtime costs ?2022 Derek C. Ashmore, All Rights Reserved 21
  • 22. Manual Intervention Requirements ?Some companies require manual intervention ? Often dictated by company policy ? Examples include ? Requiring DNS entries to be manually entered ? Separate group allocates security privileges ? Cloud functionality changes for Azure storage groups ? On-premises connectivity ?IaC depending on manual intervention cannot have automation tests ? Localize the manual intervention requirements ?2022 Derek C. Ashmore, All Rights Reserved 22
  • 23. Accommodating Manual Processes ?2022 Derek C. Ashmore, All Rights Reserved 23
  • 24. Shifting Sand Problem ?Changes occur outside of IaC ? Common Sources ? Automatic dependency updates ? Cloud backend updates ? Right-hand/Left-hand issues ? Managing cloud assets from multiple sources ?Makes IaC Code not Repeatable! ? Causes to break in future runs ? Recipe for unplanned work! ?Further reading ?2022 Derek C. Ashmore, All Rights Reserved 24
  • 25. Avoid Automatic Updates! ?Perceived as a ¡°convenience¡± ? Avoids upgrade work ? Causes unplanned work ?Examples include ? ¡°Latest¡± dependencies ? Terraform cloud provider version ? Common IaC Code version ? VM automatic updates ? Security updates not avoidable! ?Convert ¡°unplanned¡± work to ¡°planned¡± work ?2022 Derek C. Ashmore, All Rights Reserved 25
  • 26. Early Detection Strategy ?Scheduled testing in a sandbox environment ? Detect/fix shifting sand problems before users do! ?Addresses ? Required automatic updates ? Cloud back-end changes ? Uncontrollable actions by other groups ?Disadvantages ? Work is still unplanned ? Scheduled testing costs ? Labor to set up testing costs ?2022 Derek C. Ashmore, All Rights Reserved 26
  • 27. Agenda Intro and Level Set Testing Tactics Testing Challenges Best Practices and Anti- Patterns Summary / Q&A ?2022 Derek C. Ashmore, All Rights Reserved 27
  • 28. Infrastructure Code Testing Best Practices ?Don't expect perfection ? IaC output is much more complex than most application code ?Concentrate Automated Testing for reused modules ? Tests cover the most widely used infrastructure code ? Defects for widely used modules will have the most negative impact ? Example: Some common modules get used by dozens of pipelines ? Completely self-contained ? Test create all dependencies the reused code needs ? Easier because of smaller blast radius ? Less for the test to set up in the sandbox ? Easy to make completely disposable ? Use sandbox environment ? Drop after test (save $) ? Coding mistakes have less impact ?2022 Derek C. Ashmore, All Rights Reserved 28
  • 29. Infrastructure Code Testing Best Practices (con¡¯t) ?Let defects guide automated test writing ? Focuses test writing labor on the most important ?Watch the blast radius ? Blast radius correlates with test code complexity ?Make IaC more easily testable ? Keep small and targeted ? Reduce dependencies ?2022 Derek C. Ashmore, All Rights Reserved 29
  • 30. Infrastructure Code Testing Anti-Patterns ?No testing in Sandbox ? Need a place to test/edit without affecting others ? Environment must be disposable ?Inadequate testing in lower environments ?No substitute for real world contact with others ?Automated testing of IaC is not comprehensive ?2022 Derek C. Ashmore, All Rights Reserved 30
  • 31. Summary ?Infrastructure code must have automated tests too ? Infrastructure code has defects too ? Defects can cause outages ? Untested code cannot be trusted ?Automated testing needs to be funded ? You will get a positive ROI ? You don¡¯t need to be perfect ? Don¡¯t let perfection prevent progress! ?Discipline is Key ? Changes starting out will take longer ? Will be faster after developers are up to speed ?2022 Derek C. Ashmore, All Rights Reserved 31
  • 32. Thank you! ? Derek Ashmore: ¨C Blog: www.derekashmore.com ¨C LinkedIn: www.linkedin.com/in/derekashmore ? Connect Invites from attendees welcome ¨C Twitter: https://twitter.com/Derek_Ashmore ¨C GitHub: https://github.com/Derek-Ashmore ¨C Book: http://dvtpress.com/ ? Please fill out the evaluation form! ?2022 Derek C. Ashmore, All Rights Reserved 32