Kicking off any new software project is scary. Kicking off an agile software project can be even scarier.
However, rather than being afraid of risk, we can embrace it and learn from it. The lessons we learn will help us guide our teams to an overall stronger product.
18. As a customer,
I want to review my transactions,
so I can see where Im spending my money.
Theres too many transactions to pull quickly.
Some transactions are delayed.
19. Some transactions are delayed.
Probability = 3
Impact = 1
Exposure = Probability Impact= 3
20. Theres too many transactions to pull quickly.
Probability = 2
Impact = 3
Exposure = Probability Impact= 6
21. Theres too many transactions to pull quickly.
Invest time in performance tuning.
Pre-aggregate the transactions.
23. As a customer,
I want to view how my current spending
compare against my budget,
so I can see how Im pacing for the month.
The graph doesnt work in IE6.
The UI Guy gets sick mid-sprint.
24. The graph doesnt work in IE6.
Probability = 3
Impact = 1
Exposure = 3
25. The UI Guy gets sick mid-sprint.
Probability = 3
Impact = 3
Exposure = 9
26. The UI Guy gets sick mid-sprint.
Pair with another developer.
Peer code reviews.
#3: This talk is about how to mitigate risk in agile projects
But first, lets think about what risk is
#4: Software projects are fraught with risk
Wrong thing
Not what the customer wants
Not working as a team
Duplicated work
Incompatible work
Thrashing
Festering problems
No clear vision from the product owner
Engineering problems
Broken builds
Bad code
#5: Ever heard this?
Agile is sometimes considered high risk because is moves fast and defers much up-front planning
#6: But, much of agile is designed to shake out this risk
Frequent checkpoints between the team and stakeholders (demos)
Frequent interactions between teams (stand-ups)
Frequently bringing problems to a the forefront (retrospectives)
Every project has risk
The difference is that agile identifies and reacts to the risk as they emerge,
Rather than trying to simply plan them out at the beginning
#7: Iterative methodologies, like scrum, are often represented by two overlapping circles
Agile projects are often made up of a large iterative process (project or releases)
Made up of smaller iterative processes (sprints)
#8: We can think of risk management in the same way
High level (project) risk What would make or break the project as a whole
Low level (iteration) risk What day to day risks will we encounter while building the features that make up the app
The two are related
If we can get a feel for the high level risks,
Then that will help us plan for and react to the lower level risks
So, well start with the high level
#9: Risks at the project level can best be thought of as constraints
Its important to know these constraints, since they tell us what levers we can pull when things go wrong
#10: Constraints are often captured as the Iron Triangle of software development
Features (scope)
Deadline (schedule)
Team (budget)
The idea is that these constraints control the outcome of an project, and that you can always improve one factor by tweaking the others
Need to hit a deadline? Reduce features and increase the team.
Need more features? Push the deadline and increase the team.
The problem with this is that it implies the Team is also a lever that can be pulled at any point during the project.
We know better than this
Dont ever let resources enter the discussion
#11: Instead, think of this as a fulcrum between
We balance between Features as Deadline.
If we increase features, then the deadline must slip
Or, we can accelerate the deadline by pulling back features
These are the only two constraints that you have any semblance of control over in a project
But theyre incredibly powerful ones to have
What do we do with this?
We need to establish which of these two constraints in the most critical early in the project
Priorities change
Market
Political
Periodically revisit them throughout the project
Make sure your hand is still on the right level in case you need to pull it
This isnt always as easy as simply asking your business owner every 30 days things are going great, but if things happen do you want to ship less or ship late
Instead, you often just need to listen to whats happening inside of your company
What pressures do people seem the most sensitive to?
Whats happening in the industry, with the customer base or competitors that could influence a decision?
#12: The iteration level is where we really tackle risk
#13: Risk analysis is hard
Its a very big subject
Complex theories involving probability and statistics
Lots of books devoted to it
Entire sections of project management training devoted to it
Risk analysis doesnt need to be hard
Most of our risk analysis techniques are borrowed from less forgiving fields of engineering
i.e., designing bridge that must stand for 200 years
#14: It really boils down to two questions:
What could happen?
How likely is it to happen?
#16: PIE is a very simple risk analysis technique, which lets us quickly identify and rank risks that could affect the work a team tackles during an iteration
This helps us quickly identify which risks are important enough to start thinking about how to mitigate
#17: Lets look at PIE from a high level
We start with a story
Then we think of any risks which may affect that story
Not necessarily security risks, but risks to completing the story to the customers satisfaction
For the risk, we think about how likely it is to occur
And we give it a score between 1 and 3
Then we think about how severe the impact would be, if it occurred
And we give that a score between 1 and 3
Then, we multiply those two number together, and if the score is over 6
We think of at least one option to mitigate the risk, should it happen
Note that we dont need to solve the problem here, we just want to think of some options to have on the table if the risk occurs
Then we repeat for as many risks as we can think of, for that story
Note that the scores for both probability dont include anything that could never happen or thats guaranteed.
If it could never happen, then we dont need to waste time thinking about it.
If its guaranteed, than it should be built into the story at the outset.
This is designed to focus our efforts.
#19: Imagine were building a banking app, like Mint.com
Lets try this with some stories we may encounter
Such as being able to view recent transactions
Risks
Some of my transactions may be delayed up to 24 hours.
Theres too many transactions to pull in a reasonable time.
#20: At least some transactions are likely to be delayed, so we score this as probability of 3
But those transactions are likely from smaller providers that the average Mint user isnt concerned about, so we score the impact as a 1
Multiplying these together gives us an exposure score of 3, so we can consider this risk low
#21: We cant predict whether the transactions can pull quickly or not, its a flip of a coin. So we give this risk a score of 2.
The impact of this would be significant since it prevents the story from being usable. So we give it an impact score of 3.
This gives us an exposure score of 6.
I like to use 6 as a threshold for risks I like to have some strategies ready for.
Use a threshold that makes sense for you.
#22: How to we handle this risk?
This is basically a performance issue. How do we handle performance issues?
Either by addressing the issue directly
Or working around it with tricks like pre-aggregating or caching
Remember the constraints from the project?
These will influence which strategy is right for you.
If real time data is an important feature then youll probably want to be ready to invest additional time to fix the problem.
But, if deadlines are more important, then simply pre-aggregating and caching the data may be the right choice.
Remember that youre not trying to come up with the best solution for mitigation strategies, just starting points.
You dont need to solve the problem here, you just want some options on the table in the event something happens.
Its easier to plan your fire exits, before your house is on fire.
#23: Imagine we want to a display a graph that allows us to capture how our spending is pacing against our budget for the month
#24: The story would capture that we want to view how our current spending is pacing against our monthly budget.
Risks
The graph doesnt work in certain browsers.
Only one developer is familiar with the UI (The UI Guy), and he gets sick in the middle of the sprint.
#25: The likelihood of the graph not working in IE6 is almost a certainty, so we score its probability at 3
IE6 has very few users though, so we score its impact as a 1
This gives us an exposure score of 3, which means a risk that originally seemed important isnt that important at all
#26: The likelihood of the UI getting sick mid-sprint isnt guaranteed, but wouldnt be surprising, so we give it a probability score of 3
The impact of this would be significant as no one else can do this work, so the impact score is also 3
This gives us an exposure score of 9, which tells us that this is a very serious risk to our story
#27: How do we handle this? This is really an issue of cross-pollination across the team, or a low bus factor.
How do we improve a bus factor?
Pair programming.
Peer code reviews.
Not formal code reviews, but simply pull requests between members of the team.
Note that these mitigation strategies are different than those for our previous risk
The previous mitigation strategies were options to have ready on the table, but not to use until the risk actually occurred
These are options we would need to start using immediately to get out in front of the risk if it occurs
If we wait until the UI Guy gets sick to start doing pair programming, then its too late
Some mitigation strategies will be options in your pocket that you dont pull the trigger on until you need them
Others will be strategies you start immediately to get out in front of risk before they occur
#28: PIE is iterative
Put this together at the beginning of an iteration
Review it each morning before the standup
Are there any risks to the stories currently in flight that you need to listen for?
Update it after the standup
Add new risks as they emerge
Remove old risks as they fall off
Rescore it at anytime.
Like most things in agile, the power in this is in the iterative nature
This only works if its kept up to date, otherwise, it becomes stale, and then ignored
#30: We can take the information we learn from risk analysis, and let it influence our planning
Tackle high risk stories first
If a story scores high risk it should be moved to the top of the sprint
This will let you get out in front of these stories early
You dont want to wait until the last few days of the sprint to find out you were right
If story has a lot of risks, especially diverse risks, this may be sign it needs to be broken down further
At the very least, this will let you spread your risk out rather than having it concentrated in a single chunk
Youll likely see trends and patterns emerge that accompany certain types of stories
I call these risk archetypes
Things like UI is always risky because only one developer knows UI, or every time we touch the budget data the system gets really slow
Once you start to recognize these trends think about what you can do to get out in front of them
These may be candidates for investing some TLC in the system
#31: I love this quote, because it really captures how we should think of risk.
Risk will always be there, we shouldnt be afraid of it.
Its not something we can plan out. Instead, we need to:
Recognize it
Embrace it
And learn from it
And then, we can ship awesome products.