Story Point Comparison Retro
I’ve been blogging a lot about commitment and estimating lately both on this blog and my work blog. In a discussion with a Product Owner the other day, we were talking about looking at story sizes after the sprint to determine if they were sized correctly. I think this is a great idea and I’d like to plan it into a retrospective.
Remember we estimate stories based on effort – hence using points instead of days*. A good way to compare story sizes is to talk about the effort put into a particular story that everyone is familiar with in the sprint or previous sprint and compare it with a similarly sized story from an older sprint that folks are also familiar with. Were they about the same amount of effort? Were they not? How can this help to keep our estimates consistent and size stories more accurately in the future? If we all agree on point sizes in regards to effort (someone may put the same amount of effort in as someone else but it takes them longer – hence again NOT using days or converting points to days – very subjective), we will become more consistent and predictable in our estimating and commitments.
This is not meant to point out shortcomings or blame when estimates are “wrong”. It’s simply to figure out if the team is all on the same page when determining amounts of effort and identifying discrepancies that hinder our estimating and therefore predictability or lack thereof. It’s important to ensure our team scale has not shifted and people are thinking about points (or shirt sizes, or dinosaurs etc.) different than others.
So how should the retro be structured and facilitated?
1. Compile two to three stories of same sized point values.
2. Do this for a few different point values (e.g. Two 3 point stories, Two 5 point stories…).
3. Read out the description and acceptance criteria for the same sized stories.
4. Debrief on whether or not the actual effort in each story is similar to the other. Things to consider:
- Was the story broken down well?
- Were there tasks/sub tasks?
- How much risk was accounted for in estimating?
- Did we find out more details later that changed the effort/estimate?
- Not only development effort but also testing effort
- What made the stories similarly sized?
- If you have a time tracking feature in your tool, consider glancing at the hours recorded (if they are accurate and used). Don’t harp on this but it could provide some insight or drive more conversation.
- How many people were working on the story? Was there pairing?
5. Conclude if the stories were sized correctly and took similar amounts of effort according to the team’s scale.
6. Repeat for other story size categories.
7. Debrief with the team on whether or not the team is on the same page when sizing stories.
8. Size some un-sized stories soon.
Many teams get into ruts where someone says “it feels like a 5” but no one knows what a 5 is anymore. Or when doing sizing or things like planning poker team members concede and agree to avoid getting into sizing conversations that can drive out the risks and hidden effort and just to get done faster. This exercise will force the team to look at similarly sized stories to see if they really are yielding the same amount of effort and if the team’s point scale is homogeneous and agreed upon throughout.
Many people will say this isn’t a behavior seen in mature teams but I would disagree. Any team can improve their sizing, especially if there is a lot of carry over in a certain sprint. This can also identify how the POs, team and ScrumMasters can help to get more information to get a better defined story that is more straightforward to estimate. All these thing tie together toward the common goal of increasing our predictability as a team and determining what we can commit to getting to done within a sprint, release, and version.
*Note – I’m not going to get into a debate about points versus days. I’ve been there, done that. I’m also not going to get into #noEstimates here. In my opinion at this time, points are preferrable to days in estimating so effort, not subjective time can be estimated. While #noEstimates is sexy, I do not believe it is viable in all organizations or is appropriate for all teams. In general, do what will work – if it’s not working inspect and adapt it.