Metrics have always been a key element of any project management method. As the industry adoption of Agile principles increased, the metrics themselves and the approach toward them also began to undergo a change. Now in Agile projects, velocity is often looked at as the measure of a team’s progress and hence the most significant or only way to measure productivity. This write-up attempts to explain the reasons why this can go wrong and what some alternatives are to bridge the gap.
Evolution of metrics
Traditional organization-related metrics:
When teams were smaller, the focus was on the actual code or software delivered. However, as organizations scaled up, there was a need to have KPIs and metrics in place. The underlying thought was that “if you can’t measure it, you can’t improve it.” This led to a host of metrics in project management to allow managers to measure the progress. The focus of this paper is on metrics related to a team’s performance. Some of the traditional metrics used for this are:
Agile approach to metrics:
- Resource utilization: Utilization % = Total effort spent by resource/Total budgeted effort for the resource
- Percentage completion
- Defect density, defect age, defect resolution rate, etc.
With the advent of Agile, the approach toward metrics changed. Agile evangelists started coaching teams that it is important to understand the concept and actual use of any metric rather than measurement for measurement sake.
Some of the opposition to traditional metrics is explained below.
Agile principles indicate that measuring a person’s effort is not an indication of the value delivered for the customers. It might be that estimate itself was incorrect, which is fine, as that is what “estimate” should mean anyway. Holding a person responsible for only hours burned in the office cannot in any way ensure good software delivery.
We have often seen that stories or parent work items show up as 90% complete if either 9 out of 10 tasks are done, or 90% of the hours estimated against them are spent, or remaining time is estimated to be 10%. However, that does not mean that the customer has received 90% of that feature. Until the feature/functionality is completely delivered, the customer has zero value.
Hence Agile encourages to either have status as Done or In Progress. Agile principles state, “Working software is the principal measure of progress.” This means that a story is either Done or not Done, i.e., percentage measures are not relevant.
Defect density, defect age, defect resolution rate, etc.:
Defect-related metrics are tricky. A high number of defects can either be interpreted as meaning that the code was not proper in the first go, or it can mean that the QA team has worked well. Similarly, lower defects can mean the code was good, or it can mean that the QA team has not been thorough.
Agile says (in effect) that since the customer has not asked for defects, any defects identified need to be solved as part of the development work itself. Tracking can be done for RCA, but the above metrics are not conclusive.
Also, if a story is not Done in the true sense, a “defect” is really a work-in-progress task. Hence defects can be considered only in higher environments, e.g., UAT or pre-production — i.e., the number of defects per story point in UAT or higher environment can be a metric.
Current state: Agile metrics
Since Agile projects have fewer metrics than traditional ones, the primary attention often goes to velocity.
Velocity is the number of story points delivered in a sprint. Teams normally use rolling average velocity to account for sprint-specific aberrations. Velocity can give a good idea for forecasting the end date of planned work (provided those user stories have been estimated).
It should be noted that story points should not be equated to time measurement, i.e., x story points = y hours should not be considered. This approach has a drawback: We lose the advantage of relative velocity calculation.
Story points normally equate to a range of time and thus have a normal distribution relation to hours, rather than a linear relation.
That means that a story with 1 story point will have a normal distribution of hours taken. The second story with story points 2 will have a normal distribution that would overlap partially with the smaller story. This concludes that since story points have a range, it is not a good idea to equate story points to hours.
Velocity when observed over a few sprints automatically flattens out the ranges, and once the team, platform, and technology are stable, we begin to see a fair uniformity in the velocity. This, coupled with the rolling average, is sufficient to forecast without the need to measure everything in the granular hour level.
Problems in using velocity as a productivity metric
When we consider the need of top management or program management to measure team productivity, and with projects moving to Agile, teams are often asked to “prove” productivity improvement through an improvement in velocity.
While this is a fair assumption for a team working in a completely stable environment, that is rarely the actual case. New impediments come up in various forms, such as:
- Impediment due to a third-party team with which we are integrating
- Changes in infrastructure at the wider level needing work to be done by the team
- Changes in requirements, which are freely accepted in Agile, e.g., there could be minor changes to UI suggested as part of a feedback, or there could be minor data encryption requirements. If these are small in nature, it is recommended to make them part of the actual user story that they impact. However, it is still additional work that the team is doing.
- Reusable code pieces that are estimated with fewer points when getting reused lead to a drop in velocity, e.g., 2 components, of 13 story points each, delivered in sprint 1 give a velocity of 26 points in that sprint. Now in the subsequent sprint, if a team is creating similar components, due to reuse they might estimate them as 8 points each. So instead of 2, the team might deliver 3 components in the sprint, but the velocity would be 24 points. i.e., velocity reduces from 26 to 24 even though 3 components were delivered versus 2.
The above points are either factors beyond a team’s remit, or things that actually are leading to a better and faster delivery to the end customer but not reflected in metrics if velocity is used as the sole yardstick for measurement.
How to measure team productivity
Since the main intent of an Agile team is to deliver value to the product owner and ultimately to the actual users, there are a few aspects that actually tell how productive a team has been.
- Continuous improvement — There can be an agreed set of KPIs for tracking improvements, e.g., code coverage, time spent in avoiding/reducing technical debt (such as refactoring for better code quality), innovations, etc.
- Process improvements — Moving toward true continuous delivery, or Agile maturity improvements. Agile maturity steps can be seen in terms of how the team takes ownership of the sprint backlog, how the stories are groomed, tools used for collaboration, etc.
To understand what should be measured to accurately reflect a team’s performance, we need to understand which items are within the team’s ownership and control. As discussed earlier, velocity in story points is impacted by other teams, as for a user story to be truly Done, it might have to be cross-checked by other teams, such as Risk and Compliance.
The one thing that is owned by the team is the tasks within a sprint. Hence it makes sense to track tasks. Below are some parameters of tasks in a sprint backlog that can be tracked, and the pros and cons of each.
Teams can very well estimate tasks in hours. Since each task is often owned by an individual developer, we don’t face the problem that we do in estimating story points in hours, with multiple people providing different hour estimates for a story. Plus the task level estimates tend to be fairly accurate.
However, for task hours we can face the issue of hours getting monitored for the wrong reasons, such as time tracking for monitoring resource utilization. There is also a perception variation between different managers regarding which hours to fill and track at the task level, whether actual effort spent, or percentage completion, or both.
Whether the team is using task hours or task points, the use of a burn-down chart is effective in checking whether progress is being made daily in the sprint. Any impediment will result in a deviation between expected and actual lines, as the Actual line runs flat during the time when work is impeded. The ScrumMaster will keep a watch on the trend line, and more flat phases in the trend line will be indicators of frequent blockers. That is, the burndown trend is not really a separate metric but rather a way to use the task hours or task points effectively.
Task points can be used to overcome some problems with hours. It is also faster to estimate in task points, as in the case of story points. However, teams or management might have confusion regarding story points and task points.
The right way to use task points is once all stakeholders are aware that there is no relation between story points and task points. Task points are purely for the team to assess their own tasks. The burn-down of task points will still be similar to the burn-down in case of hours, with the small difference that hour burn-downs tends to be more linear, whereas point burn-downs show dips.
The team can have a rule of thumb about what constitutes 1 task point, what constitute 2 or 3, etc. It is advisable to have only three levels of task points for easy estimation, corresponding to simple, medium, and complex tasks. The points should be given based on gut feel rather than mapping hours to points, or the benefit of relative estimation will be lost.
Using task points makes estimation faster than using hours, and yet it serves the purpose of having items within a team’s control, and the completion trend can be viewed against an ideal burn-down.
Percent completion in each sprint:
Since the ultimate aim is the overall release, the team can track the percent of the total release estimated points completed every sprint.
The drawback is that if there is a scope increase, the percent completion in a sprint will show a drop. So this metric can be useful only in case of fairly stable scope or at least a stable identified MVP for a release.
Ratio of points committed vs. completed in the sprint:
If the product owner is in sync with the team and is putting up sprint goals toward the product release goal, then team can measure the ratio of points committed versus completed.
However, this does not measure a team’s productivity as such, only the improvement in sprint planning. Hence it is useful for a predictable forecast.
For using any productivity metrics, the purpose of the metrics and the maturity and stability of the engagement needs to be considered. As the team matures, the tendency to focus on effort tracking and hours reduces. This is because in time, the velocity stabilizes enough to not need effort tracking for measuring productivity. The focus changes toward the nonmeasurable parameters, or a combination therein.
By using the burn-down trend effectively, and a task unit such as task points, the team can perform with agility. Also, productivity can be tracked at multiple levels.
- At the Scrum Team level: If a team’s productivity is to be tracked, the focus should be on how effectively the tasks are being completed. For this, the task points burn-down with the burn-down trend should be used, as the tasks are in the team’s control. Along with this, parameters such as quality, innovation, etc., can be kept track of. The Agile maturity of the team can also be assessed periodically. There are a number of assessment questionnaires available.
- At the project level: Since the project includes other teams that are integration points with the Agile team, the success indicator of the project is the velocity measured in story points. The stability of the work environment in terms of less frequency of impediments due to integration points or third-party dependencies is also a sign of stable, mature processes.
- At the organization/engagement level: The success of an Agile engagement is the actual product launch itself. Here the “product” may not be an actual product but rather any deliverable that was the key outcome required. The indicator of a stable, mature Agile engagement is the number of releases, or the frequency of release. This is because one of the purposes of Agile itself is incremental Agile development, which includes frequent releases.