To improve a software development project, it is critical to measure its current performance to analyze the project's health, research the root cause of problems, and set up improvement plans. A consistent measurement system is required to ensure a standardized measurement that makes sense at all levels of the organization.
An often-misunderstood problem is that the tenets of the Agile Manifesto are guidelines rather than rules. One of its principles talks about valuing "individuals and interactions over processes and tools" — the key element being more value, a part of the concept that often gets lost.
Scrum doesn’t really talk about too many metrics. Metrics such as velocity are often not comparable between projects. From a project management perspective, comparing the velocity among multiple projects is misleading, as individual teams may have a general tendency to overestimate or underestimate. This leads to inaccurate conclusions regarding the project's health.
The actual focus of measurement should therefore be on the number of working features in the software being developed at any given time. The running tested features (RTF) is an empirical metric that is measurable week in and week out during the project's entire duration.
Management often requests that teams provide metrics that reflect the progress of a project. Metrics, when used to compare between project performances, often turn counterproductive; the "invisible gun" effect often leads teams to game the metrics system. This prevents the teams from getting the full value out of the empirical Agile process and impedes them from achieving their best.
The RTF metric upholds the Agile Manifesto by valuing individuals and interactions over processes and tools. We measure, at any moment in a project, how many features are passing all acceptance tests. To show a linear RTF progression, the entire team must learn to be Agile and do Agile.
The primary focus of software development is to have a working product that has the most possible working features per dollar of investment. This idea is called running tested features
(RTF), which is an empirical way to determine, at any point of the project cycle, the number of features that are:
- Running. Fully developed and tested features are added to the integrated product increment.
- Tested. The features are passing the automated acceptance tests.
Aside from being a very powerful tool, the RTF ratio alone, as shown below, does not completely reflect project health or aid in the comparison of progress among similar projects.
To measure projects, we propose two additional metrics
to be used in conjunction with the velocity and RTF Ratio:
- RTF completion ratio: Can be used to track project completion at a high level, in terms of number of features that pass automation tests with respect to the total number of features.
- RTF Velocity Ratio: May be used to track the volume of story points that pass automation tests with respect to the number of story points developed. This gives an accurate picture of the complexity of features that do not pass automated regression tests.
The flowchart above defines the method to incorporate the RTF metrics in a sprint work flow.
Understanding trends in RTF metrics
Let us take an example of Project X, having ten features (N = 10). Assuming that the metrics are calculated at the end of each sprint, the calculations looks something like this at the end of three sprints:
The RTF ratio and RTF velocity ratio should ideally remain constant as 1, assuming all completed features are working as expected. However, in real life the RTF ratio and RTF velocity ratio fluctuate, ranging from 0 (all features failing) to 1 (all features passing), depending on how many completed features are failing automated regression tests.
For example, in Sprint 2 of Project X, the number of RTF is four (C) out of five developed features (A). The RTF Ratio is 4/5 = 0.80. The total number of story points developed up to Sprint 2 is 17 (B), of which 12 story points (D) correspond to the 4 RTF. The RTF Velocity Ratio is 12/17 = 0.71.
The RTF completion ratio should be linear, ideally increasing from project initiation, and reaching peak value at the end of the project. It may have values ranging from 0 (“No features are running and/or tested”) to 1 (“All features are running and tested”).
Project X has a total number of ten features, represented by N. In Sprint 2, in which there are four RTFs, the RTF completion ratio = 4 / 10 = 0.40.
Action plan during a downward trend
Any of the RTF trends may dip during the lifetime of a project when features are not working as expected. This would require immediate inspection and a Plan–Do–Check–Act process (PDCA/demoing cycle) to kick off an analysis of the root cause. This fosters agility in Scrum Teams and encourages them to consistently deliver completely tested features.
Measuring the velocity of a project is a general approach to assessing the performance of an Agile project. The velocity trend, along with RTF metrics, helps provide a correct representation of the team’s progress and is an effective tool to compare multiple Scrum projects running in an organization.
Case study A
The first case study draws a comparison between projects A and B, which have widely dissimilar velocities, with almost a similar number of features, over a period of ten sprints.
From Sprints 1 to 10:
- Project A has developed 254 story points for 55 features.
- Project B has developed 411 story points for 58 features.
- Project A has a total of 58 features (NA = 58) to develop for the project.
- Project B has 60 features (NB = 60).
Looking at the velocity trend alone, it appears that Project B is better performing than Project A, as it has a greater number of story points. Let us study the RTF trends below to get a better picture.
Comparing RTF ratios:
Even with widely different velocity trends, we can see that the RTF ratio trends for Project A and Project B are similarly aligned. A sharp dip in Sprint 2 and Sprint 5 indicates that there were considerable regression test failures at that point, calling for inspection of what went wrong. The root cause was found and corrected, as is evidenced by the increase in the RTF ratio in the next sprint for both projects.
Comparing RTF velocity ratios:
Although Projects A and B have similar RTF trends, we can see here in Sprint 5 that Project B shows a steep dip in the velocity ratio; this indicates that stories with a higher number of story points have failed regression tests. Stories with higher story points that fail regression tests are more likely to impact other functional areas and may even be showstoppers.
This metric allows us to incorporate quality feedback earlier in the application development life cycle, triggering quicker fixes, extensive root cause analysis, and helping in the creation of effective defect-prevention plans.
Comparing RTF completion ratios:
The RTF completion ratio trend is very similar to the release burn-up chart. One of the additional features of the RTF completion ratio is that it helps mark that point in the project timeline where there was a failure event (e.g., code breakage). Usually, if the failure event is remediated, the project will be back on the linear track the next time the metric is measured.
Case study B
The second case study draws a comparison between projects A and B, which have almost similar velocities and numbers of features, over a period of ten sprints.
From Sprints 1 to 10:
- Project A has developed 458 story points for 60 user stories or features.
- Project B has developed 457 story points for 60 user stories.
- Project A has a total of 60 features (NA = 60) to be developed in the project.
- Project B has a total of 60 features (NB = 60).
Looking at the velocity trend alone, it appears that Projects A and B have comparable project complexity and are making similar progress, as they have a similar number of story points.
Comparing RTF ratios:
Even with almost similar velocity trends and almost the same number of features, we can see that the RTF ratio trends for both Project A and Project B are widely different. Project B shows a sharp dip in RTF from Sprint 5 through Sprint 7, indicating that there was a major cause of failure at that point, calling for urgent inspection of what is going wrong.
Even though the RTF ratio increases after Sprint 6, it is still far below the standard value of 1. Such a trend may be attributed to showstopper defects or severe code breakage due to lack of unit testing, insufficient integration testing, incorrect automated regression test cases, etc. In any case, without urgent action, projects may slip beyond stipulated timelines, which might ultimately hurt the organization’s business.
Comparing RTF velocity ratios:
If you refer to the RTF ratio comparison in the previous plot, we can see that for Project B, the RTF ratio improves after Sprint 6. A look at the RTF velocity ratio for Project B would show that the RTF velocity ratio actually declines in Sprint 7 by 9 percent; this indicates that a greater number of story points have failed regression tests. Careful inspection of the root cause analysis is required to fix the cause of the variation. The trend shows that the cause of variation was progressively being reduced and was eliminated completely in Sprint 10, when all features are "running" and "tested."
Comparing RTF completion ratios:
The RTF completion ratio trends for both Projects A and B are linear and increase progressively. But compared to Project A, Project B experienced a dip during Sprints 6 through Sprint 8.
Solution business benefits
To completely understand the benefits of RTF trends, we need to see how the RTF trend looks for a typical Waterfall project.
Waterfall projects "deliver" all features in one go, without testing. Features that are not tested often do not work, thus pulling down the RTF. There is considerable overhead when rework is required after defects are discovered during the testing cycle. Also, not all features can be tested in one cycle, because the total number of features might be too high. The RTF for non-Agile projects builds from the point of code delivery and often dips because of new defects. For this reason, non-Agile projects just cannot deliver RTF consistently.
RTF in Agile projects not only reflects the project's health accurately and empirically but it also:
- Demands that the number of RTF grows from day one — the focus being on features, not only on design or the infrastructure.
- Ensures that the RTF grows consistently; teams must constantly refactor and integrate as often as possible.
- Automates user acceptance tests, requiring more interaction with the business and providing independent and comprehensive test scenarios.
- Includes automated tests that minimize human error, due to missed test runs, and saves effort, which can be used efficiently in other ways.
- Publishes RTF at regular intervals, which results in thorough rounds of regression, aiding in early detection of issues.
- Fosters agility by incorporating the PDCA cycle whenever there is a dip in RTF.
For organizations that need to constantly measure and analyze the progress of Agile projects, no metric paints a better picture than RTF. Velocity trends are often used to compare projects; however, it is important to understand that no two projects are alike, and teams having different sets of people often estimate differently. Some teams estimate higher than others, which might lead to misconceptions that a team with lower velocity is underperforming.
RTF eliminates such anomalies; teams are measured on how many features are running and tested to create a product that has value. And this is what we care about most.
- Raghu Angara, "Agile Metrics: RTF." Scrum Alliance, June 9, 2014, https://www.scrumalliance.org/community/articles/2014/june/agile-metrics-(1).
- Ron Jeffries, "A Metric Leading to Agility," June 14, 2004, http://ronjeffries.com/xprog/articles/jatrtsmetric/.