Technical Debt for PMs

18 September 2012

Victor Szalvay
CollabNet

 

This article was originally featured at ProjectsAtWork and is reprinted here with permission.

 

As a software project manager or product owner, the term ”technical debt” may not ring a bell. Over the last several years managing Agile teams, I’ve come to realize that lurking technical debt introduces risk and represents unplanned work that impacts schedules and ultimately the long-term health of projects. I believe that project managers and product owners should make technical debt a priority and factor a “re-payment” program into project schedules as a regular part of planning.  

Technical debt is a term coined by Ward Cunningham to describe the cumulative consequences of corners being cut throughout a software project’s design and development. The basic idea is that a development team, over time, cuts corners due to lack of skill, laziness or pressure, which then manifests itself as a large backlog of technical inefficiencies that needs “paying off” in the future. Technical debt encompasses issues related to the code base, development environment, platform and libraries, architecture and design, and test automation, among others. The impact on the project includes reduced velocity, recruiting difficulties, and high defect rates.

Consider this hypothetical scenario. Bill the PM has a strong track record of successfully delivering on promised schedules or features. In order to make this happen under pressure, Bill’s team has neglected the code base to the point where only certain people understand parts of the code. The code base is so convoluted and difficult to understand that few new recruits are willing to work on it. Worse still, adding new features takes longer and longer as time passes. Bill has seemingly won schedule-related battles; however, the product has slowly deteriorated and the war is nearly (and surprisingly) lost.

What had Bill missed? Monitoring the technical health of a software system is just as critical as scope and schedule. Not the PMs role, you say? With the popularity of Agile methods booming, the product owners often unwittingly takes on this responsibility as the “single wring-able neck.” Even in traditional shops, the PM often tracks risks and estimates impact on milestone schedules. What’s more, engineers aren’t always forthcoming about the technical debt situation for reasons of pride or job security. So it often falls on the PM to ask the right questions to ascertain the real state of technical affairs.

There are several areas of technical debt that need examination. Below I focus on the four that I’ve found most impactful. For purposes of brevity I assume an object-oriented programming language (like Ruby, Java, C#, C++, etc.).

Code base

Sure you have some defects here and there, but your software works for the most part. What else is there to know? Actually, quite a bit. If your code is a Big Ball of Mud, chances are the speed at which you deliver features (or velocity) is declining. It’s also likely to be brittle when modified leading to regression defects. One of the biggest impacts, though, is that most talented software engineers aren’t going to be interested in working on it. And even in today’s brutal job market, talented software engineers have their pick of the lot when it comes to work places. To get a sense for the quality of the code base, sit down with trusted engineers and ask the following questions:

 

  • Is our current code base neatly separated into layers, like the MVC pattern suggests? In other words, Is business logic separated from the data access and presentation layers, etc.? You’re asking whether coders have discipline in terms of where certain things live in the code. If there is little or no separation of layers, you are likely sitting on a big ball of mud.
  • How many lines of code is our average class/method? What is our longest class/method? The longer the worse. There is no firm rule here but long, complicated classes and methods point to lack of discipline in terms of maintainability.
  • Is the current code base well documented? If not, new recruits will have a tough time getting up to speed and current team members will be hesitant to work on certain areas of the code.
  • Do most functions have a “single source of truth” or is there duplicate code everywhere? It is risky if there is an unknown number of places in the code that must be updated to make a single enhancement.
  • I want to visually inspect the longer methods. Now, I know you’re out of your depth, but you should be able to spot overly complex code without understanding it. For instance, look for conditional statements that nest to a ridiculous degree; typically more than a couple layers of indenting should be unnecessary.
  • What are the project’s code style standards and how are they enforced? You’re looking to see that there are standards at all and that they are widely known throughout the team. Better still is if the SCM or build process has automated checks in place to catch violations of the standards. Standards make it easier to spot defects and make it easier for team members to work on each other’s code.

 

Platform/Architecture

Most software today is built on the shoulders of internal or third party libraries, application servers, and development frameworks aimed at speeding development.  

 

  • Do we rely on a third party application or database server and when was the last time we upgraded those components? Falling too far behind the latest version can lead to support problems or force the team to spend time reintegrating the application.
  • Do we rely on any outdated libraries? The risk is that while these libraries may work now, they may no longer be actively developed or upgrading to a current version may be time consuming or risky.
  • What data persistence technology do we use and how current is it? Do we use a custom database query generator or do we use something standard (like Hibernate in the Java world)? Falling behind here can impact application performance as well as developer productivity. A custom query generator shouldn’t be necessary in today’s world of ORM frameworks.

 

Build & Development Environment

 

Unless you’re doing something very simple, it’s likely that your developers operate in a complex development environment. They may be using IDEs and build tools that have tremendous impact on their daily productivity and happiness.  

  • Is there a standard IDE configuration across the group, and are you happy with what you’re using? Even if your team uses different IDEs, the configuration can likely be shared. A shared configuration means developers can spend less time fiddling with their environment and standards can be more easily applied.
  • How long does it take to build, and are you happy with that? Imagine having to wait 10 or 20 minutes or more to see a single code change; that is actually quite common (and frustrating). If you have to wait that long, wouldn’t you be tempted to make blind changes without testing it yourself? Build times can usually be improved but need dedicated time set aside to improve it. Long build times also impact developer focus. If the build takes “too long” developers can lose focus or wander off to browse the web or even have a sword fight.
  • What’s involved in building the software? Can you walk me through a build on my laptop? What you’re looking for here is cumbersome and complex build requirements that can’t be satisfied by a developer’s workstation alone. Developers are sometimes forced to build on separate machines, which are often in short supply, and there is time and patience lost in these obtuse configurations.

 

Test Automation

This is the Achilles’ heel of most legacy systems: lots of functionality but very little test automation to catch regression failures. If you haven’t noticed, many of these areas are intertwined; e.g., you need good separation of code into layers in order to unit test code (a type of automated test that yields quick results). 

  • Can average QA/test team members build for themselves? If not, your QA team members are probably operating too far behind mainline development. QA should be able to build, run automated tests, and do exploratory testing of the freshest check-ins.
  • What automated testing tools do we use? You’re looking to make sure there isn’t an over-reliance on the old record/playback tools. Those tools are too restrictive on who can run the tests and how they can be run. More importantly, developers typically don’t get involved in writing tests. You want to hear about tools that are based on the major “XUnit” frameworks like JUnit, NUnit, and the like. When developers write tests they even more importantly write code that can be easily tested.
  • Approximately what percentage of the code is covered by automated tests?And what percentage of those tests are unit vs. integration/system tests? You are looking for a fair balance, weighted more toward unit tests. Unit tests run in milliseconds while integration tests run in minutes/hours. Too many integration/system tests lead to slow feedback after developer check-ins. The bigger the gap between check-in and test feedback, the more likely regression failures and bugs will slip through the cracks. Use the percentage as a pointer to code that is not covered by any tests. 100% test coverage does not mean there are no bugs though.
  • Describe our continuous integration process in detail. Once you have a suite of tests (unit and integration/system), running them frequently is important. Better still if they run automatically. Better still if they run automatically every time there is a check-in. Follow-on questions: How are developers alerted of broken tests? On average how long before they are fixed (there are some web-based tools that aggregate this data if you want to review it)?

Without deliberate effort (read huge amounts of time) aimed at slowly improving these problems it is likely that they will persist and get worse. Leading indicators are performance problems, inability to hire developers, and slowing development velocity. Of course, you may be working on a throw-away project where the longevity is unimportant. However, most talented engineers prefer building things with quality in mind — call it the “craftsman ethos.”  

For systems with a lot of technical debt problems, it’s not uncommon to spend 20-50% of on-going development time toward repaying that debt. Obviously that has an impact on project schedules. The risk, however, of ignoring these problems is sudden and impactful failures, perhaps in production. I work with my technical team to prioritize this work and make it an everyday part of the project plan. By chipping away at technical debt in a long-term fashion, it is possible to reduce the risk of catastrophic production failures, improve the morale of developers, and ultimately speed up development on legacy systems. 

The decision to incur technical debt should be a conscious one, done in the full light of day.  However, unrealistic time tables and feature pressure often force engineers to cut corners without the full knowledge of the management team. There may be valid reasons to take technical shortcuts, but that decision should reside with the PM or product owners, not the engineers building the product. Decisions about product quality should always sit with the individuals making scope, schedule and resource decisions.


Opinions represent those of the author and not of Scrum Alliance. The sharing of member-contributed content on this site does not imply endorsement of specific Scrum methods or practices beyond those taught by Scrum Alliance Certified Trainers and Coaches.



Article Rating

Current rating: 5 (4 ratings)

Comments

Bill Rinko-Gay, CSM, 9/24/2012 9:59:13 AM
This is an excellent summary. Scrum Masters and Project Managers often have a difficult time deciding between "pay me now" and "pay me later" because there aren't good ways to estimate the "NPV" of the technical debt. (If you know of good ways, please share.) You have to be in a position to ask, and trust, your best technical gurus as to when the scale has tipped. And, as you so rightly imply, your best technical gurus need to know it's safe to call attention to a problem.
Vaidhyanathan Radhakrishnan, CSM, 10/16/2012 6:41:15 AM
Very nice
Carlos Aquiles Ortega Flores, CSM, 11/6/2012 12:44:10 PM
Question...
In case you want to inject that kind activities into your backlog,..
a) Would you create new User Stories?
if yes ...
... could you suggest some example names/titles ?
... could you suggest some info or reference to estimate their size?
... what could be their acceptance criteria ?

b) If no (if you don┬┤t include and express those activities into the backlog as user stories but as activities), would you consider a good practice to include them as part of stabilization/tunning or pre-release phase?
Prasanna A, CSM, 11/15/2012 8:27:17 PM
Indeed this is an excellent re-post, very true and realistic reasoning all along. Thanks.
Celine Goulfault, CSM, 12/17/2012 2:51:22 AM
Hi, I am interested like Carlos into defining potential User Stories and acceptance criterias for that.
Thank you very much for the article, really interesting.
Alan Hamilton, CSM, 1/28/2013 2:12:33 PM
Thank you, Victor, for an excellent article!

Carlos and Celine: Here are some user stories for paying off Technical Debt, taken mostly from the excellent article ΓÇ£The Land that Scrum ForgotΓÇ¥ by Robert C. Martin (Uncle Bob).

As a Project Manager,
I want to extend the Definition of Done to include
ΓÇ£fix any messy, unclear, or defective code found during other work,ΓÇ¥
to steadily improve our code base.

As a Project Manager,
I want to extend the Definition of Done to include:
ΓÇ£Before changing a section of code, be sure there are unit tests around it that succeed.
After ensuring that there are successful unit tests around the code and before changing the code, write unit tests for the changed functionality that fail; then write code to make the tests succeed.ΓÇ¥
to increase confidence that our changes haven't broken that code.
(Note: This will automatically increase the number of unit tests around sections of code that you change most, which usually includes the code with the highest concentration of bugs that you have to fix.)

As a Project Manager,
I want to identify our least-well-structured code and add unit tests around it worst code first,
to increase confidence that old, poorly-structured code does not break when we make changes elsewhere.

As a Project Manager,
I want to isolate the UI layer so that all functionality accessible from the UI can be run and tested without the UI,
to make testing of non-UI functionality far faster.

As a Project Manager,
I want to assign an id to every button and field on every UI web page, rather than letting it default,
so that robust UI tests can be written using these persistent ids instead of brittle screen locations.

As a Project Manager,
I want an automated measure of test breakage so that if large numbers of tests break we can make the test design less brittle,
so that we spend less time updating brittle tests, or worse ignoring them.

As a Project Manager,
I want a dailly graph of the amount of test code and production code,
so that we can address the problem if the amount of test code does not trent toward the amount of production code.

As a Project Manager,
I want a daily automated verification that new code is at least 80% unit test code,
to ensure that we steadily pay back technical debt of unit tests.

As a Project Manager,
I want a daily report of the average number of lines per function/method,
to ensure that the average trends toward 10 lines or less.

As a Project Manager,
I want to rewrite most functions/methods > 20 lines,
to reduce the complexity to something more reasonable.

As a Project Manager,
I want to rewrite most classes that have more than 500 lines (or split into 2 or more classes)
to make each class a more reasonable size.

As a Project Manager,
I want all automated unit and acceptance (story) tests pass BEFORE AND AFTER checkin,
to maintain confidence that our continuous integration works correctly and smoothly.

As a Project Manager,
I want to use automated code quality metrics (such as those produced by Sonar)
to monitor code quality and rewrite sections of code that need it.
http://www.sonarsource.org/


And all the following metrics should trend down.
For each X in
Defects reported after each sprint.
Function points per method/function.
# commits / day > # developers (encourage frequent commits).
# manual tests.
# & list of methods/functions least covered by unit tests.
# & list of methods/functions not covered by acceptance tests.
# & list of methods and functions with Cyclomatic Complexity > 6(ish) (Crap4J).
Speed of unit and acceptance (story) tests.
Integration build errors and time to fix each.
Dependency metrics (Dependency Inversion[4], Stable Abstractions[5]).
Static analysis problems and duplicate code (e.g. FindBugs[6] or Checkstyl3[7]).
Acceptance (story) test failures ( FitNesse[11] or Cucumber[12]).
Site load time. (Gomez, http://www.webpagetest.org/, Keynote, Webmetrics)[15]
do
As a Project Manager,
I want to graph X each day and address it any time this measure trends upward,
to increase confidence that our code base improves consistently.
end


[4] http://en.wikipedia.org/wiki/Dependency_inversion_principle
[5] http://c2.com/cgi/wiki?StableAbstractionsPrinciple
[6] http://findbugs.sourceforge.net/
[7] http://checkstyle.sourceforge.net/index.html
[11] http://fitnesse.org/
[12] https://github.com/aslakhellesoy/cucumber/wiki
[15] http://www.kitchensoap.com/2010/06/24/ops-meta-metrics-velocity-2010-slides/
Scott Grussing, CSM, 3/1/2013 6:43:16 AM
Great article, thanks.
Karim Harbott, CSP,CSM, 3/27/2013 5:13:37 AM
Great article Victor. I find it helpful to visualise technical debt over time. This can demonstrate not only the amount of hidden work outstanding, but the trend. Teams can them make a plan to tackle the debt sprint-by-sprint until the trend is downwards. The velocity will change accordingly and expectations about scope can be set.

It should be the exception to incur technical debt and those making the decision should understand that faster is slower over the medium and long term. It is not a free way to get the team to deliver faster. The hard part, is convincing non-technical managers of that.

You must Login or Signup to comment.