As a software project manager or product owner, the term ”technical debt” may not ring a bell. Over the last several years managing Agile teams, I’ve come to realize that lurking technical debt introduces risk and represents unplanned work that impacts schedules and ultimately the long-term health of projects. I believe that project managers and product owners should make technical debt a priority and factor a “re-payment” program into project schedules as a regular part of planning.
Technical debt is a term coined by Ward Cunningham to describe the cumulative consequences of corners being cut throughout a software project’s design and development. The basic idea is that a development team, over time, cuts corners due to lack of skill, laziness or pressure, which then manifests itself as a large backlog of technical inefficiencies that needs “paying off” in the future. Technical debt encompasses issues related to the code base, development environment, platform and libraries, architecture and design, and test automation, among others. The impact on the project includes reduced velocity, recruiting difficulties, and high defect rates.
Consider this hypothetical scenario. Bill the PM has a strong track record of successfully delivering on promised schedules or features. In order to make this happen under pressure, Bill’s team has neglected the code base to the point where only certain people understand parts of the code. The code base is so convoluted and difficult to understand that few new recruits are willing to work on it. Worse still, adding new features takes longer and longer as time passes. Bill has seemingly won schedule-related battles; however, the product has slowly deteriorated and the war is nearly (and surprisingly) lost.
What had Bill missed? Monitoring the technical health of a software system is just as critical as scope and schedule. Not the PMs role, you say? With the popularity of Agile methods booming, the product owners often unwittingly takes on this responsibility as the “single wring-able neck.” Even in traditional shops, the PM often tracks risks and estimates impact on milestone schedules. What’s more, engineers aren’t always forthcoming about the technical debt situation for reasons of pride or job security. So it often falls on the PM to ask the right questions to ascertain the real state of technical affairs.
There are several areas of technical debt that need examination. Below I focus on the four that I’ve found most impactful. For purposes of brevity I assume an object-oriented programming language (like Ruby, Java, C#, C++, etc.).
Sure you have some defects here and there, but your software works for the most part. What else is there to know? Actually, quite a bit. If your code is a Big Ball of Mud, chances are the speed at which you deliver features (or velocity) is declining. It’s also likely to be brittle when modified leading to regression defects. One of the biggest impacts, though, is that most talented software engineers aren’t going to be interested in working on it. And even in today’s brutal job market, talented software engineers have their pick of the lot when it comes to work places. To get a sense for the quality of the code base, sit down with trusted engineers and ask the following questions:
- Is our current code base neatly separated into layers, like the MVC pattern suggests? In other words, Is business logic separated from the data access and presentation layers, etc.? You’re asking whether coders have discipline in terms of where certain things live in the code. If there is little or no separation of layers, you are likely sitting on a big ball of mud.
- How many lines of code is our average class/method? What is our longest class/method? The longer the worse. There is no firm rule here but long, complicated classes and methods point to lack of discipline in terms of maintainability.
- Is the current code base well documented? If not, new recruits will have a tough time getting up to speed and current team members will be hesitant to work on certain areas of the code.
- Do most functions have a “single source of truth” or is there duplicate code everywhere? It is risky if there is an unknown number of places in the code that must be updated to make a single enhancement.
- I want to visually inspect the longer methods. Now, I know you’re out of your depth, but you should be able to spot overly complex code without understanding it. For instance, look for conditional statements that nest to a ridiculous degree; typically more than a couple layers of indenting should be unnecessary.
- What are the project’s code style standards and how are they enforced? You’re looking to see that there are standards at all and that they are widely known throughout the team. Better still is if the SCM or build process has automated checks in place to catch violations of the standards. Standards make it easier to spot defects and make it easier for team members to work on each other’s code.
Most software today is built on the shoulders of internal or third party libraries, application servers, and development frameworks aimed at speeding development.
- Do we rely on a third party application or database server and when was the last time we upgraded those components? Falling too far behind the latest version can lead to support problems or force the team to spend time reintegrating the application.
- Do we rely on any outdated libraries? The risk is that while these libraries may work now, they may no longer be actively developed or upgrading to a current version may be time consuming or risky.
- What data persistence technology do we use and how current is it? Do we use a custom database query generator or do we use something standard (like Hibernate in the Java world)? Falling behind here can impact application performance as well as developer productivity. A custom query generator shouldn’t be necessary in today’s world of ORM frameworks.
Build & Development Environment
Unless you’re doing something very simple, it’s likely that your developers operate in a complex development environment. They may be using IDEs and build tools that have tremendous impact on their daily productivity and happiness.
- Is there a standard IDE configuration across the group, and are you happy with what you’re using? Even if your team uses different IDEs, the configuration can likely be shared. A shared configuration means developers can spend less time fiddling with their environment and standards can be more easily applied.
- How long does it take to build, and are you happy with that? Imagine having to wait 10 or 20 minutes or more to see a single code change; that is actually quite common (and frustrating). If you have to wait that long, wouldn’t you be tempted to make blind changes without testing it yourself? Build times can usually be improved but need dedicated time set aside to improve it. Long build times also impact developer focus. If the build takes “too long” developers can lose focus or wander off to browse the web or even have a sword fight.
- What’s involved in building the software? Can you walk me through a build on my laptop? What you’re looking for here is cumbersome and complex build requirements that can’t be satisfied by a developer’s workstation alone. Developers are sometimes forced to build on separate machines, which are often in short supply, and there is time and patience lost in these obtuse configurations.
This is the Achilles’ heel of most legacy systems: lots of functionality but very little test automation to catch regression failures. If you haven’t noticed, many of these areas are intertwined; e.g., you need good separation of code into layers in order to unit test code (a type of automated test that yields quick results).
- Can average QA/test team members build for themselves? If not, your QA team members are probably operating too far behind mainline development. QA should be able to build, run automated tests, and do exploratory testing of the freshest check-ins.
- What automated testing tools do we use? You’re looking to make sure there isn’t an over-reliance on the old record/playback tools. Those tools are too restrictive on who can run the tests and how they can be run. More importantly, developers typically don’t get involved in writing tests. You want to hear about tools that are based on the major “XUnit” frameworks like JUnit, NUnit, and the like. When developers write tests they even more importantly write code that can be easily tested.
- Approximately what percentage of the code is covered by automated tests?And what percentage of those tests are unit vs. integration/system tests? You are looking for a fair balance, weighted more toward unit tests. Unit tests run in milliseconds while integration tests run in minutes/hours. Too many integration/system tests lead to slow feedback after developer check-ins. The bigger the gap between check-in and test feedback, the more likely regression failures and bugs will slip through the cracks. Use the percentage as a pointer to code that is not covered by any tests. 100% test coverage does not mean there are no bugs though.
- Describe our continuous integration process in detail. Once you have a suite of tests (unit and integration/system), running them frequently is important. Better still if they run automatically. Better still if they run automatically every time there is a check-in. Follow-on questions: How are developers alerted of broken tests? On average how long before they are fixed (there are some web-based tools that aggregate this data if you want to review it)?
Without deliberate effort (read huge amounts of time) aimed at slowly improving these problems it is likely that they will persist and get worse. Leading indicators are performance problems, inability to hire developers, and slowing development velocity. Of course, you may be working on a throw-away project where the longevity is unimportant. However, most talented engineers prefer building things with quality in mind — call it the “craftsman ethos.”
For systems with a lot of technical debt problems, it’s not uncommon to spend 20-50% of on-going development time toward repaying that debt. Obviously that has an impact on project schedules. The risk, however, of ignoring these problems is sudden and impactful failures, perhaps in production. I work with my technical team to prioritize this work and make it an everyday part of the project plan. By chipping away at technical debt in a long-term fashion, it is possible to reduce the risk of catastrophic production failures, improve the morale of developers, and ultimately speed up development on legacy systems.
The decision to incur technical debt should be a conscious one, done in the full light of day. However, unrealistic time tables and feature pressure often force engineers to cut corners without the full knowledge of the management team. There may be valid reasons to take technical shortcuts, but that decision should reside with the PM or product owners, not the engineers building the product. Decisions about product quality should always sit with the individuals making scope, schedule and resource decisions.