Forward by Mike Cohn
Scrum is a starting point. In fact, it’s a great starting point. But, as a framework rather than a full-blown methodology, Scrum is deliberately incomplete. Some things—such as the best technical practices to use—are left for individual teams to determine. This allows a team to create the best fit between their project and environment and an assortment of technical practices.
While that selection of practices should belong to the team or organization rather than to a group of methodologists, the benefits of some practices are becoming so compellingly obvious that they warrant consideration by any Scrum team. But, too many Scrum teams become complacent after achieving some early productivity gains with Scrum. And they stop seeking ways to improve. Many fail to try the technical practices necessary for long-term success. In the following article, Robert Martin (perhaps better known as "Uncle Bob") tells us why so many Scrum teams fail to sustain the promise of their early successes.
Bob’s article is the first in a series of articles we will publish in this newsletter. Because Scrum is a starting point with deliberate gaps to be filled by knowledgeable teams, we are looking to those outside the core Scrum community to provide advice to those of us within it. What do leading agile thinkers from outside the Scrum world think our teams need to know or do? I’ve asked a few to share their thoughts with us. And who better to start with than Bob Martin?
The Land that Scrum Forgot
By Bob Martin
What goes wrong with so many Scrum projects? Why does the velocity start out high, but then precipitously decline? Why do some Scrum teams eventually give up on Scrum? What’s going wrong?
As someone who has been called in to rescue Scrum teams from this kind of demise, I can tell you that the problem is not that the teams are losing motivation. Often the problem is that the software that the teams are producing is getting harder and harder to work with.
Scrum makes you go fast! That’s a good thing. Often the first sprint concludes with some working features. Managers and customers are happy. The team has worked a miracle and is happy too. Everybody is happy, and Scrum is seen as a great success.
The same thing happens the next sprint, and the sprint after that. Velocity is high. The system is coming together. Feature after feature is working. Expectations have been set. Plans are made. Enthusiasm for Scrum soars. Hyper-productivity has been achieved!
One of the reasons for this hyper-productivity is the smallness of the code base. Small code bases are easy to manage. Changes are easy to make; and new features are easy to add.
But the code is growing fast; and when a code base gets large, it can be very difficult to maintain. Programmers can be significantly slowed down by bad code. Teams can be reduced to near immobility by the sheer weight of a badly written system If care is not taken soon, the hyper-productive scrum team will succumb to the disease that kills so many software projects. They will have made a mess.
“Wait,” I hear you say. “I thought Scrum was supposed to empower the team? I thought the team would take all necessary measures to ensure quality. I thought the an empowered Scrum team would not make a mess.”
That’s certainly the goal. The problem is that empowered teams are still human, they do what they are incented to do. Are they being rewarded for quality? Or are they being rewarded for productivity? How much recognition is the team getting for good code quality? How much are they getting for delivering working features?
There’s your answer. The reason scrum teams make messes is because they have been empowered and incented to make one. And a Scrum team can make a mess really, really fast! A Scrum team is hyper-productive at making messes. Before you know it the mess will be “so big and so deep and so tall, you can not clean it up. There is no way at all.”
And when that happens, productivity declines. Morale goes down. Customers and managers get angry. Life is bad.
So how can you incent a Scrum team to not make a mess? Can you simply tell them not to make a mess? We’ve tried that. It doesn’t work. The incentives for going fast are based on tangible deliverables. But it’s hard to reward a team for good code If we don’t have some way to objectively measure it. Without an unambiguous way to measure the mess; the mess will be made.
We need to go fast, and we need to stay clean so we can keep going fast. How can we incent the team to achieve both goals? Simple. We measure both and reward them equally. If the team goes fast but makes a mess, then there is no reward. If the team stays clean but goes slow, then again, there is no reward. If the team goes fast and stays clean, then there is a reward!
We can measure messes by implementing engineering disciplines and practices like Test Driven Development (TDD), Continuous Integration, Pair Programming, Collective Ownership, and Refactoring; i.e. the engineering practices of eXtreme Programming (XP).
It is usually best to start with TDD simply because a code base without tests is a mess no matter how clean it might otherwise be. This is a bold claim, but it is based on a solid rationale from a much older and more respected discipline: accounting. It is just as easy for an accountant to make a mistake on a spreadsheet as it is for a programmer to make a mistake in a program. So how do accountants prevent errors? They enter everything twice.
Accountants practice Dual Entry Bookkeeping as part of the GAAP1 (Generally Accepted Accounting Principles). Accountants who don’t hold to the GAAP tend to wind up in another profession, or behind bars. Dual Entry Bookkeeping is the simple practice of entering every transaction twice; once on the debit side, and once on the credit side. These two entries follow separate mathematical pathways until a final subtraction on the balance sheet yields a zero. A set of books that is not supported with dual entries would be considered a mess by the accounting community, no matter how accurate and clean those books were.
TDD is Dual Entry Bookkeeping for software, and it ought to be part of the GAPP (Generally Accepted Programming Practices). The symbols manipulated by the accountants are no less important to the company than the symbols manipulated by the programmers. So how can programmers do less than accountants to safeguard those symbols?
Programmers who practice TDD create a vast number of automated tests that they keep together and run as a regression suite. This is something you can measure! Measure the coverage. Measure the number of tests. Measure the number of new tests per sprint. Measure the amount of defects reported after each sprint, and use that to determine the adequacy of the test coverage.
The goal is to increase your trust in that suite of tests until you can deploy the product based solely on whether that suite passes. So measure the number of “other” tests you feel you need perform, and make shrinking that number a priority; especially if they are manual tests!
A suite of tests that you trust so much gives you an immense amount of power. With it, you can refactor the code without fear. You can make changes to the code without worrying about breaking it. If someone sees something they think is unclear or messy, they can clean it on the spot without worrying about unintended consequences.
Undocumented systems, or systems where the documentation has gotten out-of-sync with the production code, are messy. The unit tests produced by TDD are documents that describe the low level design of the system. Any programmer needing to know how some part of the system works can reliably read the unit tests for an unambiguous and accurate description. These documents can never get out of sync so long as they are passing.
Measure the size of your tests. Test methods should be on the order of five to twenty lines of code. The total amount of test code should be roughly the same as the amount of production code.
Measure test speed. The tests should run quickly; in minutes, not hours. Reward fast test times.
Measure test breakage. Tests should be designed so that changes to the production code have a small impact on the tests. If a large fraction of the tests break when the production code is changed, the test design needs improving.
Measure Cyclomatic Complexity. Functions that are too complex (e.g. cc > 6 or so) should be refactored. Use a tool like Crap4J2 to pinpoint the methods and functions that are the worst offenders and that have the least test coverage.
Measure function and class size. Average function size should be less than 10 lines. Functions longer than 20 lines should be shrunk. Classes longer than about 500 lines should be split into two or more classes. Measure your Brathwaite Correlation3, You’d like it to be greater than 2.
Measure dependency metrics. Ensure there are no dependency cycles. Ensure that dependencies flow in the direction of abstraction according to the Dependency Inversion Principle4, and the Stable Abstractions Principle5.
Use a static analysis tool like FindBugs6 or Checkstyl37 to locate obvious programming flaws and weaknesses. These tools can also find and measure the amount of duplicate code.
Implement Continuous Integration. Set up a build server like Hudson8, Team City9, or Bamboo10. Have that server build the system every time a developer commits some code. Run all the tests on that build and address any failures immediately.
Measure the number of commits per day. This number should be larger than the number of developers on the team. Encourage frequent commits.
Measure the number of days per month that the continuous build fails. Reward months with no failures. Measure the amount of time failures remain unaddressed.
Story tests are high level documents written by business analysts and testers. They describe the behavior of the system from the customer’s point of view. These tests, written in a tool like FitNesse11 or Cucumber12, are requirements that execute. When these tests pass the team knows that they are done with the stories that they describe.
Measure done-ness by running story tests in your Continuous Integration system and keeping track of the story tests that pass and fail. Use that as the basis for velocity and the progress of the team. Enforce the rule that stories are not done until their corresponding story tests are passing. And never let passing story tests break.
Practice Pair Programming. Measure the time spent pairing vs. the time spent programming alone. Teams that pair stay cleaner. They make fewer messes. They are able to cover for each other because they know each other’s domains. They communicate with each other about designs and implementations. They learn from each other.
And after all this measuring, how do you reward? You post big visible charts of the metrics in the lunchroom, or in the lobby, or in the project room. You show the charts to customers and executives, and boast about the team’s focus on quality and productivity. You have team parties to celebrate milestones. You give little trophies or awards. For example, one manager I know gave shirts to everyone on the team when they passed 1000 unit tests. The shirts had the name of the project and the words “1,000 Unit Tests” embroidered on them.
How do you keep a Scrum Team from losing productivity? How do you make sure that hyper-productivity doesn’t veer headlong into a quagmire? You make sure that the team is not hyper-productively making a mess! You make sure they are practicing the disciplines that produce data that can be measured. You use that data to measure the quality of the code they are producing; and you provide incentives for keeping that code clean.