Get certified - Transform your world of work today

The Code Is the Documentation

An Experience Summary

08/13/2013 by Richard Obermeier

If you talk to other Scrum practitioners, you'll quickly find that everyone has a different opinion about the "right" amount of development documentation.

Let's start with the beginning, namely the Agile Manifesto phrase "Working software over comprehensive documentation." One can translate this to "code is more important than comprehensive documentation" and conclude that there is a "right" amount greater than zero.

Before we try to tackle the question about the right amount, we may want to look at the categories of main types of documentation that a developer working in Scrum creates or updates:

Requirements documentation
During backlog grooming, when the product owner and developers discuss and refine the user story, a common understanding is established of what is to be achieved, and acceptance criteria are defined that drive acceptance testing. All the information is held on the story card. The right amount emerges as a result of discussion between the PO and developers.

Inline comments to the code
This is an obvious one and an accepted best practice. Nevertheless, there is value in repeating it here: "Code should be accompanied by comments that provide information about why the code was written in a particular way."

Code-related documentation
Developers create artifacts such as UML diagrams and flow sketches, often on whiteboards, scratch paper, etc., because these help create the code in the first place.

Why code-related documentation is needed

When looking at code that is serious about explaining the "why," you'll invariably find that comments in one part of the code need to refer to other code or need to express relationships between pieces of code. There is another important aspect of the code: control and data flow. Both of these are rather difficult to cover in textual inline comments. Because relationships and flows are easier to digest in a diagram, some people even try to generate UML diagrams from their code, for example using the Maven Javadoc plug-in UML Graph. That approach assumes that only inline documentation will be kept up to date, and hence it builds a complicated process around this assumption.

We have to make a decision here: Do we want the developer to do a good job with documenting why the code is written the way it is?

To quote an Agile principle: "The most efficient and effective method of conveying information to and within a development team is face-to-face conversation." A conversation or an explanation is far more effective if it can be done using a diagram. I have often seen that very similar diagrams were drawn on a whiteboard repeatedly, with the drawing consuming quite some time, because this kind of development documentation was thought to be anti-Scrum and hence a proper diagram never was filed into a document repository.

Humans -- even developers -- forget. Spending an hour creating a diagram capturing the main relation or flow will bring sure ROI in the next sprint when the fresh memory has gone. Below is a typical graph that shows how memory degrades. Just think how much more is forgotten should the code need to be revisited only in the next release.


If we accept the reasoning above, the answer is "yes." We want to create some code-related documentation artifacts (in the following abbreviated to CDA) outside of the code and keep it up to date.

Providing developers with a repository for storing CDA at the beginning of a release cycle is a minimal first step. Even this small step -- the repository ideally prepopulated with one example diagram and a few little rules on content -- fosters creation and reuse of these CDA, and you might be astonished at how self-organizing the creation of CDAs can be.

How much code-related documentation

Before considering criteria, we should touch on the fact that when refactoring and finalizing the code, some of the CDA should be considered waste (in the XP sense) and hence not get updated but be deleted from the repository.

Some questions can help with the decision about which artifacts to keep up to date:
  • Is the subject matter of a certain complexity and not obvious? (e.g.: If a particular design pattern was used, just mention it in the inline comment but do not draw your own version of that pattern.)
  • Will it help you explain the code to a team member?
  • Can several CDAs be made into one, reducing redundancy?
  • Did you use a CDA within two sprints to look something up or aid you in an explanation? If you did not, delete it.
And some rules that were found useful:
  • Favor diagrams over written text -- remember that we do not want this to be a replacement for verbal communication.
  • Put in only as much as is necessary to convey the specific concepts; the detail is to be in the inline comments and code.
  • This is written by developers, for developers, so do not put in things you can expect other developers to know.
  • Write it up early, at the latest right when the sprint ends.
  • Either update an outdated CDA or delete it from the repository. Nothing is more frustrating than outdated CDA.

Example implementation

In the past three product releases, this approach emerged and worked well for our organization, which uses IBM RTC as an Agile project tool.

RTC holds all story-related information, including acceptance criteria and testing information. It also integrates a SCCS, so code change sets are associated with the originating user story. Since RTC allows attaching files to stories, it seemed that this was all we needed for code-related documentation.

But this did not work out. Since several stories can have the same originating CDA, a reference rather than an attachment must be used. Also, RTC does not support versioning of attached documents. So we put the CDA, such as concept papers, Visio or UML diagrams, etc., onto a wiki or Sharepoint site, then added a link to this document to the story in RTC.