How to dive into Legacy Code

Diving into legacy code written some time ago can be a daunting task. It doesn’t even matter much if we’ve been writing it ourselves, or somebody else, code rots faster than we’d like to admit.

Currently faced with such a task I tried to do it in a systematic and repeatable manner.

My steps

First try to find the modules and their dependencies. I used IntelliJ IDEA for my current Java project. Since it also uses Maven, finding the dependencies was easy.

Create a graph of module interdependencies. Which modules depend on which? Find the “edges” of the system (modules that do not depend on other modules). I found them to be the best starting point for a more detailed analysis.

Find out what the purpose of each module is. Is it a layer in the system (like a DAO-module)? Is it a cross-cutting concern (model classes)?

The next step is to analyse each module by itself. For this step I recommend using doxygen. It can generate a very good documentation of the software at hand, even if no doxygen (or any other type of markup) was used, by analysing the dependencies, class hierarchies, call graphs of the program. doxygen supports many languages, chances are high yours will be too.

To get the most out of doxygen, I’ve used the following configuration file which enables many of the advanced analysis features (like call graphs etc): doxygen.config.

You have to edit the file and provide - at least - the input and output directories! After that, it’s simply a doxygen doxygen.config.

To generate this kind of documentation easily from Maven, here is a similar doxygen-maven-plugin configuration:

 1 <build>
 2         <plugins>
 3             <plugin>
 4                 <groupId>com.soebes.maven.plugins.dmg</groupId>
 5                 <artifactId>doxygen-maven-plugin</artifactId>
 6                 <configuration>
 7                     <projectName>${project.artifactId}</projectName>
 8                     <projectNumber>${project.version}</projectNumber>
 9                     <optimizeOutputJava>true</optimizeOutputJava>
10                     <extractAll>true</extractAll>
11                     <extractStatic>true</extractStatic>
12                     <recursive>true</recursive>
13                     <exclude>.git</exclude>
14                     <excludePatterns>*/test/*</excludePatterns>
15                     <inlineSources>true</inlineSources>
16                     <referencedByRelation>true</referencedByRelation>
17                     <referencesRelation>true</referencesRelation>
18                     <hideUndocRelations>false</hideUndocRelations>
19                     <umlLook>true</umlLook>
20                     <callGraph>true</callGraph>
21                     <callerGraph>true</callerGraph>
22                     <generateLatex>true</generateLatex>
23                 </configuration>
24             </plugin>
25         </plugins>
26     </build>

The generateLatex option is nice if you wish to produce PDF files (for viewing on a Kindle for example).

With this plugin configured in you pom.xml, mvn doxygen:report is your workhorse.

If you’re unsure if the generated documentation is worth it, take a look at the doxygen documentation of JUnit 4.8.2 (zip file, 7.6MB).

Everybody writes legacy code.
— Eric Ries