For my first article on refactoring, I'm going to focus on mastering the basic refactoring techniques outlined in Martin Fowler's Refactoring. I'll provide some thoughts and comments on Fowler's book to the point that I've read (about half way) and also perform some basic refactorings on an open source project.
Refactoring, Improving the Design of Existing Code
This book is a classic; it's also pretty old. Fowler published this in 1999. What this book does is amaze me; despite it being very old for the computer industry (10 years!) and despite the fact that Fowler, at the time of publishing, didn't consider himself an expert on refactoring, every single word rings true.
I remember (re-)reading Code Complete 2 over the summer. Code Complete 2 is one of those books that some industry leaders think every developer should read; I am not inclined to disagree. However, despite all of the good parts in Code Complete, there were several out dated, misinformed or just flat out wrong sections of the book (defensive programming comes to mind). When I was reading the book, I just chalked it up to "Code Complete is an ancient book. It was originally published in 1994 and hasn't been updated in about 8 years." I have never had to say anything like that in Fowler's book. Either his book was far ahead of its time (and we are just now catching up) or the refactoring principles discussed in the book are timeless. I'm rooting for the latter.
Here are some quotes, comments and notes that I took while reading the first part of the book:
Refactoring is simple!
I think people see refactoring as a difficult process. Design is hard; so improving design of existing code must be hard, as well, right? According to Martin Fowler, refactoring can be broken down into a catalog of small, simple and "even simplistic" changes. These changes include renaming variables, or extracting methods, or moving methods to more logical locations. From these basic type of refactorings are all other refactorings composed--even a system-wide refactoring of some core part of the software.
Refactoring is continuous design!
By continually refactoring your code, you are practicing continuous design. This means that your design is continuously refined as you learn more of constraints, user needs and the problem domain. This reduces the overhead of "getting it right" up front.
"Before you start refactoring, check that you have a solid suite of tests."
In this, Fowler argues that you cannot do effective refactorings until you have a good set of unit tests. Unit tests reduce the mental overhead for software engineers. When I change one line of code, I don't want to have to think about the repercussions in every other line of code in the program. Instead, I'll let an automated test suite handle that for me.
"Any fool can write code that a computer can understand. Good programmers write code the humans can understand."
Pretty self explanatory; a large part of refactoring is making the code more approachable from a developer's standpoint--not from a compiler's or a computer's point of view.
Refactoring and Performance
Often, you are given a choice between a good factoring or good performance. According to Martin Fowler, invariably you should choose the good factoring. Often, the "good performance" benefit is actually going to hurt the performance of your system because you cannot optimize at higher level of abstractions until you refactor.
Functions belong in classes of whose data they use.
This means that if Function A is in Class Foo but is always using data from Class Bar, then Function A really belongs in Class Bar.
What about composite functions?
What about functions that use data from Class Foo and Class Bar? Fowler suggests that when facing this problem, put the function in the class that is more likely to change. That is, go with the higher volatility class
The Goals of Refactoring
Refactoring, according to Martin Fowler, has several goals and benefits. First, refactoring should improve the design of the code. Fowler alludes to what we now call the DRY (Don't Repeat Yourself) principle and the SRP (Single Responsibility Principle). Recently, Robert Martin has argued that these two principles are the core of good software design. Refactoring improves both!
Refactoring also attempts to make programs easier to understand. Whether it is by good naming conventions, smaller methods, descriptive method names, higher levels of abstraction or refactoring to common patterns, refactoring will increase the readability of code. This means that engineers spend less time figuring out what code is doing and more time adding functionality.
Martin Fowler suggests that refactoring helps you find bugs. In part, this is due to the higher readability of the code. However, the process of refactoring allows the developer to more fully understand the code and its affects.
Finally, Fowler argues that all of this is moot if you can't create functionality faster. He argues that the higher readability, better design and less bugs allow for developers to create software at a faster pace despite spending time refactoring code.<
Putting it to practice
So this is all well and good. I could talk all day about how much of a Good Thing(TM) refactorings are but I need to practice this. So I picked the coolest open source Java project I could think of, Android, and went to work.
First, I need to talk about how I decided where to even start! I've never seen the Android source code, nor even used Android in any significant fashion. So, at the point, I'm clueless. In this situation, its good to lean on some tried and true tools. I turned to JavaNCSS (NCSS stands for Non-Commenting Source Statements) to get me some metrics. In particular, I was interesting in the Cyclomatic Complexity metric.
I got back over 17,000 lines of data for just the built-in Android apps section of the code. That was too much for me to reliably parse myself. So I wrote a quick Ruby script (it was almost identical to the Weather kata [Kata4]!) to parse this data and give me back some meaningful results.
Cyclomatic complexity is important to me because I am looking for complex methods. These complex methods are usually long and have ugly conditional logic. In order to reduce complexity, I can use the Extract Method technique along with other Composing Methods techniques that are outlined in the Refactoring book.
So, I picked the most complex method in the entire Android apps section. When I finally got to looking at the code, I was lost. The name "deleteInternal" implied that I was deleting something. So, in order to better understand what I was doing, I applied the Extract Method technique on a small, but complexity inducing piece of code. I extracted the code and my tool (IntelliJ IDEA) automatically told me that the code was duplicated in 3 other places. I replaced those areas with calls to my new extracted method (getWhereClause) and with one step I improved the design of not only my target method but of the entire class! Cool!
In the next step, I saw the variable tableToChange2 and red flags immediately started to go off. While there was a useful comment near the declaration, tableToChange2 wasn't a good name. I used the Rename Temp Variable to rename tableToChange2->additionalTableToChange. This small change paid dividends in readability.
Next, I noticed the huge switch statement. I quickly attempted to figure out what it was doing and decided that this should be extracted. I tried to apply Extract Method on the code but realized that there were too many inputs and outputs. So, I got out the big gun and used Extract Method Object on the code.
After performing the Extract Method Object, there was still some code that could be extracted in order to increase the understanding of the deleteInternal method. I used the Extract Method technique on those, as well. Those methods also had implementations elsewhere in the code and I was able to reuse the methods.
The final large chunk of code could be extracted as a method, as well. This had the unfortunate side effect of a long list of parameters, but I address this later.
In this step, I noticed that there were still a lot of temporary variables in the code. I used Replace Temp with Query in conjunction with the Method Object that I created earlier. This made the code much more readable and reduced long parameter lists.
Step 7 and 8
In these steps I noticed that parameter assignment was happening in deleteRecordsFromAdditionalTableToChange and setIdColumnIfNull. I used the Remove Assignment from Parameters technique and pulled the assignment logic back into deleteInternal.
Nr. NCSS CCN JVDC Function 35 18 4 0 com.android.im.provider.ImpsProvider.deleteInternal(Uri,String,String) 36 21 14 0 com.android.im.provider.ImpsProvider.notifyOfDeletedRows(UrlMatch,int,int) 37 5 2 0 com.android.im.provider.ImpsProvider.deleteRecordsFromAdditionTableToChange(String,String,StringBuilder,SQLiteDatabase) 38 3 2 0 com.android.im.provider.ImpsProvider.appendIfChangedItemIdIsNotNull(String,String,StringBuilder) 39 5 2 0 com.android.im.provider.ImpsProvider.setIdColumnNameIfNull(String)
40 5 2 0 com.android.im.provider.ImpsProvider.getWhereClause(String) 47 5 1 0 com.android.im.provider.ImpsProvider.UrlMatch.UrlMatch(Uri,StringBuilder,int,SQLiteDatabase) 48 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.getTableToChange() 49 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.getAdditionalTableToChange() 50 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.getIdColumnName() 51 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.getChangedItemId() 52 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.getAccount() 53 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.getContact() 54 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.getThreadId() 55 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.isNotifyMessagesContentUri() 56 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.isNotifyMessagesByContactContentUri() 57 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.isNotifyMessagesByThreadIdContentUri() 58 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.isNotifyContactListContentUri() 59 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.isNotifyProviderAccountContentUri() 60 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.isContactDeleted() 61 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.getDeletedContactId()
62 2 1 0 com.android.im.provider.ImpsProvider.UrlMatch.isBackfillQuickSwitchSlots() 63 238 65 0 com.android.im.provider.ImpsProvider.UrlMatch.invoke()
What the result above shows is that the refactoring did improve the design of the existing code. The invoke method is still pretty bad (and the way to fix that is through polymorphism) but the overall complexity has been diminished and it is now easier to update, read and change the code.
This is the actual code. This is very long and I apologize for that.