Function Hell

Recently I bought the book “Clean Code” by Uncle Bob which discusses, well …, clean code. I’m very picky when it comes to clean code, which is why I wanted to read opinions of others to perhaps broaden or refine some of the practices I use myself.

I just read through a chapter on functions, of which a presentation by the author himself can be found online. So far I generally agreed to most statements made in the book, but the extent to which the following rule is applied bothers me, … a lot, … and then some more.

Functions should do one thing. They should do it well. They should do it only. – Uncle Bob

To demonstrate, consider the following “good practice” code taken from his blog post about the subject.

class SymbolReplacer {
    protected String stringToReplace;
    protected List alreadyReplaced = new ArrayList();
    private Matcher symbolMatcher;
    private final Pattern symbolPattern = Pattern.compile("\\$([a-zA-Z]\\w*)");

    SymbolReplacer(String s) {
      this.stringToReplace = s;
      symbolMatcher = symbolPattern.matcher(s);
    }

    String replace() {
      replaceAllSymbols();
      return stringToReplace;
    }

    private void replaceAllSymbols() {
      for (String symbolName = nextSymbol(); symbolName != null; symbolName = nextSymbol())
        replaceAllInstances(symbolName);
    }

    private String nextSymbol() {
      return symbolMatcher.find() ? symbolMatcher.group(1) : null;
    }

    private void replaceAllInstances(String symbolName) {
      if (shouldReplaceSymbol(symbolName))
        replaceSymbol(symbolName);
    }

    private boolean shouldReplaceSymbol(String symbolName) {
      return getSymbol(symbolName) != null && !alreadyReplaced.contains(symbolName);
    }

    private void replaceSymbol(String symbolName) {
      alreadyReplaced.add(symbolName);
      stringToReplace = stringToReplace.replace(
        symbolExpression(symbolName),
        translate(symbolName));
    }

    private String symbolExpression(String symbolName) {
      return "$" + symbolName;
    }

    protected String translate(String symbolName) {
      return getSymbol(symbolName);
    }
  }

This code explains the title of this post. Due to the amount of functions, the overview of this trivial procedural task is lost. The irony is that the decomposition of the code into so many functions is supposed to make it more readable.

A lot of the replies on his post share my sentiment that over extracting like this isn’t favorable. The replies of “Angry Man”, albeit harsh, describe the problems perfectly, and provide for a fun read in case you don’t get the point I’m trying to get across. The question that arises is: “When should a separate function be created?“.

I’ll try to define my approach by a few rules of thumb, where the emphasis is on reusability, encapsulation and global readability, instead of Uncle Bob’s specification which sadly seems solely based on creating small methods as a means to achieve local readability. What I mean exactly by this is explained later, along with the argumentation.

  • Prevent creating new methods that are guaranteed to only be called from one location.
  • Group strongly associated code together (cohesion/coupling principle), and put a comment above it mentioning its intent. Separate these blocks of code from other blocks by using e.g. newlines.
  • For every ‘block’ of code written, think whether it might be used somewhere else. When it’s reusable, encapsulate it inside a suitable function/class.

Every experienced programmer probably follows similar guidelines. Notice how instead of blindly following a rule, you actually think about how the code will be used later on. I would specify this as a design principle:

Always develop towards an API.

Argumentation

One of the main design principles I’d like to use as an argument is “encapsulation“. Creating a new function within a class makes it available to the entire class, and possibly extending classes. When there is no reuse of a certain piece of code, other scopes shouldn’t be bothered by it.

By following the rule of placing code in logical blocks, and documenting it’s intent, the entire block can be interpreted just as easily, and even better than by extracting it into a function. It’s intent can be described in plain english. This is what I defined as local readability. Additionally, global readability is maintained, as you immediately see where, and how often a block of code is used. Thanks to encapsulation, you know only about the definitions relevant to the current scope.

It’s a well-known fact that, when a method becomes too long, this is a strong indication that it should be refactored, following the Separation of Concerns principle. By blindly following the “extract till you drop” rule of Uncle Bob, you attempt to solve a problem, without reflecting what you are actually trying to solve. Often it just moves the problem to another scope, as too many methods in a class also indicate the need for refactoring.

UPDATE

While continuing reading the book I came across a perfect example to demonstrate this sad fact. Consider the following extracted function which can be found in Chapter 5 on Formatting, page 83.

private String getPageNameOrDefault(Request request, String defaultPageName)
{
    String pageName = request.getResource();
    if (StringUtil.isBlank(pageName))
        pageName = defaultPageName;

    return pageName;
}

Instead of immediately extracting this code into a function, he might have realised this behavior is pretty common. You could consider writing a “StringUtil.ensureValue(string value, string default)” function. Actually a quick google check brings up “StringUtils.defaultIfEmpty()” in the apache.commons library. You don’t want to write a new function for every string needing a default value. Actually this behavior is even more common, as you might also need default values for other types than strings. For small conditional checks like this some languages (including java) empower you with conditional operators. All you really need is the following:

String pageName = request.getResource();
pageName = StringUtil.isBlank(pageName) ? defaultPageName : pageName;

The thing that amazes me is that his trigger happy extraction rule violates so many good rules which are mentioned in the book as well. Besides the already mentioned encapsulation problem, it’s not hard to see how the above mentioned example violates the “Don’t Repeat Yourself” principle. If you would argue you might need to call this code more than once, don’t. This should be called exactly once, and stored in a variable. Which of the following two is more readable to be used after the first call: “getPageNameOrDefault(request, defaultName)” or “pageName“.

As far as I’ve read through the book (chapter 5), a better name for it would be “Readable Code” instead of “Clean Code“. I do hope the remainder contains more valuable tips. I am afraid however of the second section, full with code samples, which probably apply this nonsense “extract till you drop” rule.

Author: Steven Jeuris

I have a PhD in Human-Computer Interaction and am currently working both as a software engineer at iMotions and as a postdoc at the Technical University of Denmark (DTU). This blend of research and development is the type of work which motivates and excites me the most. Currently, I am working on a distributed platform which enables researchers to conduct biometric research 'in the wild' (outside of the lab environment). I have almost 10 years of professional software development experience. Prior to academia, I worked for several years as a professional full-stack software developer at a game development company in Belgium: AIM Productions. I liked the work and colleagues at the company too much to give up entirely for further studies, so I decided to combine the two. In 2009 I started studying for my master in Game and Media Technology at the University of Utrecht in the Netherlands, from which I graduated in 2012.

13 thoughts on “Function Hell”

  1. I like the introduction of Local Readability and Global Readability; these terms describe some things I’ve been working to explain to other developers for years. And I do think local readability can be achieved with the introduction of logical blocks (“Code Paragraphs”, as I’ve called them) without turning them into separate methods. Being “trigger happy” on turning logical blocks into functions can have the opposite effect and hinder readability.

    But, mastering classes, functions, blocks, etc., is one of those things that come with experience. I don’t think there are concise guidelines and definitive rules on when or how to achieve good/clean code.

    1. I guess I agree on that as my guidline contains “thinking about reuse” in it. This makes it not that much of a definitive rule, but makes you rely more on your experience. 🙂

  2. It seems to me that the problem here isn’t that he’s dividing the methods too much, but that he is doing it wrong. What he should do is express the methods in terms of a generality rather then multiple specifics.

    Take the first example. We are basically replacing instances of a regular expression with something else. After all of that extracting he spends quite a lot of code to do it. However, if this was python it would be a one-liner.

    return re.sub('$([A-Za-z]\w*)', self.string, self.translate)

    The problem of replacing instances of regular expressions has been seen before and solved. The function should be made much shorter, but it should do so by leveraging code that is already written. Sadly, the lack of first-class functions makes the equivalent interface in Java more verbose, but that is what you get for using Java.

    The second example, as you mentioned, should have been written using something like StringUtil.defaultIfEmpty. Again, he should be exploiting already written code not writing additional code himself.

  3. I completely agree. Fragmenting the whole story that code tells into countless trivial small methods makes it hard, if not impossible, to see the big picture. The fact that the whole source code became twice as long (LOC) doesn’t exactly make it easier. It just gives you a false sense of understanding, because you understand every method, so you think you understand the program.

    But when asked “what exactly does replace() do”, you have to go down 4 levels of indirection to find out that it also makes sure that it doesn’t replace a name more than once. The interaction between “replaceAllInstances”, “shouldReplaceSymbol” and “replaceSymbol” is not at all obvious!

    My rule of thumb is: Every method or function should provide a NEW reasonable level of abstraction. To do that, it usually has to do something non-trivial. String.replaceAll() already exists, so don’t just invent a fancy new name for it.

  4. But where is your version of SymbolReplacer? It is not fair to criticize some code and does not provide the one which you like.

  5. ammoQ failed to consider that understanding “the program” isn’t an appropriate goal for anyone except in trivial, not real world cases. Most systems are so large and so complex you can’t “understand” them all at once anyway. The purpose of classes is to decompose useful behaviors. Functions doing one thing does not suddenly negate the Single Responsibility Princple for a class. The SymbolReplacer has a single concern. It replaces symbols. If there’s a bug related to symbol replacement, you know exactly where to go – and then it is a much more readable class. Let’s not also forget that there must be unit and integration tests aplenty. If the SymbolReplacer tests are passing, you can start off with confidence that the problem is not in SymbolReplacer. Look at integration tests with SymbolReplacer. There may still be a bug – something not tested, which you should of course add the test for – but that’s how you cain the “understanding”. Appeals to LOC and “understanding the program” are invalid.

  6. I must agree that, sometimes, Uncle Bob goes to far. However, I don’t agree with the rules of thumb here presented when it comes to testing. The problem is that if you have several “groups of strongly associated code together with a comment above it” in the same function, it is a lot harder to maintain a set of (unit) tests that covers all the possibilities.

    1. Agreed, and that is what the final rule of thumb is attempting to address. If the logical ‘block’ of code is reuseable, it should be abstracted away somewhere at the right level of abstraction. This includes adding tests for it. If the code is not reuseable, I do not feel there is a need to open up the logic of a method (break encapsulation) just for testing purposes. I’m simply arguing for a more thought-through process of splitting up code which ‘does more than one thing’, and do not see the merit of an intermediate step involving splitting big methods up into smaller methods located within the same class.

    2. Since most of the methods have visibility private, unit-testing them isn’t any easier, since you unit-test the public interface of your class.
      In this case, unit-testing could even become more complex, since your conditionals might be spread over multiple functions.

      I think it would be nice, to have an easy way to unit-test private functions in Java, which would make this function extractions indeed easier to test…

Leave a comment