This article is part number 13 of the Readability series.


Single assignment says that a variable should only be assigned a value once; i.e. a variable should only be initialized and never modified later. This could be said to be a property of functional programming—never mind that it’s used in compiler optimization a damn lot—so it may sound a bit out of scope in a readability post. Not really.

When you force yourself to avoid modifying already-defined variables, you will need to find other ways to express your intent. To do so, in general: you will use a high-order construction instead of a loop; you will end up having to define intermediate variables, each with its own descriptive name; or you will move pieces of the code to an auxiliary function, also with a descriptive name.

As a result, you will get clearer code: an identifier within a code block will be only used for a single purpose. Once the reader has determined what the identifier is and what it refers to, he can hold onto that assumption for the whole code block.

Let’s take a look at the previously-described cases one by one.

Use high-order constructions instead of accumulators

High-order constructions, like list comprehensions, maps, reduces, filters, sums, etc. are powerful constructs that can replace loops in many cases. Use them to construct the value of a variable at once, instead of building it by accumulating values over a loop.

In other words, given this code:

adults = []
for person in people:
    if person.age >= 21:
        adults.append(person)

Change it to one of the following:

adults = [person for person in people if person.age >= 21]
# OR
adults = filter(lambda person: person.age >= 21, people)

Reasoning over a loop is hard: loops can do “anything”, so you must read the whole structure to understand what is going on and what the side-effects are. On the other hand, these two one-liners show exactly what your intent is. (Additionally, because these alternative idioms are at a higher semantic level, your compiler or runtime engine could do a much better job at optimizing—or, mind you, parallelizing—them.)

Break up long expressions into various intermediate values

Given this code:

people = load_people(file)
everybody_widget.update(people)

people = [person for person in people if person.age > 21]
adults_widget.update(people)

people = [person for person in people if person.city == here]
nearby_widget.update(people)

You can see that the people variable has been assigned a value three times. Each time, a new filter has been applied to construct the subsequent value. Note that to understand what value of people might have at any point of the snippet, you must know what has happened up to that point. If you do this instead:

people = load_people(file)
everybody_widget.update(people)

adults = [person for person in people if person.age > 21]
adults_widget.update(adults)

nearby_adults = [person for person in adults
                 if person.city == here]
nearby_widget.update(nearby_adults)

Then it’s obvious what every variable holds from the point where they were defined to the end of the code block. The names of the intermediate computations are clear, and the original variables always hold their initial values.

Move auxiliary computations to a separate function

This is a followup to the previous code snippet. We could go one step further and rewrite it as follows:

def FilterAdults(people);
    return [person for person in people if person.age > 21]

def FilterNearby(people):
    return [person for person in people if person.city == here]

people = load_people(file)
everybody_widget.update(people)

adults = FilterAdults(people)
adults_widget.update(adults)

nearby_adults = FilterNearby(adults)
nearby_widget.update(nearby_adults)

This is my preferred alternative when the computation of the intermediate values is more complex than a single line (not the case above). Note that by splitting every step into its own function, you give your code higher semantical meaning and, as a bonus point, you can unit-test the various components more easily. Note also that you are avoiding any kind of comments by making the code itself state what the various steps are.

Let me conclude by saying that what is described in this post are not golden rules. There will be cases where modifying an existing variable is the right thing to do, either because it’s unavoidable or because it’s just clearer that way. But, if you stop for a moment to consider single assignments every time you write code, you will generally realize that there is a more expressive, self-documenting way to do the same thing.

Go to posts index

Comments from the original Blogger-hosted post: