Why Types Are Not Documentation

Tue 16 June 2015

Type annotations have been oozing into Python and getting all over everything this week. One of my open-source libs is getting a big update, and we’re playing with Sphinx’s type hints for arguments. Elsewhere, friends on Twitter question the need for comprehensive documentation in general, proposing to hoist much of it up into type systems. And I continue to tinker away on an experimental new language, which prods me to think about types as both an implementor and a prospective user.

I keep returning to 3 distinct things that often get mixed up:

  • Constraints
  • Behavior
  • and Intent

With these ideas in mind, the roles and limitations of types, tests, and documentation become clear.

The Coming Naming Mess

First, how will explicit constraints, in the form of type annotations, change the Python idiom?

Any programming community is the sum of its language’s specification and its accumulated usage patterns. You can derive one from the other no more than you can reconstitute the English vulgate from a dictionary. As type annotations gradually infiltrate Python culture, I feel myself pulling back. This isn’t because it’s bad to let machines reason about constraints; they love that stuff, and it would be nice to catch some mistakes more automatically. But type information is already woven into the Python idiom, and simply slapping the type fish on top makes for an unappetizing concoction.

Specifically, Python nouns—like the function arguments that type annotations describe—idiomatically include both semantic information and structural hints.

def frob(file_path, should_flush):
    ...

In the above, file_path is clearly a string, and should_flush is obviously a boolean. Just adding type annotations does not improve clarity at all; in fact, it adds noise for the native reader:

def frob(file_path: str, should_flush: bool):
    ...

Note the redundancy—not even beneficial by dint of being separated in space—of “path” and “str”, “should” and “bool”.

To add type annotations without sacrificing readability, we must revise our idiom, lifting out the type information rather than just repeating it:

def frob(file: str, flush: bool):
    ...

This is fine in isolation. We did lose some resolution as “path” became “str”, but in this case it isn’t too bad. However, think of what happens as codebases collide: mixtures of Python 2 with Sphinx type hints, pure Python 3 with annotations, and code that targets 2 and 3. Naming conventions will necessarily be either redundant on 3, uninformatively terse on 2, or an inconsistent mixture. I don’t look forward to that transitional period. And given that we’re already 7 years into the adoption of Python 3, it may be a very long one. Indeed, we can't go around blithely renaming kwargs (lest we break callsites), so it may be permanent.

Can We Extract Intent From Constraints?

The siren song of type systems is that a sufficiently advanced one can substitute for documentation. This, in my experience, is true of only the most mechanical statements: this arg is a string, only one worker can access this value at a time, this function performs IO. I have yet to encounter generated documentation that makes newcomers shriek with joy at its elysian clarity. Readers of C++ or Java-derived autodocs—even on the level of individual subroutines—generally shriek with other emotions entirely.

The reason is that types—at least as we have them today—are only constraints. They specify invariants that hold as we move through a program, and, from a human perspective, they tend to be fairly low-level trivialities. (In fact, “type” is an unfortunate word, a carryover from early languages where all they guarded was the difference between floats and chars. We could more usefully call them “invariants” if that weren’t already taken.) Types convey structure and allow us to mechanically enforce rules for keeping it intact. But structure alone is not enough to convey meaning.

For example, what can you know about this well-typed function?

def f(x: str, y: int) -> str:
    ...

It has some constraints in effect, but, as long as it meets those, it can do anything, even just returning "foo" all the time. It could be a string-repetition function that prints its first argument a certain number of times, but we can’t deduce that from only the types. Let’s fill in an implementation and see if that helps:

def f(x: str, y: int) -> str:
    return '{}#{}'.format(x, y)

Now our function has behavior. It’s clear what it does, but it could still mean a great many things; we could not yet write a test for it. What is it trying to do? Is it succeeding in it? Let’s add naming and see intent begin to filter in:

def source_url_with_line_anchor(path: str, line: int) -> str:
    return '{}#{}'.format(path, line)

A lot of meaning comes flying out of the names, basically for free, just by choosing good ones rather than bad. But we’re still not quite done.

def source_url_with_line_anchor(path: str, line: int) -> str:
    """Return a URL to a source code file at a given line.

    :arg path: The checkout-relative path to the source file
    :arg line: The line number to point to

    """
    return '{}#{}'.format(path, line)

With actual documentation in place, the intent—not just the actual behavior—of the function finally becomes clear. At this point, we could write tests, whose purpose is always to map intents to (testable) behaviors. We would notice that the function is buggy: it should URL-escape any weird chars that make their way into path. Of course, we could define a Url class and lift some of the documented intent up into the type annotations, but that gets us only a baby step closer: there’s still no telling which URL we intend to return. If type systems went that far, we would have no need of any other facet of programming. Though, at that point, I suspect the required constraint declarations would look a lot like Prolog and be so incomprensible as to send us scurrying back to handcrafted documentation. We’ve been playing with constraint solvers for over 40 years, and even the most mature ones are not shy about expressing intent in English. For that matter, it’s been shown that C++’s template system is Turing-complete, but I wouldn’t want to trade my docs for it.

It’s intent that humans need in order to understand a system. We grope for intent even when it’s not there, in all kinds of complex systems: casting viruses as wanting to replicate and gasses as wanting to bubble up out of solution. Perhaps we all make the best use of our wetware by overzealously anthropomorphizing everything. But as programmers, we should recognize that constraints are for proof machines, behaviors are for tests, and intents are for each other.