Static & Dynamic Typing

I've written and been paid to write code in a wide variety of languages - functional and imperative, statically and dynamically typed, verbose and concise. Over time, I came to appreciate the benefits of certain language features.

The one I want to talk about today is static typing. My programs in statically typed languages seem to be systematically more likely to work correctly on the first try, more robust against bugs introduced by refactoring or adding new features, and easier to reason about. In fact, the benefits are so large that I now use pytype even for Python scripts with just a hundred lines.

Others seem to share this experience, from an interview with the engineering lead for TypeScript:

I think my favorite thing that I see is people on the Internet saying, 'I did this huge refactoring in TypeScript and I was refactoring for three hours. And then I ran my code and it worked the first time.' In a dynamic language, that would just never, ever happen....

First, let's clear up a common confusion: all commonly used programming languages have values that are typed in some way (a number vs a string), the difference is whether these types are checked ahead of time by a compiler, or only at run-time if and when needed by an interpreter or JIT. In fact, a major way in which JITs can speed up code is by discovering and caching the types used by dynamic code^[1].

Verbosity

While it is certainly true that some statically typed languages can be very verbose - C++, Java - this is not a property of static types. As just one example, Rust and Scala show that conciseness does not have to come at the cost of performance or safety.

The main way in which statically typed languages avoid verbosity due to types is by using type inference: use types specified in a few strategic places, say function signatures, and use them to infer the types in the rest of the program. These are locations that would be informally annotated with types even in dynamically typed programs - think of Python doc strings.

Type inference isn't restricted to a simple linear or top-down process either - as long as the necessary information is present somewhere, the compiler can figure it out. From the Rust documentation:

fn main() {
    // Because of the annotation, the compiler knows that `elem` has type u8.
    let elem = 5u8;

    // Create an empty vector (a growable array).
    let mut vec = Vec::new();
    // At this point the compiler doesn't know the exact type of `vec`, it
    // just knows that it's a vector of something (`Vec<_>`).

    // Insert `elem` in the vector.
    vec.push(elem);
    // Aha! Now the compiler knows that `vec` is a vector of `u8`s (`Vec<u8>`)
}

Type inference also works in complex situations (example from reddit, interactive playground):

use rand::Rng;

fn main() {
    let n = 20;
    
    // Generate a random graph on n vertices in adjacency list format.
    let mut rng = rand::thread_rng();
    let graph: Vec<Vec<_>> = (0..n)
        .map(|_| (0..n).filter(|_| rng.gen()).collect())
        .collect();
        
    for (vertex, neighbours) in graph.iter().enumerate() {
      println!("vertex {:2} connected to: {:?}", vertex, neighbours);
    }
}

Small vs Large

Engineering is the art of trade-offs, and it should come as no surprise that different tasks call for radically different solutions.

Software engineering can be thought of as "programming integrated over time". What practices can we introduce to our code to make it sustainable - able to react to necessary change - over its life cycle, from conception to introduction to maintenance to depreciation?

-- Software Engineering at Google

When your time-scale is very short, as when writing a single-use analytics script, the maintenance benefits of static typing will be less important than when building a system that is expected to be maintained for decades. Static types will still be useful to avoid bugs, but when your entire program fits in a few hundred lines you are less likely to miss subtle edge cases to begin with.

For very small, single-use programs, in my experience the most important factor is easy of writing and conciseness, something that dynamic languages often excel at. However, be careful: One-off scripts have a nasty habit of staying in use much longer than you thought!

Types Everywhere

Not long ago, web programmers scoffed at types, yet this year TypeScript edged out Python to become second most loved programming language in the annual Stack Overflow developer survey. The #1 spot has been claimed by Rust for five years in a row. Kotlin is bringing concise typing to the JVM and even Python is adding static type checking.

Life is too short to manually look for bugs a computer can catch for you.

Introduced in the paper Optimizing Dynamically-Typed Object-Oriented Languages With Polymorphic Inline Caches, PDF ↩

Tags: programming