Every bug is a type error

DrBearhands

Since I started keeping track two years ago, every single bug I’ve encountered, whether directly or through a friend’s complaints, was some form of type error. As you might already expect, I’m not talking about a type mismatch error at compile time, which is a prevented type error.

We are somewhat conditioned by C-style programming languages to think of a type as essentially not much more than a memory layout for the compiler to generate fast code. Instead, I like to think about the type of a value as defining everything you can do with that value; what information can be extracted from it, and what changes it can undergo (if any). A type error occurs when this collection of “things we can do” is different in our head than it is in the program; the type does not correctly model the domain. Ideally, the type of a function should tell you everything you need to know about it, making documentation obsolete.

I find that the languages we commonly use sadly do not offer the necessary tools to create domain-accurate types. And so we mentally translate the domain into patterns we are familiar with, often focusing on similarities between domain and representation and forgetting the “edge cases” where they differ.

If we could properly represent all our domains in types, we could avoid these type errors as the compiler could type-check our program for us. Like I said, this includes every single bug I’ve encountered in the past two years, so the exercise seems worthwhile. We would need quite different programming languages with certain features that are currently quite exotic.

I compiled a list of common language design issues. All of these are problems I’ve encountered in real life applications. The list is not complete, and each individual case is not elaborated to the full, but this post is long enough as it is. I will probably make an attempt at creating a language addressing these issues at some point in the future.

Sum types

The textbook example of a type error is Java’s Optional, where the get() method returns a value that might not exist. There is no way in Java, or many other imperative languages for that matter, to specify that data can be either one thing or another, only that it is both.

We call a type that is either one thing or another a sum type (or union type). A type that is both one thing and another are product types. Classes and structs are the latter. Defining states of data without sum types is like printing without magenta. You can handle some cases correctly, but not all.

Lets consider, for instance, a connection status. We might either be connected to a peer, or not connected to anybody. It would be wrong to specify who we’re not connected to. Without sum types, like in Java or Go, we’re forced to declare it like this:

class ConnectionStatus {
  boolean connected;
  Peer peer;
}

Whereas in e.g. Haskell we can say:

data ConnectionStatus = NotConnected | Connected Peer

The former can be a source for bugs, the latter cannot. If you’d like to know more about this topic I suggest looking up algebraic data types.

Ints and Floats

There has never been a programmer who needed to use an int or float as a type. Maybe N, Z, Q, or R (natural numbers, integers, rationals and real numbers, respectively), but not a 32² sized ring of integers or IEEE 754 Floating-Point Arithmetic. int and float are the stuff of hardware that has no business being in our domain representation. We’re all familiar with some kind of error resulting from their use: overflows, divide by 0, loss of precision…

Ideally we would let the compiler figure out a mapping between domain types and hardware types. Sometimes this is going to result in a slower implementation; an overflowing int with fixed length is simpler than an unbound natural, but for most business needs correctness is more important than the slight speed increase given by a reduction of operations (as opposed to e.g. cache coherency and branch misspredictions).

In a type-safety conscious language, int and float should be fringe types, hidden in exotic imports, if at all available. We would instead have naturals, integers, rationals, fixed-precision and scientific notation as primitive number types. Variants with custom bounds could potentially be used for efficiency.

But there’s a second issues with numbers. A natural, as a type, does not have any useful engineering context. It only tells us which values something might have, not what it is. A port is not just a natural, an age is not just a natural, adding an age to a port is not a sensible operation. Rather than using number sets as types, it would be better if we could specify them as shorthand for values. Such as:

data Port = 1 to 65535 from Nat

data Age = 0 to infinity from Nat

Abstract data types

Algebraic data types are not always enough. Let’s assume we’ve taken the point from the previous section to heart and have created a type for a counter:

data NumberOfVisitors = 0 to infinity from Nat

while we have correctly specified the potential values data with type NumberOfVisitors might have, we have not specified how those states can be reached. We likely don’t want to use constructors directly. We’re only interested in the value 0, and an increment operation.

Fortunately many languages allow us to restrict which functions to export from a module. Private and public functions are one way of doing it, but no OOP is required.

How you specify imports can strongly encourage of discourage the proper use of abstract data types. Elm vs Haskell is a great example of this, where Elm code tends to be much neater on average in this regard.

Friend modules

C++ has something called friend classes. A class may specify another class to be its “friend”, granting it access to private functions. This niche functionality becomes more important as you take abstract data types farther.

Suppose we have 2 datatypes, type A and type B, in modules Ma and Mb, respectively. We want to specify a function f : A -> B, but to do so we need hidden constructors from both Ma and Mb. We have 2 options: specify an “MaInternal” or “MbInternal” module that exposes all hidden constructors, or merge Ma and Mb into a single module. Neither is ideal.

Friend modules may not necessarily be the best solution, but languages should have some method of implementing abstract data types that is not entirely reliant on matching file structures and asymmetric dependencies.

Error handling

Functions that can throw errors are, of course, a prime source of bugs when error handling is lacking. Using sum types we can already make errors explicit in the type, like so:

someFunction :: SomeInput -> Result SomeError SomeReturnType

Where Result X Y is either an error of type X or “a success” with type Y. This is great, because it forces us to handle any potential errors (assuming a language that requires handling every case in a sum type).

Unfortunately this method is not always great for composition. First, errors should be transformed across abstraction layers. An EOF (End Of File) error is only meaningful if we know the execution state of the program. What file did we open? What did we expect to see rather than an EOF? That information should be added as the error is propagated through the stack / abstraction layers and is eventually handled. Programming languages should incorporate error transformers that do not hinder readability of a function flow. IMO, the common solution is not great in this respect, from Elm:

mapError someErrorTransformer (someFunction input)

The main control flow is hidden inside the error transformation. It’s possible that this simply cannot be done (well) using mere text. Admittedly we can do a little bit better using infix operators ▷ (flipped function application) and >> (flipped function composition):

someFunction input ▷ mapError someErrorTransformer

Another problem occurs when we have multiple error-throwing functions:

f1 :: A -> Result Err1 B
f2 :: B -> Result Err2 C
f3 :: C -> Result Err3 D

data SomeError = Err1Ctor Err1 | Err2Ctor Err2 | Err3Ctor Err3

someFunction :: A -> Result SomeError D
someFunction a =
  (f1 a ▷ mapError Err1Ctor)
  >> andThen (f2 >> mapError Err2Ctor)
  >> andThen (f3 >> mapError Err3Ctor)

There’s a lot of control-flow clutter that makes this code harder to read.

The neatest we can write this is using the do-notation from Haskell, the sensible composition operators from Elm, and some indentation:

someFunction :: A -> Result SomeError D
someFunction a = do
  b <- f1 a ▷ mapError Err1Ctor
  c <- f2 b ▷ mapError Err2Ctor
  f3 c      ▷ mapError Err3Ctor

This is a lot easier to read, but we can take it on step further by having open sums in our programming language. In open sums are the counterpart to tuples: generic sum-types where cases can be specified, added, and remove dynamically.

Assuming we have a special do that creates open sums, we could write:

someFunction :: A -> Result (Err1 | Err2 | Err3) D
someFunction a = do
  b <- f1 a
  c <- f2 b
  f3 c

We can’t do this with regular do-notation because Result Err1 and Result Err2 are not the same monad. A Haskell extension that lets programmers redefine do does exists and it can be used to achieve the above.

Without something like open sums, programmers may be tempted to use non-specific error types like Maybe or Result String, just to more easily compose errors and avoid filling the screen with edge case handling. On the other hand, using open sums as an error rather than using a specific error goes against my previous point about making errors more meaningful.

Implicit Kleisli categories

Take the following c-style function type:

Foo bar(Baz);

Looking at the type, we expect this function takes a Baz and produces a Foo, as that is what the type denotes. However, terms and conditions apply. The function will produce a if there are no errors, if there are no infinite loops, it also may cause side-effects, it may use data beyond the Baz we feed to it…

Formally, we usually write function in a specification that is a Kleisli category of the category we reason in. We implicitly put monads around our types.

Ideally, the default for functions would be to be total: no side-effects, no infinite execution. We know by the halting problem that not every function can be expressed in total languages. In practice, most functions in business can trivially be proven to terminate, and bottoms can easily be added to the type if that is not the case:

terminates :: Function -> (Bool | Bottom)

This is a safety feature just as much as it is clean code dogma. Subjecting 3rd party code to tight restrictions, statelessness in particular, makes it hard for attackers to hide unsavory things in functions.

As a bonus, removing all these terms and conditions makes code easier to reason about, creating new optimization opportunities.

Linear types

I’ve mentioned absence of side-effects as a prerequisite above, but sometimes we just want to mutate something, be it for effectiveness or because something is inherently mutating and doing anything else would also be a type error.

We can achieve the above using linear types. Unlike regular types, linear types are consumed when used, e.g.:

regularF :: A -> B --regular function

linearF :: A -* B --linear function

someFunction :: (A, B)
someFunction =
  let
    a = makeA
  in
    (a, regularF a) --Ok!

someBadFunction :: (A, B)
someBadFunction =
  let
    a = makeA
  in
    (a, linearF a) -- Not ok!

someBadFunction will result in an error, because linearF consumes a, so a cannot also be used to create a tuple.

Haskell has recently introduced linear types, with some caveats, but they operate on functions, just like the example above. I.e. you can specify that a function consumes a value, rather than specifying that a type is inherently linear. I don’t like this approach, because every example for linear types I know of is represented more accurately with the latter approach. Money, connections, pointers, these are all linear by the nature of what they represent.

I’d like to point our that I’m not criticizing the developers who created Haskell’s linear extension, their approach is a good fit for Haskell. For a type safety-zealous language like I’m proposing it isn’t ideal.

Global state tracking (OpenGL examples)

Okay, so we can safely modify local-ish data using linear types. But what about tracking global state? I found this desirable when working with OpenGL, and I expect this will come up more commonly when working closer to hardware in general.

When using OpenGL, we often create a shader program, set some values in that program and run the program. We might write the following code (it’s inefficient and unrealistic, but illustrates the point well):

void someFunction(GLuint program) {
  GLint uniformLocation = glGetUniformLocation(program, "myVariable");
  glUseProgram(program);
  glUniform1f(uniformLocation, 0.42);
}

glGetUniformLocation(program, "myVariable") fetches the location of the variable called myVariable inside program. glUseProgram tells OpenGL to use a specific program on the next draw command and glUniform1f writes data to a location.

Now let’s look at what is missing from the function types:

void glUniform1f( GLint location,
                  GLfloat v0);

Of course GLint is already a bad start to type-safety, but that’s not the point I want to make. What this function does is set a floating point variable, at location, in a shader to a value v0. That part is reasonably clear from the type. However, the location must also be from the currently active program (among other things). So if we had forgotten glUseProgram(program) in our example above we might have caused some undefined behavior (depending on what happens outside this function).

To make this type-safe, each of these functions should set pre- and post-conditions (Hoare logic, essentially) in their type to keep track of how global state changes throughout each function. Illustrated here with some hastily made-up pragma notation.

#pragma post: BoundProgram = @var program
void glUseProgram( GLuint program);


#pragma post: BelongsTo (@var program, @return)
GLint glGetUniformLocation( GLuint program,
                            const GLchar *name);


#pragma pre: BelongsTo (@exist program, @var location)
#pragma pre: BoundProgram = @var program
void glUniform1f( GLint location,
                  GLfloat v0);

With this, the compiler could check adherence to the OpenGL documentation automatically and save us from a lot off potentially undefined behavior and spaghetti execution-order dependencies.

You can make something that comes pretty close in Haskell using indexed monads (and existential types, and type level programming).

Essentially indexed monads have a variable “index” in their type that can denote changes to the type of the state before and after a monadic action. That index can be a phantom type, such as a type denoting “program A is bound”. The problem here, beyond the fact that is becomes a mathematical nightmare for a lot of people, is that we cannot automatically forget things.

In a rendering engine we might draw many objects in a loop, each with their own shader program. We do not know at compile-time which shader program will be bound at the end of the loop:

// What post condition should this function have for BoundProgram?
void someFunction(vector<Object3D*> drawQueue) {
  for (Object3D* obj : drawQueue) {
    glUseProgram(obj->program);
    otherStuff(obj);
  }
}

For indexed monads, which require having explicit, known types for its indices, we would need to add an operation after otherStuff(obj) that does nothing at run time but forgets the bound program at type level:

#pragma post: BoundProgram = @unknown
void forgetProgram();

We’d be polluting terms with types just to hammer an ill-fitting solution into shape. There are even more complex situations were prolog-style nondeterminism and backtracking comes into play.