Union Types in Flow & Reason

Union types are powerful yet often overlooked. At work, I’ve been using Flow which thankfully supports union types. But as I’ve refactored more of our code to use union types, I’ve noticed that our bundle size has been steadily increasing!

In this post, we’re going to explore why that’s the case. We’ll start with a problem which union types can solve, flesh out the problem to motivate why union types are definitely the solution, then examine the resulting cost of introducing them. In the end, we’ll compare Flow to other compile-to-JS languages on the basis of how they represent union types in the compiled output. I’m especially excited about Reason, so we’ll talk about it the most.

Setup: Union Types in a React Component

Let’s consider we’re writing a simple React 2FA2FA: two-factor authentication

modal. We’ll be using Flow, but you can pretend it’s TypeScript if you want. The mockup we were given looks like this:

A sample mockup for a two-factor authenticaion modal

In this mockup:

We’ll need some way for our component to know which of the three screens is visible. Let’s use a union type in Flow:

type Screen =
  | 'LoadingScreen'
  | 'CodeEntryScreen'
  | 'SuccessScreen';

Union types are a perfect fit! 🎉 Union types document intent and can help guard against mistakes. Fellow developers and our compiler can know “these are all the cases.” In particular, Flow can warn us when we’ve forgotten a case.

Our initial implementation is working great. After sharing it with the team, someone suggests adding a “cancel” button in the top corner. It doesn’t make sense to cancel when the flow has already succeeded, so we’ll exclude it from the last screen:

Adding a close button to our modal

No problem: let’s write a function called needsCancelButton to determine if we need to put a cancel button in the header of a particular screen:

const needsCancelButton = (screen: Screen): boolean => {
  // Recall: 'SuccessScreen' is the last screen,
  // so it shouldn't have a cancel button.
  return screen !== 'SuccessScreen';
};

Short and sweet. 👌 Everything seems to be working great, until…

switch: Optimizing for Exhaustiveness

The next day, we get some updated mocks from the design team. This time, they’ve also drawn up a “failure” screen for when the customer has entered the wrong code too many times:

The failure screen for our modal

We can handle this—we’ll just add a case to our Screen type:

type Screen =
  | 'LoadingScreen'
  | 'CodeEntryScreen'
  | 'SuccessScreen'
  // New case to handle too many wrong attempts:
  | 'FailureScreen';

But now there’s a bug in our needsCancelButton function. 😧 We should only show a close button on screens where it makes sense, and 'FailureScreen' is not one of those screens. Our first reaction after discovering the bug would be to just blacklist 'FailureScreen' too:

const needsCancelButton = (screen: Screen): boolean => {
  return (
    screen !== 'SuccessScreen' ||
    screen !== 'FailureScreen'
  );
};

But we can do better than just fixing the current bug. We should write code so that when we add a new case to a union type, our type checker alerts us before a future bug even happens. What if instead of a silent bug, we got this cheery message from our type checker?

Hey, you forgot to add a case to needsCancelButton for the new screen you added. 🙂

— your friendly, neighborhood type checker

Let’s go back and rewrite needsCancelButton so that it will tell us this when adding new cases. We’ll use a switch statement with something special in the default case:

const impossible = <T>(x: empty): T => {
  throw new Error('This case is impossible.');
}

const needsCancelButton = (screen: Screen): boolean => {
  switch (screen) {
    case 'LoadingScreen':
      return true;
    case 'CodeEntryScreen':
      return true;
    case 'SuccessScreen':
      return false;
    default:
      // (I named this function 'absurd' in my earlier post:
      // https://blog.jez.io/flow-exhaustiveness/)
      // This function asks Flow to check for exhaustiveness.
      //
      // [flow]: Error: Cannot call `impossible` with `screen` bound to `x` because string literal `FailureScreen` [1] is incompatible with empty [2].
      return impossible(screen);
  }
}

(Play with it on Try Flow →)

Now Flow is smart enough to give us an error! Making our code safer, one switch statement at a time. 😅 Union types in Flow are a powerful way to use types to guarantee correctness. But to get the most out of union types, always“Always” is a very strong statement. Please use your best judgement. But know that if you’re not using a switch, you’re trading off the burden of exhaustiveness & correctness from the type checker to the programmer!

access them through a switch statement. Every time we use a union type without an exhaustive switch statement, we make it harder for Flow to tell us where we’ve missed something.

Correctness, but at what cost?

You might not have noticed, but we paid a subtle cost in rewriting our needsCancelButton function. Let’s compare our two functions:

// ----- before: 62 bytes (minified) -----

const needsCancelButton = (screen) => {
  return screen !== 'SuccessScreen';
};

// ----- after: 240 bytes (minified) -----

const impossible = (x) => {
  throw new Error('This case is impossible.');
};

const needsCancelButton = (screen) => {
  switch (screen) {
    case 'LoadingScreen':
      return true;
    case 'CodeEntryScreen':
      return true;
    case 'SuccessScreen':
      return false;
    default:
      return impossible(screen);
  }
};

With just an equality check, our function was small: 62 bytes minified. But when we refactored to use a switch statement, its size shot up to 240 bytes! That’s a 4x increase, just to get exhaustiveness. Admittedly, needsCancelButton is a bit of a pathological case. But in general: as we make our code bases more safe using Flow’s union types of string literals, our bundle size bloats!

Types and Optimizing Compilers

One of the many overlooked promises of types is the claim that by writing our code with higher-level abstractions, we give more information to the compiler. The compiler can then generate code that captures our original intent, but as efficiently as possible.

Flow is decidedly not a compiler: it’s only a type checker. To run JavaScript annotated with Flow types, we first strip the types (with something like Babel). All information about the types vanishes when we run the code.Even though TypeScript defines both a language and a compiler for that language, in practice it’s not much different from Flow here. A goal of the TypeScript compiler is to generate JavaScript that closely resembles the original TypeScript, so it doesn’t do compile-time optimizations based on the types.

What can we achieve if we were to keep the types around all the way through compilation?

Reason (i.e., ReasonML) is an exciting effort to bring all the benefits of the OCaml tool chain to the web. In particular, Reason works using OCaml’s mature optimizing compiler alongside BuckleScript (which turns OCaml to JavaScript) to emit great code.

To see what I mean, let’s re-implement our Screen type and needsCancelButton function, this time in Reason:

type screen =
  | LoadingScreen
  | CodeEntryScreen
  | SuccessScreen;

let needsCancelButton = (screen: screen): bool => {
  switch (screen) {
  | LoadingScreen => true;
  | CodeEntryScreen => true;
  | SuccessScreen => false;
  }
};

Looks pretty close to JavaScript with Flow types, doesn’t it? The biggest difference is that the case keyword was replaced with the | character. Making the way we define and use union types look the same is a subtle reminder to always pair union types with switch statements! More than being a nice reminder, it makes it easy to copy / paste our type definition as boilerplate to start writing a new function!

Another difference: Reason handles exhaustiveness checking out of the box. 🙂

What does the Reason output look like?

// Generated by BUCKLESCRIPT VERSION 3.0.1, PLEASE EDIT WITH CARE
'use strict';

function needsCancelButton(screen) {
  if (screen >= 2) {
    return false;
  } else {
    return true;
  }
}

(Play with it on Try Reason →)

Not bad! Telling Reason that our function was exhaustive let it optimize the entire switch statement back down to a single if statement. In fact, it gets even better: when we run this through uglifyjs, it removes the redundant true / false:

"use strict";
function needsCancelButton(n){
  return !(n>=2)
}

Wow! This is actually better than our initial, hand-written if statement. Reason compiled what used to be a string literal 'SuccessScreen' to just the number 2. Reason can do this safely because custom-defined types in Reason aren’t strings, so it doesn’t matter if the names get mangled.

Taking a step back, Reason’s type system delivered on the promise of types in a way Flow couldn’t:

I’m really excited about Reason. 😄 It has a delightful type system and is backed by a decades-old optimizing compiler tool chain. I’d love to see more people take advantage of improvements in type systems to write better code!


Appendix: Other Compile-to-JS Runtimes

The above analysis only considered Flow + Babel and Reason. But then I got curious about how other typed languages that compile to JavaScript compare on the optimizations front:

TypeScript

Despite being a language and compiler, TypeScript maintains a goal of compiling to JavaScript that closely resembles the source TypesScript code. TypeScript has three language constructs for working with exhaustiveness:

  1. union types (identical to the Flow unions that we’ve been talking about),
  2. enums, which are sort of like definition a group of variable constants all at once, and
  3. const enums which are like enums except that they’re represented more succinctly in the compiled output.

TypeScript’s union type over string literals are represented the same way as Flow, so I’m going to skip (1) and focus instead on (2) and (3).

TypeScript’s enum and const enum are subtly different. Not having used the language much, I’ll refer you to the TypeScript documentation to learn more about the differences. But for sure, const enums compile much better than normal enums.

Here’s what normal enums look like in TypeScript—they’re even worse than unions of string literals:

var Screen_;
(function (Screen_) {
    Screen_[Screen_["LoadingScreen"] = 0] = "LoadingScreen";
    Screen_[Screen_["CodeEntryScreen"] = 1] = "CodeEntryScreen";
    Screen_[Screen_["SuccessScreen"] = 2] = "SuccessScreen";
})(Screen_ || (Screen_ = {}));
var impossible = function (x) {
    throw new Error('This case is impossible.');
};
var needsCancelButton = function (screen) {
    switch (screen) {
        case Screen_.LoadingScreen:
            return true;
        case Screen_.CodeEntryScreen:
            return true;
        case Screen_.SuccessScreen:
            return false;
        default:
            return impossible(screen);
    }
};

TypeScript Playground →

So for normal enums:

And then here’s what const enums look like—you can see that TypeScript represents them under the hood without any sort of Screen_ object:

var impossible = function (x) {
    throw new Error('This case is impossible.');
};
var needsCancelButton = function (screen) {
    switch (screen) {
        case 0 /* LoadingScreen */:
            return true;
        case 1 /* CodeEntryScreen */:
            return true;
        case 2 /* SuccessScreen */:
            return false;
        default:
            return impossible(screen);
    }
};

TypeScript Playground →

PureScript

PureScript is another high-level language like Reason. Both Reason and PureScript have data types where we can define unions with custom constructor names. Despite that, PureScript’s generated code is significantly worse than Reason’s.

"use strict";
var LoadingScreen = (function () {
    function LoadingScreen() {};
    LoadingScreen.value = new LoadingScreen();
    return LoadingScreen;
})();
var CodeEntryScreen = (function () {
    function CodeEntryScreen() {};
    CodeEntryScreen.value = new CodeEntryScreen();
    return CodeEntryScreen;
})();
var SuccessScreen = (function () {
    function SuccessScreen() {};
    SuccessScreen.value = new SuccessScreen();
    return SuccessScreen;
})();
var needsCancelButton = function (v) {
    if (v instanceof LoadingScreen) {
        return true;
    };
    if (v instanceof CodeEntryScreen) {
        return true;
    };
    if (v instanceof SuccessScreen) {
        return false;
    };
    throw new Error("Failed pattern match at Main line 10, column 1 - line 10, column 39: " + [ v.constructor.name ]);
};

Admittedly, I didn’t try that hard to turn on optimizations in the compiler. Maybe there’s a flag I can pass to get this Error to go away. But that’s pretty disappointing, compared to how small Reason’s generated code was!

Elm

I list Elm in the same class as Reason and PureScript. Like the other two, it lets us define custom data types, and will automatically warn when us pattern matches aren’t exhaustive. Here’s the code Elm generates:

var _user$project$Main$needsCancelButton = function (page) {
  var _p0 = page;
  switch (_p0.ctor) {
    case 'LoadingScreen':
      return true;
    case 'CodeEntryScreen':
      return true;
    default:
      return false;
  }
};
var _user$project$Main$SuccessScreen = {ctor: 'SuccessScreen'};
var _user$project$Main$CodeEntryScreen = {ctor: 'CodeEntryScreen'};
var _user$project$Main$LoadingScreen = {ctor: 'LoadingScreen'};

It’s interesting to see that even though Reason, PureScript, and Elm all have ML-style datatypes, Reason is the only one that uses an integer representation for the constructor tags.