Conventional knowledge is that union types are common and intersection types are rare. But actually that’s not the case—intersection types show up in nearly every program Sorbet type checks thanks to control flow.
Union types in
Sorbet are incredibly common, which should be no surprise. In Sorbet,
T.nilable(...)
is sugar for
T.any(NilClass, ...)
. T.nilable
shows up all
over the place and probably catches more bugs than any other feature in
Sorbet.
Sorbet also has intersection
types. While union types mean “either this or that,” intersection
types mean “both this and that.” On first glance, intersection types
seem like some super niche feature which only benefits a handful of
programs. In Stripe’s Ruby monorepo, the strings T.any
and
T.nilable
occur nearly 300 times more than
T.all
does.
But those numbers hide something critical: intersection types power Sorbet’s control flow-sensitive typing. They’re actually present in every Ruby program, but just a little hard to spot. Let’s look at how pervasive they are with a few examples:
# typed: strict
extend T::Sig
class Parent; end
class Child < Parent; end
{params(x: Parent).void}
sig def example1(x)
case x
when Child
.reveal_type(x) # Revealed type: `Child`
Tend
end
Here x
starts out having type Parent
, but
inside the case
statement Sorbet treats x
as
having the more specific type Child
. There’s no
T.all
in sight, but that’s because it’s hiding. Sorbet
doesn’t just throw away the fact that it knew
x <: Parent
. Instead, it uses T.all
to
update its type for x
to
T.all(Parent, Child)
.
T.all(Parent, Child)
is equivalent to Child
because Child
is a subtype of Parent
. If types
represent sets of values, then the set of values represented by
Child
is a subset of the set of values represented by
Parent
, so the intersection of those two sets would just
leave Child
.If you’re not convinced, consider: with
T.all(Parent, Child)
we should be able to call all the
methods on Parent
and all the methods on
Child
. But Child
inherits
Parent
’s methods, so any method Parent
has
will already be on Child
. So Child
is
equivalently good as T.all(Parent, Child)
.
Sorbet attempts to simplify a large type to a smaller, equivalent type when it can for two reasons:
Usability – Most users don’t know that
T.all
means “intersection type” or even what intersection types are. (And even those who do still end up drawing Venn diagrams from time to time!) It only gets more complicated whenT.all
s andT.any
s nest inside each other.Meanwhile,
Child
is a super easy type to understand, and leads to nice, short error messages.Performance – Checking whether one type is a subtype of another is a super common operation, so it has to be fast. By collapsing
T.all(Parent, Child)
toChild
, Sorbet does at least half as much work when checking subtyping (probably more, because of some common path optimizations).When this simplification happens, Sorbet even skips an allocation entirely. Cnstructing
T.all(Parent, Child)
in Sorbet short circuits and returns a reference to the already allocatedChild
type that was passed as an argument (with ownership tracked viastd::shared_ptr
If you’re looking for a good intro to modern C++ things likeshared_ptr
, I can’t recommend this blog post series enough.
).
Let’s look at another example of control flow:
# typed: strict
extend T::Sig
class A; end
class B; end
{params(a_or_b: T.any(A, B)).void}
sig def example2(a_or_b)
case a_or_b
when A
.reveal_type(a_or_b) # Revealed type: `A`
Tend
end
This example method accepts either A
or B
(T.any(A, B)
) and then branches on whether
a_or_b
is an instance of A
. Again: Sorbet
doesn’t throw away that it knows a_or_b <: T.any(A, B)
.
Instead it updates its knowledge of the type of a_or_b
using T.all
to get T.all(T.any(A, B), A)
.
Realizing that this is equivalent to A
is a bit
trickier:
.all(T.any(A, B), A)
T
# Distribute
.any( T.all(A, A) , T.all(B, A) )
T
# T.all(A, A) is just A (idempotence)
.any( A , T.all(B, A) )
T
# A and B are classes (not mixins) and neither inherits the other.
# It's impossible to have a value of that type, so it's bottom:
.any( A , T.noreturn )
T
# bottom is the identity of union
A
You can start to see how usability and performance and might get a little out of hand if Sorbet didn’t keep attempting to simplify things! The cumulative effect of all the control flow in a program would result in huge, unweidly types.
Until now you could claim that I’ve been hyping up intersection types as the solution to problems that were self-imposed. That if we just invented some other method for modeling control flow, it would have been naturally usable or naturally performant, and we wouldn’t have had problems in the first place. So next let’s look at some examples to see why intersection types really are the most natural solution:
module I1
def foo1; end
end
module I2
def foo2; end
end
{params(x: I1).void}
sig def example3(x)
.foo1 # Works outside
xcase x
when I2
.foo1 # Should (and does) still work inside
x.foo2
xend
end
Unlike in the other examples, this is the first example where had we
tried to implement control-flow-sensitive typing by throwing away the
old type and using the new type instead it wouldn’t have worked. The key
thing to notice: this example uses modules. Outside the
case
of course calling x.method_from_1
works
because x
starts out at type I1
. But if we
treated x
as only I2
inside the
when I2
, we’d start reporting an error for calling
x.method_from_1
because it doesn’t exist on
I2
.
Unlike intersecting unrelated classes (our T.all(B, A)
example from earlier), intersecting unrelated modules does’t collapse to
T.noreturn
. There’s nothing stopping some class from
including both I1
and I2
. Instances of that
class would be values of type T.any(I1, I2)
:
class SomeClass
include I1
include I2
end
# This type assertion is okay:
.let(SomeClass.new, T.all(I1, I2)) T
So at least for implementing certain cases of flow sensitive typing, we’ll need intersection types anyways. Then for these certain cases we’d incur the usability and performance problems we discovered earlier and have to solve them.
But more than that, intersection types are fundamentally easier to work with compared to some ad hoc approach to flow sensitive typing. Type system bugs are weird. It’s frquently harder to figure out whether the current behavior is buggy in the first place than it is to find the cause!
In that light, intersection types present an elegant, robust model for arriving at what the correct behavior should be, independent of what Sorbet’s existing behavior is. It’s clear how intersection types interact with union types, and with subtyping, and with generics, and with variance, etc.
By repurposing intersection types to model control flow sensitivity, when things go wrong there’s a framework for discovering what’s right.
(Speaking of repurposing, intersection types also play an important role in how Sorbet suggests potential method signatures! That’s three birds with one stone.)