Disclaimer: this post was first drafted as a Stripe-internal email. On December 10, 2022 I republished it here, largely unchanged from the original. See Some Old Sorbet Compiler Notes for more. The Sorbet Compiler is still largely an experimental project: this post is available purely for curiosity’s sake.
Any benchmark numbers included in this post are intended to be educational about how the Sorbet Compiler approaches speeding up code. They should not be taken as representative or predictive of any real-world workload, and are likely out-of-date with respect to improvements that have been made since this post originally appeared.
Last week in Types Make Array Access Faster we compared the Ruby VM’s performance on array accesses with the Sorbet Compiler’s performance on array accesses, as an example of how making types available to the Sorbet Compiler let it speed up code. The snippet under scrutiny was basically this operation:
[0] xs
but repeated many (10M) times to make the performance difference obvious.
The data we collected looked like this:
benchmark | interpreted | compiled | interpreted, minus while |
compiled, minus while |
compiler speedup, w/o while |
---|---|---|---|---|---|
while_10_000_000.rb | 0.205s | 0.048s | — | — | — |
untyped_array_aref.rb | 0.282s | 0.174s | 0.077s | 0.126s | 0.61x |
typed_array_aref.rb | 0.282s | 0.061s | 0.077s | 0.013s | 5.92x |
And our ultimate conclusion was:
With type information, Sorbet-compiled code is even faster than both the interpreted code and the compiled but untyped code.
But there was an interesting caveat along the way:
The array access operation is actually slower than the Ruby VM if Sorbet doesn’t have type information (0.61x speedup is less than 1, so it’s a slowdown).
The idea was that for our plain xs[0]
program, the compiler was actually slower than the interpreter.
Why was the compiler slower?
It turns out that array access is one of the operations the Ruby VM is already pretty good at, because it’s special cased. We can check this looking at the bytecode instructions that the Ruby VM uses to evaluate an array access:
❯ ruby --dump=insns -e 'xs = []; xs[0]'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,14)> (catch: FALSE)
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] xs@0
0000 newarray 0 ( 1)[Li]
0002 setlocal_WC_0 xs@0
0004 getlocal_WC_0 xs@0
0006 putobject_INT2FIX_0_
0007 opt_aref <callinfo!mid:[], argc:1, ARGS_SIMPLE>, <callcache>
0010 leave
Here’s how to read this output:
We used the special
--dump=insns
flag to theruby
command line. You can try this at home!Theres some stuff we don’t need on the first few lines, and then the bytecode instructions start with the line reading
0000
.The actual instruction that corresponds to the
xs[0]
instruction happens at index0007
. The name of the instruction isopt_aref
.
That’s interesting! Instead of treating array access like any other method call,Did you know that square brackets are just a method call in Ruby?
it treats it as a special, optimized instruction called opt_aref
. Checking the implementation of that instruction, we find that the optimization only works if the method receiver (xs
in this case) is exactly an instance of the Array
or Hash
class.
In other words, it’s easy to defeat this optimization by subclassing Array
:
class MyArray < Array
end
= MyArray.new([2])
xs [0] xs
In this case, since xs
is not exactly Array
or Hash
anymore, the optimization won’t apply, and the Ruby VM falls back to calling a method named []
on xs
with argument 0
.
We can see the effect of this by writing another Sorbet compiler benchmark, and adding it to our table:
benchmark | interpreted | compiled | interpreted, minus while |
compiled, minus while |
compiler speedup, w/o while |
---|---|---|---|---|---|
while_10_000_000.rb | 0.205s | 0.048s | — | — | — |
untyped_array_aref.rb | 0.282s | 0.174s | 0.077s | 0.126s | 0.61x |
typed_array_aref.rb | 0.282s | 0.061s | 0.077s | 0.013s | 5.92x |
untyped_array_subclass_aref.rb | 0.388s | 0.172s | 0.183s | 0.124s | 1.48x |
By changing the untyped Array
to an untyped subclass of Array
, the interpreter slows downEditing note: These numbers are unchanged from when I first measured in September 2020. They do not necessarily reflect the Sorbet Compiler’s current performance.
an extra 0.106ms, but our compiled version doesn’t care whether it was the Array
case or MyArray
case, because they’re both untyped.
Now that the Ruby VM hasn’t effectively special cased our benchmark, the compiler starts to shine! This is another reason why we’re really optimistic about the impact of the compiler. Our initial plans were to speed up typed code, and count on other teams adding types everywhere. While adding types definitely helps (look at that 5.92x speedup!), the compiler can still speed up certain kinds of untyped code, too.