TypeScript has this really handy error that flags when it looks like two values of unrelated types are getting compared:
I would love to build the same error into Sorbet, but there are two
features which make that hard: custom overrides of ==
and
subtyping. Here are some heuristics we might consider building in
Sorbet, and why they don’t work.
Heuristic 1
Reject all calls to
==
when the operand types don’t overlap.
The problem: Ruby equality methods make liberal use of implicit type conversions and other tricks. Some examples in the standard library:
Array
,Hash
, andString
allow the right operand to implement a method (to_ary
,to_hash
, orto_str
, respectively) that gets called implicitly on arguments whose type does not match the receiver.In addition to implicit conversions, most numeric types (for example,
Integer
andFloat
) will callother == self
wheneverself == other
initially returnsfalse
. Here’s an example:class A def ==(other) .is_a?(A) || other == 0 otherend end p(0 == A.new) # => true
This allows classes like
Process::Status
In one sense, it’s neat: you can have methods likestatus.success?
without monkey patchingInteger
, and while still allowing code like0 == status
.
in the standard library to be compared interchangeably withInteger
.
Library and application authors pattern their code off of precedents set in the standard library. Implicit conversions like these surface throughout real-world Ruby code.
Heuristic 2
Reject all calls to
==
when the operand types don’t overlap, as long as the types being compared don’t have a custom==
method.
Maybe we pessimistically assume that any override of
==
might do something wacky. In response, we could limit
the check to calls on types that don’t override
BasicObject#==
.
First of all, that’s almost crippling on its own! When
==
isn’t overridden it defaults to reference equality, but
people use ==
most often to compare structures, not
references! Ruby provides equal?
for reference
equality—seeing a call to ==
(not to equal?
)
is a strong indicator that the author doesn’t want reference
equality.
Even if we were satisfied with how limiting this heuristic is, it’s still not enough, this time because of subtyping:
class Foo; end
class FooChild < Foo
def ==(other); true; end
end
class Bar; end
sig {params(x: Foo, y: Bar).void}
def example(x, y)
if x == y
p(x, y)
end
end
In this case, Foo
and Bar
do not
overlapNote that Foo
and Bar
do
not overlap because they are classes. If one of them had been a module,
they
would overlap, also defeating the equality check.
Foo
does not override ==
.
Unfortunately that says nothing about what subclasses of
Foo
might do. In this example, Foo
has a
subclass called FooChild
with a custom override that
misbehaves. (I’ve made it always return true
for
simplicity, but you can just as well imagine it doing some sort of
implicit conversion.) Thus, subtyping has defeated our heuristic.
If the left operand had been final!
Sorbet
could rule out caveats like this, but then our heuristic would be so
limited as to be surprising: few classes are final!
, and
few classes which rely on ==
do so for reference equality.
Someone may see this class of error once, assume it applies more widely
than it does, and then be shocked when a problem sneaks past the
check.
Heuristic 3
Add custom logic for
==
which knows about the implicit conversions specific standard library classes do.
Maybe trying to be as general as possible is the wrong approach? We could try picking only a few important classes in the standard library and special case the check for those types.
For example,This is not merely an example—this is the approach
we’ve implemented it here.
if we pretend that Symbol
is final (it’s not
but maybe we pretend anyways), we could require that the right operand’s
type overlaps with Symbol
, catching attempts to compare for
equality against String
(super common).
This only gets us as far as types that don’t do implicit conversions.
For a type like String
, even though we know it allows
comparisons against “anything that defines to_str
” and
Sorbet can look up whether such a method exists during inference,
subtyping once again gets in our way:
class AbstractParent; end
class Child < AbstractParent
def to_str; ""; end
end
sig {params(x: String, y: AbstractParent).void}
def example(x, y)
if x == y
p(x, y)
end
end
In the comparison of String
and
AbstractParent
, the types don’t overlap
and AbstractParent
doesn’t implement
to_str
. But Child
does, which defeats the
heuristic.
Sorbet could both assume String
is final!
and check that the right operand is
final!
, and then implement the check (the last time
subtyping got in our way, it was only with the left operand). Adding one
more constraint makes this heuristic even less useful and more
surprising than the previous.
A manual approach?
Maybe I’m a zealous Sorbet user who acknowledges that the problem
can’t be solved for every class. I still want to report an error
whenever one of my classes is compared using ==
on
mismatched types—can I take the problem into my own hands? It’s
technically as simple as adding a signature:
class A < T::Struct
extend T::Sig
const :val, Integer
sig {params(other: A).returns(T::Boolean)}
def ==(other)
case other
when A then self.val == other.val
else false
end
end
end
The signature requires that other
has type
A
. Done! Well, this fixes the immediate problem
(x == y
reports an error if y
is an
A
), but it causes two of its own.
First, this is technically an incompatible override (see A
note on variance for more). As it happens, Sorbet silences the error
because the RBI definition for BasicObject#==
is not marked
overridable
.At this point, marking it overridable would be a
backwards incompatible change, requiring existing ==
signatures to start mentioning override
, likely with no
benefit.
Maybe you declare “ignorance is bliss,” ignore the voice
telling you that all incompatible overrides are bad, and blaze
ahead.
This leads to our second problem, heterogeneous collections.
Ruby’s include?
method calls ==
under the
hood. Even if the program never mentions a literal call to
==
with A
and Integer
, it still
might happen indirectly by way of include?
if
xs
has any A
values in it. But now instead of
returning false
, the sig
on our
A#==
method will raise an exception!
That’s a problem.You might be tempted to think of this as an
indictment of runtime checking,
but I don’t. In my opinion, this is the runtime type system flagging a
real problem (incompatible override) which the static type system
couldn’t catch because of gradual typing.
To fix it, we could try marking the sig
with
checked(:never)
, but then Sorbet’s dead code checking would
prevent us from handling other types in our method body.
Another attempted fix might be to use an overloaded signature. These aren’t allowed outside of standard library signatures, but here’s what would happen if they were:
The overloaded signature specifies that if other
is
A
the comparison happens, but if other
is any
other type, the method returns FalseClass
. The resulting
errors aren’t quite as obvious: errors now show up indirectly as dead
code errors, rather than something descriptive at the point of the
problem. It’s unclear whether hard-to-understand errors are better than
no errors.
However, Sorbet’s overload resolution doesn’t work well with these
signatures.Writing this post made me wonder if Sorbet
should do something smarter here, like try to decompose the
argument type and combine all the overloads that apply to each
component. But that’s getting into territory where I can’t think of any
prior art, which sets off my ⚠️ bad idea ⚠️ radar. (For example,
TypeScript behaves like Sorbet when porting the above example.)
T.nilable(A)
is not a subtype of
A
, causing Sorbet to apply the BasicObject
overload. This means Sorbet ascribes FalseClass
to
res
, which is wrong when y
is
non-nil
at runtime.
All of this leads me to the conclusion that not only have we failed to add heuristics to Sorbet to solve this, it’s not even really practical for users to take the problem into their own hands.
It’s an unfortunate state of affairs, and one that we likely can’t fix. The best advice I can offer is just to be aware of this and try to write thorough tests. Like I mentioned above, the approach we’ll likely have to take is insert ad hoc checks for specific standard library classes that are subclassed infrequently in practice and that do no implicit conversions. It will be possible to create false positives errors, and we’ll have to live with that.
Appendix: What do other languages do?
Some languages require that for two things to be equal, their types
must always be the same. For example, in Haskell the Eq
type class only provides a function of type
(Eq a) => a -> a -> Bool
. All the occurrences of
a
in that signature force the left and right operands of
==
to have matching types.
Other languages say that the two operands’ types must be the same by default, but allow opting into comparisons between other types explicitly. This is how both Rust (example) and C++ work.
Java has basically all the same problems as Sorbet faces with Ruby.
java.lang.Object
implements equals
by default
for all classes and its argument takes another Object
. The
wild thing to me is that the Java designers must have been aware of the
way other typed languages handled equality—it’s not like Java takes this
approach because it started from an untyped language! C++ had operator
overloading powering ==
all the way back in the 80s, well
before Java appeared (to say nothing of Haskell or ML languages).
Mypy implements behavior like TypeScript, but it’s behind a
--strict-equality
flag. It suffers the same problems as
above because it has overridable __eq__
methods and
subtyping, but the maintainers have made the call that since individual
projects can choose to opt in and since implicit conversions are more
rare in Python, the problems are tolerable.
Flow implements the same check that TypeScript does, but only for
==
not for ===
(from my limited poking).