Problems typing equality in Ruby

TypeScript has this really handy error that flags when it looks like two values of unrelated types are getting compared:

function f(x: number, y: string) {
  if (x === y) {
    //  ^^^ error: This comparison appears to be unintentional because
    //             the types 'number' and 'string' have no overlap.
    console.log(x, y);
  }
}
View in TypeScript Playground →

I would love to build the same error into Sorbet, but there are two features which make that hard: custom overrides of == and subtyping. Here are some heuristics we might consider building in Sorbet, and why they don’t work.

Heuristic 1

Reject all calls to == when the operand types don’t overlap.

The problem: Ruby equality methods make liberal use of implicit type conversions and other tricks. Some examples in the standard library:

Library and application authors pattern their code off of precedents set in the standard library. Implicit conversions like these surface throughout real-world Ruby code.

Heuristic 2

Reject all calls to == when the operand types don’t overlap, as long as the types being compared don’t have a custom == method.

Maybe we pessimistically assume that any override of == might do something wacky. In response, we could limit the check to calls on types that don’t override BasicObject#==.

First of all, that’s almost crippling on its own! When == isn’t overridden it defaults to reference equality, but people use == most often to compare structures, not references! Ruby provides equal? for reference equality—seeing a call to == (not to equal?) is a strong indicator that the author doesn’t want reference equality.

Even if we were satisfied with how limiting this heuristic is, it’s still not enough, this time because of subtyping:

class Foo; end
class FooChild < Foo
  def ==(other); true; end
end
class Bar; end

sig {params(Foo, Bar).void}
def example(x, y)
  if x == y
    p(x, y)
  end
end

In this case, Foo and Bar do not overlapNote that Foo and Bar do not overlap because they are classes. If one of them had been a module, they would overlap, also defeating the equality check.

Foo does not override ==. Unfortunately that says nothing about what subclasses of Foo might do. In this example, Foo has a subclass called FooChild with a custom override that misbehaves. (I’ve made it always return true for simplicity, but you can just as well imagine it doing some sort of implicit conversion.) Thus, subtyping has defeated our heuristic.

If the left operand had been final! Sorbet could rule out caveats like this, but then our heuristic would be so limited as to be surprising: few classes are final!, and few classes which rely on == do so for reference equality. Someone may see this class of error once, assume it applies more widely than it does, and then be shocked when a problem sneaks past the check.

Heuristic 3

Add custom logic for == which knows about the implicit conversions specific standard library classes do.

Maybe trying to be as general as possible is the wrong approach? We could try picking only a few important classes in the standard library and special case the check for those types.

For example,This is not merely an example—this is the approach we’ve implemented it here.

if we pretend that Symbol is final (it’s not but maybe we pretend anyways), we could require that the right operand’s type overlaps with Symbol, catching attempts to compare for equality against String (super common).

This only gets us as far as types that don’t do implicit conversions. For a type like String, even though we know it allows comparisons against “anything that defines to_str” and Sorbet can look up whether such a method exists during inference, subtyping once again gets in our way:

class AbstractParent; end
class Child < AbstractParent
  def to_str; ""; end
end

sig {params(String, AbstractParent).void}
def example(x, y)
  if x == y
    p(x, y)
  end
end

In the comparison of String and AbstractParent, the types don’t overlap and AbstractParent doesn’t implement to_str. But Child does, which defeats the heuristic.

Sorbet could both assume String is final! and check that the right operand is final!, and then implement the check (the last time subtyping got in our way, it was only with the left operand). Adding one more constraint makes this heuristic even less useful and more surprising than the previous.

A manual approach?

Maybe I’m a zealous Sorbet user who acknowledges that the problem can’t be solved for every class. I still want to report an error whenever one of my classes is compared using == on mismatched types—can I take the problem into my own hands? It’s technically as simple as adding a signature:

class A < T::Struct
  extend T::Sig

  const , Integer

  sig {params(A).returns(T::Boolean)}
  def ==(other)
    case other
    when A then self.val == other.val
    else false
    end
  end
end

The signature requires that other has type A. Done! Well, this fixes the immediate problem (x == y reports an error if y is an A), but it causes two of its own.

First, this is technically an incompatible override (see A note on variance for more). As it happens, Sorbet silences the error because the RBI definition for BasicObject#== is not marked overridable.At this point, marking it overridable would be a backwards incompatible change, requiring existing == signatures to start mentioning override, likely with no benefit.

Maybe you declare “ignorance is bliss,” ignore the voice telling you that all incompatible overrides are bad, and blaze ahead.

This leads to our second problem, heterogeneous collections.

sig {params(T::Array[T.any(Integer, A)]}
def example(xs)
  xs.include?(0) # 💥
end

Ruby’s include? method calls == under the hood. Even if the program never mentions a literal call to == with A and Integer, it still might happen indirectly by way of include? if xs has any A values in it. But now instead of returning false, the sig on our A#== method will raise an exception!

That’s a problem.You might be tempted to think of this as an indictment of runtime checking, but I don’t. In my opinion, this is the runtime type system flagging a real problem (incompatible override) which the static type system couldn’t catch because of gradual typing.

To fix it, we could try marking the sig with checked(:never), but then Sorbet’s dead code checking would prevent us from handling other types in our method body.

Another attempted fix might be to use an overloaded signature. These aren’t allowed outside of standard library signatures, but here’s what would happen if they were:

class A < T::Struct
  extend T::Sig

  const , Integer

  sig {params(A).returns(T::Boolean)}
  sig {params(BasicObject).returns(FalseClass)}
  def ==(other)
    case other
    when A then self.val == other.val
    else false
    end
  end
end

sig {params(A, T.nilable(A)).void}
def example(x, y)
  if x == y
    p(x, y)
  end
end
editor.rb:19: This code is unreachable https://srb.help/7006
    22 |    p(x, y)
            ^^^^^^^
    editor.rb:18: This condition was always `falsy` (`FalseClass`)
    21 |  if x == y
             ^^^^^^
  Got `FalseClass` originating from:
    editor.rb:18:
    21 |  if x == y
             ^^^^^^
Errors: 1
View on sorbet.run →

The overloaded signature specifies that if other is A the comparison happens, but if other is any other type, the method returns FalseClass. The resulting errors aren’t quite as obvious: errors now show up indirectly as dead code errors, rather than something descriptive at the point of the problem. It’s unclear whether hard-to-understand errors are better than no errors.

However, Sorbet’s overload resolution doesn’t work well with these signatures.Writing this post made me wonder if Sorbet should do something smarter here, like try to decompose the argument type and combine all the overloads that apply to each component. But that’s getting into territory where I can’t think of any prior art, which sets off my ⚠️ bad idea ⚠️ radar. (For example, TypeScript behaves like Sorbet when porting the above example.)

T.nilable(A) is not a subtype of A, causing Sorbet to apply the BasicObject overload. This means Sorbet ascribes FalseClass to res, which is wrong when y is non-nil at runtime.

All of this leads me to the conclusion that not only have we failed to add heuristics to Sorbet to solve this, it’s not even really practical for users to take the problem into their own hands.



It’s an unfortunate state of affairs, and one that we likely can’t fix. The best advice I can offer is just to be aware of this and try to write thorough tests. Like I mentioned above, the approach we’ll likely have to take is insert ad hoc checks for specific standard library classes that are subclassed infrequently in practice and that do no implicit conversions. It will be possible to create false positives errors, and we’ll have to live with that.


Appendix: What do other languages do?

Some languages require that for two things to be equal, their types must always be the same. For example, in Haskell the Eq type class only provides a function of type (Eq a) => a -> a -> Bool. All the occurrences of a in that signature force the left and right operands of == to have matching types.

Other languages say that the two operands’ types must be the same by default, but allow opting into comparisons between other types explicitly. This is how both Rust (example) and C++ work.

Java has basically all the same problems as Sorbet faces with Ruby. java.lang.Object implements equals by default for all classes and its argument takes another Object. The wild thing to me is that the Java designers must have been aware of the way other typed languages handled equality—it’s not like Java takes this approach because it started from an untyped language! C++ had operator overloading powering == all the way back in the 80s, well before Java appeared (to say nothing of Haskell or ML languages).

Mypy implements behavior like TypeScript, but it’s behind a --strict-equality flag. It suffers the same problems as above because it has overridable __eq__ methods and subtyping, but the maintainers have made the call that since individual projects can choose to opt in and since implicit conversions are more rare in Python, the problems are tolerable.

Flow implements the same check that TypeScript does, but only for == not for === (from my limited poking).