Identity and Equality in Ruby and Smalltalk

One important concept in object-oriented languages is the difference between equality and identity. The concept isn’t complicated, but the English words we use to talk about it are imprecise. The thesaurus says that “identity” and “equality” are synonyms. So let’s back up and make sure we’re being clear on the concepts.

We often talk about variables as if they “hold” a certain object, but that’s not a very good metaphor. You can put the same object “inside” any number of different variables. If we’re working with containment as a metaphor, we run into problems here. It’s as if we’re saying that the same person is in two different houses at the same time.

Instead, we need to think of variables as names. Different people use different names for the same thing (Puma, Mountain Lion, Cougar) or the same name for multiple things (my “home” is not the same house as your “home”). Names are just a reference to something real, a way to address things.

We say that two variables are identical when they both refer to the same object. If they refer to different objects that represent the same value, we say that they are equal. Consider, for example two points that have the same X and Y coordinates. They are equal, but not identical. Objects themselves can tell you whether they’re identical or equal.

In Smalltalk, the relevant messages are:

= The objects are equal
== The object(s) are identical
hash A number that must the same for all equal objects, typically used by collections

The Smalltalk virtual machine is the sole place where object identity can be determined, so the #== method is implemented as a primitive and never overridden by any class. But scores of classes implement their own method for #= and #hash. A classic Smalltalk programming mistake is to override #= but not #hash, thus breaking the requirement that two equal objects have the same hash value.

In Ruby, there are actually significantly more equality messages that you may encounter. The definitions I’m using here come from pp. 95 and 571 of Programming Ruby:

== Test for equal value
=== Used to compare each of the items with the target in the when clause of a case statement
<=> General comparison operator. Returns -1, 0, or +1, depending on whether its receiver is less than, equal to, or greater than its argument.
.eql? True if the receiver and the argument have both the same type and equal values. 1 == 1.0 returns true, but 1.eql?(1.0) returns false.
.equal? True if the receiver and argument have the same object ID
hash Generates a Fixnum hash value for this object. This function must have the property that a.eql?(b) implies a.hash == b.hash.

This is a little overwhelming. Ruby’s :equal? method is a test for identity, and like Smalltalk, it is implemented in Object and never overridden. That gets us started.

The methods for :eql? and :hash are probably the most similar to Smalltalk. If you override one, you need to override the other. Ruby’s collections use them to determine equality for things like Array#uniq or Hash lookup keys.

This has bitten me more than once. I’ve implemented a new sort of object and overridden ==, only to find later that

  • I can’t use those objects for Hash keys, and
  • Arrays return unexpected results when sent :uniq

It seems that instead of overriding ==, I should have overridden eql? and hash. This makes my Hashes and Array#uniq results work like I expected. But wait… The default Ruby implementation of == defers to equal?, which is not what I want either:

>> g1 =
=> #<GridAddress:0xb7b569c8 @key=":key," @row="nil">
>> g2 =
=> #<GridAddress:0xb7b50028 @key=":key," @row="nil">
>> g1.eql?(g2)
=> true
>> g1.hash == g2.hash
=> true

>> g1 == g2
=> false

There’s no deferral from one of these messages to another. In other words, the behavior I want requires me to override eql?, hash, and ==. These are independent, parallel implementations. The difference is primarily semantic. With eql, there is no attempt to coerce the objects to a similar type. Methods for ==, on the other hand, typically try to coerce objects to the same types first before making comparisons.

Finally, we come to the “spaceship” operator, <=>. This is used by a mixin called Comparable, and you automatically get various other comparison operators (including ==) if you include Comparable and implement <=>. The only objects that understand it are those for which less-than and greater-than comparisons actually make sense.

In my opinion, Ruby complicates things here with little payoff. Most programs compare using ==, and the details of other equality comparisons are lost on most Ruby programmers. I think it’s a fair generalization that most Ruby developers don’t use their own objects as hash keys, probably due to a sort of bias towards the more “primitive” object types like Ramon Leon talks about (see “Obsession with Simple Types” in this post).

It’s also worth noting that if you override eql? or ==, you’re expected to check to make sure that the objects have the same type before you start comparing any details. This is the common pattern in Smalltalk, too. Most Smalltalk implementations of #= first check in some way to see that the two objects are the same kind of thing and then proceed to compare relevant details.


Filed under Ruby, Smalltalk

7 responses to “Identity and Equality in Ruby and Smalltalk

  1. Pingback: Egalité et identité en Smalltalk et Ruby at #doesNotUnderstand:

  2. I’ve come up against this problem, too.

    I am writing a program based on a card game. The card game uses two packs of cards. A 7 of Diamonds is equal to the 7 Diamonds from the other pack. But they are not identical.

    I need object identity in places, but when considering what card to play I don’t need the AI to think about 7 Diamonds twice if he has both in his hands. A .uniq on his hand doesn’t work for this, but I am loathe to override .eql? to get this behavior.

    So this is somewhere that ruby isn’t quite working for me. That said, the Comparable mixin that gives you a lot for simply defining the spaceship operator () is pretty cool.

  3. David Baldwin

    Seems like an identical? method would be a good alias or replacement for equal?

  4. ha ha ha, i face this problem everyday !

  5. Does ruby refer to as rocketship or spaceship? I named that spaceship for Perl. Maybe something got lost in the translation.

  6. Pingback: What is the difference between identity and equality in OOP? - QuestionFocus

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s