Rust was right in not including inheritance.
Introduction
The title of this post makes a bold claim, I understand. It has been specifically phrased as to avoid ambiguity—still, it needs some clarifications and caveats.
- Not everyone agrees 100% on what Object-Oriented Programming (henceforth “OOP”) constitutes. By Smalltalk's standards, neither C++ nor Java are OOP; by my own, Rust is OOP. I chose the phrasing “traditionally-implemented OOP” to refer to the most popular meaning, which includes inheritance; this is exemplified eg by C++ and Java. (Henceforth “tradOOP” for short.)
- The purpose of inheritance, in each and every explanation I could find, is mentioned to be the structuring of is-a relationships, explained further below. This is the purpose for which I believe inheritance to be unfit, and by extension tradOOP as a whole. I do not doubt the existence of programs for which it is a good fit, or even the best we currently have; my assertion concerns itself with its stated purpose, and only with that.
- The main observation presented herein is not mine; it even exists in Wikipedia, with the name “circle-ellipse problem”. Wikipedia's article, however, is not accessible to the casual reader, and does not offer alternatives. The present article, in contrast, aims to achieve both those goals.
Our motivating question
Let us begin with a simple enough question, whose answer will stay with us throughout the rest of this article: What purpose does a programming language serve?
For the purposes of this article, the working answer to this question shall be as follows:
The purpose of a programming language is to map real-world problems to a computer's executive capabilities.
Correct, isn't it? We could go more specific or more vague if we wanted, but this answer is not wrong at all. More to the point, this answer leads us to a very important corollary:
One direct measure of a programming language's quality is its ability to effectively model the real world.
I say this, because there is a very specific part of the real world which I would like to high-light, and explore its relationship with programming languages: subcategories, and its very close relative substitutability.
Is-a relationships
Subcategories appear in natural language fairly frequently. Usually, they appear using the phrase “is a”, hence the term “is-a relationship”. For instance: A woman is a human. A human is a mammal. A mammal is an animal.
Substitutibility means that, whenever one wants a Y, you can instead offer an X without any problems; in other words, you substitute an X for a Y. The reason why it's a very close relative of subcategories is because, if X is a subcategory of Y, then an X can be substituted for a Y. For example, thanks to the hierarchy presented in the prior paragraph, if someone says “Name an example of a specific animal”, a completely correct thing to respond would be “your mother”.
(Sorry about this. Won't do it again.)
With all this in mind, let us examine how subcategories and substitutability (henceforth “subtyping”) work in Rust, and compare/contrast it with object-oriented languages like C++ and Java.
Subtyping in Rust
To illustrate Rust's capabilities in subtyping, we will be using a simple hierarchy of geometric shapes.
Program structure
- A parallelogram will be represented by three points: Two of its consecutive vertices, plus its centre.
- A parallelogram's angle or side can be changed at will. This can lead to either new parallelograms or to a change in existing ones.
- A rectangle is a parallelogram whose angles are all equal.
- A rhombus is a parallelogram whose sides are all equal.
- A square is a rhombus and rectangle at the same time.
The phrasings were not really chosen at random. They were chosen to high-light a very specific way to define things: So-called genus and differentia. In simpler terms, you define something by saying first which category it belongs in, and then what sets it apart from other members of this category.
Available options
So, let's explore how this whole thing could be expressed in Rust! The first thing to note is that, if X and Y are both nouns, Rust has no direct way to express the phrase “an X is a Y”. Instead, we have the following two options, each with its own unique advantages and disadvantages:
- One could choose, for the Y, not a noun but an adjective: in Rust terms, not a concrete type but a trait.
- Advantage: We can identify common behaviour between types, thereby grouping types together by the behaviour they exhibit.
- Disadvantage: We can no longer say “let there be a Y called
foo”, because traits cannot be used to instantiate variables.
- The other option, if Y is a noun, is to implement the
Fromtrait. This is a way to tell the language how to transform an X into a Y, which in turn lets us use it in place of a Y.- Advantage: We can have concrete instances of both Xs and Ys.
- Disadvantage: Once we change the type of something, its previous type gets lost, and thenceforth we can only use it as a Y.
Of course, nothing prevents us from using both of those approaches at the same time! We do exactly this in the code below: There exists a Parallelogram data-type, which represents a generic parallelogram; there also exists an IsPgram trait, which represents the behaviours common to all subcategories of parallelograms.
Architecture step by step
A possible way to structure the code is as follows, detailed step by step. Each step can be clicked on to reveal the code to write.
Firstly, we need a few data-types for lengths and angles and points, and then four structs for each kind of shape:
type Length = f32; type Angle = f32; type Point = ;- This means that, yes, parallelograms are represented both as concrete types and as categories.
Then, a trait detailing their common behaviour:
Each kind of shape shall be given exactly the state it needs, no less and no more:
- The `TryFrom` trait implementations permit us to perform the opposite transformation, if possible.
We also need several implementations of the `From` trait, as mentioned earlier:
- Nb just from the function signatures, we can see that each member function denotes the invariants it does and does not maintain. A rhombus can change its primary angle without changing its data-type; not so a rectangle. This goes vice-versa for changing the primary side.
Time to implement the common behaviour for each type:
// Parallelograms do not change. // Rectangles only change when their primary angle does. // Rhombuses only change when their primary side does. // Squares always change, becoming either rectangles or rhombuses. - The rest will need maths, which we have omitted to maintain the focus on the structural choices.
Some member functions can be implemented by changing the data-type, then deferring to an already-existing implementation:
- As previously, the implementations defer to prior ones.
Finally, time to implement the `in_place` methods, and more generally all behaviour that is unique for each data-type:
Final code
Click to show code
// Irrelevant to the current discussion.
type Length = f32;
type Angle = f32;
type Point = ;
Cool Bear's Cousin's Hot Tip:
Cool Bear's Cousin's Hot Tip:
Although the degrees of freedom are exactly the ones we want, not all possible states here are valid. Sides can have upper or lower limits, lengths must be positive, floats must be finite. Those will have to be maintained using sanity checks in the constructor.
Evaluation
Having detailed the way to architecture the code, let us now examine the merits and demerits.
Merits:
- Very accurate modelling of the problem
- State is never wasted
Demerits:
- Highly unintuitive to write
- The differentiæ are not explicit in the code
- The concept of “parallelogram” is modelled twice in the code, with subtle differences
- Fairly verbose and repetitive (just the
FromandTryFromimplementations take a fair bit of code)
Ho hum. Two merits, four demerits. Not horrible, but definitely not great.
Can tradOOP do better?
Subcategories in tradOOP
Subcategories in tradOOP are so tightly connected with inheritance as to be basically synonymous. Briefly: when an X is declared to be a Y, it inherits all its state and behaviour. After that, it can expand either or both of those things as it sees fit.
Let's translate the above example to old-style C++, as a representative example. I am purposefully avoiding modern C++ idioms, because as mentioned above I am aiming to write about tradOOP specifically.
Parallelograms
Just look at this:
;
;
Much tidier, isn't it? Unlike Rust, we have no need to separate Parallelogram into two; the same class models both the state and the relationships. Convenient!
On to rectangles:
;
…uh-oh.
Just like that, we encounter the first insurmountable problem.
State cannot be constrained
Said problem is as follows: If we denote Rectangle to be a subcategory of Parallelogram via inheritance, we automatically bestow to it all the state that the latter already has. This, however, goes directly against the entire concept of the genus and differentia, which we mentioned earlier.
Think of it this way: the entire job of the differentia is to take a genus and constrain its possible members. Thus, if the possible members have been constrained, their representation ought to need strictly less state than the representation of the genus. Inheritance disagrees, and decides that their representation will need weakly more state than the representation of the genus. Which is, even in the absolute best case, a complete waste of resources.
If we want to constrain the possible states, we have to go the other way, and denote Parallelogram to be a subcategory of Rectangle. This solves some problems but creates 10× as many, because now we can use any random Parallelogram wherever a Rectangle is expected.
But fine, whatever. Let's say we don't care about wasting state. Let's keep going and see how to implement the methods.
Method implementation:
;
Just four methods after that, we encounter the second insurmountable problem, which also proves fatal.
Behaviour cannot be constrained either
In the Rust example earlier, we had both in_place versions of the methods (which merely modified an already-existing variable) and ordinary methods, which created a new copy. The important thing to note is this: The in_place methods only existed for specific data-types, because they're not common to all of them!
The C++ example above has no way to declare this. As soon as we create a subcategory of a Parallelogram, it has to have at least the same behaviour as the Parallelogram, including the in_place methods. With the side one there is no problem, but the angle one has to make a Rectangle change its angle, in-place, while remaining a Rectangle. Which is the opposite of what we want.
There is a third problem, more minor but still noteworthy.
Composition also bestows state
A common piece of advice for tradOOP languages is to favour “Composition over inheritance”. Briefly put, this says that when we want to ensure that X has at least as much state as Y, we ought to do that by just including a Y as a member of each X, not by writing class X: Y.
This is, in my humble opinion, a disappointing duplication of concerns. Two data-types with the same behaviour have no reason to have the same state; indeed, as we showed earlier, the most natural way to describe things is the exact opposite. We are told to favour one over the other, when they should never have stepped on each other's toes in the first place.
Summary and conclusion
The most natural way to think of a subcategory is as something that has the following three properties:
- It can be used (“substituted”) wherever its super-category can
- It needs strictly less state to be described, compared to its super-category.
- It has some behaviour in common with its super-category, though not necessarily all of it.
Traditional OOP, by using inheritance to achieve the first part, automatically defenestrates the other two. It is exactly this weakness of tradOOP which IMHO makes it lose to Rust in its own turf. And to think: Rust does not really do a great job at modelling those properties either! But, unlike tradOOP, at the very least it can model them all at the same time.
Postscript
Credit where it's due: I think method-calling syntax is great, and am grateful to tradOOP for introducing it to the world.
I've seen someone denigrate OOP in general, saying among other things that method-calling syntax can't be all that significant or useful. “Have you seen headlines writing ‘Method-calling syntax introduced, giving developers world-wide huge boosts in productivity’?”
This seems to me to be an example of motivated reasoning: beginning from the axiom that OOP is bad, we look for excuses to support the idea. I disagree, and have a very strong example in mind: point_1.route_to(point_2) is visibly different from point_2.route_to(point_1), which in turn is visibly the same as point_1.route_from(point_2). Method-calling syntax lets us distinguish between the subject and object of a phrase, making some things much clearer. Without this syntactic sugar we'd need to write route(point_1, point_2) which makes it extremely easy to misuse, because we don't have immediate hints as to which point is the beginning and which is the destination. There's also the possibility of route_from_to(point_1, point_2) but it doesn't really solve the problem IMHO.
And that's before we get into the fact that method-calling can be chained, leading to an operation being read very naturally left-to-right: point_1.route_to(point_2).via(point_3).using(mode_of_transport). Without method syntax this would be written as using(via(route(point_1, point_2), point_3), mode_of_transport). This alternative, instead of being read left-to-right, needs to be read in a spiral manner from the middle towards either end. I think the readability does not even compare.