The only features originally allowed into C++ were those which had efficient implementations. This sounds great, but the results were sometimes dubious. Consider the following program which uses multiple inheritance.
#include <cstdio>
#include <inttypes.h>
using namespace std;
class Left {
int32_t a, b;
};
class Right {
int32_t c, d;
};
class Bottom : public Left, public Right {
};
int main(int argc, char *argv[]) {
Bottom *bottom = new Bottom();
Left *left = bottom;
Right *right = bottom;
printf("left -> %p\n", left);
printf("bottom -> %p\n", bottom);
printf("right -> %p\n", right);
if (left == bottom && bottom == right) {
printf("left == bottom == right\n");
} else {
printf("!(left == bottom == right)\n");
}
return 0;
}
The first three lines of output are:
left -> 0x24dd010
bottom -> 0x24dd010
right -> 0x24dd018
Right off the bat, we see that something happened to right
. The statement right = bottom
did something special: instead of setting right
to bottom
, it actually set it to bottom + 8
. To understand why, we need to look at the memory representation for *bottom
.
0x24dd006 -> | ... |
0x24dd010 -> | int32_t a | <- bottom, left
0x24dd014 -> | int32_t b |
0x24dd018 -> | int32_t c | <- right
0x24dd022 -> | int32_t d |
0x24dd026 -> | ... |
A Bottom
object consists of a Left
object, followed by a Right
object. In turn, Left
and Right
objects consist of their own fields. So, a Bottom
consists of four int32_t
s.
One of the principles behind C++ was that access to object fields should be as fast as possible. In our case, it’s possible to avoid any indirection and make it as fast as normal variable access: since the compiler knows the memory layout, it replaces accesses to bottom->a
by *bottom
, accesses to bottom->b
by *(bottom + 4)
, and so on. This works fine for bottom
, which actually points to a Bottom
object, and for left
because the first part of a Bottom
object is a Left
object.
What about right
? We want right->d
to be replaceable by *(right+4)
, but it’s the second part of a Bottom
object that is a Right
object. For this to work, the statement right = bottom
cannot simply assign bottom
to right
; instead, it casts Bottom*
to Right*
by assigning bottom + <offset of Right in Bottom>
to right
.
Now that we understand what’s happening, we can ask the next question: what is the last line of output? Is it left == bottom == right
? Or !(left == bottom == right)
? It’s the former: the compiler somehow keeps track of what the pointers are supposed to be, and tries to hide the offsetting from the programmer.
I generally like C++, but this gives me mixed feelings. A part of me thinks that this is awesome, and another part screams in horror.
This article is a very good description of C++’s multiple and virtual inheritance with a focus on memory layouts.