This time we will discuss virtual inheritance in C++ and find out why one should be very careful using it. See other articles of this series: N1, N2, N3.
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Wade Not in Unknown Waters. Part Four.
1. Wade Not in Unknown Waters. Part Four.
Author: Andrey Karpov
Date: 15.07.2013
This time we will discuss virtual inheritance in C++ and find out why one should be very careful using it.
See other articles of this series: N1, N2, N3.
Initialization of Virtual Base Classes
At first let's find out how classes are allocated in memory without virtual inheritance. Have a look at this
code fragment:
class Base { ... };
class X : public Base { ... };
class Y : public Base { ... };
class XY : public X, public Y { ... };
It's pretty clear: members of the non-virtual base class 'Base' are allocated as common data members of
a derived class. It results in the 'XY' object containing two independent 'Base' subobjects. Here is a
scheme to illustrate that:
Figure 1. Multiple non-virtual inheritance.
When we deal with virtual inheritance, an object of a virtual base class is included into the object of a
derived class only once. Figure 2 shows the structure of the 'XY' object in the code fragment below.
class Base { ... };
class X : public virtual Base { ... };
class Y : public virtual Base { ... };
class XY : public X, public Y { ... };
2. Figure 2. Multiple virtual inheritance.
It is at the end of the 'XY' object that memory for the shared subobject 'Base' is most probable to be
allocated. The exact implementation of the class depends on the compiler. For example, the classes 'X'
and 'Y' may store pointers to the shared object 'Base'. But as far as I understand, this practice is out of
use nowadays. A reference to a shared subobject is rather implemented through offset or as
information stored in the virtual function table.
The "most derived" class 'XY' alone knows where exactly a subobject of the virtual base class 'Base' is to
be allocated. That's why it is the most derived class which is responsible for initializing all the subobjects
of virtual base classes.
'XY' constructors initialize the 'Base' subobject and pointers to it in 'X' and 'Y'. After that, all the rest
members of the classes 'X', 'Y' and 'XY' are initialized.
Once the 'XY' constructor has initialized the 'Base' subobject, the 'X' and 'Y' constructors are not allowed
to re-initialize it. The particular way it will be done depends on the compiler. For example, it can pass a
special additional argument into the 'X' and 'Y' constructors to tell them not to initialize the 'Base' class.
Now the most interesting thing which causes much confusion and a lot of mistakes. Have a look at the
following constructors:
X::X(int A) : Base(A) {}
Y::Y(int A) : Base(A) {}
XY::XY() : X(3), Y(6) {}
What number will the base class's constructor take as an argument - 3 or 6? None!
The constructor 'XY' initializes the virtual subobject 'Base' yet does that implicitly. It is the 'Base'
constructor which is called by default.
As the 'XY' constructor calls the 'X' or 'Y' constructor, it doesn't re-initialize 'Base'. That's why 'Base' is
not being called with an argument passed into it.
Troubles with virtual base classes don't end here. Besides constructors, there are also assignment
operators. If I'm not mistaken, the standard tells us that an assignment operator generated by the
compiler may assign values to a subobject of a virtual base class multiple times or once. So, you just
don't know how many times the 'Base' object will be copied.
If you implement your own assignment operator, make sure you have prevented multiple copying of the
'Base' object. The following code fragment is incorrect:
3. XY &XY::operator =(const XY &src)
{
if (this != &src)
{
X::operator =(*this);
Y::operator =(*this);
....
}
return *this;
}
This code leads to double copying of the 'Base' object. To avoid this, we should add special functions
into the 'X' and 'Y' classes to prevent copying of the 'Base' class's members. The contents of the 'Base'
class are copied just once, in the same code fragment. This is the fixed code:
XY &XY::operator =(const XY &src)
{
if (this != &src)
{
Base::operator =(*this);
X::PartialAssign(*this);
Y::PartialAssign(*this);
....
}
return *this;
}
This code will work well, but it still doesn't look nice and clear. That's the reason why programmers are
recommended to avoid multiple virtual inheritance.
Virtual Base Classes and Type Conversion
Because of the specifics of how virtual base classes are allocated in memory, you can't perform type
conversions like this one:
Base *b = Get();
XY *q = static_cast<XY *>(b); // Compilation error
4. XY *w = (XY *)(b); // Compilation error
A persistent programmer, though, will achieve that by employing the operator 'reinterpret_cast':
XY *e = reinterpret_cast<XY *>(b);
However, the result will hardly be of any use. The address of the beginning of the 'Base' object will be
interpreted as a beginning of the 'XY' object, which is quite a different thing. See Figure 3 for details.
The only way to perform a type conversion is to use the operator dynamic_cast. But using dynamic_cast
too often makes the code smell.
Figure 3. Type conversion.
Should We Abandon Virtual Inheritance?
I agree with many authors that one should avoid virtual inheritance by all means, as well as common
multiple inheritance.
Virtual inheritance causes troubles with object initialization and copying. Since it is the "most derived"
class which is responsible for these operations, it has to be familiar with all the intimate details of the
structure of base classes. Due to this, a more complex dependency appears between the classes, which
complicates the project structure and forces you to make some additional revisions in all those classes
during refactoring. All this leads to new bugs and makes the code less readable.
Troubles with type conversions may also be a source of bugs. You can partly solve the issues by using
the dynamic_cast operator. But it is too slow, and if you have to use it too often in your code, it means
that your project's architecture is probably very poor. Project structure can be almost always
implemented without multiple inheritance. After all, there are no such exotica in many other languages,
and it doesn't prevent programmers writing code in these languages from developing large and complex
projects.
We cannot insist on total refusal of virtual inheritance: it may be useful and convenient at times. But
always think twice before making a heap of complex classes. Growing a forest of small classes with
shallow hierarchy is better than handling a few huge trees. For example, multiple inheritance can be in
most cases replaced by object composition.
5. Good Sides of Multiple Inheritance
OK, we now understand and agree with the criticism of multiple virtual inheritance and multiple
inheritance as such. But are there cases when it can be safe and convenient to use?
Yes, I can name at least one: Mix-ins. If you don't know what it is, see the book "Enough Rope to Shoot
Yourself in the Foot" [3]
A mix-in class doesn't contain any data. All its functions are usually pure virtual. It has no constructor,
and even when it has, it doesn't do anything. It means that no troubles will occur when creating or
copying these classes.
If a base class is a mix-in class, assignment is harmless. Even if an object is copied many times, it doesn't
matter: the program will be free of it after compilation.
References
1. Stephen C. Dewhurst. "C++ Gotchas: Avoiding Common Problems in Coding and Design". -
Addison-Wesley Professional. - 352 pages; illustrations. ISBN-13: 978-0321125187. (See gotchas
45 and 53).
2. Wikipedia. Object composition.
3. Allen I. Holub. "Enough Rope to Shoot Yourself in the Foot". (You can easily find it on the
Internet. Start reading at section 101 and further on).