Implementation details and performance traits of generics in .NET, Java and C++. Presentation for the Jerusalem .NET/C++ User Group by Sasha Goldshtein.
2. What Is This Talk?
• Understanding how generics are implemented in
C++, Java and .NET at the runtime and machine-code
level
• Understanding the performance implications and other
pros/cons of each mechanism
• We will not learn how to use generics
3. Why Do We Want Them?
• “Pure” object-oriented programming does not always
provide a clean and type-safe solution with good
performance
• In other words, what’s wrong here?
public class ArrayList {
object[] items;
public void Add(object item) { ... }
public object ElementAt(int index) { ... }
}
4. The C++ Approach
“Templates, the smart macros from hell.”
• Use parameterized template as a sketch
• No constraints on the original template code
• Everything happens at compile-time
template <typename RanIt>
void sort(RanIt begin, RanIt end) {
… if (*begin < *(begin+1)) …
}
5. C++ Template Definition
template <typename T>
class vector {
T* data; int size; int cap;
public:
vector(int capacity) { ... }
void push_back(const T& datum) { ... }
T operator[](int index) const { ... }
};
6. C++ Template Instantiation
You say:
vector<int> v(2);
Compiler says:
class __vector__int__ {
int* data; int size; int cap;
public:
vector(int capacity) { ... }
};
7. C++ Template Instantiation
You say:
vector<int> v(2);
v.push_back(42);
Compiler says:
class __vector__int__ {
int* data; int size; int cap;
public:
vector(int capacity) { ... }
void push_back(const int& datum) { ... }
};
8. C++ Template Instantiation
You say:
vector<EmptyClass> v(2);
sort(v.begin(), v.end());
Compiler says:
error C2784: 'bool std::operator <(const std::vector<_Ty,_Alloc> &,const
std::vector<_Ty,_Alloc> &)' : could not deduce template argument for 'const
std::vector<_Ty,_Alloc> &' from 'EmptyClass'
vector(1724) : see declaration of 'std::operator <'
templatesstuff.cpp(20) : see reference to function template instantiation
'void sort<std::_Vector_iterator<_Myvec>>(RanIt,RanIt)' being compiled
with
[
_Myvec=std::_Vector_val<std::_Simple_types<EmptyClass>>,
RanIt=std::_Vector_iterator<std::_Vector_val<
std::_Simple_types<EmptyClass>>>
]
9. The C++ Approach—Pros and Cons
Pros Cons
• No performance cost • Can’t share templates
• Very flexible between translation
• Full compile-time type units
safety • Can’t share templates
between libraries (code
bloat)
• Can’t reliably export
templates from libraries
• No constraints = no
readable compiler
errors
10. The Java Approach
• Use parameterized template as a compiler aid
• Constraints used to prove things to the compiler
• Erase type information at runtime
public class LinkedList<E> {
private LinkedList<E> head;
private E value;
public void add(E element) { ... }
public E getAt(int index) { ... }
}
11. Java Generic Type Erasure
There is just one type (raw type) at runtime:
public class LinkedList {
private LinkedList head;
private Object value;
public void add(Object element) { ... }
public Object getAt(int index) { ... }
}
12. Java Generic Type Constraints
Cannot use anything but java.lang.Object methods
without specifying constraint (wildcard):
public class SortedList<E
extends Comparable<E>> {
...
public void add(E element) {
... if (element.compareTo(other)) ...
}
}
13. The Java Approach—Pros and Cons
Pros Cons
• Backwards compatible • Can’t use generics with
with non-generic Java primitive types
versions • Can’t distinguish
• Constraint violation between generic class
results in clear instantiations
compiler error • Can’t instantiate
• Can share generic generic type
types and objects parameters (“new E”)
between • Can’t use type
packages/applications parameters in static
methods or fields
14. The .NET Approach
• Use parameterized template as a compiler aid and a
runtime code generation sketch for the JIT
• Constraints used to prove things to the compiler
public class List<T> {
T[] items; int size; int cap;
public void Add(T item) { ... }
public T this[int index] {
get { ... } set { ... }
}
}
16. .NET Generic Types at Runtime
• There is a separate type at runtime for each generic
instantiation, but not necessarily a separate copy of the
methods’ code
• Does this method’s machine code depend on T?
public void Add(T item) {
if (size < items.Length – 1) {
items[size] = item;
++size;
} else AllocateAndAddSlow(item);
}
21. Type-Specific Code
• What about new T[12] or typeof(T).FullName?
• When .NET generic methods need access to T, they get it
from the method table (this or hidden parameter)
• …Unless the type parameters are value types, in which
case the MT is hard-coded into the method:
C#:
Foo<T>() { … typeof(T) … } T=int
Machine code:
mov ecx,offset 798b6844 (MT: System.Int32)
call clr!JIT_GetRuntimeType (6ca40aa8)
22. Generics and Reflection
• Because generic types are first-class citizens, they are
accessible to Reflection at runtime
Type to = typeof(Dictionary<,>);
Type tc = to.MakeGenericType(
typeof(string), typeof(int));
to = typeof(List<double>).GetGenericTypeDefinition();
tc = to.MakeGenericType(typeof(int)); //List<int>
23. Generic Constraints
• .NET constraints restrict type parameters at compile-
time, very similar to Java’s
• Only a limited set of constraints available:
• Interface constraint: where T : IComparable<T>
• Base constraint: where T : UserControl
• Category constraint: where T : class or where T : struct
• Constructor constraint: where T : new()
Note that constraints don’t break the machine code equivalence
for reference types. Why?
24. Case Study: IEquatable<T>
public static void CallEquals<T>(T inst) {
inst.Equals(inst);
}
public struct Point {
public int X, Y;
public override bool Equals(object o) {
if (o is Point) return Equals((Point)o);
return false;
}
public bool Equals(Point pt) { ... }
}
25. Case Study: IEquatable<T>
• CallEquals has no constraints, so the C# compiler
chooses the Object.Equals(Object) virtual method
• We can add an interface constraint with a strongly-typed
Equals method—now the compiler prefers it
• Note: the interface call has no virtual cost on value types
public static void CallEquals<T>(T inst)
where T : IEquatable<T>
{
inst.Equals(inst);
}
26. Sorting “If Possible”, a la C++
public class List<T> {
T[] items; ...
public void Add(T item) { ... }
public void Sort(SortProvider<T> sorter = null) {
sorter = sorter ?? SortProvider<T>.GetDefault();
if (sorter == null)
throw new NotImplementedException();
sorter.Sort(items);
}
}
27. Sorting “If Possible”, a la C++
public abstract class SortProvider<T> {
public abstract void Sort(T[] items);
public static SortProvider<T> GetDefault() {
if (T is IComparable<T>)
return new DefaultSortProvider<T>();
if (T is IGreaterthanable<T>)
return new GreaterThanSortProvider<T>();
return null;
}
}
internal class DefaultSortProvider<T> : SortProvider<T>
where T : IComparable<T> {
//Use T.CompareTo for sorting
}
28. Getting Generic Math Right in .NET
• Pretty nasty:
• Consider Complex<T>: you can’t implement operators…
• Solution sketch:
• Define ICalculator<T> with methods instead of operators
• Implement ICalculator<T> for each T
• Choose between ICalculator<T>’s implementations at
runtime, and use them in your generic math code
• For more:
http://www.codeproject.com/Articles/8531/Using-generics-for-calculations
29. The .NET Approach—Pros and Cons
Pros Cons
• Constraint violation • Constraints are not
results in clear compiler enough for everything
error (e.g., generic math)
• Can share generic types • No meta-programming
and objects between abilities (advantage?)
packages/applications
• Can use generics
efficiently with value
types
• Can use Reflection to
query over generic types