Generics in .NET, C++ and Java

GENERICS
Sasha Goldshtein
blog.sashag.net | @goldshtn

What Is This Talk?
• Understanding how generics are implemented in
C++, Java and .NET at the runtime and machine-code
level
• Understanding the performance implications and other
pros/cons of each mechanism

• We will not learn how to use generics

Why Do We Want Them?
• “Pure” object-oriented programming does not always
provide a clean and type-safe solution with good
performance
• In other words, what’s wrong here?

public class ArrayList {
object[] items;
public void Add(object item) { ... }
public object ElementAt(int index) { ... }
}

The C++ Approach
“Templates, the smart macros from hell.”

• Use parameterized template as a sketch
• No constraints on the original template code
• Everything happens at compile-time

template <typename RanIt>
void sort(RanIt begin, RanIt end) {
… if (*begin < *(begin+1)) …
}

C++ Template Definition
template <typename T>
class vector {
T* data; int size; int cap;
public:
vector(int capacity) { ... }
void push_back(const T& datum) { ... }
T operator[](int index) const { ... }
};

C++ Template Instantiation
You say:
vector<int> v(2);

Compiler says:
class __vector__int__ {
int* data; int size; int cap;
public:
};

You say:
vector<int> v(2);
v.push_back(42);

Compiler says:
class __vector__int__ {
int* data; int size; int cap;
public:
void push_back(const int& datum) { ... }
};

You say:
vector<EmptyClass> v(2);
sort(v.begin(), v.end());

Compiler says:
error C2784: 'bool std::operator <(const std::vector<_Ty,_Alloc> &,const
std::vector<_Ty,_Alloc> &)' : could not deduce template argument for 'const
std::vector<_Ty,_Alloc> &' from 'EmptyClass'
vector(1724) : see declaration of 'std::operator <'
templatesstuff.cpp(20) : see reference to function template instantiation
'void sort<std::_Vector_iterator<_Myvec>>(RanIt,RanIt)' being compiled
with
[
_Myvec=std::_Vector_val<std::_Simple_types<EmptyClass>>,
RanIt=std::_Vector_iterator<std::_Vector_val<
std::_Simple_types<EmptyClass>>>
]

The C++ Approach—Pros and Cons
Pros Cons

• No performance cost • Can’t share templates
• Very flexible between translation
• Full compile-time type units
safety • Can’t share templates
between libraries (code
bloat)
• Can’t reliably export
templates from libraries
• No constraints = no
readable compiler
errors

The Java Approach
• Use parameterized template as a compiler aid
• Constraints used to prove things to the compiler
• Erase type information at runtime

public class LinkedList<E> {
private LinkedList<E> head;
private E value;
public void add(E element) { ... }
public E getAt(int index) { ... }
}

Java Generic Type Erasure
There is just one type (raw type) at runtime:

public class LinkedList {
private LinkedList head;
private Object value;
public void add(Object element) { ... }
public Object getAt(int index) { ... }
}

Java Generic Type Constraints
Cannot use anything but java.lang.Object methods
without specifying constraint (wildcard):

public class SortedList<E
extends Comparable<E>> {
...
public void add(E element) {
... if (element.compareTo(other)) ...
}
}

The Java Approach—Pros and Cons
Pros Cons

• Backwards compatible • Can’t use generics with
with non-generic Java primitive types
versions • Can’t distinguish
• Constraint violation between generic class
results in clear instantiations
compiler error • Can’t instantiate
• Can share generic generic type
types and objects parameters (“new E”)
between • Can’t use type
packages/applications parameters in static
methods or fields

The .NET Approach
• Use parameterized template as a compiler aid and a
runtime code generation sketch for the JIT
• Constraints used to prove things to the compiler

public class List<T> {
T[] items; int size; int cap;
public void Add(T item) { ... }
public T this[int index] {
get { ... } set { ... }
}
}

Digression: .NET Object Layout

.NET Generic Types at Runtime
• There is a separate type at runtime for each generic
instantiation, but not necessarily a separate copy of the
methods’ code
• Does this method’s machine code depend on T?

public void Add(T item) {
if (size < items.Length – 1) {
items[size] = item;
++size;
} else AllocateAndAddSlow(item);
}

Concrete Example: Stack Push
BasicStack`1[[System.__Canon, mscorlib]].Push(System.__Canon)
00260360 57 push edi
00260361 56 push esi
00260362 8b7104 mov esi,dword ptr [ecx+4]
00260365 8b7908 mov edi,dword ptr [ecx+8]
00260368 8d4701 lea eax,[edi+1]
0026036b 894108 mov dword ptr [ecx+8],eax
0026036e 52 push edx
0026036f 8bce mov ecx,esi
00260371 8bd7 mov edx,edi
00260373 e8f4cb3870 call clr!JIT_Stelem_Ref (705ecf6c)
00260378 5e pop esi
00260379 5f pop edi
0026037a c3 ret

BasicStack`1[[System.Int32, mscorlib]].Push(Int32)
002603c0 57 push edi
002603c1 56 push esi
002603c2 8b7104 mov esi,dword ptr [ecx+4]
002603c5 8b7908 mov edi,dword ptr [ecx+8]
002603c8 8d4701 lea eax,[edi+1]
002603cb 894108 mov dword ptr [ecx+8],eax
002603ce 3b7e04 cmp edi,dword ptr [esi+4]
002603d1 7307 jae 002603da
002603d3 8954be08 mov dword ptr [esi+edi*4+8],edx
002603d7 5e pop esi
002603d8 5f pop edi
002603d9 c3 ret
002603da e877446170 call clr!JIT_RngChkFail (70874856)
002603df cc int 3

BasicStack`1[[System.Double, mscorlib]].Push(Double)
00260420 56 push esi
00260421 8b5104 mov edx,dword ptr [ecx+4]
00260424 8b7108 mov esi,dword ptr [ecx+8]
00260427 8d4601 lea eax,[esi+1]
0026042a 894108 mov dword ptr [ecx+8],eax
0026042d 3b7204 cmp esi,dword ptr [edx+4]
00260430 730c jae 0026043e
00260432 dd442408 fld qword ptr [esp+8]
00260436 dd5cf208 fstp qword ptr [edx+esi*8+8]
0026043a 5e pop esi
0026043b c20800 ret 8
0026043e e813446170 call clr!JIT_RngChkFail (70874856)
00260443 cc int 3

Type-Specific Code
• What about new T[12] or typeof(T).FullName?
• When .NET generic methods need access to T, they get it
from the method table (this or hidden parameter)

• …Unless the type parameters are value types, in which
case the MT is hard-coded into the method:
C#:
Foo<T>() { … typeof(T) … } T=int
Machine code:
mov ecx,offset 798b6844 (MT: System.Int32)
call clr!JIT_GetRuntimeType (6ca40aa8)

Generics and Reflection
• Because generic types are first-class citizens, they are
accessible to Reflection at runtime

Type to = typeof(Dictionary<,>);
Type tc = to.MakeGenericType(
typeof(string), typeof(int));

to = typeof(List<double>).GetGenericTypeDefinition();
tc = to.MakeGenericType(typeof(int)); //List<int>

Generic Constraints
• .NET constraints restrict type parameters at compile-
time, very similar to Java’s
• Only a limited set of constraints available:
• Interface constraint: where T : IComparable<T>
• Base constraint: where T : UserControl
• Category constraint: where T : class or where T : struct
• Constructor constraint: where T : new()

Note that constraints don’t break the machine code equivalence
for reference types. Why?

Case Study: IEquatable<T>
public static void CallEquals<T>(T inst) {
inst.Equals(inst);
}

public struct Point {
public int X, Y;
public override bool Equals(object o) {
if (o is Point) return Equals((Point)o);
return false;
}
public bool Equals(Point pt) { ... }
}

Case Study: IEquatable<T>
• CallEquals has no constraints, so the C# compiler
chooses the Object.Equals(Object) virtual method
• We can add an interface constraint with a strongly-typed
Equals method—now the compiler prefers it
• Note: the interface call has no virtual cost on value types

public static void CallEquals<T>(T inst)
where T : IEquatable<T>
{
inst.Equals(inst);
}

Sorting “If Possible”, a la C++
public class List<T> {
T[] items; ...
public void Add(T item) { ... }
public void Sort(SortProvider<T> sorter = null) {
sorter = sorter ?? SortProvider<T>.GetDefault();
if (sorter == null)
throw new NotImplementedException();
sorter.Sort(items);
}
}

Sorting “If Possible”, a la C++
public abstract class SortProvider<T> {
public abstract void Sort(T[] items);
public static SortProvider<T> GetDefault() {
if (T is IComparable<T>)
return new DefaultSortProvider<T>();
if (T is IGreaterthanable<T>)
return new GreaterThanSortProvider<T>();
return null;
}
}
internal class DefaultSortProvider<T> : SortProvider<T>
where T : IComparable<T> {
//Use T.CompareTo for sorting
}

Getting Generic Math Right in .NET
• Pretty nasty:
• Consider Complex<T>: you can’t implement operators…
• Solution sketch:
• Define ICalculator<T> with methods instead of operators
• Implement ICalculator<T> for each T
• Choose between ICalculator<T>’s implementations at
runtime, and use them in your generic math code
• For more:
http://www.codeproject.com/Articles/8531/Using-generics-for-calculations

The .NET Approach—Pros and Cons

Pros Cons

• Constraint violation • Constraints are not
results in clear compiler enough for everything
error (e.g., generic math)
• Can share generic types • No meta-programming
and objects between abilities (advantage?)
packages/applications
• Can use generics
efficiently with value
types
• Can use Reflection to
query over generic types

QUESTIONS?
Sasha Goldshtein
blog.sashag.net | @goldshtn

Generics in .NET, C++ and Java

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (18)

Andere mochten auch

Andere mochten auch (19)

Ähnlich wie Generics in .NET, C++ and Java

Ähnlich wie Generics in .NET, C++ and Java (20)

Mehr von Sasha Goldshtein

Mehr von Sasha Goldshtein (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Generics in .NET, C++ and Java