The .NET garbage collector can be your best friend or your worst enemy; and it’s not friendly with a lot of people. The GC left more than a few production systems burning in smoke after developers failed to anticipate the effects of real production loads on the memory subsystem. In this talk, we will methodically measure and improve the .NET garbage collector’s performance. We will begin with a quick refresher on dynamic performance tools that can identify GC issues: CLR performance counters, ETW GC events, and ETW object allocation events; as well as static analysis tools, such as the Roslyn-based heap allocations analyzer. Then, we will inspect multiple issues at the source code level: excessive boxing, unintended effects of lambdas closing over local variables, await-generated state machines, intermediate objects in LINQ queries, and many others. We will also discuss higher-level memory problems: how to get rid of large object allocations, how to avoid finalization, and how to convert heap-based designs to local objects. Some of these ideas are now being applied at the language and framework level in C# 7 and .NET Core. At the end of the talk, you will be equipped to reduce memory traffic and GC overhead in your own applications, often by a factor of 10 or more!
8. • Provide numeric performance information about the
system
• Located in different areas in the system - disk, .NET,
networking, OS objects
• Available on-demand using built-in tools like
perfmon and typeperf
• Can write your own but that’s not the topic of our
talk…
9.
10. • High-speed logging framework supporting more
than 100K structured messages per second
• .NET, drivers, services, third party components
• Can be turned on on-demand while running
• Very small overhead
• Can write your own but that’s not the topic of our
talk…
21. struct StructWithSpecializedEquals {
public int Value { get; set; }
public override bool Equals(object obj) {
if (!(obj is StructWithSpecializedEquals)) return false;
return ((StructWithSpecializedEquals)obj).Value == Value;
}
}
22. struct StructEquatable : IEquatable<StructEquatable> {
public int Value { get; set; }
public bool Equals(StructEquatable other) {
return Value == other.Value;
}
}
23. private const int N = 10000;
…
structs = Enumerable.Range(0, N)
.Select(v => new Struct { Value = v })
.ToList();
structWithSpecializedEqualses = Enumerable.Range(0, N)
.Select(v => new StructWithSpecializedEquals { Value = v })
.ToList();
structEquatables = Enumerable.Range(0, N)
.Select(v => new StructEquatable { Value = v })
.ToList();
24. [Benchmark]
public bool SearchStruct() {
return structs.Contains(structs.Last());
}
[Benchmark]
public bool SearchStructWithSpecializedEquals() {
…
}
[Benchmark]
public bool SearchStructEquatable() {
…
}
35. public void CaptureState() {
_globalSum = 0;
for (int i = 0; i < Elements; ++i) {
var data = new Data { Value = i };
TaskStub.StartNew(() => {
_globalSum += data.Value;
});
}
}
36. public void PassStateAsParameter() {
_globalSum = 0;
for (int i = 0; i < Elements; ++i) {
var data = new Data { Value = i };
TaskStub.StartNew(d => {
_globalSum += (d as Data).Value;
}, data);
}
}
37. public void NoCapturedState() {
_globalSum = 0;
for (int i = 0; i < Elements; ++i) {
TaskStub.StartNew(() => {
_globalSum += Data.Default.Value;
});
}
}
38. public void NoStateAndNoLambda() {
_globalSum = 0;
for (int i = 0; i < Elements; ++i) {
TaskStub.StartNew(AddFunction);
}
}
private static void AddFunction() {
_globalSum += Data.Default.Value;
}
47. public static double CalculateWithLoops() {
int sum = 0;
for (int i = Minimum; i < Maximum; ++i) {
var digits = new int[10];
var number = i;
while (number > 0) {
digits[number % 10] += 1;
number /= 10;
}
for (int d = 0; d < digits.Length; ++d)
if (digits[d] == 1) // then this is a unique digit
++sum;
}
return (double)sum / (Maximum - Minimum);
}
48. public static double CalculateWithLoopsAndString() {
int sum = 0;
for (int i = Minimum; i < Maximum; ++i) {
var digits = new int[10];
var s = i.ToString();
for (var k = 0; k < s.Length; ++k)
digits[s[k] - '0'] += 1;
for (int d = 0; d < digits.Length; ++d)
if (digits[d] == 1) // then this is a unique digit
++sum;
}
return (double)sum / (Maximum - Minimum);
}
49. public static double CalculateWithLinq() {
return Enumerable.Range(Minimum, Maximum - Minimum)
.Select(i => i.ToString()
.AsEnumerable()
.GroupBy(
c => c,
c => c,
(k, g) => new {
Character = k,
Count = g.Count()
})
.Count(g => g.Count == 1))
.Average();
}
52. • You have a large heap. Don’t make it worse by
frequent calls to full GC which is going to take a long
time.
• Instead, use what we learned to reduce memory
usage
• And don’t forget to remove debugging code from
production
53.
54.
55.
56. • Value types (structs) have a compact memory layout,
and can be embedded in their parent object, making
cache’s life easier and generally reducing footprint
57.
58. • Pool expensive or large objects instead of returning
them to GC
• For large arrays (e.g. byte[]) may use
System.Buffers
59.
60. • Finalizers run at some point after the object is no
longer referenced by the application (non-
deterministic)
• Finalizers run on a separate thread and create
potential concurrency issues
• Finalization prolongs object lifetime and can create
leaks if finalizers don’t complete quickly enough
• Better to use deterministic resource management
(IDisposable)
61. • Although a single memory allocation is extremely
fast, it adds up
• All based on real questions, stories and bugs
• Don’t overcomplicate where it’s not needed
• Measure
• And optimize…
• DIY: https://github.com/dinazil/look-mommy-no-gc