Although I wouldn’t call myself a Garbage Collection expert, I have a general grasp of how it works in .Net. Like many areas where my knowledge is somewhat superficial, I tend to forget what I learned if I don’t use it for a while, only to have to relearn it later. The relearning process can be frustrating, especially when you’re trying to relocate the same articles and links you used previously. To streamline this process in the future, I’m consolidating some of the top resources on this topic, along with my own insights, in this post. This exercise will also compel me to see it through.
This post, which focuses on the inner workings of the Garbage Collector, is the first in a two-part series. The second part will delve into the practical implications of this information for us as developers.
Some Foundational Concepts
Both .Net (and Mono since version 2.8) utilize a generational garbage collector (same. When memory is allocated (using the new operator just before the constructor is called) and insufficient memory is available, the GC process kicks in. Depending on the type of GC in use (Workstation or Server), the GC operates either on the same thread that attempted the allocation or a separate thread. When a dedicated thread is used, we refer to this as concurrent garbage collection. This allows other threads to continue running for a significant portion of the GC process, minimizing noticeable pauses. This represents an improvement over older “stop-the-world” GC approaches. .Net 4 saw enhancements to Concurrent GC, which was renamed Background GC (Background garbage collection), further reducing the likelihood of noticeable slowdowns.
Prior to Background GC, when the GC was processing Generation 2 objects (a potentially time-consuming task), other threads running concurrently and making new allocations could fill the segment designated for new objects (the ephemeral segment, which we’ll explore shortly). This would cause these threads to stall due to lack of space for their new objects. With the background collection algorithm, if this scenario arises, all threads (including the one handling the collection) are paused, and a new GC process is initiated specifically to address this ephemeral segment. These specialized ephemeral segment collections, now called foreground collections, free up space, allowing all threads (regular application threads and the background collection thread) to resume operation.
.Net’s heap is comprised of two heaps: the Small Object Heap (SOH) and the Large Object Heap (LOH), where objects exceeding 85,000 bytes reside. Objects in the LOH are categorized as Generation 2 (although Gen 2 also includes SOH objects). These Generation 2 objects are only collected during a full garbage collection. Garbage collections targeting only Generation 0 and 1 objects are termed ephemeral collections in .Net terminology. These collections are typically very fast (due to shorter object graph traversal paths) and, as a result, are not executed as Concurrent-Background collections. The segment (currently 16 MB) where Gen 0 and Gen 1 objects are allocated is known as the ephemeral segment, and it’s always the most recently allocated segment.
Ephemeral generations are allocated within the ephemeral segment. Each new segment the garbage collector acquires becomes the new ephemeral segment, holding objects that persisted through a generation 0 garbage collection. The previous ephemeral segment then becomes the new generation 2 segment.
Unlike the SOH, which is compacted after a GC run, the LOH is not. This means that improper allocation of large objects can result in fragmentation issues. Compacting the SOH leads to substantial performance improvements compared to traditional native applications. Allocations in the SOH are linear. The runtime maintains a pointer to the next available position in the SOH (within the ephemeral segment). This enables direct allocation to this location, contrasting with native applications where finding a free memory area is necessary. This also ensures that objects created sequentially are allocated contiguously, further enhancing performance. The Generational GC Performance Optimizations section here provides a more in-depth explanation.
Finalization
Garbage collection encompasses more than just memory deallocation; it also handles the release of unmanaged resources such as Windows handles. Code responsible for releasing these resources should be placed within a Finalize method. This allows the runtime to release these resources even if they’re overlooked in application code.
So how do memory release and resource release work in tandem?
Several articles and Q&A resources listed below provide excellent explanations. The GC, when initiated, operates under the assumption that all objects in the heap are garbage. To identify objects that shouldn’t be discarded, it starts by traversing GC roots (primarily stack variables and global variables) to construct a graph of reachable objects. Objects not included in this graph are considered inaccessible by the application and, therefore, true garbage. This is where finalization comes in, before memory is released. The GC determines if these unused objects are referenced in the Finalization queue (all objects overriding the Finalize method inheriting from the Object class are added to this queue during creation). Objects found in the Finalization queue are removed from it and added to the freachable queue (which also acts as a GC root, meaning any objects referenced by your finalizable object are also prevented from being collected at this point will also be kept alive). These objects are no longer classified as garbage, so their memory isn’t reclaimed in this collection cycle.
As the GC completes, a dedicated application thread known as the Finalizer Thread is alerted to the new entries in the freachable queue. This thread then starts executing the Finalize method for each object in the queue. Once an object’s Finalize method has run, it’s removed from the freachable queue. This ensures that during the next GC cycle, the object won’t be part of the graph, and its memory will be freed.
Further reading:
- Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework
- When does CLR say that an object has a finalizer?
- Fundamentals of Garbage Collection
- Difference between background and concurrent garbage collection?
- how the mark and sweep phases of .NET GC can run concurrently with application threads?
- Background and Foreground GC in .NET 4
- Maoni (aka the GC queen) blog
- .NET Garbage Collector PopQuiz
- Very complete and not too dense article
- Very interesting article that I didn’t come across with until March 2012
- No entry in this blog would be complete without a link to wikipedia :-)