parrotcode: Garbage Collection Subsystems | |
Contents | Documentation |
docs/pdds/pdd09_gc.pod - Garbage Collection Subsystems
This PDD describes how DOD/GC systems work, and what's required of PMC classes.
$Revision$
Doing DOD takes a bit of work--we need to make sure that everything is findable from the root set, and that we don't go messing up data shared between interpreters.
There are basically three general schemes to achieve garbage collection.
There are several variants possible with the preceding schemes. These variants achieve different goals:
All managed objects (PMCs, Strings, Buffers) inside Parrot are subject to garbage collection. As these objects aren't allowed to move after creation, garbage collection is done by a non-copying scheme. Further: as we have to cope with pointers to objects on the C stack and in CPU registers, the garbage collection scheme is a conservative one.
DOD/GC is normally triggered by allocation of new objects, which happens usually from some stack nesting below the run-loop. There is a small chance that an integer on the C stack is misinterpreted as a pointer to an object. This object would kept alive in such a case.
The live-ness information gained by dead object detection (DOD) is also the base for collecting variable sized-data that may hang off buffers.
Variable-sized memory like string memory gets collected, when the associated header isn't found to be alive during DOD. While a copying collection could basically[1] be done at any time, it's inefficient to copy buffers of objects that are non yet detected being dead. This implies that before a collection in the memory pools is run, a DOD run for fixed-sized headers is triggered.
[1] Dead objects stay dead, there is no possibility of a reusal of dead objects.
GC subsystems are rather independent. The goal for Parrot is just to provide new object headers in the fastest possible way. How that is achieved can be considered as an implementation detail.
While GC subsystems are independent they may share some code to reduce Parrot memory footprint. E.g. stop-the-world mark and sweep and incremental mark and sweep use the same arena structures and share arena creation and DOD routines.
Currently only one GC system is active (selected at configure or compile time). But future versions might support switching GC systems during runtime to accommodate for different work loads.
void Parrot_gc_XXX_init(Interp *)
XXX
.arena_base
.
The initialization code is responsible for the creation of the header pools and has to fill the following function pointer slots in arena_base
:void (*do_dod_run) (Interp *, int flags)
DOD_trace_stack_FLAG
indicates that the C-stack (and other system areas like the processor registers) have to be traced too.void (*de_init_gc_system) (Interp *)
void (*mark_object) (Interp *, Pobj*)
pobject_lives(Interp *, PObj *)
void (*init_pool) (Interp *, struct Small_Object_Pool *)
Each header pool provides one function pointer to get a new object from that pool.
PObj * (*get_free_object) (Interp *, struct Small_Object_Pool*)
The GC subsystem has to provide these (possibly empty) macros:
DOD_WRITE_BARRIER(Interp *, PMC *agg, PMC *old, PMC *new)
agg
the element old
is getting overritten by new
.
Both old
and new
may be NULL.DOD_WRITE_BARRIER_KEY(Interp *, PMC *agg, PMC *old, PObj *old_key, PMC *new, PObj *new_key)
The arena_base
holds the mentioned function pointers,
pointers to the header pools,
some statistic counters,
and a pointer void *gc_private
reserved for the GC subsystem.
The GC subsystem is responsible for updating the appropriate statistic fields of the structure.
Being able to block GC and DOD is important--you'd hate to have the newly allocated Buffers or PMCs you've got yanked out from underneath you. That'd be no fun. Use the following routines to control GC:
Note that the blocking is recursive--if you call Parrot_block_DOD() three times in succession, you need to call Parrot_unblock_DOD() three times to re-enable DOD.
For PMCs and Buffers to be collected properly, you must get the flags set on them properly. Otherwise Bad Things Will Happen.
Note: don't manipulate these flags directly. Always use the macros defined in include/parrot/pobj.h.
mark
vtable slot will be called during DOD.
The mark function must call pobject_lives
for all non-NULL objects that PMC refers to.pobject_lives
may be a macro.PMC_int_val(SELF)
.The following flags can be used by the GC subsystem:
None.
None.
"A unified theory of garbage collection" <http://portal.acm.org/citation.cfm?id=1028982>
"Scalable Locality-Conscious Multithreaded Memory Allocation" http://people.cs.vt.edu/~scschnei/papers/ismm06.pdf
|