NAME ^

docs/memory_internals.pod - Memory Internals

ABSTRACT ^

This document tries to explain the internals of parrot structures related to memory management and should give the answer 42, when questions are related to the whole and everything and memory management.

OVERVIEW ^

All allocated items are managed in memory pools. Memory pools hold collections of similar items. These pools can be roughly divided into two kinds: pools holding fixed sized items and variable sized items.

A pool is organized as a linked list of big chunks, holding many managed items of one kind.

Abbreviations used here ^

DOD ... dead object detection, see docs/pdds/pdd09_gc.pod for details.

By scanning through all the interpreter's registers, stacks and the processor stack, all objects that are detected here are marked as alive. Objects in all pools not alive are considered dead.

Top down: the interpreter ^

A overall layout of the interpreter's memory management looks like so:

    typedef Interp {
        ...
        struct Arenas *arena_base;
        ...
    } Interp;

All object-like things that get allocated during the execution of parrot bytecode are managed here.

Arenas ^

struct Arenas holds pointers to different subkinds of managed memory. A simplification looks similar to this:

    struct Arenas {
        struct Memory_Pool *memory_pool;
        ...
        struct Small_Object_Pool * header_pool;
    }

Memory_pool and header_pool are variable and fixed sized pool pointers respectively.

Memory_Pool ^

Here all variable sized items are allocated, managed in linked lists of Memory_Blocks - but see below.

Small_Object_Pool ^

This pool manages Small_Object_Arenas, linked together, which provide the space for the fixed sized objects.

Fixed sized items ^

These items are either objects by themselves, like a PMC, or are a header structure of a variable sized object like a STRING. The general object of this kind is a buffer-like object, which consists of a Buffer or a PObj at the beginning and a variable but fixed sized part after the buffer structure. Examples of this are a STRING or an IntList.

Buffer-like objects of the same size are maintained in the sized_header_pools, which manage objects of the same size in one slot.

All fixed sized objects are allocated with alloc_objects(), and first put onto the pool's free_list. When there is need for a new object it is taken off the free list, and when it stops being used, it will be put back on the free list again. DOD detects which objects are no longer being used.

If the free list is empty then a DOD run is started, which may be able to refill the free list with dead objects it detects are free for re-use. If it finds none then a new pool is allocated to hold more objects.

These fixed sized objects are never freed during the lifetime of an interpreter, they just get reused or recycled.

General structure of a buffer-like item ^

    struct parrot_object_t {
    struct {
        void *bufstart;
        size_t buflen;
    } b;
    unsigned flags;
    ...
    } PObj;

This does not totally reflect the current implementation, but is the spirit of the abstraction of current objects. Above structure including flags is the current Buffer. With some additional fields like strstart and bufused, inserted at the ellipses, you get a STRING. Adding a vtable (and some other structure members) yields a PMC.

ARENA_DOD_FLAGS ^

Only three flags need to be checked during a DOD run: PObj_live_FLAG, PObj_on_free_list_FLAG, and PObj_is_special_PMC_FLAG. Normally these flags are stored in PObj->flags, meaning that each PMC must be accessed during the DOD run.

An alternative approach is to store the DOD-Flags in the Arenas as a packed array of bits. This approach will be used if the preprocessor variable ARENA_DOD_FLAGS is defined to 1, which happens by default if the system provides a memory alignment primitive such as memalign. In this case the struct Small_Object_Arena is extended with a pointer to the packed bitarray

    struct Small_Object_Arena {
        UINTVAL *dod_flags;
        size_t object_size;
        ...
    };

The memory for this Small_Object_Arena is allocated at the beginning of a large aligned block of memory (currently 4Mib) and the objects in this arena come from this memory block. Therefore the arena of an object can be found by simply masking out the lower bits of the pointer to the object:

    arena = object & ARENA_MASK;

The macro GET_ARENA does exactly this, including the necessary type casts to remove warnings. The dod_flags are accessed by getting the element number in the arena (this is possible because the object_size is fixed and stored in the struct Small_Object_Arena), creating the appropriate bitmask by shifting and accessing the right element of dod_flags[].

    n = (object - arena->start_objects) / arena->object_size;
    arena->dod_flags[n >> ARENA_FLAG_SHIFT] 
        & flag << ((n & ARENA_FLAG_MASK) << 2)

pobj.h provides macros to facilitate referencing individual object flags: DOD_flag_SET, DOD_flag_CLEAR and DOD_flag_TEST. They are also defined in the case without ARENA_DOD_FLAGS, so they make up a portable way of manipulating the DOD-relevant object flags.

Flag packing format

The three object flags consulted during DOD are packed into one 4-bit nibble per object (one bit is currently unused). Since arena->dod_flags[] is an array of native-sized UINTVALs ("words"), the number of nibbles per array entry varies depending on the platform.

To reference a particular flag in a test, set, or clear operation, we require both the word containing the object's flags and a bitmask to isolate the flag.

The flag word for object o is found as follows:

- Get the dod_flags[] array for the object's arena [GET_ARENA(o)->dod_flags]

- Get the index of the object in its arena [n=GET_OBJ_N(arena,o)]

- Determine which word in dod_flags contains the needed flag nibble [(obj index)/(obj flag sets per word) = n>>ARENA_FLAG_SHIFT]

The composite expression to reference the flag word is then GET_ARENA(o)->dod_flags[ GET_OBJ_N(GET_ARENA(o), o) >> ARENA_FLAG_SHIFT ]

The bitmask for the desired flag is constructed as follows:

- Determine the index into the flag word of the object's nibble [(obj index)%(nibbles per word) = n & ARENA_FLAG_MASK]

- Convert nibble index into bit index [(nibble index)*(bits per nibble) = (n & ARENA_FLAG_MASK) << 2]

- Create bitmask for desired flag by shifting base flag by bit index [d_PObj_whatever_FLAG << (nibble index)]

The composite expression to form the bitmask is then (d_PObj_ ## flag ## _FLAG << (( GET_OBJ_N(GET_ARENA(o), o) & ARENA_FLAG_MASK ) << 2))

The DOD_flag_* macros combine these constructions to access the flags.

Variable sized items ^

These items never live alone, they are part of a Buffer structure, described above. They are allocated at bufstart. This is, for example, used to manage the buffer's free list, where bufstart is used as a pointer to the next object.

These items are managed in two different pools: the memory_pool and the constant_string_pool. The former holds all variable sized items, while the latter containing the word "string", holds constant strings only, as we don't have other variable sized constant items to store.

Here, different memory allocation schemes jump in:

Copying GC ^

A memory_pool gets allocated in big blocks, namely a Memory_Block.

When some part is needed, e.g. to store a string, this part is carved out of the memory block, until this block is used up. If no more space is available in this memory block, a garbage collection (GC) is started. This copies all living items of all used memory blocks into one new block, which holds thereafter only used items tightly packed together.

The old memory blocks, containing sparse unused parts and used parts already copied to the new place, are then unused altogether and get free()ed thereafter.

When GC doesn't provide enough free space needed for a new item, a new block is added to the memory pool.

This also implies that buffers are moved around during their life. Users of these buffers are therefore not allowed to hold pointers to buffers over pieces of code that might trigger a GC run, like Parrot_allocate() or string_compare().

Defragmentating allocator ^

An alternative to the above is to use a memory allocator, which is as fast as the above and does reuse free()ed items in a memory conserving way. Actually, the malloc implementations in some systems' libc efficiently provide this, such as the glibc malloc based on Doug Lea's allocator.

Using this allocator, all variable sized items are just allocated via a plain malloc() call, or resized with realloc(), and after they lose their existence (ie when DOD detects that the managing buffer header is no longer in use) they are free()ed. That's all. The underlying allocator collects these pieces, coalesces them if possible to form bigger pieces, and then puts them on free lists, sorted by size. Eventually, when a new allocation request arrives, it may give them back to Parrot.

So here, the variable length memory_pool is unused. You can consider this pool to live inside the allocator.

Buffers allocated this way don't move around, except when reallocated of course.

The Configure.pl option --gc allows one to use either method.

Buffer_Tail and COW ^

Both implementations have the same problem to solve: STRINGs that get copied, or parts of strings as the result of a substr() call, do not allocate new space in the memory pool to hold this string or part of string. They just use a new_string_header() and set up a pointer (strstart) pointing somewhere into the original string and remember the used length there in bufused.

This is all OK, as long as the original and the lazy copy of the string are not modified. So that's well-known and called COW (copy on write).

Now, during GC (or freeing buffers) the problem arises: to whom does this string belong? You shouldn't copy the same string to different places, thus rendering COW to a noop, nor are you allowed to free this same string more then once (your debugger will tell you why...).

Both allocation schemes therefore use a part of the allocated string to do this bookkeeping.

Copying GC uses a Buffer_Tail after the end of the actual variable length string and marks such COW strings with TAIL_moved and stores the new address in the buffer header, so other users of this string can be updated to reuse the same string (XXX one or all other users?).

The malloc()/free() approach stores a refcount at bufstart. During DOD all dead users increment the refcount, living users set it to an huge value. When freeing the buffer, the string is only freed if the refcount reaches zero.

Simplified Figure ^

                             +--------+
      +------------------<---| Arenas |<-----------+
      |                      +--------+-->--+      |
      |                                     |      |
      |         +------+    +-----------+   |  +=============+
      |         | S0   |<---| Registers |<--)--| Interpreter |
      |         +------+    +-----------+   |  +=============+
      |     +---| S1   |                    |
      |     |   +------+                    +----------+
 +-------+  |                                          |
 | Blk 1 |--)-->+----------+    +--------------+   +---------+
 +-------+  |   | Buffer 1 |    | OnestringAno |   | Block 1 |
 | Blk 2 |  |   +----------+    | therstring.. |   +---------+
 +-------+  |   | Buffer 2 |    | ..String...  |<--| Block 2 |
 | .     |  |   +----------+    +--------------+   +---------+
 +-------+  |   | ...      |        ^    ^         | ...     |
 Small Obj  |   +----------+        |    |         +---------+
 Pool       +-->| Buffer N |--------+----+         Memory Pool
                +----------+
                 Buffer          Memory Block

Now consider, that the Interpreter shall be a PMC living in Buffer X of the underlying interpreter and is currently running perldoc docs/memory_internals.pod, and then redo this figure, with all the blanks filled in ;-)

FILES ^

smallobject.[ch], headers.c, resources.[ch], res_lea.c, dod.c, string.[ch], pobj.h.

BUGS ^

All spelling errors belong to those who honestly collected them, as well as all errors related to my abuse of the English language - I'm not natively speaking it.

To minimize this bugs section - please patch this document and keep it up to date - thanks.

AUTHOR ^

Leopold Tötsch <lt@toetsch.at>

VERSION ^

0.1.0 Mar 2004


parrot