NAME ^

src/intlists.c - Regex stack handling routines

DESCRIPTION ^

intlist emulation code, calls routines in src/list.c.

Here is the original documentation for intlist:

The basic data structure is a variant of a doubly-linked list of 'chunks', where a chunk is a Buffer header subclass containing the link pointers and other metadata for the chunk. As expected from it being a Buffer header, the PObj_bufstart field points to the actual array of INTVALs. The handle used by external code for one of these IntLists is just a pointer to a chunk, always called 'list' in this code.

For now, all of the chunks are fixed-length in size. (That could easily be changed, at the cost of another integer in each header.)

Notice that I said 'variant' of a doubly-linked list. That is because if you start at 'list' and follow prev pointers, you will loop through all the used nodes of the list, as usual. But if you follow next pointers instead, you might find a spare node hanging off the last node in the list (the last node is always list->prev, so if there is a spare node, it will be at list->prev->next. If no spare exists, then list->prev->next==list.)

The first node in the list may be partly full; the intermediate nodes are always completely full; and the last node may be partly full. Each node has a .start field, giving the offset of the first valid element (always zero except for possibly the first node), and a .end field, giving one past the offset of the last valid element (always equal to INTLIST_CHUNK_SIZE except for possibly the last node).

To make it concrete, let's walk through some sample operations. To push onto the end of the list, first find the last chunk: list->prev. Then if chunk->end < INTLIST_CHUNK_SIZE, there is space to fit another element and so just stick it in. If not, we must add a chunk to the end of the list. If there is a spare, just link it fully into the list (forming a conventional doubly-linked list). Otherwise, create a new chunk and link it fully into the list. Easy enough.

To pop something off the end, first go to the end chunk (list->prev). Pop off an element and decrement .end if the chunk is nonempty. If it is empty, make that last chunk into the spare (discarding the previous spare). Then go to the previous chunk, which is guaranteed to have .end set to INTLIST_CHUNK_SIZE, and return data[.end--].

The length of the list is always cached in the overall header chunk. If an operation changes which chunk is the header (i.e., shift or unshift), then the length is copied to the new header.

Invariants:

There is always space in list->prev to insert an element.

The 'list' chunk is never empty unless the entire list is empty.

In combination, the above invariants imply that the various operations are implemented as:

push

Write element, push a new chunk if necessary.

pop

Check to see if we have to back up a chunk, read element.

shift

Read element, discard chunk and advance if necessary.

unshift

Unshift a chunk if necessary, write element.

Direct aka indexed access of intlist data:

The classic method would be to walk the intlist->next pointers (or optimized, the ->prev pointers if an index near the end is requested) and locate the chunk, that holds the wanted list item.

To speed things up, especially for bigger lists, there are additional fields in the 'list' (the head chunk):

chunk_list

Holds pointers to individual chunks.

collect_runs

collect_runs counter, when chunk_list was rebuilt last.

n_chunks

Used length in chunk_list

If on any indexed access interpreter's collect_runs is different, the chunks might have been moved, so the chunk_list has to be rebuilt.

Getting data outside the array dimensions will return the value NULL, which will SIGSEGV, the intlist did an explicit exception, so there is not much difference. Of course, a check for valid pointers could be added here.

Functions ^

void intlist_mark(PARROT_INTERP, NOTNULL(IntList *l))

Marks the list as live.

PARROT_MALLOC PARROT_CANNOT_RETURN_NULL IntList *intlist_clone(PARROT_INTERP, ARGIN(const IntList *list))

Returns a clone of the list.

PARROT_MALLOC PARROT_CANNOT_RETURN_NULL IntList *intlist_new(PARROT_INTERP)

Returns a new list.

PARROT_WARN_UNUSED_RESULT PARROT_PURE_FUNCTION INTVAL intlist_length(SHIM_INTERP, ARGIN(const IntList *list))

Returns the length of the list.

void intlist_assign(PARROT_INTERP, NOTNULL(IntList *l), INTVAL idx, INTVAL val)

Assigns <val> to the item at idx.

void intlist_push(PARROT_INTERP, NOTNULL(IntList *l), INTVAL val)

Pushes val on the end of the list.

void intlist_unshift(PARROT_INTERP, NOTNULL(IntList **l), INTVAL val)

Pushes val on the front of the list.

INTVAL intlist_pop(PARROT_INTERP, NOTNULL(IntList *l))

Popping/shifting into a sparse hole returns 0.

INTVAL intlist_shift(PARROT_INTERP, NOTNULL(IntList **l))

Removes and returns the first item on the list.

PARROT_WARN_UNUSED_RESULT INTVAL intlist_get(PARROT_INTERP, NOTNULL(IntList *list), INTVAL idx)

Returns the item at idx.

void intlist_dump(NOTNULL(FILE *fp), NOTNULL(IntList *list), int verbose)

Prints out the list in human-readable form.

SEE ALSO ^

include/parrot/intlist.h, src/list.c and include/parrot/list.h.


parrot