NAME ^

src/hash.c - Hash table

DESCRIPTION ^

A hashtable contains an array of bucket indexes. Buckets are nodes in a linked list, each containing a void * key and value. During hash creation, the types of key and value as well as appropriate compare and hashing functions can be set.

This hash implementation uses just one piece of malloced memory. The hash->bs bucket store points to this region.

This hash doesn't move during GC, therefore a lot of the old caveats don't apply.

Functions ^

PARROT_WARN_UNUSED_RESULT static size_t key_hash_STRING(PARROT_INTERP, NOTNULL(STRING *value), size_t seed)

Return the hashed value of the key value. See also string.c.

PARROT_WARN_UNUSED_RESULT static int STRING_compare(PARROT_INTERP, NOTNULL(const void *search_key), NOTNULL(const void *bucket_key))

Compares the two strings, returning 0 if they are identical.

PARROT_WARN_UNUSED_RESULT PARROT_PURE_FUNCTION static int pointer_compare(SHIM_INTERP, NULLOK(const void *a), NULLOK(const void *b))

Compares the two pointers, returning 0 if they are identical

PARROT_WARN_UNUSED_RESULT PARROT_PURE_FUNCTION static size_t key_hash_pointer(SHIM_INTERP, NULLOK(void *value), size_t seed)

Returns a hashvalue for a pointer.

PARROT_WARN_UNUSED_RESULT PARROT_PURE_FUNCTION static size_t key_hash_cstring(SHIM_INTERP, NOTNULL(const void *value), size_t seed)

TODO: Not yet documented!!!

PARROT_WARN_UNUSED_RESULT PARROT_PURE_FUNCTION static int cstring_compare(SHIM_INTERP, NOTNULL(const char *a), NOTNULL(const char *b))

C string versions of the key_hash and compare functions.

PARROT_WARN_UNUSED_RESULT PARROT_PURE_FUNCTION size_t key_hash_int(SHIM_INTERP, NULLOK(void *value), size_t seed)

Custom key_hash function.

PARROT_WARN_UNUSED_RESULT PARROT_PURE_FUNCTION int int_compare(SHIM_INTERP, NULLOK(const void *a), NULLOK(const void *b))

Custom compare function.

PARROT_API void parrot_dump_hash(SHIM_INTERP, NOTNULL(const Hash *hash))

Print out the hash in human-readable form. Except it's empty.

PARROT_API void parrot_mark_hash(PARROT_INTERP, NOTNULL(Hash *hash))

Marks the hash and its contents as live.

static void hash_thaw(PARROT_INTERP, NOTNULL(Hash *hash), NOTNULL(visit_info *info))

This is used by freeze/thaw to visit the contents of the hash.

pinfo is the visit info, (see include/parrot/pmc_freeze.h>).

static void hash_freeze(PARROT_INTERP, NOTNULL(const Hash *const hash), NOTNULL(visit_info *info))

TODO: Not yet documented!!!

PARROT_API void parrot_hash_visit(PARROT_INTERP, NOTNULL(Hash *hash), NOTNULL(void *pinfo))

TODO: Not yet documented!!!

static void expand_hash(PARROT_INTERP, NOTNULL(Hash *hash))

For a hashtable of size N, we use MAXFULL_PERCENT % of N as the number of buckets. This way, as soon as we run out of buckets on the free list, we know that it's time to resize the hashtable.

Algorithm for expansion: We exactly double the size of the hashtable. Keys are assigned to buckets with the formula

        bucket_index = hash(key) % parrot_hash_size
so when doubling the size of the hashtable, we know that every key is either already in the correct bucket, or belongs in the current bucket plus parrot_hash_size (the old parrot_hash_size). In fact, because the hashtable is always a power of two in size, it depends only on the next bit in the hash value, after the ones previously used.

So we scan through all the buckets in order, moving the buckets that need to be moved. No bucket will be scanned twice, and the cache should be reasonably happy because the hashtable accesses will be two parallel sequential scans. (Of course, this also mucks with the ->next pointers, and they'll be all over memory.)

PARROT_API void parrot_new_hash(SHIM_INTERP, NOTNULL(Hash **hptr))

Returns a new Parrot STRING hash in hptr.

PARROT_API void parrot_new_pmc_hash(PARROT_INTERP, NOTNULL(PMC *container))

Create a new Parrot STRING hash in PMC_struct_val(container)

PARROT_API void parrot_new_cstring_hash(SHIM_INTERP, NOTNULL(Hash **hptr))

Returns a new C string hash in hptr.

static void init_hash(NOTNULL(Hash *hash), PARROT_DATA_TYPE val_type, Hash_key_type hkey_type, hash_comp_fn compare, hash_hash_key_fn keyhash)

TODO: Not yet documented!!!

PARROT_API void parrot_hash_destroy(SHIM_INTERP, NOTNULL(Hash *hash))

TODO: Not yet documented!!!

void parrot_chash_destroy(PARROT_INTERP, NOTNULL(Hash *hash))

TODO: Not yet documented!!!

void parrot_new_hash_x(NOTNULL(Hash **hptr), PARROT_DATA_TYPE val_type, Hash_key_type hkey_type, hash_comp_fn compare, hash_hash_key_fn keyhash)

Returns a new hash in hptr.

FIXME: This function can go back to just returning the hash struct pointer once Buffers can define their own custom mark routines.

The problem is: During DODs stack walking the item on the stack must be a PMC. When an auto Hash* is seen, it doesn't get properly marked (only the Hash* buffer is marked, not its contents). By passing the **hptr up to the Hash's init function, the newly constructed PMC is on the stack including this newly constructed Hash, so that it gets marked properly.

void parrot_new_pmc_hash_x(SHIM_INTERP, NOTNULL(PMC *container), PARROT_DATA_TYPE val_type, Hash_key_type hkey_type, hash_comp_fn compare, hash_hash_key_fn keyhash)

Like parrot_new_hash_x but w/o the described problems. The passed in container PMC gets stored in the Hash end the newly created Hash is in PMC_struct_val(container).

PARROT_API void parrot_new_pointer_hash(SHIM_INTERP, NOTNULL(Hash **hptr))

Create a new HASH with void * keys and values.

PARROT_API PARROT_WARN_UNUSED_RESULT PARROT_CANNOT_RETURN_NULL PMC *Parrot_new_INTVAL_hash(PARROT_INTERP, UINTVAL flags)

Create a new Hash PMC with INTVAL keys and values. flags can be PObj_constant_FLAG or 0.

PARROT_API PARROT_WARN_UNUSED_RESULT PARROT_PURE_FUNCTION INTVAL parrot_hash_size(PARROT_INTERP, NOTNULL(const Hash *hash))

Return the number of used entries in the hash.

PARROT_API PARROT_WARN_UNUSED_RESULT PARROT_CAN_RETURN_NULL void *parrot_hash_get_idx(SHIM_INTERP, NOTNULL(const Hash *hash), NOTNULL(PMC *key))

Called by iterator.

PARROT_API PARROT_WARN_UNUSED_RESULT PARROT_CAN_RETURN_NULL HashBucket *parrot_hash_get_bucket(PARROT_INTERP, NOTNULL(const Hash *hash), NOTNULL(void *key))

Returns the bucket for key.

PARROT_API PARROT_WARN_UNUSED_RESULT PARROT_CAN_RETURN_NULL void *parrot_hash_get(PARROT_INTERP, NOTNULL(Hash *hash), NOTNULL(void *key))

Returns the value keyed by key or NULL if no bucket is found.

PARROT_API PARROT_WARN_UNUSED_RESULT INTVAL parrot_hash_exists(PARROT_INTERP, NOTNULL(Hash *hash), NOTNULL(void *key))

Returns whether the key exists in the hash.

PARROT_API PARROT_IGNORABLE_RESULT PARROT_CANNOT_RETURN_NULL HashBucket *parrot_hash_put(PARROT_INTERP, NOTNULL(Hash *hash), NOTNULL(void *key), NULLOK(void *value))

Puts the key and value into the hash. Note that key is not copied.

PARROT_API void parrot_hash_delete(PARROT_INTERP, NOTNULL(Hash *hash), NOTNULL(void *key))

Deletes the key from the hash.

PARROT_API void parrot_hash_clone(PARROT_INTERP, NOTNULL(Hash *hash), NOTNULL(Hash *dest))

Clones hash to dest.

SEE ALSO ^

docs/pdds/pdd08_keys.pod.

TODO ^

Future optimizations:


parrot