| parrotcode: Parrot Strings | |
| Contents | C |

src/string.c - Parrot Strings

This file implements the non-ICU parts of the Parrot string subsystem.
Note that bufstart and buflen are used by the memory subsystem.
The string functions may only use buflen to determine,
if there is some space left beyond bufused.
This is the only valid usage of these two data members,
beside setting bufstart/buflen for external strings.

Parrot_unmake_COWParrot_make_COW_referenceParrot_reuse_COW_referencestring_set
Creation, enlargement, etc.
string_initstring_deinitstring_capacitystring_make_emptystring_rep_compatible ascii <op> utf8 => utf8
=> ascii, B<if> C<STRING *b> has ascii chars only.
string_appendstring_from_cstringstring_primary_encoding_for_representationconst_stringstring_makelen bytes of string data read from buffer.charset_name specifies the string's representation. The currently recognised values are: 'iso-8859-1'
'ascii'
'binary'
'unicode'
unicode implies the utf-8 encoding, and the other three assume fixed-8 encoding.charset is unspecified the default charset 'ascii' will be used.flags is optionally one or more PObj_* flags OR-ed together.string_grow
string_lengthstring_indexstring_str_indexstart. The return value is a (0 based) offset in characters, not bytes. If second string is not specified, then return -1.string_ordstring_chrstring_copy
string_compute_strlenstring_max_bytesstring_concatNULL, then a copy of the non-NULL string is returned. If both strings are NULL, then a new zero-length string is created and returned.string_repeatstring_substrlength from offset from the specified Parrot string and stores it in **d, allocating memory if necessary. The substring is also returned.string_replace substr EXPR, OFFSET, LENGTH, REPLACEMENT
length characters from offset in the first Parrot string with the second Parrot string, returning what was replaced.string_chopnn characters of the specified Parrot string. If n is negative, cuts the string after +n characters. The returned string is a copy of the one passed in.string_chopn_inplacen characters of the specified Parrot string. If n is negative, cuts the string after +n characters. The string passed in is modified and returned.string_equalmake_writablelen. The representation argument is required in case a new Parrot string has to be created.nonnull_encoding_name(STRING *s)real_exception to print the exception message could potentially be null.string_bitwise_andAND on two Parrot string, performing type and encoding conversions if necessary. If the second string is not NULL then it is reused, otherwise a new Parrot string is created.string_bitwise_orOR on two Parrot strings, performing type and encoding conversions if necessary. If the third string is not NULL then it is reused, otherwise a new Parrot string is created.string_bitwise_xorXOR on two Parrot strings, performing type and encoding conversions if necessary. If the second string is not NULL then it is reused, otherwise a new Parrot string is created.string_bitwise_notNOT on a Parrot string. If the second string is not NULL then it is reused, otherwise a new Parrot string is created.string_bool0, "" or "0".string_nprintfParrot_snprintf() except that it writes to and returns a Parrot string.bytelen does not include space for a (non-existent) trailing '\0'. dest may be a NULL pointer, in which case a new native string will be created. If bytelen is 0, the behaviour becomes more sprintf-ish than snprintf-like. bytelen is measured in the encoding of *dest.string_printfstring_to_int sign = '+' | '-'
digit = "Any code point considered a digit by the chartype"
indicator = 'e' | 'E'
digits = digit [digit]...
decimal-part = digits '.' [digits] | ['.'] digits
exponent-part = indicator [sign] digits
numeric-string = [sign] decimal-part [exponent-part]
string_to_numstring_to_int() except that a floating-point value is returned.string_from_intstring_from_numstring_to_cstringstring_cstring_free() to free the string. Failure to do this will result in a memory leak.string_cstring_freestring_to_cstring().string_pinstring_unpinstring_pin() so that the string once again uses managed memory.string_hashs->hashval.string_escape_stringstring_unescape_cstring can handle are esacped as \x, as well as a double quote character. Other control chars and codepoints < 0x100 are escaped as \xhh, codepoints up to 0xffff, as \uhhhh, and codepoints greater than this as \x{hh...hh}.string_escape_string_delimitedstring_unescape_cstring \xhh 1..2 hex digits
\ooo 1..3 oct digits
\cX control char X
\x{h..h} 1..8 hex digits
\uhhhh 4 hex digits
\Uhhhhhhhh 8 hex digits
\a, \b, \t, \n, \v, \f, \r, \e
string_upcasestring_upcase_inplacestring_downcasestring_downcase_inplacestring_titlecasestring_titlecase_inplacestring_incrementParrot_string_cstringParrot_string_is_cclasss at given offset is in the given character class flags. See also include/parrot/cclass.h for possible character classes. Returns 0 otherwise, or if the string is empty or NULL.Parrot_string_trans_charsetdest == NULL, converts src to the given charset or encoding inplace, else returns a copy of src with the charset/encoding in dest.Parrot_string_trans_encodingdest == NULL, converts src to the given charset or encoding inplace, else returns a copy of src with the charset/encoding in dest.uint_to_strnum converted to a Parrot STRING.base must be defined, a default of 10 is not assumed. The caller has to verify that base >= 2 && base <= 36 The buffer tc must be at least sizeof (UHUGEINTVAL)*8 + 1 chars big.minus is true then - is prepended to the string representation.int_to_strnum converted to a Parrot STRING.base must be defined, a default of 10 is not assumed.num < 0 then - is prepended to the string representation.
|
|
|