string.ops - String Operations
Operations that work on strings,
whether constructing,
modifying or examining them.
- ord(out INT,
in STR)
- The codepoint in the current character set of the first character of string $2 is returned in integer $1.
If $2 is empty,
an exception is thrown.
- ord(out INT,
in STR,
in INT)
- The codepoint in the current character set of the character at integer index $3 of string $2 is returned in integer $1.
If $2 is empty,
an exception is thrown.
If $3 is greater than the length of $2,
an exception is thrown.
If $3 is less then zero but greater than the negative of the length of $2,
counts backwards through $2,
such that -1 is the last character,
-2 is the second-to-last character,
and so on.
If $3 is less than the negative of the length of $2,
an exception is thrown.
- chr(out STR,
in INT)
- The character specified by codepoint integer $2 in the current character set is returned in string $1.
- chopn(inout STR,
in INT)
- Remove n characters specified by integer $2 from the tail of string $1.
If $2 is negative,
cut the string after -$2 characters.
- chopn(out STR,
in STR,
in INT)
- Remove n characters specified by integer $3 from the tail of string $2,
and returns the characters not chopped in string $1.
If $3 is negative,
cut the string after -$3 characters.
- concat(inout STR,
in STR)
- concat(invar PMC,
in STR)
- concat(invar PMC,
invar PMC)
- Modify string $1 in place,
appending string $2.
- concat(out STR,
in STR,
in STR)
- concat(invar PMC,
invar PMC,
in STR)
- concat(invar PMC,
invar PMC,
invar PMC)
- Append string $3 to string $2 and place the result into string $1.
- repeat(out STR,
in STR,
in INT)
- repeat(invar PMC,
invar PMC,
in INT)
- repeat(invar PMC,
invar PMC,
invar PMC)
- Repeat string $2 integer $3 times and return result in string $1.
The
PMC
versions are MMD operations.
- repeat(invar PMC,
in INT)
- repeat(invar PMC,
invar PMC)
- Repeat string $1 number $2 times and return result in string $1.
The
PMC
versions are MMD operations.
- length(out INT,
in STR)
- Calculate the length (in characters) of string $2 and return as integer $1.
If $2 is NULL or zero length,
zero is returned.
- bytelength(out INT,
in STR)
- Calculate the length (in bytes) of string $2 and return as integer $1.
If $2 is NULL or zero length,
zero is returned.
- pin(inout STR)
- Make the memory in string $1 immobile.
This memory will not be moved by the Garbage Collector,
and may be safely passed to external libraries.
(Well,
as long as they don't free it) Pinning a string will move the contents.
- $1 should be unpinned if it is used after pinning is no longer necessary.
- unpin(inout STR)
- Make the memory in string $1 movable again.
This will make the memory in $1 move.
- substr(out STR,
in STR,
in INT)
- substr(out STR,
in STR,
in INT,
in INT)
- substr(out STR,
inout STR,
in INT,
in INT,
in STR)
- substr(inout STR,
in INT,
in INT,
in STR)
- substr(out STR,
invar PMC,
in INT,
in INT)
- Set $1 to the portion of $2 starting at (zero-based) character position $3 and having length $4.
If no length ($4) is provided,
it is equivalent to passing in the length of $2.
This creates a COW copy of $2.
- Optionally pass in string $5 for replacement.
If the length of $5 is different from the length specified in $4,
then $2 will grow or shrink accordingly.
If $3 is one character position larger than the length of $2,
then $5 is appended to $2 (and the empty string is returned); this is essentially the same as
concat $2, $5
- Finally, if $3 is negative, then it is taken to count backwards from the end of the string (ie an offset of -1 corresponds to the last character).
- The third form is optimized for replace only, ignoring the replaced substring and does not waste a register to do the string replace.
- index(out INT, in STR, in STR)
- index(out INT, in STR, in STR, in INT)
- The index function searches for a substring within target string, but without the wildcard-like behavior of a full regular-expression pattern match. It returns the position of the first occurrence of substring $3 in target string $2 at or after zero-based position $4. If $4 is omitted, index starts searching from the beginning of the string. The return value is based at "0". If the string is null, or the substring is not found or is null, index returns "-1".
- sprintf(out STR, in STR, invar PMC)
- sprintf(out PMC, invar PMC, invar PMC)
- Sets $1 to the result of calling
Parrot_psprintf
with the given format ($2) and arguments ($3, which should be an ordered aggregate PMC). In the (unimplemented) versions that don't include $3, arguments are popped off the user stack.
- The result is quite similar to using the system
sprintf
, but is protected against buffer overflows and the like. There are some differences, especially concerning sizes (which are largely ignored); see misc.c for details.
- new(out STR)
- new(out STR, in INT)
- Allocate a new empty string, of length $2 (optional), encoding $3 (optional) and type $4. (optional)
- stringinfo(out INT, in STR, in INT)
- Extract some information about string $2 and store it in $1. If a null string is passed, $1 is always set to 0. If an invalid $3 is passed, an exception is thrown. Possible values for $3 are:
- 1 The location of the string buffer header.
- 2 The location of the start of the string.
- 3 The length of the string buffer (in bytes).
- 4 The flags attached to the string (if any).
- 5 The amount of the string buffer used (in bytes).
- 6 The length of the string (in characters).
- upcase(out STR, in STR)
- Uppercase $2 and put the result in $1
- upcase(inout STR)
- Uppercase $1 in place
- downcase(out STR, in STR)
- Downcase $2 and put the result in $1
- downcase(inout STR)
- Downcase $1 in place
- titlecase(out STR, in STR)
- Titlecase $2 and put the result in $1
- titlecase(inout STR)
- Titlecase $1 in place
- join(out STR, in STR, invar PMC)
- Create a new string $1 by joining array elements from array $3 with string $2.
- split(out PMC, in STR, in STR)
- Create a new Array PMC $1 by splitting the string $3 into pieces delimited by the string $2. If $2 does not appear in $3, then return $3 as the sole element of the Array PMC. Will return empty strings for delimiters at the beginning and end of $3
- Note: the string $2 is just a string. If you want a perl-ish split on regular expression, use
PGE::Util
's split from the standard library.
- charset(out INT, in STR)
- Return the charset number $1 of string $2.
- charsetname(out STR, in INT)
- Return the name $1 of charset number $2. If charset number $2 is not found, name $1 is set to null.
- find_charset(out INT, in STR)
- Return the charset number of the charset named $2. If the charset doesn't exist, throw an exception.
- trans_charset(inout STR, in INT)
- Change the string to have the specified charset.
- trans_charset(out STR, in STR, in INT)
- Create a string $1 from $2 with the specified charset.
- Both functions may throw an exception on information loss.
- encoding(out INT, in STR)
- Return the encoding number $1 of string $2.
- encodingname(out STR, in INT)
- Return the name $1 of encoding number $2. If encoding number $2 is not found, name $1 is set to null.
- find_encoding(out INT, in STR)
- Return the encoding number of the encoding named $2. If the encoding doesn't exist, throw an exception.
- trans_encoding(inout STR, in INT)
- Change the string to have the specified encoding.
- trans_encoding(out STR, in STR, in INT)
- Create a string $1 from $2 with the specified encoding.
- Both functions may throw an exception on information loss.
- is_cclass(out INT, in INT, in STR, in INT)
- Set $1 to 1 if the codepoint of $3 at position $4 is in the character class(es) given by $2.
- find_cclass(out INT, in INT, in STR, in INT, in INT)
- Set $1 to the offset of the first codepoint matching the character class(es) given by $2 in string $3, starting at offset $4 for up to $5 codepoints. If no matching character is found, set $1 to (offset + count).
- find_not_cclass(out INT, in INT, in STR, in INT, in INT)
- Set $1 to the offset of the first codepoint not matching the character class(es) given by $2 in string $3, starting at offset $4 for up to $5 codepoints. If the substring consists entirely of matching characters, set $1 to (offset + count).
- escape(out STR, invar STR)
- Escape all non-ascii chars to backslashed escape sequences. A string with charset ascii is created as result.
- compose(out STR, in STR)
- Compose (normalize) a string.
Copyright (C) 2001-2008, The Perl Foundation.
This program is free software. It is subject to the same license as the Parrot interpreter itself.