NAME
string.ops - String Opcodes
DESCRIPTION
Operations that work on strings, whether constructing, modifying or examining them.
When making changes to any ops file,
run make bootstrap-ops to regenerate all generated ops files.
- ord(out INT, in STR) The codepoint in the current character set of the first character of string $2 is returned in integer $1. If $2 is empty, an exception is thrown.
- ord(out INT, in STR, in INT) The codepoint in the current character set of the character at integer index $3 of string $2 is returned in integer $1. If $2 is empty, an exception is thrown. If $3 is greater than the length of $2, an exception is thrown. If $3 is less then zero but greater than the negative of the length of $2, counts backwards through $2, such that -1 is the last character, -2 is the second-to-last character, and so on. If $3 is less than the negative of the length of $2, an exception is thrown.
- chr(out STR, in INT) The character specified by codepoint integer $2 is returned in string $1.
- chopn(out STR, in STR, in INT) Remove n characters specified by integer $3 from the tail of string $2, and returns the characters not chopped in string $1. If $3 is negative, cut the string after -$3 characters.
- concat(invar PMC, in STR)
- concat(invar PMC, invar PMC) Modify string $1 in place, appending string $2.
- concat(out STR, in STR, in STR)
- concat(invar PMC, invar PMC, in STR)
- concat(invar PMC, invar PMC, invar PMC) Append string $3 to string $2 and place the result into string $1.
- repeat(out STR, in STR, in INT)
- repeat(invar PMC, invar PMC, in INT)
- repeat(invar PMC, invar PMC, invar PMC) Repeat string $2 integer $3 times and return result in string $1. The
- repeat(invar PMC, in INT)
- repeat(invar PMC, invar PMC) Repeat string $1 number $2 times and return result in string $1. The
- length(out INT, in STR) Calculate the length (in characters) of string $2 and return as integer $1. If $2 is NULL or zero length, zero is returned.
- bytelength(out INT, in STR) Calculate the length (in bytes) of string $2 and return as integer $1. If $2 is NULL or zero length, zero is returned.
- pin(inout STR) Make the memory in string $1 immobile. This memory will not be moved by the Garbage Collector, and may be safely passed to external libraries. (Well, as long as they don't free it) Pinning a string will move the contents.$1 should be unpinned if it is used after pinning is no longer necessary.
- unpin(inout STR) Make the memory in string $1 movable again. This will make the memory in $1 move.
- substr(out STR, in STR, in INT)
- substr(out STR, in STR, in INT, in INT)
- substr(out STR, invar PMC, in INT, in INT) Set $1 to the portion of $2 starting at (zero-based) character position $3 and having length $4. If no length ($4) is provided, it is equivalent to passing in the length of $2.
- replace(out STR, in STR, in INT, in INT, in STR) Replace part of $2 starting from $3 of length $4 with $5. If the length of $5 is different from the length specified in $4, then $2 will grow or shrink accordingly. If $3 is one character position larger than the length of $2, then $5 is appended to $2 (and the empty string is returned); this is essentially the same as
- index(out INT, in STR, in STR)
- index(out INT, in STR, in STR, in INT) The index function searches for a substring within target string, but without the wildcard-like behavior of a full regular-expression pattern match. It returns the position of the first occurrence of substring $3 in target string $2 at or after zero-based position $4. If $4 is omitted, index starts searching from the beginning of the string. The return value is based at "0". If the string is null, or the substring is not found or is null, index returns "-1".
- sprintf(out STR, in STR, invar PMC)
- sprintf(out PMC, invar PMC, invar PMC) Sets $1 to the result of calling
- new(out STR)
- new(out STR, in INT) Allocate a new empty string of length $2 (optional).XXX: Do these ops make sense with immutable strings?
- stringinfo(out INT, in STR, in INT) Extract some information about string $2 and store it in $1. If a null string is passed, $1 is always set to 0. If an invalid $3 is passed, an exception is thrown. Possible values for $3 are:
- 1 The location of the string buffer header.
- 2 The location of the start of the string.
- 3 The length of the string buffer (in bytes).
- 4 The flags attached to the string (if any).
- 5 The amount of the string buffer used (in bytes).
- 6 The length of the string (in characters).
- upcase(out STR, in STR) Uppercase $2 and put the result in $1
- downcase(out STR, in STR) Downcase $2 and put the result in $1
- titlecase(out STR, in STR) Titlecase $2 and put the result in $1
- join(out STR, in STR, invar PMC) Create a new string $1 by joining array elements from array $3 with string $2.
- split(out PMC, in STR, in STR) Create a new Array PMC $1 by splitting the string $3 into pieces delimited by the string $2. If $2 does not appear in $3, then return $3 as the sole element of the Array PMC. Will return empty strings for delimiters at the beginning and end of $3Note: the string $2 is just a string. If you want a perl-ish split on regular expression, use
- encoding(out INT, in STR) Return the encoding number $1 of string $2.
- encodingname(out STR, in INT) Return the name $1 of encoding number $2. If encoding number $2 is not found, name $1 is set to null.
- find_encoding(out INT, in STR) Return the encoding number of the encoding named $2. If the encoding doesn't exist, throw an exception.
- trans_encoding(out STR, in STR, in INT) Create a string $1 from $2 with the specified encoding.Both functions may throw an exception on information loss.
- is_cclass(out INT, in INT, in STR, in INT) Set $1 to 1 if the codepoint of $3 at position $4 is in the character class(es) given by $2.
- find_cclass(out INT, in INT, in STR, in INT, in INT) Set $1 to the offset of the first codepoint matching the character class(es) given by $2 in string $3, starting at offset $4 for up to $5 codepoints. If no matching character is found, set $1 to (offset + count).
- find_not_cclass(out INT, in INT, in STR, in INT, in INT) Set $1 to the offset of the first codepoint not matching the character class(es) given by $2 in string $3, starting at offset $4 for up to $5 codepoints. If the substring consists entirely of matching characters, set $1 to (offset + count).
- escape(out STR, invar STR) Escape all non-ascii chars to backslashed escape sequences. A string with charset ascii is created as result.
- compose(out STR, in STR) Compose (normalize) a string.
- find_codepoint(out INT, in STR) Set $1 to the codepoint with the name given in $2, or -1 if there is none. Requires ICU lib, otherwise exception is thrown.
PMC versions are MMD operations.
PMC versions are MMD operations.
concat $2, $5Finally, if $3 is negative, then it is taken to count backwards from the end of the string (ie an offset of -1 corresponds to the last character).New $1 string returned.
Parrot_psprintf with the given format ($2) and arguments ($3, which should be an ordered aggregate PMC).The result is quite similar to using the system sprintf, but is protected against buffer overflows and the like. There are some differences, especially concerning sizes (which are largely ignored); see misc.c for details.
PGE::Util's split from the standard library.
COPYRIGHT
Copyright (C) 2001-2010, Parrot Foundation.
LICENSE
This program is free software. It is subject to the same license as the Parrot interpreter itself.
