string.ops - String Operations
Operations that work on strings,
whether constructing,
modifying or examining them.
See also rx.ops.
- ord(out INT,
in STR)
- Two-argument form returns the 0th character of string $2 in register $1.
If $2 is empty,
throws an exception.
- ord(out INT,
in STR,
in INT)
- Three-argument form returns character $3 of string $2 in register $1.
If $2 is empty,
throws an exception.
If $3 is greater than the length of string $2,
throws an exception.
If $3 is less then zero but greater than the negative of the length,
counts backwards through the string,
such that -1 is the last character,
-2 is the second-to-last character,
and so on.
If $3 is less than the negative of the length,
throws an exception.
- chr(out STR,
in INT)
- Returns the character specified by the $2 number.
- chopn(inout STR,
in INT)
- Remove $2 characters from the end of the string in $1.
If $2 is negative,
cut the string after -$2 characters.
- chopn(out STR,
in STR,
in INT)
- Removes $3 characters from the end of the string in $2 and returns the result in $1.
If $3 is negative,
cut the string after -$3 characters.
- concat(inout STR,
in STR)
- Append the string in $2 to the string in $1.
- concat(out STR,
in STR,
in STR)
- Append the string $3 to $2 and places the result into $1.
- repeat(out STR,
in STR,
in INT)
- Repeats string $2 $3 times and stores result in $1.
- length(out INT,
in STR)
- Set $1 to the length (in characters) of the string in $2.
- bytelength(out INT,
in STR)
- Set $1 to the length (in bytes) of the string in $2.
- pin(inout STR)
- Make the memory in $1 immobile.
This memory will not be moved by the GC,
and may be safely passed to external libraries.
(Well,
as long as they don't free it) Pinning a string will move the contents.
- The memory only need be unpinned if you plan on using it for any length of time after its pinning is no longer necessary.
- unpin(inout STR)
- Make the memory in $1 movable again.
This will make the memory in $1 move.
- substr(out STR,
in STR,
in INT)
- substr(out STR,
in STR,
in INT,
in INT)
- substr_r(out STR,
in STR,
in INT,
in INT)
- substr(out STR,
inout STR,
in INT,
in INT,
in STR)
- substr(inout STR,
in INT,
in INT,
in STR)
- substr(out STR,
in PMC,
in INT,
in INT)
- Set $1 to the portion of $2 starting at (zero-based) character position $3 and having length $4.
If no length ($4) is provided,
it is equivalent to passing in the length of $2.
- Optionally pass in string $5 for replacement.
If the length of $5 is different from the length specified in $4,
then $2 will grow or shrink accordingly.
If $3 is one character position larger than the length of $2,
then $5 is appended to $2 (and the empty string is returned); this is essentially the same as
concat $2, $5
- Finally, if $3 is negative, then it is taken to count backwards from the end of the string (ie an offset of -1 corresponds to the last character).
- The third form is optimized for replace only, ignoring the replaced substring and does not waste a register to do the string replace.
- The _r variants reuse an existing string header and therefore normally do not create a new string in the destination register.
- index(out INT, in STR, in STR)
- index(out INT, in STR, in STR, in INT)
- The index function searches for one string within another, but without the wildcard-like behavior of a full regular-expression pattern match. It returns the position of the first occurrence of $3 in $2 at or after $4. If $4 is omitted, starts searching from the beginning of the string. The return value is based at "0". If the substring is not found, returns "-1".
- sprintf(out STR, in STR, in PMC)
- sprintf(out PMC, in PMC, in PMC)
- #=item sprintf(out STR, in STR) [unimplemented] [[what is this op supposed to do? --jrieks]]
- #=item sprintf(out PMC, in PMC) [unimplemented] [[what is this op supposed to do? --jrieks]]
- Sets $1 to the result of calling
Parrot_psprintf
with the given format ($2) and arguments ($3, which should be an ordered aggregate PMC). In the (unimplemented) versions that don't include $3, arguments are popped off the user stack.
- The result is quite similar to using the system
sprintf
, but is protected against buffer overflows and the like. There are some differences, especially concerning sizes (which are largely ignored); see misc.c for details.
- new(out STR)
- new(out STR, in INT)
- Allocate a new empty string, of length $2 (optional), encoding $3 (optional) and type $4. (optional)
- stringinfo(out INT, in STR, in INT)
- Extract some information about string $2 and store it in $1. Possible values for $3 are:
- 1 The location of the string buffer header.
- 2 The location of the start of the string.
- 3 The length of the string buffer (in bytes).
- 4 The flags attached to the string (if any).
- 5 The amount of the string buffer used (in bytes).
- 6 The length of the string (in characters).
- upcase(out STR, in STR)
- Uppercase $2 and put the result in $1
- upcase(inout STR)
- Uppercase $1 in place
- downcase(out STR, in STR)
- Downcase $2 and put the result in $1
- downcase(inout STR)
- Downcase $1 in place
- titlecase(out STR, in STR)
- Titlecase $2 and put the result in $1
- titlecase(inout STR)
- Titlecase $1 in place
- join(out STR, in STR, in PMC)
- Create a new string $1 by joining array elements from array $3 with string $2.
- split(out PMC, in STR, in STR)
- Create a new Array PMC $1 by splitting the string $3 into pieces delimited by the string $2. If $2 does not appear in $3, then return $3 as the sole element of the Array PMC. Will return empty strings for delimiters at the beginning and end of $3
- charset(out INT, in STR)
- Return the charset number of string $2.
- charsetname(out STR, in INT)
- Return the name of charset numbered $2.
- find_charset(out INT, in STR)
- Return the charset number of the charset named $2. If the charset doesn't exit, throw an exception.
- trans_charset(inout STR, in INT)
- Change the string to have the specified charset.
- trans_charset(out STR, in STR, in INT)
- Create a string $1 from $2 with the specified charset.
- Both functions may throw an exception on information loss.
- encoding(out INT, in STR)
- Return the encoding number of string $2.
- encodingname(out STR, in INT)
- Return the name of encoding numbered $2.
- find_encoding(out INT, in STR)
- Return the encoding number of the encoding named $2. If the encoding doesn't exit, throw an exception.
- trans_encoding(inout STR, in INT)
- Change the string to have the specified encoding.
- trans_encoding(out STR, in STR, in INT)
- Create a string $1 from $2 with the specified encoding.
- Both functions may throw an exception on information loss.
- is_cclass(out INT, in INT, in STR, in INT)
- Set $1 to 1 if the codepoint of $3 at position $4 is in the character class(es) given by $2.
- find_cclass(out INT, in INT, in STR, in INT, in INT)
- Set $1 to the offset of the first codepoint matching the character class(es) given by $2 in string $3, starting at offset $4 for up to $5 codepoints. If no matching character is found, set $1 to (offset + count).
- find_not_cclass(out INT, in INT, in STR, in INT, in INT)
- Set $1 to the offset of the first codepoint not matching the character class(es) given by $2 in string $3, starting at offset $4 for up to $5 codepoints. If the substring consists entirely of matching characters, set $1 to (offset + count).
- escape(out STR, invar STR)
- Escape all non-ascii chars to backslashed escape sequences. A string with charset ascii is created as result.
- compose(out STR, in STR)
- Compose (normalize) a string.
Copyright (C) 2001-2004 The Perl Foundation. All rights reserved.
This program is free software. It is subject to the same license as the Parrot interpreter itself.