string.ops - String Operations
Operations that work on strings,
whether constructing,
modifying or examining them.
See also rx.ops.
- ord(out INT,
in STR)
- Two-argument form returns the 0th character of string $2 in register $1.
If $2 is empty,
throws an exception.
- ord(out INT,
in STR,
in INT)
- Three-argument form returns character $3 of string $2 in register $1.
If $2 is empty,
throws an exception.
If $3 is greater than the length of string $2,
throws an exception.
If $3 is less then zero but greater than the negative of the length,
counts backwards through the string,
such that -1 is the last character,
-2 is the second-to-last character,
and so on.
If $3 is less than the negative of the length,
throws an exception.
- chr(out STR,
in INT)
- Returns a the character specified by the $2 number.
- chopn(inout STR,
in INT)
- chopn(out STR,
in STR,
in INT)
- Remove $2 characters from the end of the string in $1.
The 3-arg version removes $3 characters from the end of the string in $2 and returns the result in $1
- concat(inout STR,
in STR)
- Append the string in $2 to the string in $1.
- concat(out STR,
in STR,
in STR)
- Append the string $3 to $2 and places the result into $1.
- repeat(out STR,
in STR,
in INT)
- Repeats string $2 $3 times and stores result in $1.
- length(out INT,
in STR)
- Set $1 to the length (in characters) of the string in $2.
- bytelength(out INT,
in STR)
- Set $1 to the length (in bytes) of the string in $2.
- pin(inout STR)
- Make the memory in $1 immobile.
This memory will not be moved by the GC,
and may be safely passed to external libraries.
(Well,
as long as they don't free it) Pinning a string will move the contents.
- The memory only need be unpinned if you plan on using it for any length of time after its pinning is no longer necessary.
- unpin(inout STR)
- Make the memory in $1 movable again.
This will make the memory in $1 move.
- substr(out STR,
in STR,
in INT)
- substr(out STR,
in STR,
in INT,
in INT)
- substr_r(out STR,
in STR,
in INT,
in INT)
- substr(out STR,
inout STR,
in INT,
in INT,
in STR)
- substr(inout STR,
in INT,
in INT,
in STR)
- substr(out STR,
in PMC,
in INT,
in INT)
- Set $1 to the portion of $2 starting at (zero-based) character position $3 and having length $4.
If no length ($4) is provided,
it is equivalent to passing in the length of $2.
- Optionally pass in string $5 for replacement.
If the length of $5 is different from the length specified in $4,
then $2 will grow or shrink accordingly.
If $3 is one character position larger than the length of $2,
then $5 is appended to $2 (and the empty string is returned); this is essentially the same as
concat $2, $5
- Finally, if $3 is negative, then it is taken to count backwards from the end of the string (ie an offset of -1 corresponds to the last character).
- The third form is optimized for replace only, ignoring the replaced substring and does not waste a register to do the string replace.
- The _r variants reuse an existing string header and therefore normally do not create a new string in the destination register.
- index(out INT, in STR, in STR)
- index(out INT, in STR, in STR, in INT)
- The index function searches for one string within another, but without the wildcard-like behavior of a full regular-expression pattern match. It returns the position of the first occurrence of $3 in $2 at or after $4. If $4 is omitted, starts searching from the beginning of the string. The return value is based at "0". If the substring is not found, returns "-1".
- pack(inout STR, in INT, in INT)
- pack(inout STR, in INT, in NUM)
- pack(inout STR, in INT, in STR)
- pack(inout STR, in INT, in INT, in INT)
- #=item pack(inout STR, in INT, in NUM, in INT) (unimplemented)
- #=item pack(inout STR, in INT, in STR, in INT) (unimplemented)
- Concat $2 bytes from $3 at the end of $1 or replace them at $4 if provided.
- BE AFRAID, THIS IS A QUICK HACK, USE IT AT YOUR OWN RISK.
- sprintf(out STR, in STR, in PMC)
- sprintf(out PMC, in PMC, in PMC)
- #=item sprintf(out STR, in STR) [unimplemented] [[what is this op supposed to do? --jrieks]]
- #=item sprintf(out PMC, in PMC) [unimplemented] [[what is this op supposed to do? --jrieks]]
- Sets $1 to the result of calling
Parrot_psprintf
with the given format ($2) and arguments ($3, which should be an ordered aggregate PMC). In the (unimplemented) versions that don't include $3, arguments are popped off the user stack.
- The result is quite similar to using the system
sprintf
, but is protected against buffer overflows and the like. There are some differences, especially concerning sizes (which are largely ignored); see misc.c for details.
- new(out STR)
- new(out STR, in INT)
- Allocate a new empty string, of length $2 (optional), encoding $3 (optional) and type $4. (optional)
- stringinfo(out INT, in STR, in INT)
- Extract some information about string $2 and store it in $1. Possible values for $3 are:
- 1 The location of the string buffer header.
- 2 The location of the start of the string.
- 3 The length of the string buffer (in bytes).
- 4 The flags attached to the string (if any).
- 5 The amount of the string buffer used (in bytes).
- 6 The length of the string (in characters).
- upcase(out STR, in STR)
- Uppercase $2 and put the result in $1
- upcase(inout STR)
- Uppercase $1 in place
- downcase(out STR, in STR)
- Downcase $2 and put the result in $1
- downcase(inout STR)
- Downcase $1 in place
- titlecase(out STR, in STR)
- Titlecase $2 and put the result in $1
- titlecase(inout STR)
- Titlecase $1 in place
- join(out STR, in STR, in PMC)
- Create a new string $1 by joining array elements from array $3 with string $2.
- split(out PMC, in STR, in STR)
- Create a new Array PMC $1 by splitting the string $3 with regexp $2. Currently implemented only for the empty string $2.
- isnull(in STR, labelconst INT)
- Branch to $2 if $1 is a NULL string.
- charset(out INT, in STR)
- Return the charset number of string $2.
- charsetname(out STR, in INT)
- Return the name of charset numbered $2.
- find_charset(out INT, in STR)
- Return the charset number of the charset named $2. If the charset doesn't exit, throw an exception.
- trans_charset(inout STR, in INT)
- Change the string to have the specified charset.
- trans_charset(out STR, in STR, in INT)
- Create a string $1 from $2 with the specified charset.
- Both functions may throw an exception on information loss.
- is_whitespace(out INT, in STR, in INT)
- Set $1 to 1 if the codepoint of string $2 at offset $3 is whitespace.
- is_wordchar(out INT, in STR, in INT)
- Set $1 to 1 if the codepoint of string $2 at offset $3 is a wordchar.
- is_digit(out INT, in STR, in INT)
- Set $1 to 1 if the codepoint of string $2 at offset $3 is a digit.
- is_punctuation(out INT, in STR, in INT)
- Set $1 to 1 if the codepoint of string $2 at offset $3 is a punctuation char.
- is_newline(out INT, in STR, in INT)
- Set $1 to 1 if the codepoint of string $2 at offset $3 is a newline char.
- find_whitespace(out INT, in STR, in INT)
- Set $1 to the offset of the next whitespace codepoint or to -1.
- find_wordchar(out INT, in STR, in INT)
- Set $1 to the offset of the next wordchar codepoint or to -1.
- find_digit(out INT, in STR, in INT)
- Set $1 to the offset of the next digit codepoint or to -1.
- find_punctuation(out INT, in STR, in INT)
- Set $1 to the offset of the next punctuation codepoint or to -1.
- find_newline(out INT, in STR, in INT)
- Set $1 to the offset of the next newline codepoint or to -1.
- find_word_boundary(out INT, in STR, in INT)
- Set $1 to the offset of the next word boundary or to -1.
Copyright (C) 2001-2004 The Perl Foundation. All rights reserved.
This program is free software. It is subject to the same license as the Parrot interpreter itself.