parrotcode: String Operations Contents | Ops

# NAME

string.ops - String Operations

# DESCRIPTION

Operations that work on strings, whether constructing, modifying or examining them. See also rx.ops.

ord(out INT, in STR)

Two-argument form returns the 0th character of string \$2 in register \$1. If \$2 is empty, throws an exception.

ord(out INT, in STR, in INT)

Three-argument form returns character \$3 of string \$2 in register \$1. If \$2 is empty, throws an exception. If \$3 is greater than the length of string \$2, throws an exception. If \$3 is less then zero but greater than the negative of the length, counts backwards through the string, such that -1 is the last character, -2 is the second-to-last character, and so on. If \$3 is less than the negative of the length, throws an exception.

chr(out STR, in INT)

Returns a the character specified by the \$2 number.

chopn(inout STR, in INT)

chopn(out STR, in STR, in INT)

Remove \$2 characters from the end of the string in \$1. The 3-arg version removes \$3 characters from the end of the string in \$2 and returns the result in \$1

concat(inout STR, in STR)

Append the string in \$2 to the string in \$1.

concat(out STR, in STR, in STR)

Append the string \$3 to \$2 and places the result into \$1.

repeat(out STR, in STR, in INT)

Repeats string \$2 \$3 times and stores result in \$1.

length(out INT, in STR)

Set \$1 to the length (in characters) of the string in \$2.

bytelength(out INT, in STR)

Set \$1 to the length (in bytes) of the string in \$2.

pin(inout STR)

Make the memory in \$1 immobile. This memory will not be moved by the GC, and may be safely passed to external libraries. (Well, as long as they don't free it) Pinning a string will move the contents.

The memory only need be unpinned if you plan on using it for any length of time after its pinning is no longer necessary.

unpin(inout STR)

Make the memory in \$1 movable again. This will make the memory in \$1 move.

substr(out STR, in STR, in INT)

substr(out STR, in STR, in INT, in INT)

substr_r(out STR, in STR, in INT, in INT)

substr(out STR, inout STR, in INT, in INT, in STR)

substr(inout STR, in INT, in INT, in STR)

substr(out STR, in PMC, in INT, in INT)

Set \$1 to the portion of \$2 starting at (zero-based) character position \$3 and having length \$4. If no length (\$4) is provided, it is equivalent to passing in the length of \$2.

Optionally pass in string \$5 for replacement. If the length of \$5 is different from the length specified in \$4, then \$2 will grow or shrink accordingly. If \$3 is one character position larger than the length of \$2, then \$5 is appended to \$2 (and the empty string is returned); this is essentially the same as

`  concat \$2, \$5`
Finally, if \$3 is negative, then it is taken to count backwards from the end of the string (ie an offset of -1 corresponds to the last character).

The third form is optimized for replace only, ignoring the replaced substring and does not waste a register to do the string replace.

The _r variants reuse an existing string header and therefore normally do not create a new string in the destination register.

index(out INT, in STR, in STR)

index(out INT, in STR, in STR, in INT)

The index function searches for one string within another, but without the wildcard-like behavior of a full regular-expression pattern match. It returns the position of the first occurrence of \$3 in \$2 at or after \$4. If \$4 is omitted, starts searching from the beginning of the string. The return value is based at "0". If the substring is not found, returns "-1".

pack(inout STR, in INT, in INT)

pack(inout STR, in INT, in NUM)

pack(inout STR, in INT, in STR)

pack(inout STR, in INT, in INT, in INT)

#=item pack(inout STR, in INT, in NUM, in INT) (unimplemented)

#=item pack(inout STR, in INT, in STR, in INT) (unimplemented)

Concat \$2 bytes from \$3 at the end of \$1 or replace them at \$4 if provided.

BE AFRAID, THIS IS A QUICK HACK, USE IT AT YOUR OWN RISK.

sprintf(out STR, in STR, in PMC)

sprintf(out PMC, in PMC, in PMC)

#=item sprintf(out STR, in STR) [unimplemented] [[what is this op supposed to do? --jrieks]]

#=item sprintf(out PMC, in PMC) [unimplemented] [[what is this op supposed to do? --jrieks]]

Sets \$1 to the result of calling `Parrot_psprintf` with the given format (\$2) and arguments (\$3, which should be an ordered aggregate PMC). In the (unimplemented) versions that don't include \$3, arguments are popped off the user stack.

The result is quite similar to using the system `sprintf`, but is protected against buffer overflows and the like. There are some differences, especially concerning sizes (which are largely ignored); see misc.c for details.

new(out STR)

new(out STR, in INT)

Allocate a new empty string, of length \$2 (optional), encoding \$3 (optional) and type \$4. (optional)

stringinfo(out INT, in STR, in INT)

Extract some information about string \$2 and store it in \$1. Possible values for \$3 are:

1 The location of the string buffer header.

2 The location of the start of the string.

3 The length of the string buffer (in bytes).

4 The flags attached to the string (if any).

5 The amount of the string buffer used (in bytes).

6 The length of the string (in characters).

upcase(out STR, in STR)

Uppercase \$2 and put the result in \$1

upcase(inout STR)

Uppercase \$1 in place

downcase(out STR, in STR)

Downcase \$2 and put the result in \$1

downcase(inout STR)

Downcase \$1 in place

titlecase(out STR, in STR)

Titlecase \$2 and put the result in \$1

titlecase(inout STR)

Titlecase \$1 in place

join(out STR, in STR, in PMC)

Create a new string \$1 by joining array elements from array \$3 with string \$2.

split(out PMC, in STR, in STR)

Create a new Array PMC \$1 by splitting the string \$3 with regexp \$2. Currently implemented only for the empty string \$2.

isnull(in STR, labelconst INT)

Branch to \$2 if \$1 is a NULL string.

charset(out INT, in STR)

Return the charset number of string \$2.

charsetname(out STR, in INT)

Return the name of charset numbered \$2.

find_charset(out INT, in STR)

Return the charset number of the charset named \$2. If the charset doesn't exit, throw an exception.

trans_charset(inout STR, in INT)

Change the string to have the specified charset.

trans_charset(out STR, in STR, in INT)

Create a string \$1 from \$2 with the specified charset.

Both functions may throw an exception on information loss.

is_whitespace(out INT, in STR, in INT)

Set \$1 to 1 if the codepoint of string \$2 at offset \$3 is whitespace.

is_wordchar(out INT, in STR, in INT)

Set \$1 to 1 if the codepoint of string \$2 at offset \$3 is a wordchar.

is_digit(out INT, in STR, in INT)

Set \$1 to 1 if the codepoint of string \$2 at offset \$3 is a digit.

is_punctuation(out INT, in STR, in INT)

Set \$1 to 1 if the codepoint of string \$2 at offset \$3 is a punctuation char.

is_newline(out INT, in STR, in INT)

Set \$1 to 1 if the codepoint of string \$2 at offset \$3 is a newline char.

find_whitespace(out INT, in STR, in INT)

Set \$1 to the offset of the next whitespace codepoint or to -1.

find_wordchar(out INT, in STR, in INT)

Set \$1 to the offset of the next wordchar codepoint or to -1.

find_digit(out INT, in STR, in INT)

Set \$1 to the offset of the next digit codepoint or to -1.

find_punctuation(out INT, in STR, in INT)

Set \$1 to the offset of the next punctuation codepoint or to -1.

find_newline(out INT, in STR, in INT)

Set \$1 to the offset of the next newline codepoint or to -1.

find_word_boundary(out INT, in STR, in INT)

Set \$1 to the offset of the next word boundary or to -1.