P6C::IMCC::ExtRegex::Adapter

Convert the native perl6 compiler's representation of a regex parse tree into languages/regex's format.

convert_p6tree: Convert a P6C parse tree to a regex tree. Main entry point. Calls convert() to do the dirty work.
convert: Recursively perform the conversion. For any node type TYPE, calls a method named convert_TYPE.
convert_rule: Grab the parse tree out of a P6C::rule and convert it, then wrap it in a scan node unless there was a ^ within it. Probably buggy, since there are probably ways of having a ^ that does not apply to the entire expression.
convert_rx_alt: P6C::rx_alt(T) -> Regex::Ops::Tree::alternate(branches of T)
convert_rx_seq: P6C::rx_seq(T) -> Regex::Ops::Tree::seq(elements of T)
convert_rx_atom: Convert P6C::rx_atom(T) depending on what kind of atom T is. A P6C::rx_val is converted directly; an ARRAY is treated as a code block; if T has a type field that is of type PerlArray, then it is matched as an array literal (as if it were an alternation of each of its elements); otherwise, it is assumed to be a string.; This also plays around with group captures. It increments the group id ($1 -> $2 etc.) Which is overly simplistic and stupid.
intvalue: Utility routine to extract an integer value out of a P6C tree.
convert_rx_repeat: P6C::rx_repeat(T) -> Regex::Ops::Tree::multi_match(T.min, T.max, T.greedyflag, T.expr)
convert_rx_meta: Convert a metacharacter (backslashed character). Things like \s and \x34 and things. Fun stuff. Lots missing.
convert_rx_any: Convert P6C::rx_any, which represents . in a regex, which just means to skip ahead one character.
convert_rx_any: Convert zero-width assertion. Currently mishandles ^ and $. Actually, I think the $ implementation may be fine. Doesn't implement anything else.
string_to_incexc: Generate an inclusion/exclusion list out of a string representing a character class. An inc/exc list L is a sequence of code points representing a character class, which can also be thought of as a set of code points. Anything less than the first element L[0] is not in the set; anything equal to the L[0] but less than or equal to L[1] is in the set; anything greater than L[1] but less than or equal to L[2] is not in the set, etc.; Examples:; FIXME: makes no attempt to handle unicode
ord_to_incexc: Generate an inclusion/exclusion list from a single code point. Unless it is negated, it is kind of silly to use this instead of a simple 'match' op.
convert_rx_oneof: Convert the P6C compiler's notion of a character class into languages/regex's.
convert_rx_assertion: Placeholder for assertions.
convert_rx_call: Argument: $tree - A P6C::rx_call object representing a call to a nested rule within a regex match tree
convert_sv_literal: Matching a literal value by breaking it up into individual characters. Which seems pretty stupid at the time I'm documenting this, considering that I have code to match a whole string. Oh well; with optimization, it should boil down to pretty much the same thing. And maybe there's some brilliant reason why I chose to do it this way instead. (But I doubt it; I probably just did this one first.)

parrotcode: Untitled
Contents \| Language Implementations \| Perl6