P6C::IMCC::ExtRegex::Adapter ^

Convert the native perl6 compiler's representation of a regex parse tree into languages/regex's format.


Convert a P6C parse tree to a regex tree. Main entry point. Calls convert() to do the dirty work.


Recursively perform the conversion. For any node type TYPE, calls a method named convert_TYPE.


Grab the parse tree out of a P6C::rule and convert it, then wrap it in a scan node unless there was a ^ within it. Probably buggy, since there are probably ways of having a ^ that does not apply to the entire expression.


P6C::rx_alt(T) -> Regex::Ops::Tree::alternate(branches of T)


P6C::rx_seq(T) -> Regex::Ops::Tree::seq(elements of T)


Convert P6C::rx_atom(T) depending on what kind of atom T is. A P6C::rx_val is converted directly; an ARRAY is treated as a code block; if T has a type field that is of type PerlArray, then it is matched as an array literal (as if it were an alternation of each of its elements); otherwise, it is assumed to be a string.

This also plays around with group captures. It increments the group id ($1 -> $2 etc.) Which is overly simplistic and stupid.


Utility routine to extract an integer value out of a P6C tree.


P6C::rx_repeat(T) -> Regex::Ops::Tree::multi_match(T.min, T.max, T.greedyflag, T.expr)


Convert a metacharacter (backslashed character). Things like \s and \x34 and things. Fun stuff. Lots missing.


Convert P6C::rx_any, which represents . in a regex, which just means to skip ahead one character.


Convert zero-width assertion. Currently mishandles ^ and $. Actually, I think the $ implementation may be fine. Doesn't implement anything else.


Generate an inclusion/exclusion list out of a string representing a character class. An inc/exc list L is a sequence of code points representing a character class, which can also be thought of as a set of code points. Anything less than the first element L[0] is not in the set; anything equal to the L[0] but less than or equal to L[1] is in the set; anything greater than L[1] but less than or equal to L[2] is not in the set, etc.


  ()    - the empty set
  (0)   - the universal set
  (5)   - anything 5 or greater
  (2,4) - 2 or 3
FIXME: makes no attempt to handle unicode


Generate an inclusion/exclusion list from a single code point. Unless it is negated, it is kind of silly to use this instead of a simple 'match' op.


Convert the P6C compiler's notion of a character class into languages/regex's.


Placeholder for assertions.


Argument: $tree - A P6C::rx_call object representing a call to a nested rule within a regex match tree


Matching a literal value by breaking it up into individual characters. Which seems pretty stupid at the time I'm documenting this, considering that I have code to match a whole string. Oh well; with optimization, it should boil down to pretty much the same thing. And maybe there's some brilliant reason why I chose to do it this way instead. (But I doubt it; I probably just did this one first.)