PDD 23: Exceptions

Abstract

This document defines the requirements and implementation strategy for Parrot's exception system.

Version

$Revision$

Description

Exceptions are indications by running code that something unusual -- an "exception" to the normal processing -- has occurred. When code detects an exceptional condition, it throws an exception object. Before this occurs, code can register exception handlers, which are functions (or closures) which may (but are not obligated to) handle the exception. Some exceptions permit continued execution immediately after the throw; some don't.

Exceptions transfer control to a piece of code outside the normal flow of control. They are mainly used for error reporting or cleanup tasks.

(A digression on terminology: In a system analysis sense, the word "exception" usually refers to the exceptional event that requires out-of-band handling. However, in Parrot, "exception" also refers to the object that holds all the information describing the exceptional condition: the nature of the exception, the error message describing it, and other ancillary information. The specific type (class) of an exception object indicates its category.)

Exception Opcodes

These are the opcodes relevant to exceptions and exception handlers:

push_eh LABEL
push_eh EXCEPTIONHANDLER_PMC
Push an exception handler pmc onto the exception handler stack.When an exception is thrown, Parrot walks up the stack of active exception handlers, invoking each one in turn, but still in the dynamic context of the exception (i.e. the call stack is not unwound first). See below for more detail.If a LABEL is provided, Parrot creates and pushes an exception handler that resumes execution at LABEL if invoked, which has the effect of unconditionally handling all errors, and unwinding the stack to that label.If an EXCEPTIONHANDLER_PMC is provided, Parrot pushes that pmc itself onto the exception handler stack.
pop_eh
Pop the most recently pushed exception handler off the exception handler stack.
throw EXCEPTION [ , CONTINUATION ]
Throw an exception consisting of the given EXCEPTION PMC, after taking a continuation at the next opcode. When a CONTINUATION is passed in, it will use that instead of generating a new continuation. Active exception handlers (if any) will be invoked with EXCEPTION as the only parameter. The CONTINUATION is stored in the 'resume' slot of the EXCEPTION.PMCs other than Parrot's Exception PMC may also be thrown, but they must support the interface of an Exception PMC. An HLL may implement throwing any arbitrary type of PMC, by storing that PMC in the 'payload' slot of the Exception PMC.Exception handlers can resume execution after handling the exception by invoking the continuation stored in the 'resume' slot of the exception object. That continuation must be invoked with no parameters; in other words, throw never returns a value.
rethrow EXCEPTION
While handling an exception, rethrow the exception to the next handler. Aside from selecting a different handler, the behaviour of rethrow is the same as throw. Each successive call to rethrow will select a different handler, until it exhausts the list of possible handlers. A rethrown exception that is not handled behaves the same as an unhandled thrown exception.
die [ MESSAGE ]
The die opcode throws an exception of type exception;death and severity EXCEPT_error with a payload of MESSAGE. The exception payload is a String PMC containing MESSAGE.{{NOTE: Exception classes NYI. Currently throws CONTROL_ERROR}}The default when no MESSAGE is given is "Fatal exception at LINE in FILE." followed by a backtrace.{{NOTE: Not yet implemented.}}If this exception is not handled, it results in Parrot returning an error indication and the stringification of MESSAGE to its embedding environment. When running standalone, this means writing the stringification of MESSAGE to standard error and executing the standard Parrot function Parrot_exit, to shut down the interpreter cleanly.
exit [ EXITCODE ]
Throw an exception of type exception;exit with a payload of EXITCODE, which defaults to zero, as an Integer PMC.{{NOTE: Exception classes NYI. Currently throws a type based on the EXITCODE.}}If not handled, this exception results in Parrot returning EXITCODE as a status to its embedded environment, or when running standalone, to execute the C function exit(EXITCODE).{{NOTE: This is not currently the case. Parrot now stores the EXITCODE argument in the type, not the payload}}

Exception Introspection Opcodes

These are the opcodes relevant to introspection of the exception handler stack:

count_eh
Return the quantity of currently active exception handlers.

Order of Operations in Exception Handling

When throw is called, for all active exception handlers, in LIFO order:

1 Find the topmost exception handler.
2 Push an exception record somewhere, presumably on the exception handler stack. The exception record contains a pointer to an exception handler block, an exception PMC, and (optionally) a continuation.
3 Invoke the handler (note: this is still in the thrower's dynamic context).
4 If the exception is rethrown, repeat steps 1-3 above, finding the next exception handler.
5 If no handler is found, and the exception is non-fatal (such as a warning), and there is a continuation in the exception record (because the throwing opcode was throw), invoke the continuation (resume execution). Whether to resume or die when an exception isn't handled is determined by the severity of the exception.
6 Otherwise terminate the program like die.

When running an embedded Parrot interpreter, the interpreter does not immediately terminate on an unhandled exception, it merely returns control to the embedding program and stores the unhandled exception so that it may be queried by the embedding program. The embedding program may choose to handle the exception and continue execution by invoking the exception's continuation.

Implementation

Exception Object Interface

All of Parrot's standard exceptions provide at least the following interface. It is recommended that all classes intended for throwing also provide at least this interface as well.

PMC *get_attr_str(STRING *name)
Retrieve an attribute from the Exception. All exceptions will have at least message, severity, resume, and payload attributes.The message is an exception's human-readable self-description. Note that the type of the returned PMC isn't required to be String, but you should still be able to stringify and print it.The severity is an integer from an internal Parrot enum of exception severities.The resume is a continuation that you can invoke to resume normal execution of the program.The payload more specifically identifies the detailed cause/nature of the exception. Each exception class will have its own specific payload type(s). See the table of standard exception classes for examples.
PMC *set_attr_str(STRING *name, PMC *value)
Set an attribute on the Exception. All exceptions will have at least message, severity, resume, and payload attributes.
PMC *annotations()
Gets a Hash containing any bytecode annotations in effect at the point where the exception was thrown. If none were in effect, returns an empty Hash. See the PIR PDD for syntax for declaring and semantics of bytecode annotations.
PMC *annotations(STRING *name)
Returns a PMC representing the bytecode annotation with the key specified in name at the point where the exception was thrown. If there was no such annotation in effect, a NULL PMC will be returned.
PMC *backtrace()
Gets a representation of the backtrace at the point that this exception was thrown. Returns an array of hashes. Each array element represents a caller in the backtrace, the most recent caller first. The hash has two keys: sub, which holds the PMC representing the sub, and annotations which is a hash of the annotations at the point where the exception was thrown for the current sub, or for the point of the call a level deeper for the rest.

Standard Parrot Exceptions

Parrot comes with a small hierarchy of classes designed for use as exceptions. Parrot throws them when internal Parrot errors occur, but any user code can throw them too.

{{NOTE: Currently NYI. Parrot currently uses integers to represent exception types.}}

{{NOTE: Questions about how this interoperates with custom HLL exception classes}}

exception
Base class of all standard exceptions. Provides no special functionality. Exists for the purpose of isa testing.
exception;death
Exception type that is thrown by the die opcode. See the description of the die opcode in this document.
exception;errno
A system error as reported in the C variable errno. Payload is an integer. Message is the return value of the standard C function strerror().
exception;exit
Exception type that is thrown by the exit opcode. See the description of the exit opcode in this document.
exception;math
Generic base class for math errors.
exception;math;division_by_zero
Division by zero (integer or float). No payload.
exception;domain
Generic base class for miscellaneous domain (input value) errors. Payload is an array, the first element of which is the operation that failed (e.g. the opcode name); subsequent elements depend on the value of the first element.(Note: There is not a separate exception class for every operation that might throw a domain exception. Class proliferation is expensive, both to Parrot and to the humans working with it who have to memorize a class hierarchy. But I understand the temptation.)
exception;lexical
An find_lex or store_lex operation failed because a given lexical variable was not found. Payload is an array: [0] the name of the lexical variable that was not found, [1] the LexPad in which it was not found.

Opcodes that Throw Exceptions

Exceptions have been incorporated into built-in opcodes in a limited way. For the most part, they're used when the return value is either impractical to check (perhaps because we don't want to add that many error checks in line), or where the output type is unable to represent an error state (e.g. the output I register of the ord opcode).

The div, fdiv, and cmod opcodes throw exception;math;division_by_zero.

The ord opcode throws exception;domain when it's passed an empty argument or a string index that's outside the length of the string. Payload is an array, first element being the string 'ord'.

The find_charset opcode throws exception;domain if the charset name it's looking up doesn't exist. Payload is an array: [0] string 'find_charset', [1] charset name that was not found.

The trans_charset opcode throws exception;domain on "information loss" (presumably, this means when one charset doesn't have a one-to-one correspondence in the other charset). Payload is an array: [0] string 'trans_charset', [1] source charset name, [2] destination charset name, [3] untranslatable code point.

The find_encoding opcode throws exception;domain if the encoding name it's looking up doesn't exist. Payload is an array: [0] string 'find_encoding', [1] encoding name that was not found.

The trans_encoding opcode throws exception;domain on "information loss" (presumably, this means when one encoding doesn't have a one-to-one correspondence in the other encoding). Payload is an array: [0] string 'trans_encoding', [1] source encoding name, [2] destination encoding name, [3] untranslatable code point.

Parrot's default version of the LexPad PMC throws exception;lexical for some error conditions, though other implementations can choose to return error values instead.

By default, the find_lex and store_lex opcodes throw an exception (exception;lexical) when the given name can't be found in any visible lexical pads. However, this behavior is only a default, as provided by the default Parrot lexical pad PMC LexPad. If a given HLL has its own lexical pad PMC, its behavior may be very different. (For example, in Tcl, store_lex is likely to succeed every time, as creating new lexicals at runtime is OK in Tcl.)

{{ TODO: List any other opcodes that currently throw exceptions and general categories of opcodes that should throw exceptions. }}

Other opcodes respond to an errorson setting to decide whether to throw an exception or return an error value. get_hll_global and get_root_global throw an exception (or returns a Null PMC) if the global name requested doesn't exist. find_name throws an exception (or returns a Null PMC) if the name requested doesn't exist in a lexical, current, global, or built-in namespace.

{{ TODO: "errorson" as specified is dynamically rather than lexically scoped; is this good? Probably not good. Let's revisit it when we get the basic exceptions functionality implemented. }}

It's a little odd that so few opcodes throw exceptions (these are the ones that are documented, but a few others throw exceptions internally even though they aren't documented as doing so). It's worth considering either expanding the use of exceptions consistently throughout the opcode set, or eliminating exceptions from the opcode set entirely. The strategy for error handling should be consistent, whatever it is. [I like the way LexPads and the errorson settings provide the option for exception-based or non-exception-based implementations, rather than forcing one or the other.]

{{ NOTE: There are a couple of different factors here. One is the ability to globally define the severity of certain exceptions or categories of exceptions without needing to define a handler for each one. (e.g. Perl 6 may have pragmas to set how severe type-checking errors are. A simple "incompatible type" error may be fatal under one pragma, a resumable warning under another pragma, and completely silent under a third pragma.) Another is the ability to "defang" opcodes so they return error codes instead of throwing exceptions. We might provide a very simple interface to catch an exception and capture its payload without the full complexity of manually defining exception handlers (though it would still be implemented as an exception handler internally). Something like:

This could eliminate the need for "defanging" because it would be almost as easy to use as error codes. It could be implemented once for all exceptional opcodes, instead of needing to be defined for each one. And, it still keeps the error information out-of-band, instead of mixing the error in with normal return values. }}

Exception Object Interface

Retrieving the Exception Message

The exception message is stored in the 'message' attribute:

  # ...
 handler:
  .local pmc exception
  .local string message
  .get_results (exception)
  message = exception['message']
  say message

Resuming after Exceptions

Exceptions thrown by standard Parrot opcodes (like the one thrown by get_hll_global above or by the throw opcode) are always resumable, so when the exception handler function returns normally it continues execution at the opcode immediately after the one that threw the exception. Other exceptions at the run-loop level are also generally resumable.

{{NOTE: Currently only implemented for the actual throwing opcodes, throw, die, exit.}}

You resume from an exception by invoking the return continuation stored in the 'resume' attribute of the exception.

  push_eh handler
  $P0 = new 'Exception'          # create new exception object
  throw $P0                      # throw it
  pop_eh
  say "Everything is just fine."
  .return()
 handler:
  .local pmc exception, continuation
  .get_results (exception)
  continuation = exception['resume']
  continuation()

Attachments

None.

Footnotes

None.

References

  src/ops/core.ops
  src/exceptions.c
  src/pmc/exception.pmc
  src/pmc/exceptionhandler.pmc