docs/pdds/pdd15_objects.pod - Object and Class semantics for Parrot
This PDD describes the semantics of Parrot's object and class systems.
The PDD is divided into two parts,
the semantics expressed to user programs through PMCs,
and the default back-end class scheme.
Note that the class system is not the single mandated class scheme,
merely the one designed to express the semantics needed for perl 6,
ruby,
and python.
Alternate class systems are certainly possible,
and direct compatibility with the system as described here isn't strictly necessary.
This is a reasonably straightforward object system.
It assumes that objects have:
- An array of attributes.
Note that attribute values are always PMCs.
- A parent class
- A custom (though possibly class-wide) vtable
and that you can:
- Call a method on an object
- Get a method PMC for a method for an object (for deferred method calls)
- Fetch the class for an object
- Subclass an existing object (note that objects may not necessarily be able to have their classes changed arbitrarily,
but making a subclass and moving the object to it is allowable)
- Get an attribute by name or offset
- Set an attribute by name or offset
Additionally we assume that all objects can have properties on them,
as all PMCs can have properties.
The property get/set method may be overridden on a per-class basis as any other vtable method may be.
For classes,
we assume that:
- Classes have an associated namespace.
(Which may be anonymous)
- Classes have one or more immediate parent classes
- Classes have a catalog of attribute names and offsets for all attributes.
- Classes have a list of interfaces they implement
And we further assume that classes can:
- Instantiate an object of their class
- Add parent classes
- Remove parent classes
- Add attributes
- Remove attributes
- Add interfaces
- Remove interfaces
This list is likely not definitive,
but it's enough to start with.
It also doesn't address the semantics of method calls,
which need to be dealt with,
possibly separately.
With that in mind,
the object system supports these features with a combination of PMC classes (not to be confused with object classes) and opcodes.
There are four pieces to the object implementation.
There are the PMCs for the classes and objects,
the opcodes the engine uses to do objecty things,
the specific vtable methods used to perform those objecty things,
and the supporting code provided by the interpreter engine to do the heavy lifting.
Please note that Parrot,
in general,
does not restrict operations on objects and classes.
If a language has restrictions on what can be done with them,
the language is responsible for making sure that disallowed things do not happen.
For example,
Parrot permits multiple inheritance,
and will not stop code that adds a new parent to an existing class.
If a language doesn't allow for multiple inheritance it must not emit code which would add multiple parents to a class.
(Parrot may,
at some point,
allow imposition of runtime restrictions on a class,
but currently it doesn't)
There are two PMC classes,
ParrotClass
and ParrotObject
.
ParrotObject PMCs are the actual objects,
and hold all the per-object instance data.
ParrotClass PMCs hold all the class-specific information.
Instantiating a new OO class creates a new ParrotClass PMC,
and enters the new OO class into Parrot's PMC class table,
at which point it is indistinguishable from any other PMC class.
(This doesn't mean that non-ParrotClass things can be subclassed or treated as an OO class.
Neither is that forbidden.
Just unimplemented)
It's important to note that all 'standard' classes are ParrotClass PMC instances,
and all 'standard' objects are ParrotObject PMCs.
We do not create a brand new PMC class for each OO class,
and they all share the ParrotClass or ParrotObject vtable,
respectively.
This distinction is mostly an artifact of the implementation,
and may change in the future.
While the internals of the class and object PMCs should be considered black boxes,
here's some documentation as to what they are for implementation purposes.
The ParrotClass PMC holds a 6 element array,
which is:
- 0
- An array PMC of the immediate parent classes
- 1
- The class name PMC
- 2
- An array of all parent PMCs,
in search order
- 3
- The class attribute section hash.
Keys are the class name in language-defined format (so perl would be foo::bar,
while java would be some.damn.long.thing.with.dots),
values are the integer offset from the start of the attribute array where that class' attributes start.
- 4
- The class attribute name hash.
Keys are the fully qualified attribute names and the values are the offset from the beginning of the attribute array of the particular attribute.
- 5
- The class attribute array.
This is an array of unqualified attribute names.
Note that the attribute catalog holds all the attributes for an object.
This includes the attributes in the object's class as well as all the attributes defined in all the parent classes.
(Multiple inheritance makes this necessary -- the offsets of a class' attributes will change from child class to child class)
ParrotClass PMCs also have the "I am a class" flag set on them.
The ParrotObject PMC is an array of meta-information and attributes.
The elements of this array are:
- 0
- The class PMC
- 1
- The class name PMC
- 2+
- The attributes for the object
Note that ParrotObject PMCs also have the "I am an object" flag set on them.
The following ops are provided to deal with objects.
Please note that method calls are governed by parrot's calling conventions,
and as such objects,
method PMCs,
return continuations,
and parameters must be in the right places,
though some ops will put parameters where they need to go.
- classoffset Ix,
Py,
Sz
- Returns the offset of the first attribute for class Sz in object Py.
- getattribute Px,
Py,
Iz
- Returns attribute Iz of object Py and puts it in Px.
Note that the attribute number is an absolute offset.
- getattribute Px,
Py,
Sz
- Get the attribute with the fully qualified name Sz from object Py and put it in Px.
- setattribute Px,
Iy,
Pz
- Set the attribute Iy of object Px to Pz.
Note that this op stores the actual PMC rather than a copy,
and so if the PMC's value is subsequently changed,
the value of the attribute will also change.
- setattribute Px,
Sy,
Pz
- Set the attribute of object Px with the fully qualified name Sy to Pz
- fetchmethod Px,
Py,
Sz
- Find the PMC for method Sz of object Py,
and put it in Px.
Note that how the method PMC returned behaves if it goes out of scope or if the class hierarchy changes or the method definitions change is entirely up to the class that provides the PMC.
- callmethod
- callmethod Sz
- Call a method.
If the method name is provided,
we find the PMC for the named method and put it in the sub/method slot.
If no name is provided we assume that all the calling conventions have already been set up and the method PMC is already in the proper place.
- callmethodcc
- callmethodcc Sx
- Make a method call,
automatically generating a return continuation.
If a method name is passed in we look up the method PMC for the object and put it in the method slot.
If a method name isn't provided then we assume that things are already properly set up.
- tailcallmethod (Unimplemented)
- tailcallmethod Sx (Unimplemented)
- Make a tailcall to method Sx.
If no method name is given,
we assume everything is already set up properly.
- newclass Px,
Sy
- Create a new base class named Sy,
and put the PMC for it in Px
- subclass Px,
Py,
Sz
- Create a new class,
named Sz,
which has Py as its immediate parent.
- addparent Px,
Py
- Add class Py to the end of the list of immediate parents of class Px.
Adds any attributes of Py (and its parent classes) that aren't already in Px.
- removeparent Px,
Py (Unimplemented)
- Remove class Py from the parent list of Px.
All parent classes of Py which aren't parent classes of what remains of Px's parent list are removed,
as are their attributes.
- addattribute Px,
Sy
- Add attribute Sy to class Px.
This will add the attribute slot to all objects of class Px and children of class Px,
with a default value of
Null
.
- removeattribute Px,
Sy (Unimplemented)
- Remove the attribute Sy from class Px,
all objects of class Px,
and all objects of a child of class Px.
- instantiate Px,
Py,
Sz (Unimplemented)
- Instantiate a brand new class,
based on the metadata in Py,
named Sz.
To make this work all PMCs must have the following vtable entries.
They may,
for non-objects,
throw an exception.
The catalog metadata for objects is considered to be attributes on the class,
so to get the offset for a class for an object,
you fetch the object's class then look up the offset attribute from it.
(The class attributes are detailed later) This is safe in general,
since the only code reasonably querying a class' attribute list is the class code itself,
and if a class doesn't know whether it's a ParrotClass-style class or not you've got bigger problems.
- find_method(string *)
- Returns the PMC for the named method.
If no method of this name exists,
nor can be constructed,
returns a Null PMC.
- Note that for languages which support default fallback methods,
such as Perl 5's AUTOLOAD,
this would be the place to return it if a normal lookup fails.
- isa(class *)
- Returns true or false if the class passed in as a parameter is in the inheritance hierarchy of the object.
- can(string *)
- Returns true or false if the object can perform the requested method.
(Including with an AUTOLOAD)
- does(class *)
- Returns true or false to note whether the object in question implements the interface passed in.
- get_attr(INTVAL)
- Returns the attribute at the passed-in offset for the object.
- get_attr(STRING*)
- Returns the attribute with the fully qualified name for the object.
- set_attr(INTVAL,
PMC *)
- Sets the attribute for the passed-in offset to the passed-in PMC value
- set_attr(STRING*,
PMC *)
- Set the attribute with the fully qualified name for the object.
- get_class
- Returns the class PMC for the object.
Currently Parrot only supports mutating a class' metainformation for ParrotClass classes.
This is a restriction which will be lifted at some point soon.
The bytecode is isolated from most of the internal details of the implementation.
This allows both for flexibility in the implementation and forward compatibility,
generally good things.
It also allows for multiple concurrent interoperable object systems.
The major thrust is for transparent use of objects,
though most class activity (including creation of subclasses and modifications of existing classes) should be transparent as well.
The following examples all assume we're working with basic ParrotObject objects and ParrotClass classes.
To create a new class Foo
which has no parent classes:
newclass $P0, "Foo"
To create a class Foo
with the parents A
and B
, the code would be:
getclass $P0, "A"
getclass $P1, "B"
subclass $P2, $P0, "Foo"
addparent $P2, $P1
Adding the attributes a
and b
to the new class Foo
:
newclass $P0, "Foo"
addattribute $P0, "a" # This is offset 0 + classoffset
addattribute $P0, "b" # This is offset 1 + classoffset
Assuming we want an object of class Foo
:
.local int FooType
.local pmc MyObject
find_type FooType, "Foo"
new MyObject, FooType
Calling the method Xyzzy
on an object, assuming the PDD03 calling conventions are respected:
callmethod "Xyzzy"
set S0, "Xyzzy"
callmethod
Or, if a return continuation needs constructing:
callmethodcc "Xyzzy"
set S0, "Xyzzy"
callmethodcc
Assuming we've an object that has class Foo
in it somewhere and want to get the second attribute b
out of it:
.local int BaseOffset
.local int BOffset
classoffset BaseOffset, $P0, "Foo"
BOffset = BaseOffset + 1
getattribute $P1, $P0, BOffset
Or with named access, if it isn't time critical:
getattribute $P1, $P0, "Foo\x0b"
To get a new class, you can do a newclass
, which creates a new class with no parents besides parrot's default super-ish parent class. (Which doesn't appear in the class list anywhere, though arguably it ought to)
To get a new child class, you have two potential options:
- Subclass the parent
- Create a new standalone class and add a parent
Both ways work. It is, however, more efficient to use the first method, and just subclass the immediate parent class of your new class.
When adding in extra parents in a multiple-inheritance scenario, subclass the first class in the immediate parent list then use the addparent
op to add in the rest of the immediate parents.
Do be aware that, right now, you should not add attributes or parents to a class that's been subclassed or has had objects instantiated. This will leave the internal structures of the classes and objects in an inconsistent state and things won't work at all the way you want them to. At the moment parrot won't warn if you do this, but it will soon. The restriction on parent list changes and attribute addition will be lifted in future releases, though doing so will be an expensive operation.
Classes may override the vtable methods, allowing objects of a class to behave like a primitive PMC. Each vtable slot has a corresponding named method that parrot looks for in your class hierarchy when an object is used in a primitive context.
To use these properly at a low-level requires a good working knowledge of the way Parrot works--generally for higher-level languages the language compiler or runtime will provide easier-to-use wrappers. These methods are all prototyped, and take a single fixed argument list, and return at most a single value.
While vtable methods may take a continuation, those continuations may not escape the vtable method's execution. This is due to the way that vtable methods are called by the interpreter--once a vtable method is exited any continuation taken within it is no longer valid and may not be used.
Note that any class method that wishes to use parrot's multi-method dispatch system may do so. This is, in fact, encouraged, though it is not required. In the absence of explicit multimethod dispatch, a left-side wins scheme is used.
The following list details the raw method names:
- __init
- Called when the object is first created.
- __init_pmc
- __init_pmc_props
- __morph
- __mark
- Called when the DOD is tracing live PMCs. If this method is called then the code must mark all strings and PMCs that it contains as live, otherwise they may be collected.
- This method is only called if the PMC is flagged as having a special mark routine, and is not necessary for normal objects.
- __destroy
- Called when the object is destroyed. This method is only called if the PMC is marked as having an active finalizer.
- __getprop
- __setprop
- __delprop
- __getprops
- __type
- __type_keyed
- __type_keyed_int
- __type_keyed_str
- __subtype
- __name
- __clone
- __find_method
- __get_integer
- Return the integer value of the object
- __get_integer_keyed
- __get_integer_keyed_int
- __get_integer_keyed_str
- __get_number
- Return the floating-point value of the object
- __get_number_keyed
- __get_number_keyed_int
- __get_number_keyed_str
- __get_bignum
- Return the extended precision numeric value of the PMC
- __get_string
- Return the string value of the PMC
- __get_string_keyed
- __get_string_keyed_int
- __get_string_keyed_str
- __get_bool
- Return the true/false value of the PMC
- __get_bool_keyed
- __get_bool_keyed_int
- __get_bool_keyed_str
- __get_pmc
- Return the PMC for this PMC.
- __get_pmc_keyed
- __get_pmc_keyed_int
- __get_pmc_keyed_str
- __get_pointer
- __get_pointer_keyed
- __get_pointer_keyed_int
- __get_pointer_keyed_str
- __set_integer_native
- Set the integer value of this PMC
- __set_integer_same
- __set_integer_keyed
- __set_integer_keyed_int
- __set_integer_keyed_str
- __set_number_native
- Set the floating-point value of this PMC
- __set_number_same
- __set_number_keyed
- __set_number_keyed_int
- __set_number_keyed_str
- __set_bignum_int
- Set the extended-precision value of this PMC
- __set_string_native
- Set the string value of this PMC
- __set_string_same
- __set_string_keyed
- __set_string_keyed_int
- __set_string_keyed_str
- __set_bool
- Set the true/false value of this PMC
- __assign_pmc
- Set the value to the value of the passed in
- __set_pmc
- Make the PMC refer to the PMC passed in
- __set_pmc_keyed
- __set_pmc_keyed_int
- __set_pmc_keyed_str
- __set_pointer
- __set_pointer_keyed
- __set_pointer_keyed_int
- __set_pointer_keyed_str
- __elements
- Return the number of elements in the PMC, if the PMC is treated as an aggregate.
- __pop_integer
- __pop_float
- __pop_string
- __pop_pmc
- __push_integer
- __push_float
- __push_string
- __push_pmc
- __shift_integer
- __shift_float
- __shift_string
- __shift_pmc
- __unshift_integer
- __unshift_float
- __unshift_string
- __unshift_pmc
- __splice
- __add
- __add_int
- __add_float
- __subtract
- __subtract_int
- __subtract_float
- __multiply
- __multiply_int
- __multiply_float
- __divide
- __divide_int
- __divide_float
- __modulus
- __modulus_int
- __modulus_float
- __cmodulus
- __cmodulus_int
- __cmodulus_float
- __neg
- __bitwise_or
- __bitwise_or_int
- __bitwise_and
- __bitwise_and_int
- __bitwise_xor
- __bitwise_xor_int
- __bitwise_ors
- __bitwise_ors_str
- __bitwise_ands
- __bitwise_ands_str
- __bitwise_xors
- __bitwise_xors_str
- __bitwise_not
- __bitwise_shl
- __bitwise_shl_int
- __bitwise_shr
- __bitwise_shr_int
- __concatenate
- __concatenate_native
- __is_equal
- __is_same
- __cmp
- __cmp_num
- __cmp_string
- __logical_or
- __logical_and
- __logical_xor
- __logical_not
- __repeat
- __repeat_int
- __increment
- __decrement
- __exists_keyed
- __exists_keyed_int
- __exists_keyed_str
- __defined
- __defined_keyed
- __defined_keyed_int
- __defined_keyed_str
- __delete_keyed
- __delete_keyed_int
- __delete_keyed_str
- __nextkey_keyed
- __nextkey_keyed_int
- __nextkey_keyed_str
- __substr
- __substr_str
- __invoke
- __can
- __does
- __isa
- __freeze
- __thaw
- __thawfinish
- __visit
- __share
Since every object system on the planet shares a common set of terms but uses them completely differently, this section defines
- Property
- A name and value pair attached to a PMC. Properties may be attached to the PMC in its role as a container or the PMC in its role as a value.
- Properties are global to the PMC. That is there can only be one property named "FOO" attached to a PMC, and it is globally visible to all inspectors of the PMCs properties. They are not restricted by class.
- Properties are generally assigned at runtime, and a particular property may or may not exist on a PMC at any particular time. Properties are not restricted to objects as such, and any PMC may have a property attached to it.
- Attribute
- An attribute is a slot in an object that contains a value, generally a PMC. (Containing non-PMCs leads to interesting garbage collection issues at the moment) Attributes are referenced either by slot number or by class name/attribute name pairs. (At least conceptually)
- Attributes are set on a class-wide basis, and all the objects of a class will have the same set of attributes. Generally attributes aren't added or removed from classes at runtime, as this would require resizing and moving the elements of the attribute arrays of existing objects, and potentially recompiling code with fixed attribute offsets embedded in it. Most OO languages don't allow attribute changes to existing classes, though parrot's base attribute system does allow this.
- The fully qualified name of an attribute is the classname, a null, and the attribute name. Parrot synthesizes the fully-qualified name where it needs to.
- Method
- In its strictest sense, a method is a chunk of code that you call with an object in the object slot of the calling conventions.
- More generally, a method is some piece of code that you invoke by name through an object. You call the object's "Invoke a method" vtable entry, passing in the method name (Assuming we don't just get it from the sub name register, per calling conventions). The object is then responsible for doing something with the method being requested. Presumably it calls the method, though this isn't strictly required.
- Delegate
- An object that is transparently (to the user) embedded in another object. Delegate objects are used in those cases where we can't inherit from a class because the class is from a different object universe.
- As an example, assume you have a class A, which inherits from class B. The classes are incompatible, so Parrot can't automatically meld B into A, as it might if they were. When instantiating an object of class A, Parrot will automatically instantiate an object of class B and embed it in the object of class A. The object of class B is class A's delegate--when a method call comes in that A can't handle, that method call is delegated to B.
- Parent class
- Also called the super-class. The parent class is, in an inheritance situation, the class being derived from. If A derives from B, B is the parent class of A.
- Child class
- Also called the sub-class. The child class is, in an inheritance situation, the class doing the deriving. If A derives from B, A is the child class.
The following list a set of languages, then within each language what the parrot term translates to.
- Python
- Attribute
- A Python attribute maps to a parrot property
- .NET
- Attribute
- What .NET calls an attribute parrot calls a property
- Property
- What .NET calls a property we call an attribute
- Generic Terminology
- Instance Variable
- Instance Variables map to what we call attributes
None.
None.
None.
Maintainer: Dan Sugalski
Class: Internals
PDD Number: 15
Version: 1.2
Status: Developing
Last Modified: February 09, 2004
PDD Format: 1
Language: English
- Version 1.3
- April 3, 2004
- Version 1.2
- February 9, 2004
- Version 1.1
- March 11, 2002
- version 1
- None. First version
- Version 1.3
- Removed some unimplemented notes. Changed vtables to get_*, set_* so that they match other vtable function syntax.
- Version 1.2
- A complete overhaul from the original spec.
- Version 1.1
- Removed attributes from the object interface and put them in the class interface section, where they belong.
- Version 1.0
- None. First version