parrotcode: Object and Class semantics for Parrot | |
Contents | Documentation |
docs/pdds/pdd15_objects.pod - Object and Class semantics for Parrot
This PDD describes the semantics of Parrot's object and class systems.
$Revision$
An object is a value that incorporates both data and behavior related to that data.
A class defines a pattern of characteristics and behaviors from which objects are constructed.
An attribute is a slot in an object that contains a value, generally a PMC. Attributes are referenced by class name/attribute name pairs.
Attributes are set on a class-wide basis, and all the objects of a class will have the same set of attributes. Most OO languages don't allow attribute changes to existing classes, but Parrot's base attribute system does allow it. In order to safely support advanced dynamic features in HLLs, attributes are not accessible via fixed attribute offsets, but only via named lookup.
A method is a piece of code that you invoke by name through an object. Methods implement the behaviour of an object.
Also called the super-class. The parent class is, in an inheritance situation, the class being derived from. If A derives from B, B is the parent class of A.
Also called the sub-class. The child class is, in an inheritance situation, the class doing the deriving. If A derives from B, A is the child class.
A role adds attributes and methods into a class without inheritance.
The composed class retains a list of roles applied to it (so they can be checked with does
),
but otherwise maintains no distinction between composed attributes and methods and those defined in the class.
An object that is transparently (to the user) embedded in another object. Delegate objects are used in those cases where we can't inherit from a class because the class is from a different object universe.
A property is a role that only adds attributes and accessors.
Properties are generally assigned at runtime, and a particular property may or may not exist on a PMC at any particular time. Properties are not restricted to objects as such--any PMC may have a property attached to it.
An interface is a role that only adds methods.
There are four pieces to the object implementation. There are the PMCs for the classes, roles, and objects, the opcodes the engine uses to do objecty things, the specific vtable functions used to perform those objecty things, and the supporting code provided by the interpreter engine to do the heavy lifting.
Parrot, in general, doesn't restrict operations on objects and classes. If a language has restrictions on what can be done with them, the language is responsible for making sure that disallowed things do not happen. For example, Parrot permits multiple inheritance, and will not stop code that adds a new parent to an existing class. If a language doesn't allow for multiple inheritance it must not emit code which would add multiple parents to a class. (Parrot may, at some point, allow imposition of runtime restrictions on a class, but currently it doesn't.)
There are two PMC classes,
Class
and Object
.
Class PMCs hold all the class-specific information.
Instantiating a new OO class creates a new Class PMC,
and enters the new OO class into Parrot's PMC class table,
at which point it is indistinguishable from any other PMC class.
It's important to note that 'standard' classes are Class PMC instances, or instances of a subclass of the Class PMC, and 'standard' objects are Object PMCs. It isn't necessary to create a brand new low-level PMC class for each OO class, and they all share the Class or Object vtable, respectively.
An instance of the Class PMC has ten internal attributes, which are:
The attribute catalog holds only the attributes defined in a particular class. When instantiating an object, the object data store is created as a ResizablePMCArray, so doesn't need any specific details of the class's attribute structure. As attributes are set in the object (based on the index in the lookup table), the Array expands to accommodate the attribute indexes that are actually used. In the common case, a relatively small set near the lower index range is all that will be used.
When setting the attribute cache it is necessary to scan all parent classes as well as the instantiated class for attributes defined there. The inheritance rules (MRO) for a particular HLL will determine which child class attributes override which parent class attributes. The cache is only set on individual accesses to a particular attribute.
(If a parent class changes its set of attributes, should that change appear in later instantiations of objects from child classes? If so, all of these classes would need to be re-constructed as a result of the change; note that any already instantiated objects would refer to the old class. NOTE: flag old classes with an "updated" status, to notify objects of the old class that they should rebless themselves into the new class next time they access the old class?)
Class PMCs also have the "I am a class" flag set on them.
Extending an existing class that has been instantiated creates a new class object that replaces the old class object in the Namespace. However, the old class object must be kept, as the old objects still point to it and do their method resolution and attribute lookup through that class object.
If a class hasn't been instantiated, adding a method or attribute only modifies the existing class object instead of creating a new class object. Extending a class that has been instantiated only causes the creation of a new class object the first time it's extended. After that, methods and attributes added to it will only modify the existing class object until it is instantiated again.
The Namespace always points to the most current incarnation of the class. All the class objects that belong to a particular namespace store a pointer to that Namespace object. They keep that pointer even if the Namespace object no longer stores a pointer to them.
Since any given class name may have multiple corresponding class objects, the class registry has a much diminished role in this implementation. Its only responsibility is maintaining a mapping of unique IDs to class objects throughout the system. It can't be used for looking up classes by name, because it's possible to have multiple classes with the same name in the same namespace. The class registry may need to have names removed (since it doesn't care about names anymore). Low-level PMC types will also need entries in the namespace hierarchy. We may eventually be able to eliminate the registry of class IDs altogether.
A class can be garbage collected when it has no instantiated objects and no Namespace object referencing it (to mark it as live). When a class is garbage collected, it should remove itself from the registry.
To make this work all Classes need the following vtable entries.
Currently Parrot only supports mutating a class' metainformation for Class classes. This is a restriction which will be lifted at some point soon.
These methods are just syntactic sugar for the vtable functions. They are not included in the Class PMC by default, but added to Class as a role.
$P1 = $P2.name( $S3 )
$P1 = $P2.namespace()
$P1 = $P2.new( 'myattrib' => "Foo" )
$P1.add_attribute($S2)
$P1.add_attribute($S2, $S3)
$P1.add_attribute($S2, $P3)
does
, and doesn't necessarily correspond to a class or role namespace.) $P1 = $P2.attributes()
$P1.add_method($S2, $P3)
$P1.add_method($S2, $P3, 'vtable' => 1)
$P1 = $P2.methods()
$P1.add_parent($P3)
$P1 = $P2.parents()
$P1 = $P2.roles()
$P1.add_role($P2, [named])
exclude
and alias
; see "Role Conflict Resolution" for more details. $P1 = $P2.subclass($S3)
$I1 = $P2.isa($S3)
$I1 = $P2.can($S3)
$I1 = $P2.does($S3)
$P1 = $P2.inspect()
$P1 = $P2.inspect($S3)
Object
PMCs are the actual objects, and hold all the per-object instance data.
The Object PMC is an array of meta-information and attributes. The elements of this array are:
A list of the object's attributes is accessible from the class. The attribute cache is the most straightforward way to retrieve a complete list of attributes visible to the object, but the first time you introspect for a complete list the class may have to calculate the list by traversing the inheritance hierarchy.
Object PMCs have the "I am an object" flag set on them.
Object PMCs have no methods aside from those defined in their associated class. They do have vtable functions providing access to certain low-level information about the object, method call functionality, etc. See the sections below on Objects and Vtables.
In addition to a value type, objects can have a container type. The container type can't be stored in the object itself, because a single object may live within multiple containers. So, the container type (when it exists) is stored in the LexPad or Namespace entry for a particular variable.
In a static language like C#.Net:
B isa A
A o1 = new B();
B o2 = new B();
o1.x; # retrieves A's attribute
o2.x; # retrieves B's attribute
o1.foo(); # calls B's method
o2.foo(); # calls B's method
All Objects need the following vtable entries.
An instance of the Role PMC has five attributes, which are:
All Roles need the following vtable entries.
These methods are just syntactic sugar for the vtable functions. They are not included in the Role PMC by default, but added to Role as a role.
$P1 = $P2.name( $S3 )
$P1 = $P2.namespace()
$P1 = $P2.attributes()
$P1.add_attribute($S2)
$P1.add_attribute($S2, $S3)
$P1.add_attribute($S2, $P3)
does
, and doesn't necessarily correspond to a class or role namespace.) $P1.add_role($P2, [named])
exclude
and alias
; see "Role Conflict Resolution" for more details. $P1 = $P2.roles()
$P1.add_method($S2, $P3)
$P1.add_method($S2, $P3, 'vtable' => 1)
$P1 = $P2.methods()
$P1 = $P2.inspect()
$P1 = $P2.inspect($S3)
When a role is added to a class, we try to compose it right away, and throw an exception on any conflicts that are detected. A conflict occurs if two roles try to supply a method of the same name (but see the note on multi-methods below). High level languages will provide varying facilities to deal with this, and Parrot provides the primitives to implement them.
When declaring a composed class, you can optionally supply an array of method names that will be defined by the class to resolve a conflict in its roles. This is done using the named parameter resolve
. This feature supports composition conflict resolution in languages such as Perl 6.
When adding a role to a class, you can optionally supply an array of method names from the role to exclude from the composition process. This is done using the named parameter exclude
. It is not an error to list a method name in this array that the role does not have. This makes it possible to implement languages that provide for explicit exclusions on a role-by-role basis.
When adding a role to a class, you can optionally specify that specific methods are to be aliased to different names within the class. This is done with the optional alias
named parameter. The parameter takes hash of strings, where the key is a method name in the role, and the value is the name it will have in to the class. (This is also sometimes used for conflict resolution.)
If you alias
a method, it won't automatically exclude
the original name from the role. You can also explicitly exclude
the method name, if you want a proper renaming of the method. A resolve
at the class level will automatically exclude
all methods of that name from any role composed into the class. You can alias
the method if you want to call it from the composed class. (You might use this if you want the resolving method to be able to call either of the conflicting methods from two composed roles.)
If a method in a role is a MultiSub PMC and there is either no method of that name yet OR what is in the method slot with that name is also a MultiSub PMC, there will be no error. Instead, the multi-methods from the role will be added to the multi-methods of the MultiSub PMC already in the class. Any attempt to combine a multi with a non-multi will result in an error.
The following ops are provided to deal with objects. Please note that method calls are governed by Parrot's calling conventions, and as such objects, method PMCs, return continuations, and parameters must be in the right places, though some ops will put parameters where they need to go.
$P1 = getattribute $P2, $S3
$P1 = getattribute $P2, $P3, $S4
Undef
PMC. setattribute $P1, $S2, $P3
setattribute $P1, $P2, $S3, $P4
callmethod
callmethod $S1
callmethodcc
callmethodcc $S1
callmethodsupercc $S1
callmethodcc
that skips over the current class when searching for the method, and only looks in the parent classes. PIR may provide some syntactic sugar for this. callmethodnextcc $S1
callmethodcc
that picks up an existing find_method
search where it left off for the current call. {{ Note: this depends on find_method being resumable, and on the context of a particular method including a pointer to the find_method call that found it. Neither may be feasible. }} PIR may provide some syntactic sugar for this. $P1 = newclass $S2
$P1 = newclass $S2, $P3
$P1 = subclass $P2, $S3
Integer
or ResizablePMCArray
. $P1 = get_class $S2
$P1 = get_class $P2
$P1 = new $S2
$P1 = new $S2, $P3
$P1 = new $P2
$P1 = new $P2, $P3
addparent $P1, $P2
removeparent $P1, $P2
addattribute $P1, $S2
addattribute $P1, $S2, $S3
addattribute $P1, $S2, $P3
Null
. It optionally takes a simple string value or key specifying a type of the attribute. removeattribute $P1, $S2
addrole $P1, $P2
$P1 = inspect $P2
$P1 = inspect $P2, $S3
PIR provides some syntactic sugar for declaring classes.
.sub custom_method :method
# ...
.end
.sub get_integer :vtable
# ...
.end
:method and :vtable can be combined to indicate that a particular code entity is callable both as a method and as a vtable override.
If the class object has not yet been created at the point when the PIR subs are compiled, the methods and vtable overrides are temporarily stored in the associated namespace.
Classes may override the vtable functions, allowing objects of a class to behave like a primitive PMC. To use these properly at a low-level requires a good working knowledge of the way Parrot works--generally for higher-level languages the language compiler or runtime will provide easier-to-use wrappers. These methods are all prototyped, and take a single fixed argument list, and return at most a single value.
To override a vtable function, either add the :vtable pragma to the declaration of the method, or pass a named parameter "vtable" into the add_method
method on a class or role.
The bytecode is isolated from most of the internal details of the implementation. This allows both for flexibility in the implementation and forward compatibility, generally good things. It also allows for multiple concurrent interoperable object systems. The major thrust is for transparent use of objects, though most class activity (including creation of subclasses and modifications of existing classes) should be transparent as well.
The following examples all assume we're working with basic Object objects and Class classes.
To create a new class Foo
which has no parent classes:
newclass $P0, "Foo"
To create a class Foo
with the parents A
and B
, the code would be:
get_class $P0, "A"
get_class $P1, "B"
subclass $P2, $P0, "Foo"
addparent $P2, $P1
Adding the attributes a
and b
to the new class Foo
:
$P0 = newclass "Foo"
addattribute $P0, "a"
addattribute $P0, "b"
Assuming we want an object of class Foo
:
.local pmc FooClass
.local pmc MyObject
FooClass = get_class "Foo"
MyObject = FooClass.new()
Calling the method Xyzzy
on an object, assuming the PDD03 calling conventions are respected:
callmethod "Xyzzy"
set S0, "Xyzzy"
callmethod
Or, if a return continuation needs constructing:
callmethodcc "Xyzzy"
set S0, "Xyzzy"
callmethodcc
With named access:
getattribute $P1, $P0, "Foo\x0b"
To get a new class, you can do a newclass
, which creates a new class with no parents besides Parrot's default super-ish parent class.
To get a new child class, you have two potential options:
Both ways work. It is, however, more efficient to use the first method, and just subclass the immediate parent class of your new class.
When adding in extra parents in a multiple-inheritance scenario, subclass the first class in the immediate parent list then use the addparent
op to add in the rest of the immediate parents.
Notes on some of the OO-related needs of various languages.
Ruby: Just like Smalltalk, everything is an object. I'm hoping to be able to implement core Ruby classes (String, Array, Hash, Module, etc) something like this.
ParrotClass
|
RubyClass String
| |
\ /
RubyString
Ruby: Objectspace in Ruby allows the programmer to iterate through every live object in the system. There is some debate about how to make this play nice with different garbage collection schemes.
A class is a collection of methods and attributes. It would be desirable, for those classes whose definition is fully known at compile time, to have a convenient way to have the class along with its attributes and methods stored into a PBC file rather than created at runtime. However, creation of new classes at runtime will be needed too.
Ruby: Ruby has meta-classes. It would be nice if classes were objects in Parrot's OO model.
Attributes are instance data associated with a class (or role, however those are supported). They may not always be of a type specified by a PMC, though boxing/unboxing is of course an option.
Perl 6: All attributes are opaque (not externally visible, even to any subclasses).
.Net: Attributes may be private (not externally visible), public (always externally visible), protected (only visible to subclasses) and internal (only visible inside the current assembly - the closest correspondence in Parrot is perhaps only visible inside the same PBC file). Additionally, it is allowable for a subclass to introduce an attribute of the same name as the a parent class has, and they both exist depending on what type an instance of the class is currently viewed as being (read: there is a difference between the type of the reference and the type of the value).
Ruby: Attributes can be dynamically added and removed at runtime.
Perl 6: Methods may be public (anyone can invoke them) or private (only invokable by the class they are defined in). Additionally, submethods are methods that do not get inherited.
.Net: Like attributes, methods may be public, private, protected or internal.
Ruby: has a method_missing that gets called when method resolution fails to find a method. Methods can be dynamically added and removed at runtime.
A constructor is run when an object is instantiated.
.Net: There may be many constructors for an object (provided they all have different signatures), and the correct one is called based upon the passed parameters.
Perl 6: Multiple inheritance.
.Net: Single inheritance.
Ruby: Single inheritance but support for mixins of Ruby modules.
An interface specifies a set of methods that must be implemented by a class that inherits (or implements) the interface, but does not provide any form of implementation for them.
.Net: Interfaces are pretty much what was just describe above. XXX Need to check behavior of you implement two interfaces with methods of the same name.
A role consists of a set of methods and attributes. It cannot be instantiated on its own, but must be composed into a class. When this happens its methods and attributes become of that classes methods and attributes. This may happen at compile time or runtime, however when a role is composed into a class at runtime then what really happens is that a new anonymous class is created with the role composed into it and then the namespace entry for the existing class is updated to refer to the new one. Note that this means classes must be garbage collectable, with all those referred to by a namespace or with objects of that class existing being marked live.
Perl 6: Roles pretty much are a Perl 6 thing, so the definition above contains all that is needed. An open question is whether Parrot worry about collision detection? For compile time composition that's easy to punt to the compiler; for runtime composition, that's not so easy though.
Perl 6: Reflection provides access to a list of methods that a class has, its parent classes and the roles it does, as well as the name of the class and its memory address. For methods, their name, signature, return type and whether the method is declared multi are available.
.Net: Reflection provides access to a list of attributes and methods as well as the name of the class and its parent. The types of attributes and signatures of methods are also available.
An inner class is essentially a class defined within a class. Therefore it has access to things private to its outer class.
Perl 6: Inner classes are allowed, and may also be private.
.Net: Inner classes are allowed and may be private, public, protected or internal.
Delegation is where a method call is "forwarded" to another class. Parrot may provide support for simple cases of it directly, or could just provide a "no method matched" fallback method that the compiler fills out to implement the delegation.
Perl 6: Delegation support is highly flexible, even allowing a regex to match method names that should be delegated to a particular object.
Prototype-based OO has no classes. All objects are cloned from existing objects and modified. Requires lightweight singleton creation, without needing a separate class for every instance object. (Self, JavaScript, and Io are examples of prototype-based 00.) An example from Io:
Dog := Object clone # The Dog object is a clone of Object
Dog tail := "long" # it has an attribute 'tail' with the value 'long'
Dog bark := method("yap" print) # It has a method 'bark'
Dog bark # call the method 'bark', printing 'yap'
The following list a set of languages, then within each language what the Parrot term translates to.
docs/pdds/pdd15_object_metamodel.png
None.
None.
|