parrotcode: The Parrot Primer | |
Contents | Documentation |
docs/intro.pod - The Parrot Primer
This is an update to the article 'Parrot: Some Assembly Required' which appeared on http://www.perl.com for the 0.0.2 release of Parrot. It's intended as being the best way for the newcomer to Parrot to learn what Parrot is and how to use it.
First, though, what is Parrot, and why are we making such a fuss about it? Well, if you haven't been living in a box for the past years, you'll know that the Perl community has embarked on the design and implementation of a new version of Perl, both the language and the interpreter.
Parrot is related to Perl 6,
but it is not Perl 6.
To find out what it actually is,
we need to know a little about how Perl works.
When you feed your program into perl
,
it is first compiled into an internal representation,
or bytecode; then this bytecode is fed to an almost separate subsystem inside perl
to be interpreted.
So there are two distinct phases of perl
's operation - compilation to bytecode,
and interpretation of bytecode.
This is not unique to Perl; other languages following this design include Python,
Ruby,
Tcl and,
believe it or not,
even Java.
In previous versions of Perl, this arrangement has been pretty ad hoc: there hasn't been any overarching design to the interpreter or the compiler, and the interpreter has ended up being pretty reliant on certain features of the compiler. Nevertheless, the interpreter (some languages call it a Virtual Machine) can be thought of as a software CPU - the compiler produces "machine code" instructions for the virtual machine, which it then executes, much like a C compiler produces machine code to be run on a real CPU.
Perl 6 plans to separate out the design of the compiler and the interpreter. This is why we've come up with a subproject, which we've called Parrot, which has a certain, limited amount of independence from Perl 6. Parrot is destined to be the Perl 6 Virtual Machine, the software CPU on which we will run Perl 6 bytecode. We're working on Parrot before we work on the Perl 6 compiler because it's much easier to write a compiler once you've got a target to compile to!
The name "Parrot" was chosen after the 2001 April Fool's Joke which had Perl and Python collaborating on the next version of their interpreters. This is meant to reflect the idea that we'd eventually like other languages to use Parrot as their VM; in a sense, we'd like Parrot to become a "common language runtime" for dynamic languages.
It should be stressed we're still in the early stages of development.
But don't let that put you off! Parrot is still very much usable; we've already a lot of languages (in different state of completeness) which compile down to Parrot bytecode. Please have a look at the languages/ subdirectory.
At the moment, it's possible to write simple programs in Parrot assembly language, use an assembler to convert them to machine code and then execute them on a test interpreter. We have support for a wide variety of ordinary and transcendental mathematical operations, some rudimentary string support, and some conditional operators.
So let's get ourselves a copy of Parrot, so that we can start investigating how to program in the Parrot assembler.
Periodic, numbered releases will appear on CPAN (we're currently on version 0.1.2), but at this stage of the project an awful lot is changing between releases. To really keep up to date with Parrot, we should get our copy from the CVS repository. Here's how we do that:
% cvs -d :pserver:anonymous@cvs.perl.org:/cvs/public login
(Logging in to anonymous@cvs.perl.org)
CVS password: [ and here we just press return ]
% cvs -d :pserver:anonymous@cvs.perl.org:/cvs/public co parrot
cvs server: Updating parrot
U parrot/.cvsignore
U parrot/Config_pm.in
....
There's also a web interface to the CVS repository, available at http://cvs.perl.org/cvsweb/parrot/.
For those of you who can't use CVS, there are CVS snapshots built every six hours which you can find at http://cvs.perl.org/snapshots/parrot/.
Now we have downloaded Parrot, we need to build it; so:
% cd parrot
% perl Configure.pl
Parrot Configure
Copyright (C) 2001-2003 The Perl Foundation. All Rights Reserved.
Since you're running this script, you obviously have
Perl 5--I'll be pulling some defaults from its configuration.
...
The Configure script will then attempt to discover your local configuration automatically; you can supply the --ask switch if you wish to configure the build manually. You might also have a look at:
% perl Configure.pl --help
Once Configure has finished successfully, type make
(or the name of your local make
program). With any luck, Parrot will successfully build. (If it doesn't, the address to complain to is at the end of this introduction...)
Now we should run some tests; so type make test
and you should see a readout like the following:
perl t/harness --gc-debug --running-make-test -b t/op/*.t t/pmc/*.t \
t/native_pbc/*.t imcc/t/*/*.t t/src/*.t
t/op/00ff-dos...........ok
t/op/00ff-unix..........ok
...
All tests successful, 40 subtests skipped.
Files=95, Tests=1386, 125 wallclock secs (56.96 cusr + 23.71 csys = 80.67 CPU)
(Of course, there might be more tests than this, but you get the idea; tests may be skipped - for one reason or another - but none of them should fail!)
If you have problems with parrot, please send a message to bugs-parrot@bugs6.perl.org with a description of your problem. Please include the myconfig file that was generated as part of the build process.
Before we dive into programming Parrot assembly, let's take a brief look at some of the concepts involved.
The Parrot CPU has four basic data types:
INTVAL
FLOATVAL
STRING
PMC
The first three types are pretty much self-explanatory; the final type, Parrot Magic Cookies, are slightly more difficult to understand. But that's OK! We'll talk more about PMCs at the end of the article.
The current Perl 5 virtual machine is a stack machine - it communicates values between operations by keeping them on a stack. Operations load values onto the stack, do whatever they need to do, and put the result back onto the stack. This is very easy to work with, but it's very slow: to add two numbers together, you need to perform three stack pushes and two stack pops. Worse, the stack has to grow at runtime, and that means allocating memory just when you don't want to be allocating it.
So Parrot's going to break with the established tradition for virtual machines, and use a register architecture, more akin to the architecture of a real hardware CPU. This has another advantage: we can use all the existing literature on how to write compilers and optimizers for register-based CPUs for our software CPU!
Parrot has specialist registers for each type: 32 INTVAL registers, 32 FLOATVAL registers, 32 string registers and 32 PMC registers. In Parrot assembler, these are named I0
...I31
, N0
...N31
, S0
...S31
, P0
...P31
.
Now let's look at some assembler. We can set these registers with the set
operator:
set I1, 10
set N1, 3.1415
set S1, "Hello, Parrot"
All Parrot ops have the same format: the name of the operator, the destination register, and then the operands.
There are a variety of operations you can perform: the file docs/core_ops.pod documents them, along with a little more about the assembler syntax. For instance, we can print out the contents of a register, or a constant:
print "The contents of register I1 is: "
print I1
print "\n"
Or we can perform mathematical functions on registers:
add I1, I1, I2 # Add the contents of I2 to the contents of I1
mul I3, I2, I4 # Multiply I2 by I4 and store in I3
inc I1 # Increment I1 by one
dec N3, 1.5 # Decrement N3 by 1.5
We can even perform some simple string manipulation:
set S1, "fish"
set S2, "bone"
concat S1, S2 # S1 is now "fishbone"
substr S4, S1, 0, 1, "w" # S1 is now "wishbone"
length I1, S1 # I1 is 8
end
Code gets a little boring without flow control; for starters, Parrot knows about branching and labels. The branch
op is equivalent to Perl's goto
:
branch TERRY
JOHN: print "fjords\n"
branch END
MICHAEL: print " pining"
branch GRAHAM
TERRY: print "It's"
branch MICHAEL
GRAHAM: print " for the "
branch JOHN
END: end
It can also do simple tests for whether or not a register contains a true value:
set I1, 12
set I2, 5
mod I3, I1, I2
if I3, REMAIND
print "5 is an integer divisor of 12"
branch DONE
REMAIND: print "5 divides 12 with remainder "
print I3
DONE: print "\n"
end
Note that if
branches to REMAIND
if I3
contains a true (i.e. non-zero) value; if I3
is zero, execution falls through to the next statement. Here's what that would look like in Perl, for comparison:
$i1 = 12;
$i2 = 5;
$i3 = $i1 % $i2;
if ($i3) {
print "5 divides 12 with remainder ";
print $i3;
} else {
print "5 is an integer divisor of 12";
}
print "\n";
exit;
And speaking of comparison, we have the full range of numeric comparators: eq
, ne
, lt
, gt
, le
and ge
. Note that you can't use these operators on arguments of disparate types; you may even need to add the suffix _i
or _n
to the op to tell it what type of argument you are using - although the assembler ought to divine this for you, by the time you read this.
Now let's have a look at a few simple Parrot programs to give you a feel for the language.
This little program displays the Unix epoch time every second: (or so)
set I3, 3000000
REDO: time I1
print I1
print "\n"
set I2, 0
SPIN: inc I2
le I2, I3, SPIN
branch REDO
end
First, we set integer register 3 to contain 3 million - that's a completely arbitrary number, due to the fact that Parrot averages a massive six million ops per second on my machine. Then the program consists of two loops: the outer loop stores the current Unix time in integer register 1, prints it out, prints a new line, and resets register 2 to zero. The inner loop increments register 2 until it reaches the 3 million we stored in register 3. When it is no longer less than (or equal to) 3 million, we go back to the REDO
of the outer loop. In essence, we're just spinning around a busy loop to waste some time.
How do we run this? Copy the assembler to a file showtime.pasm, and inside your Parrot directory, run:
parrot showtime.pasm
This will assemble and run the code in showtime.pasm. You can also create an assembled bytecode from the assembler by running:
parrot -o showtime.pbc showtime.pasm
(.pbc
is the file extension for Parrot bytecode.)
To run this bytecode type
parrot showtime.pbc
The Fibonacci series is defined like this: take two numbers, 1 and 1. Then repeatedly add together the last two numbers in the series to make the next one: 1, 1, 2, 3, 5, 8, 13, and so on. The Fibonacci number fib(n)
is the n'th number in the series. Here's a simple Parrot assembler program which finds the first 20 Fibonacci numbers:
# Some simple code to print some Fibonacci numbers
# Leon Brocard <acme@astray.com>
print "The first 20 fibonacci numbers are:\n"
set I1, 0
set I2, 20
set I3, 0
set I4, 1
REDO: set I5, I4
add I4, I3, I4
set I3, I5
print I3
print "\n"
inc I1
lt I1, I2, REDO
DONE: end
This is the equivalent code in Perl:
print "The first 20 fibonacci numbers are:\n";
my $i = 0;
my $target = 20;
my $a = 0;
my $b = 1;
until ($i == $target) {
my $num = $b;
$b += $a;
$a = $num;
print $a,"\n";
$i++;
}
Additional examples of what can be done with Parrot assembler can be found in the parrot/examples/ subdirectory, and on the web at http://www.parrotcode.org/examples/.
Parrot is obviously developing very rapidly, and we've still got a long way to go before we are ready to a compiler to this platform. This section is for those who are interested in helping us take Parrot further.
PMCs are almost like Perl 5's SVs and Python's Objects, only more so. A PMC is an object of some type, which can be instructed to perform various operations. So when we say
inc P1
to increment the value in PMC register 1, the increment
method is called on the PMC - and it's up to the PMC how it handles that method.
PMCs are how we plan to make Parrot language-independent - a Perl PMC would have different behavior from a Python PMC or a Tcl PMC. The individual methods are function pointers held in a structure called a vtable, and each PMC has a pointer to the vtable which implements the methods of its "class". Hence a Perl interpreter would link in a library full of Perl-like classes and its PMCs would have Perl-like behaviour.
The PMC types available are described in doc/vtables.pod; you can create a new PMC with
new P0, <typename>
and then use any of the instructions in ops/core.ops and ops/pmc.ops which support PMCs. doc/vtables.pod also tells you how to implement your own PMC vtable classes.
We've got a good number of people working away on Parrot, but we could always use a few more. To help out, you'll need a subscription to the perl6-internals mailing list, (perl6-internals@perl.org
), where all the development takes place. You should also keep up to date with the CVS version of Parrot; if you want to be alerted to CVS commits, you can subscribe to the cvs-parrot mailing list (cvs-parrot@perl.org
). CVS commit access is given to those who take responsibility for a particular area of Parrot, or who often submit high-quality patches.
A useful web page is http://cvs.perl.org, which reminds you how to use CVS, and allows you to browse the CVS repository; the code page is a summary of this information and other resources about Parrot. Another good resource is http://www.parrotcode.org.
So don't delay - pick up a Parrot today!
|