parrotcode: Parrot JIT Subsystem | |
Contents | Documentation |
docs/jit.pod - Parrot JIT Subsystem
This PDD describes the Parrot Just In Time compilation subsystem.
The Just In Time, or JIT, subsystem converts a bytecode file to native machine code instructions and executes the generated instruction sequence directly.
Currently works on ALPHA, Arm, Intel x86, PPC, and SPARC version 8 processor systems, on most operating systems. Currently only 32-bit INTVALs are supported.
The initial step in generating native code is to invoke Parrot_jit_begin, which generally provides architecture specific preamble code. For each parrot opcode in the bytecode, either a generic or opcode specific sequence of native code is generated. The .jit files provide functions that generate native code for specific opcode functions, for a given instruction set architecture. If a function is not provided for a specific opcode, a generic sequence of native code is output which calls the interpreter C function that implements the opcode. Such opcode are handled by Parrot_jit_normal_op.
If the opcode can cause a control flow change, as in the case of a branch or call opcode, an extended or modified version of this generic code is used that tracks changes in the bytecode program counter with changes in the hardware program counter. This type of opcode is handled by Parrot_jit_cpcf_op.
While generating native code, certain offsets and absolute addresses may not be available. This occurs with forward opcode branches, as the native code corresponding to the branch target has not yet been generated. On some platforms, function calls are performed using program-counter relative addresses. Since the location of the buffer holding the native code may move as code is generated (due to growing of the buffer), these relative addresses may only be calculated once the buffer is guaranteed to no longer move. To handle these instances, the JIT subsystem uses "fixups", which record locations in native code where adjustments to the native code are required.
The architecture specific jit_emit.h file communicates some defines and tables with jit.c and languages/imcc/imc.c. The structure of the file and the defines must therefore follow a specific syntax.
#if JIT_EMIT ... emit code #else ... defines static const jit_arch_info arch_info = { ... initialization of maps ... and possibly private static functions } #endif
See src/jit/skeleton/jit_emit.h for a more detailed explanation.
XXX most are moved into jit_arch_info
now.
#define MAP(i) OMAP(i) #undef MAP #define MAP(i) (i) >= 0 ? 0 : OMAP(i)
See src/jit/i386/jit_emit.h for actual usage of these defines.
Jit files are interpreted as follows:
OFFS
in their names:REG_OFFS_INT(reg_no) ...ROFFS_INT(n) ...INT_CONST[n]Gets replaced by the INTVAL
constant specified in the nth argument.NUM_CONST[n]Gets replaced by the FLOATVAL
constant specified in the nth argument.MAP[n]The nth integer or floating processor register, mapped in this section.Note: The register with the physical number zero can not be mapped.NATIVECODEGets replaced by the current native program counter.*CUR_OPCODE[n]Gets replaced by the address of the current opcode in the Parrot bytecode.ISRn FSRnThe nth integer or floating point scratch register.
TEMPLATE Parrot_set_x_ic { if (MAP[1]) { jit_emit_mov_ri<_N>(NATIVECODE, MAP[1], <typ>_CONST[2]); } else { jit_emit_mov_mi<_N>(NATIVECODE, &INT_REG[1], <typ>_CONST[2]); } } Parrot_set_i_ic { Parrot_set_x_ic s/<_N>/_i/ s/<typ>/*INT/ } Parrot_set_n_ic { Parrot_set_x_ic s/<_N>/_ni/ s/<typ>/&INT/ s/INT_R/NUM_R/ }The jit function Parrot_set_i_ic is based on the template Parrot_set_x_ic, the s/x/y/ are substitutions on the template body, to generate the actual function body. These substitutions are done before the other substitutions.s. jit/i386/core.jit for more.
To make it easier to share core.jit files between machines of similar architecture, the jit_emit functions should follow this syntax:
jit_emit_<op>_<args>_<type>
The access to Parrot registers is done relative to $6
, all other memory access is done relative to $27
, to access float constants relative to $7
so you must preside the instruction with ldah $7,0($27).
Only 32 bit INTVALs are supported. Long double FLOATVALs are ok.
There are four mapped integer registers %edi, %esi, %ecx, and %edx. The first 2 of these are callee saved, they preserve their value around extern function calls.
Four floating point operations the registers ST1 ... ST4 are mapped and considered as preserved over function calls.
The register %ebx
holds the register frame pointer.
Let's see how this works:
Parrot Assembly:
set I0,8 set I2,I0 print I2 end
Parrot Bytecode: (only the bytecode segment is shown)
+--------------------------------------+ | 73 | 0 | 8 | 72 | 2 | 0 | 21 | 2 | 0 | +-|------------|------------|--------|-+ | | | | | | | +----------- end (no arguments) | | +-------------------- print_i (1 argument) | +--------------------------------- set_i_i (2 arguments) +---------------------------------------------- set_i_ic (2 arguments)
Please note that the opcode numbers used might have already changed. Also generated assembly code might be different.
Intel x86 assembly version of the Parrot ops:
Parrot_jit_begin
0x817ddd0 <jit_func>: push %ebp 0x817ddd1 <jit_func+1>: mov %esp,%ebp 0x817ddd3 <jit_func+3>: push %ebx 0x817ddd4 <jit_func+4>: push %esi 0x817ddd5 <jit_func+5>: push %edi normal function header till here, now push interpreter 0x817ddd6 <jit_func+6>: push $0x8164420 get jit function table to %ebp and jump to first instruction 0x817dddb <jit_func+11>: mov 0xc(%ebp),%eax 0x817ddde <jit_func+14>: mov $0x81773f0,%ebp 0x817dde3 <jit_func+19>: sub $0x81774a8,%eax 0x817dde9 <jit_func+25>: jmp *%ds:0x0(%ebp,%eax,1)
set_i_ic
0x817ddee <jit_func+30>: mov $0x8,%edi
set_i_i
0x817ddf3 <jit_func+35>: mov %edi,%ebx
Parrot_jit_save_registers
0x817ddf5 <jit_func+37>: mov %edi,0x8164420 0x817ddfb <jit_func+43>: mov %ebx,0x8164428
Parrot_jit_normal_op
0x817de01 <jit_func+49>: push $0x81774c0 0x817de06 <jit_func+54>: call 0x804be00 <Parrot_print_i> 0x817de0b <jit_func+59>: add $0x4,%esp
Parrot_jit_end
0x817de0e <jit_func+62>: add $0x4,%esp 0x817de14 <jit_func+68>: pop %edi 0x817de16 <jit_func+70>: pop %ebx 0x817de18 <jit_func+72>: pop %esi 0x817de1a <jit_func+74>: pop %ebp 0x817de1c <jit_func+76>: ret
Please note the reverse argument direction. PASM and JIT notations use dest,src,src, while gdb and the internal macros in jit_emit.h have src,dest.
Above listing was generated by gdb, the GNU debugger, with a little help from Parrot_jit_debug, which generates a symbol file in stabs format, s. info stabs for more (or less :-()
The following script calls ddd (the graphic debugger fronted) and attaches the symbol file, after it got built in parrot_build_asm.
# dddp # run ddd parrot with given file # gdb confirmations should be off parrot -o $1.pbc -d1 $1.pasm echo "b runops_jit r -D4 -j $1.pbc n add-symbol-file $1.o 0 s " > .ddd ddd --command .ddd parrot &
Run this with e.g. dddp t/op/jit_2, then turn on the register status, step or nexti through the source, or set break points as with any other language.
You can examine parrot registers via the debugger or even set them and you can always step into external opcode and look at *interpreter.
The tests t/op/jit*.t have some test cases for testing register allocation. These tests are written for a mapping of 4 processor registers. If your processor architecture has more mapped registers, reduce them to 4 and run these tests.
$ cat j.pasm set I0, 10 set N1, 1.1 set S2, "abc" print "\n" end $ dddp j
(ddd shows above source code and assembly (startup code snipped):
0x815de46 <jit_func+30>: mov $0xa,%ebx 0x815de4b <jit_func+35>: fldl 0x81584c0 0x815de51 <jit_func+41>: fstp %st(2) 0x815de53 <jit_func+43>: mov %ebx,0x8158098 0x815de59 <jit_func+49>: fld %st(1) 0x815de5b <jit_func+51>: fstpl 0x8158120 0x815de61 <jit_func+57>: push $0x815cd90 0x815de66 <jit_func+62>: call 0x804db90 <Parrot_set_s_sc> 0x815de6b <jit_func+67>: add $0x4,%esp 0x815de6e <jit_func+70>: push $0x815cd9c 0x815de73 <jit_func+75>: call 0x804bcd0 <Parrot_print_sc> 0x815de78 <jit_func+80>: add $0x4,%esp 0x815de7b <jit_func+83>: add $0x4,%esp 0x815de81 <jit_func+89>: pop %edi 0x815de83 <jit_func+91>: pop %ebx 0x815de85 <jit_func+93>: pop %esi 0x815de87 <jit_func+95>: pop %ebp 0x815de89 <jit_func+97>: ret (gdb) n (gdb) n (gdb) n (gdb) p I0 $1 = 10 (gdb) p N1 $2 = 1.1000000000000001 (gdb) p *S2 $3 = {bufstart = 0x815ad30, buflen = 15, flags = 336128, bufused = 3, strstart = 0x815ad30 "abc"} (gdb) p &I0 $4 = (INTVAL *) 0x8158098
XXX (p)rinting register contents like shown above is currently not supported.
docs/dev/jit_i386.pod, jit/skeleton/jit_emit.h
|