YUAN, Feng
( Hewlett Packard Singapore )
email: yuanfeng@hpsgrt1.sgp.hp.com
Paper submitted to PROMPT HP-GC Anniversary Conference 1994
This paper describes an ongoing experiment to compile Pascal program into system RPL source code, which is a high level language used to build the HP RPN calculators.
Despite the popularity of HP RPN calculators like HP 48SX and HP 48 GX, the ability to program HP calculators in user RPL or system RPL is still regarded as something only the super users could do.
System RPL ( reverse polish lisp ) and user RPL are hard to programming for several reasons:
Technically speaking, RPL is an intermediate code that works on a software emulated stack machine. RPL has lots of cousins: for example, Forth which is one of the predecessor of RPL, P_code which is used by several Pascal compilers, HPcode which is used by HP in many of its compilers. For this reason, the author has chosen to work on porting the P4 Pascal compiler to generate system RPL source code.
The P4 package is a public domain portable Pascal compiler and P_code interpreter written by Urs Ammann and others from ETH Zurich and fully documented in S. Pemberton's book titled "Pascal Implementation". The P4 compiler has been substantially modified to generate system RPL code.
The following sections will describe the details of compiling Pascal source code into system RPL code, familiarity with both languages are assumed.
One of the design objective is that we must produce system RPL code that can merge with hand written system RPL code as smoothly as possible. That is to say we should be able to call hand written code and being called by hand written code. One big obstacle to this is variable storage and access.
P_code machine stores local variables on the stack. Each procedure call creates a standard stack frame, a chunk of parameters and local variables. The stack frame contains 5 elements for return address, function return value, base address, last base address and maximum stack extent. To access a variable, you must specify the static block level and offset within the block. During runtime, there is a MP register which points to the first variable of the current block. To access a local variable, adding offset to MP will get the address. To access variables defined in outer block, you get its base address from the stack frame chain.
For example, the following example shows a simple Pascal procedure and the corresponding P_code:
procedure p(i,j: integer); var k: integer; begin k:=i+j end; ent1 8 p needs 5+2+1 for stack frame, parameters and locals ent2 7 p will need another 7 stack levels for computation, not so accurate lodi 5 copy parameter i to top_of_stack lodi 6 copy parameter j to top_of_stack adi integer addition stri 7 store into local variable k retp procedure return
In system RPL, there is no MP register and stack frame chain ( except a return address that is pushed on the return stack ), so we can't generate 5 element stack frame and use (block level, offset) based addressing.
Nevertheless, because Pascal is a strong typed language, for local variable access, we know for sure during compile time how many elements are there on the (local) stack. We can then use stack operations to fetch and store variables, similar to what system RPL programmers are doing. For the example shown above, we know for sure when entering routine p, two elements would be on the stack: i on stack level 2 and j on stack level 1. So it can be compiled into the following system RPL code:
NAMELESS proc_3 :: (2) standard RPL routine entering ZERO (3) reserve space for local variable k 3PICK (4) i is three levels away 3PICK (5) j is now three levels away too #+ (4) binary integer addition ( integer is implemented as unsigned integer now, beware ) SWAPDROP (3) store into location for k 3DROP (0) remove i,j,k from stack ; standard RPL routine quitting
The numbers enclosed in parenthesis indicate the current (local) stack levels. It starts with 2 for 2 parameters, incremented as you reserving space for k, loading i and j, decremented as you adding or storing and finally cleared before you quit. While the original P4 compiler does maintain a stack level during compilation, it's not updated in some cases.
Stack level is very critical for compiling into system RPL, the P4 code is modified to maintain exact stack level. As a side note, because system RPL code uses lots of relative fetching and storing words like 3PICK, ROT, UNROT and 4UNROLL3DROP, system RPL code is hard to write, understand, debug and modify. A Pascal or C programmer seldom has to worry about the runtime address of a variable.
Compiling a Pascal function would be a little more complicated in handling return value. To following example illustrates this:
function p(a,b,c: real): real; begin p:=sqrt(b*b-4.0*a*c) end; NAMELESS proc_3 :: (3) ZERO (4) reserve space for return function value 3PICK (5) b is three levels away 4PICK (6) b is four levels away now ( DUP would be better ) %* (5) b*b % 4.0 (6) direct constant ( not smart enough to use %4 ) 6PICK (7) a %* (6) 4.0*a 4PICK (7) c %* (6) 4.0*a*c %- (5) b*b-4.0*a.c %SQRT (5) real square root XY>Y (4) store as function return value 4UNROLL3DROP (1) retain top of stack, drop 3 levels under it ;
For global variables, while we can store them on the stack, we have no way to locate them during runtime without using some form of base pointer. So, we decided to use temporary environment and unnamed LAM variables for global variables. The compilation is quite straight forward. Here is an example for globals:
program test(input,output); var a,b,p: real; begin p:=a+b end. NAMELESS proc_3 :: (0) ZERO THREE NDUPN DROP (3) three dummy values ' NULLLAM THREE NDUPN DOBIND (0) environment for 3 variables 3GETLAM (1) a is at fixed location now 2GETLAM (2) b %+ (1) a+b 1PUTLAM (0) p is the first variable ABND (0) discard environment ;
Program like shown above can only be useful if it can read value from somewhere outside and return result to somewhere. Generally speaking, we need the power to bypass the Pascal compiler, and to be able to insert system RPL words into the code generated. Inline procedures as defined Turbo Pascal would be a solution.
If we define two inline procedures 'tos' to fetch the top of the non local stack element and 'push' to push something on the non local stack, we can rewrite the above program as:
program test(input,output); var a,b: real; function tos: real; inline ' ( tos ) '; procedure push(val: real); inline ' ( push ) '; begin a:=tos; b:=tos; push(a+b) end. NAMELESS proc_3 :: (0) ZEROZERO (2) two dummy values ' NULLLAM TWO NDUPN DOBIND (0) environment for 2 variables ( tos ) (1) comment only 2PUTLAM (0) store tos into a ( tos ) (1) comment only 1PUTLAM (0) store tos into b 2GETLAM (1) 1GETLAM (2) %+ (1) a+b ( push ) (0) leaving result on stack ABND (0) discard environment ;
By non local stack, we mean everything on the stack before the first parameter of the current procedure/function is pushed. Beware that the tos and push defined here only works for empty local stack. They should be defined by the compiler so that they can work even when the local stack is not empty. For example, the expression 'a+tos' should be compiled into:
2GETLAM (1) local stack contains one element SWAP (2) get one element from nonlocal stack, remove old copy %+
With tos, we can even accept variable number of parameters. Here is an example which does list summation:
function decompose(l: list): integer; inline 'INNERCOMP'; function sumlist(listlen: integer): integer; var sum: integer; begin sum:=0; while listlen <> 0 do begin sum:=sum+tos; listlen:=listlen-1 end; sumlist:=sum end; begin sumlist(decompose(tos)); end.
Inline procedures/functions can be very useful in extending Pascal's limited standard procedures/functions and access HP calculator's rich reservoir of math functions and user interface functions, while enjoying the benefits of Pascal compiler at the same time. Here are some examples:
function max(a,b: integer): integer; inline '#MAX'; function min(a,b: integer): integer; inline '#MIN'; type string=integer; (* Define string as occupying one element space *) (* so that it can be a function return type *) function concat(s1,s2: string): string; inline '&S'; (* Char is implemented as binary integer internally, so *) (* it needs to be converted to CHR before calling >T$ *) function conchr(s1: string; c: char): string; inline '#>CHR >T$'; (* graphics *) procedure lineon(x1,y1,x2,y2: integer); inline 'LINEON3'; (* complex, list, matrix, etc add your own *)
Although user defined record type in Pascal looks like a major feature, its implementation is quite straight forward. The compiler has to remember the offset address of every field relative to the start of the record. When generating code for simple variable access, adding field offset to the variable address will get the field address. Here is an example with both local and global variables:
type complex = record rel: real; img: real end; var a: complex; procedure negate; var b: complex; begin b.rel:=-a.rel; b.img:=-a.img end; NAMELESS proc_3 :: (0) ZEROZERO (2) ( b.rel b.img ) 1GETLAM (3) ( b.rel b.img a.rel ) %CHS (3) ( b.rel b.img -a.rel ) XYZ>ZY (2) ( -a.rel b.img ) 2GETLAM (3) ( -a.rel b.img a.img ) %CHS (3) ( -a.rel b.img -a.img ) XY>Y (2) ( -a.rel -b.img ) 2DROP (0) ( ) ;
Up to now, the compiler still manages to find the exact address ( offset ) of a variable, local or global. Array as defined in Pascal would be totally different. Array in Pascal is a composite data type, with each element being a valid data object.While system RPL and user RPL defines array as a simple data type (atomic). Which means that each element is an object body. Object prologue should be added to it to form a valid object. When we compile Pascal into system RPL here, array is implemented as a composite data type. Separate mechanism should be provided to handle one dimensional and two dimensional arrays in RPL, and to explore HP calculator's rich set of array functions.
The basic problem of compiling array access is to generate code to compute the address of 'a[i]', where a is defined as:
a: array [index_min..index_max] of typet;
Any book on compiler will give a formula like:
address(a[i]) = address(a) + (i-index_min)*sizeof(typet); = (address(a)-index_min*sizeof(typet)) + i*sizeof(typet);
For local variable access, another version of the formula is used, because we are using offset that is relative to the current top of stack:
offset(a[i]) = offset(a) - (i-index_min)*sizeof(typet); = (offset(a)+index_min*sizeof(typet)) - i*sizeof(typet);
It follows that, we can't use DUP, OVER, 3PICK, 4PICK ... to fetch an array element, we can't use XY>Y, XYZ>ZY to store, because address is only available at runtime. We should instead use PICK and STOI which both accept an offset on the stack.
STOI means indexed store and can be defined as:
( Indexed store: obn obn-1 ... ob1 index val ) ( ==> val obn-1 ... ob1 ) NAMELESS STOI :: SWAPDUP #2+ ROLLDROP UNROLL ;
An example will help to illustrate:
procedure p; var a: array [5..7] of record rel: integer; img: integer end; i: integer; begin i:=6; a[i].rel:=1; a[i+1].rel:=a[i].rel+1 end; NAMELESS proc_3 :: ( 0) ZERO SEVEN NDUPN DROP ( 7) 6 for array, one for integer SIX ( 8) 6 XY>Y ( 7) store into i SEVENTEEN ( 8) 7+5*2, 7 is offset(a) OVER ( 9) read i #2* #- ( 8) 7+(5-i)*2 ONE ( 9) 1 STOI ( 7) a[i].rel:=1 SEVENTEEN ( 8) 7+5*2, 7 is offset(a) OVER ( 9) i ONE (10) 1 #+ ( 9) i+1 #2* #- ( 8) 7+(5-i-1)*2, offset for a[i+1] EIGHTEEN ( 9) 8+5*2, 8 is offset(a) now 3PICK (10) i #2* #- ( 9) 8+(5-i)*2 PICK ( 9) a[i] ONE (10) 1 #+ ( 9) a[i]+1 STOI ( 7) a[i+1]:=a[i]+1 7DROP ( 0) clean up ;
An experienced system RPL programmer would say, this is only the code generated by a compiler, not me.
Control structures in system RPL is actually very rich, we have IT, ITE, DO_LOOP, BEGIN_WHILE_REPEAT, BEGIN_AGAIN, BEGIN_UNTIL, ', SEMI, COLA, SKIP, case and even GOTO. The excessive usage of SKIP ( explicitly or implicitly ) make system RPL control structures not as efficient as GOTO based implementation. For example, skipping one single object or object pointers in run stream takes about 120 clock cycles. Skipping a block of 33 objects would take 1ms on GX machines.
We base our implementation here on relative jumps. Two new routines are defined to support IF_THEN_ELSE, WHILE_DO, REPEAT_UNTIL and GOTO in Pascal: JMP and FJMP.
JMP expects a binary integer offset on the stack, adding this offset to the next RPL instruction pointer will get the target instruction pointer. That integer should be a direct embedded constant. The original system RPL word GOTO must be followed by an object pointer. Which is faster, shorter but more restrictive.
FJMP expects a True/False flag and a binary integer offset on the stack. If the flag is False, control is passed to that target address, otherwise, the next instruction in run stream is executed.
GOTO statement is just JMP: goto 00; ... 00: ... can be compiled as:
DOBINT ASSEMBLE REL(5) lab_1 RPL JMP ... LOCALLABEL lab_1 ...
To simplify the example codes, we assume we could define a macro for system RPL compiler:
DEFINE OFFSET(x) DOBINT \ ASSEMBLE \ REL(5) x \ RPL
IF statement: if ... then ... else ... can be compiled as:
.. conditional evaluation OFFSET(lab_1) FJMP ... then part OFFSET(lab_2) JMP LOCALLABEL lab_1 ... else part LOCALLABEL lab_2
WHILE statement: while ... do ... can be compiled as:
LOCALLABEL lab_1 ... conditional part OFFSET(lab_2) FJMP ... loop body OFFSET(lab_1) JMP LOCALLABEL lab_2
REPEAT statement: repeat ... until ... can be compiled as:
LOCALLABEL lab_1 ... loop body ... conditional part OFFSET(lab_1) FJMP
Implementation of FOR statement needs a special location for storing the end of loop variable index. Our Prting of P4 to system RPL compiler is designed to be a single pass compiler, so we can't allocate space for the end of loop index together with parameters and other variables. We choose to put them dynamically on the data stack. FOR statement: for i:=start to end do .... can be compiled to:
... (1) evaluate 'start' 1PUTLAM (0) i:=start ... (1) evaluate 'end', leave it on stack LOCALLABEL lab_1 1GETLAM (2) read current i OVER (3) end of loop value #> NOT (2) i<=end OFFSET(lab_2) jump to end if index passed end_of_loop FJMP (1) ... loop body 1GETLAM (2) read i #1+ (2) increment by 1 1PUTLAM (1) store i back OFFSET(lab_1) jump to loop test JMP LOCALLABEL lab_2 DROP (0) remove end_of_loop index from data stack
System RPL programmer would notice that this implementation is less efficient than the system RPL built-in DO_LOOP which is supported by a runtime Do_Loop_Environment. DO_LOOP is not used to implement Pascal's FOR statement because DO_LOOP is executed at least once and access of the loop index within and outside the loop body is not well defined ( especially if there are more than two levels of FOR loop). Another reason is DO_LOOP does not allow simple break_out_of_loop mechanism ( especially for nested loops ).
Breaking out of WHILE and REPEAT loop is simply a normal GOTO statement. Breaking out of a FOR loop using GOTO statement is possible too except that compiler would not automatically drop the end_of_loop index(es) for you. You will have to place the exact number of 'drop's before GOTO, where 'drop' is defined as:
procedure drop; inline 'DROP';
For example, the following code fragment will be compiled correctly:
label 00; ... for i:=1 to 10 do for j:=1 to 10 do for k:=1 to 10 do if solution_found(i,j,k) then begin drop; drop; drop; goto 00 end; 00: ...
We will finish this section by implementing the CASE statement. The concept of CASE in Pascal or switch in C is implemented by a series of conditional statements in user RPL. System RPL provides several methods to perform similar tasks: OVER#=case, EQLookup, CK To implement CASE statement, we define a new basic control word here: XJP ( case jump ). XJP accepts a binary integer on the data stack and a binary integer array of relative offsets that follows the XJP word. XJP just finds the relative jump offset for that index, and jumps to that location. Here is an example:
The preceding sections describe how Pascal features can be compiled into system RPL code, we have touched:
Some features are already supported, but not explicitly mentioned:
Still, there are lots of standard or commonly accepted extended Pascal features not supported or not implemented quite well in this implementation, which provides lots of opportunities for future developments:
Another direction of possible developments is to expand Pascal's limited data types and standard procedures/functions using system RPL's rich resources and good concepts:
Still another direction of possible development is to make the experiment reported here truly accessible and useful to end users. Things to be done includes:
The author would like to acknowledge the HP calculator software team members for introducing me to the wonderful land of HP calculator. Jim Donnelly gave the first lesson on user RPL programming. Charlie Patton's lectures and lots of emails are very valuable to the understand of RPL internals.
[1] W.C. Wickes, "An Evolutionary RPN Calculators for Technical professionals", HP Journal, Vol. 38, No. 8, August 1987, pp.11-17
procedure testcase(c: char);
begin
case c of
'0': ...
'1': ...
'2': ...
'3': ...
end;
...
end;
NAMELESS proc_3
::
DUP fetch c
OFFSET(lab_1) jump forward for XJP
JMP multiple pass compiler can do a better job
LOCALLABEL lab_3 '0' goes here
...
OFFSET(lab_2) jump pass case statement
JMP
LOCALLABEL lab_4 '1' goes here
...
OFFSET(lab_2) jump pass case statement
JMP
LOCALLABEL lab_5 '2' goes here
...
OFFSET(lab_2) jump pass case statement
JMP
LOCALLABEL lab_6 '3' goes here
...
OFFSET(lab_2) jump pass case statement
JMP
LOCALLABEL lab_1 actual dispatch without check
FORTYEIGHT subtract internal value of '0'
#-
XJP case jump
ASSEMBLE
CON(5) =DOARRY an array object here
REL(5) lab_2 indicating size of array
CON(5) =DOBINT element type
CON(5) 1 one dimensional array
CON(5) 4 four elements
REL(5) lab_3 offset for '0'
REL(5) lab_4 offset for '1'
REL(5) lab_5 offset for '2'
REL(5) lab_6 offset for '3'
RPL
LOCALLABEL lab_2 end of case code
...
DROP
;
6. Limitations and possible future developments
7. Acknowledgment
8. References
[2] D.K. Byrne et al, "An Advanced Scientific Graphing Calculators", HP Journal, Vol. 45, No. 4, August 1994, pp.6-22
[3] S. Pemberton & M. Daniels, "Pascal Implementation", Ellis Horwood, Chichester, UK.
[4] B. Kinnersley, "The Language List", Ver. 2.3, Sep 1994, Internet newsgroup comp.lang.misc
[5] James Donnelly, "The HP48 Handbook", Armstrong Publishing Company, 2nd Edition, 1993
[6] W.C. Wickes, "HP 48 Insights", Part I and II, Larken Publications, 1992
[7] RPLMAN.DOC, RPLCOMP.DOC, ftp from hpcvbbs.cv.hp.com
9. Appendix
Basic Pascal runtime support library
( data stack: # -> )
( run stream: DOARRY len*5+5 DOBINT off0 off1 off2 .... offn-1 )
NAMELESS XJP
CODE
GOSBVL =POP# A=offset
C=A A
A=A+A A
A=A+A A
A=C+A A A*=5
D0=D0+ 10
D0=D0+ 15 D0: point to first offset
CD0EX
C=C+A A C: point to THE offset
CD0EX
A=DAT0 A A: value of offset
CD0EX
C=C+A A C=D0+offset
CD0EX
LOOP
ENDCODE
NAMELESS JMP ( #offset -> )
CODE
GOSBVL =POP# A=offset
CD0EX C=Instruction pointer
C=C+A A C+=offset
CD0EX D0+=offset
D0=D0- 10 adjustment ( 16 for external ROM )
LOOP
ENDCODE
NAMELESS FJMP ( T/F #offset -> )
CODE
GOSBVL =POP# A=offset
R0=A.F A R0=offset
GOSBVL =popflag
A=R0.F A
GOC over
CD0EX C=Instruction pointer
C=C+A A C+=offset
CD0EX D0+=offset
D0=D0- 10 adjustment ( 16 for external ROM )
over LOOP
ENDCODE
* Indexed store: stack[n] stack[n-1] .... stack[1] index value
* ==> value stack[n-1] .... stack[1]
NAMELESS STOI
::
SWAPDUP #2+ ROLLDROP UNROLL
;