\documentstyle[a4,12pt]{article} \begin{document} \author{APM Manual pages} \title{Procedures, Registers and Modules in IMP/Pascal} \maketitle \parskip .1 in \setcounter{secnumdepth}{10} \parindent 0in \section{Procedure Call Conventions and Register Usage} \subsection{Modules} The code generated by the IMP/Pascal compilers is pure and position-independent. A module is an independent compilation unit, corresponding to a main program or a file of external procedures and/or external data. A main program module may also contain external procedures/data; the only distinction is that it has a non-null main block. If a module has any static data (including linkage for external references), the module contains a code sequence to initialise the allocated static data area. This allows a module to be reset without re-loading or re-allocating the static data area. The system provides for automatic linking of modules at load time, or dynamically, at the time of first call. \subsection{Register usage} \begin{description} \item[SP] stack pointer \item[A6] link register for level 1 procedures \item[A5] base for process-global data \item[A4] base for static data (+) - all modules (which have static data) top-level data (-) - main program module (level 0) \item[A3-A0] temporaries and parameter passing \item[D7] unassigned pattern \item[D6] stack limit + 256 \item[D5] line number (if maintained dynamically) \item[D4] byte accumulator (high order 24 bits kept clear) \item[D3-D0] temporaries and parameter passing \end{description} \subsection{Parameter passing} By default, parameters and results are passed in registers, up to the number assigned for this purpose (4 data registers and 4 address registers). {\em Values} are passed in data registers and {\em addresses} are passed in address registers. Registers in the two groups are assigned starting from D0 and A0, respectively. A {\em value} {\em result} for a function is passed in D0 (D0 and D1 if two-valued) and an {\em address} {\em result} in A0. For these purposes a {\em value} is any parameter passed by value which is a simple scalar or an aligned structure occupying no more than 32 bits; an {\em address} is any parameter passed by name or any other structure passed by value. The initial code in a procedure has the task of assigning the parameters passed into it in registers to local storage as appropriate. [Under optimisation, this may not be necessary in the case of simple procedures.] In the case of structure values passed as addresses, it is the structure content which is stored, not the address, so that the properties of pass-by-value are preserved. When the result of a structure function is returned as an address in A0, it may point to the now released stack frame and the receiving code requires to move the structure elsewhere (eg by pushing it properly onto the stack). Parameters beyond those which can be accommodated in registers of the appropriate group are pushed to the stack in reverse order of occurrence. In this case, the actual value is pushed for all value parameters (including structures), the 32-bit address for %name parameters. There are special unpublicised conventions in the IMP compiler which allow explicit control to be exercised over how many registers are used for parameter passing and hence when the escape to the stack is invoked. \subsection{Procedure call mechanism} {\em Internal} procedures (ie those local to the module making the call) are called using BSR (possibly reach-extended by one or more BRAs as stepping-stones): \tt \small \begin{verbatim} evaluate parameters; BSR dddd; remove any stacked parameters \end{verbatim} \normalsize \rm {\em Extracode} procedures (regarded as hardware extensions) are called by the sequence: \tt \small \begin{verbatim} evaluate parameters; JSR aaaa; remove any stacked parameters \end{verbatim} \normalsize \rm where aaaa is the absolute address of the relevant extracode vector slot. {\em System} procedures are called by the sequence: \tt \small \begin{verbatim} evaluate parameters; JSR xxxx(A4); remove any stacked parameters \end{verbatim} \normalsize \rm with the content of the section of the static data area starting at xxxx initialised at load/link time to: \tt \small \begin{verbatim} JMP eeeeeeee \end{verbatim} \normalsize \rm {\em External} procedures are called by the sequence: \tt \small \begin{verbatim} push A4; eval param; JSR xxxx(A4); remove param; pop A4 \end{verbatim} \normalsize \rm with the content of the section of the static data area starting at xxxx initialised at load/link time to: \tt \small \begin{verbatim} MOVE.L #newa4,A4; JMP eeeeeeee \end{verbatim} \normalsize \rm For {\em dynamic} external procedures, the call is the same but the 12-byte sequence is initially set to invoke the loader. Note that it is the responsibility of the calling sequence to adjust the Stack Pointer to remove any stacked parameters following the call. [Motorola thought of ReTurn and De-allocate stack too late for the original 68000]. There is no general assumption about the preservation of the values of the temporary registers across procedure calls. The standard entry sequence does not therefore save these registers. In the case of most extracodes and certain system procedures, however, there are stipulated registers involved in each case and other registers are preserved. \subsection{Addressing conventions} The {\em process-global} data area maintains the process environment, in particular the input/output context and the exception trap linkage. This is common to all the modules making up a complete program. It is addressed by the Global Base register (A5). {\em Static} (own) data, in either main program modules or external modules, is accessed at non-negative displacements from the Static Base register (A4). [Comment: the first declared own enjoys an efficiency bonus]. Every module which has any static data (required also for linkage for external references) has its own static data area which is private to (a specific incarnation of) the module. [The code of a module is sharable; the data area, in general, not]. The calling sequence for an external procedure (or one that might be, eg a procedure parameter) includes code to save and restore the current Static Base and the transfer sequence sets it for the module containing the procedure. This is done without affecting the way in which parameters are handled. In a {\em main program module}, the static data area is allocated from the initial stack allocation for the program and addressed in the usual way positively from the Static Base. '{\em Dynamic}' data in the main program (top-level) is then allocated on the stack and is addressable negatively from Static Base. {\em Local} variables in the current stack frame are accessed SP-relative (positive or zero displacements). This strategy is used since many procedures have variables which are accessed only locally, so that no other form of access requires to be provided. Variables at {\em intermediate} levels, neither local nor top-level, are accessed using a version of the 'display' technique for multi-level addressing. A single address register (A6) is reserved for level 1 procedures (main program being taken as level 0) and the display for lower levels is held in store (at the start of the process-global area, maximum level 7). The store-held part of the display is slaved into temporary address registers as needed. In most cases the stack extension for a block or procedure is of {\em fixed} size. The extension is made incrementally, corresponding to the declaration of individual dynamic variables, but with accumulation of multiple declarations. This strategy allows efficient assignment of initial values and enables the use of SP-relative addressing on a one-pass basis. Where arrays of {\em variable} size are involved, the total stack extension is of unknown size. In such cases, the dope information and one or two pointer slots for each array are allocated as part of the fixed stack extension, and the arrays are subsequently allocated at the end of the fixed part. The presence of this section of unknown length prevents the use of SP-relative addressing (except for later declared compiler temporaries), so that local addressing in such cases follows the pattern for intermediate level addressing. Fixed size arrays which are big enough to threaten addressability of the fixed section (max 32k bytes) are treated as if they were of variable size. At procedure entry, a LINK operation (on A6 or stored display slot) is performed if at least one of the following conditions holds: \begin{itemize} \item there are non-local references to variables at this level; \item there are dynamic (or large) objects declared at this level; \item there is an event trap statement in the procedure; \item stack diagnostics are enabled [pending]. \end{itemize} [Although the compilers are one-pass, the entry sequence is generated at the end of a procedure, rather than the beginning.] \section{Object Module Format} Object module files consist of a number of sections. These are, in order of occurrence: \begin{description} \item[HEADER] information about complete module and section lengths \item[EXPORTS] list of identifiers defined in this module for external use \item[IMPORTS] list of external identifiers referenced in this module \item[CODE] binary position-independent code of module \item[DIAG] diagnostic tables if present \end{description} All section sizes are even numbers of bytes. The total length of the object module file is the sum of the sizes of the individual sections. Numeric information is in standard M68000 binary twos-complement representation (MS bytes first). The {\em header} section has a fixed size of 32 bytes (for module format V02). Its layout is: \tt \small \begin{verbatim} __________________ 0000 / | F E 0 2 | object module code + version 0002 / | 0 0 0 0 | spare 0004 / | export size | size of export section (bytes) 0006 / | import size | size of import section (bytes) 0008 / | code | size of 000A / | size | code section (bytes) 000C / | reset entry | reset entry point (word) [1] 000E / | main entry | main entry point (word) [1] 0010 / | static data | required size for 0012 / | size | static data area (bytes) 0014 / | stack | required size for 0016 / | size | stack (bytes) [2] 0018 / | diag section | size of 001A / | size | diagnostic section 001C / | 0 0 0 0 | spare 001E / | 0 0 0 0 | spare __________________ [1] word (16-bits) displacement from start of code section [2] stack size > 0 is actual requirement when known stack size <= 0 is negated minimum requirement \end{verbatim} \normalsize \rm Both the {\em import} and {\em export} sections consist of a variable number of consecutive records, one for each identifier. The last record is followed by at least one zero word as an end marker. (This is for processing convenience; the overall section size is as given in the header). The records are of variable size, depending on the length of the identifier. The size of the record is 13 plus the length of the identifier, evenned up. The layout of each record is: \tt \small \begin{verbatim} __________________ 0000 / | x x x x | flag word [1] 0002 / | type | type [2] 0004 / | " | information 0006 / | " | words 0008 / | address | byte address of 000A / | " | entity or reference [3] 000C+/ | identifier | text of identifier / : " : (length-prefixed string) / : " : / | " | __________________ Note [1] relevant bits ( MS bit = 15): 15 = 0: last (terminator word) = 1: not-last 14 = 0: internal (to be ignored) = 1: external 13,12 = 00: data object = 01: system procedure = 10: external procedure = 11: dynamic external procedure \end{verbatim} \rm \normalsize Note [2] {to be specified: not used at present} Note [3] for a {\em data} entry in the {\em export} list, the address is the byte displacement of the object relative to the start of the static data area for a {\em procedure} entry in the {\em export} list, the address is the byte displacement of the entry-point relative to the start of the code area for a {\em data} entry in the {\em import} list, the address is the byte displacement in the static data area of a 4-byte slot to be filled in with the absolute address of the object for a {\em procedure} entry in the {\em import} list, the address is the byte displacement in the static data area of a 6-byte (system) or 12-byte (external) call sequence to be set up to call the procedure \subsection{Example Pascal program} \tt \small \begin{verbatim} PROGRAM simple; PROCEDURE process(a:Integer); EXTERN; VAR i:Integer; BEGIN read(i); process(i); END. \end{verbatim} \normalsize \rm \subsection{Complete object module file (no diagnostics)} \tt \small \begin{verbatim} HEADER 0000 FE02 file-type code (FE), version (02) 0002 0000 [unused] 0004 0000 size of export section (0) 0006 0028 size of import section (40 bytes) 0008 0000 code section size 000A 0044 (68 bytes) 000C 000D reset entry (word 13 = byte 26) 000E 0001 main entry (word 1 = byte 2) 0010 0000 static data requirement 0012 0018 (24 bytes) 0014 FFFF stack requirement 0016 FFF0 (16+ bytes) 0018 0000 diag section size 001A 0000 (0) 001C 0000 [unused] 001E 0000 [unused] IMPORTS 0020 D000 system procedure "RINT" 0022 0000 0024 0000 0026 0000 0028 0000 relative address in static 002A 0000 data area of transfer sequence (0) 002C 0452 "R 002E 494E IN 0030 5400 T" + pad 0032 E000 external procedure "process" 0034 0000 0036 0000 0038 0000 003A 0000 relative address in static data 003C 000C area of transfer sequence (=12) 003E 0770 "p 0040 726F ro 0042 6365 ce 0044 7373 ss" 0046 0000 terminator CODE 0048 4E75 004A 206D .... .... 008A 0000 \end{verbatim} \normalsize \rm Creating object modules in assembly language programming An include file is available for use when preparing assembly language programs which are to be compatible with modules written in IMP or Pascal. By including the statement INCL INC:MODULE.ASM in the assembly language source file, the following macros are made available to allow the assembled output to be in FE02 format suitable for "linking" with modules written in IMP or PASCAL: \begin{description} \item[MODULE text-string] The file should contain exactly one call of this. TEXT-STRING is the module-name, which is ignored except that it appears as the title on every page of the listing. \item[EXPORT symbol,mode] For each procedure or data object defined in this module for use from other modules, a call on EXPORT should be made. SYMBOL is the name given to the external object, and is also the label which must appear in the code or data section. MODE is one of DATA, EXTERNAL, or SYSTEM. \item[IMPORT symbol,mode] For each procedure or data object referenced but not defined in this program, a call on IMPORT should be made. SYMBOL is the name of the external object, and the label which must appear in the data section. MODE is one of DATA, EXTERNAL, SYSTEM, or DYNAMIC. \item[CODE] This macro should be called if the module contains any code. \item[STACK SET value] This defines the value plugged into the module header which informs the loader of the estimated maximum stack requirement of the procedures in this module. (>0: known maximum, <0: -guestimate, =0: unknown) \item[CALL symbol] This is used to call an external or dynamic procedure. \item[SCALL symbol] This is used to call a system (data-less external) procedure. \item[ADDRESS symbol,ptr] This is used to address an external data object. \item[COPYDATA] This may be used in the reset procedure to initialise the data area. \item[DATA] This macro should be called if the module contains any static data or any references to external objects. \item[symbol VECTOR mode] This macro should be called, in the data section, for every imported object. SYMBOL and MODE are the same as in the IMPORT statement. This generates either a single longword to receive the address of the external object (for DATA references) or the appropriate entry sequence for calling external procedures (6 bytes for SYSTEM procedures, 12 for EXTERNAL and DYNAMIC ones). \item[ENDMODULE] This should be called last thing, immediately before the END statement. Its effect is to delimit the end of the last section, and to insert dummy export, import, and code sections if they were absent. \end{description} \subsection{Sequencing of directives} The calls on the macros listed above must be made in the following order, corresponding to the order of sections in the object module: \begin{enumerate} \item (optional, once) STACK SET value \\ \item (obligatory, once) MODULE name \\ This generates the header. \item (optional, repeated) EXPORT name,mode \\ On the first call, this identifies the start of the export section. It generates a loader record for the symbol. \item (optional, repeated) IMPORT name,mode \\ On the first call, this identifies the end of the export section and the start of the import section. It generates a loader record similar to that for the EXPORT directive. \item (optional, once) CODE \\ This identifies the end of the import section, and the start of the code section. If the code section is present, it must contain the labels RESET and BEGIN. The RESET procedure will be called by the loader whenever the module is loaded, this procedure should be used to initialise the data section, but note that external objects may not yet be accessed. The BEGIN procedure is called by the loader (after the RESET one) if the module is being invoked as a main program. \item (optional, once) DATA \\ This identifies the end of the code section proper and the beginning of the data initialisation section. It must be present if the module uses any static data or references any external objects. \item (optional, repeated) name VECTOR mode \\ This reserves space in the data section for addresses of external data objects or call sequences for external procedures. These may be placed only in the data section, but can appear anywhere within it, i.e. need not appear all together at the front, say. \item (obligatory, once) ENDMODULE \\ This identifies the end of the data section and deals with any sections which may have been absent. \end{enumerate} \subsection{Accessing the static data area} It is expected that the RESET procedure will copy the contents of the (read-only) data initialisation area within the code section. (accessed by PC-relative addressing) into the writeable static data area (accessed by A4). The data initialisation area starts at label DATABEG (defined by the DATA macro) and is of size DATASIZE bytes (defined by the ENDMODULE macro). It is expected that code referring to anything in the data area should do so via A4. A reference to the label X, say, should read X-DATABEG(A4). In the code section, the following macros may be used in order to access external objects more conveniently: \begin{description} \item[ADDRESS symbol,x] This puts the address of external data object SYMBOL into X, which will usually be an address register. \item[CALL symbol] This calls an external (or dynamic) procedure. \item[SCALL symbol] This calls a "system" (data-less external) procedure. \item[COPYDATA] This may be called by the reset routine, it copies DATASIZE bytes from (DATABEG) to (A4). \end{description} The distinction between EXTERNAL and SYSTEM procedures is that the latter are called by shorter and faster entry sequences. SYSTEM procedures are expected not to have any static data of their own. The standard call sequence for EXTERNAL procedures is: \tt \small \begin{verbatim} MOVE.L A4,-(SP) JSR label-DATABEG(A4) --+ +--> MOVE.L #,A4 JMP.L --+ +--> RTS --+ MOVE.L (SP)+,A4 <--------------------------------------------+ The call sequence for SYSTEM procedures misses out the code affecting A4, on the understanding that the called module will neither affect nor use A4: JSR label-DATABEG(A4) --+ +--> JMP.L --+ +--> RTS --+ (straight back) <--------------------------------------------------+ Although SYSTEM procedures may not use any module-own data, they can access the process-global data area addressed via A5 which is common to all modules, but local to the process in which they are executing. \end{verbatim} \end{document}