External Program Linkage

A design objective of this scheme is that the code of programs should be protected from accidental or deliberate overwriting and that code should be shared between users whenever possible, with consequent savings of main memory and disc space.

Program File Format and Virtual Memory Layout

The standard program file format is shown above. Program files are shareable and while being executed are normally connected in virtual memory in read or read-shared mode. The first four words constitute a 'file header'. This provides the length of the file and the start of the Code area, the GLAP (General Linkage Area Pattern) and Linkage data area, each relative to the start of the file.

The code area must be invariant so that the code can be run in read or read-shared mode..

The GLAP contains the information needed for linking this program file to any file it may require, including other program files. When a program is loaded the actual program file is left unaltered and the nececsary changes to establish external linkage are made to a copy of the GLAP, referred to as the GLA (General Linkage Area). This approach was suggested by Arden, Galler, O'Brien and Westervelt (1966). Any initialised data to be used by the program may be set up in the GLAP.

A particular program file may contain several separately compiled routines, each of which will have a block of code and a block of GLAP within the appropriate areas of the file. Information required by the Program Loader is provided in linked lists, the heads of which are stored at the front of the linkage data area.

Linkage Data Lists

List 1 is a list of entry points at which the program file may be entered. These may be a main program or externally accessible routines. The list itself is contained in the linkage data area. Each item in the list has the format shown in the diagram above.
List 2 is a list of external references to be satisfied by the Loader before the program file is entered. The list items are set up in the GLAP by the compiler which creates the file, with a similar format to that for entry points, except that it sets the three addresses to zero. The loader replaces the three zero-filled words with the virtual memory addresses of the start of the code block, the start of the GLA block and the entry point address respectively of the program file satisfying the reference.
List 3 is a list of external references to be satisfied dynamically (i.e. at the time that an external routine is called). The compilers set up items in this list in the same way as for list 2, but the loader action is different. The three zero words are filled with the addresses of the list item itself, an environment descriptor for the loader (i.e. a register set and the program counter), and the entry point address of a dynamic load sequence. If the program uses this information to attempt to enter the external routine it will enter the dynamic load sequence.
A fourth list of data entry points defines initialised data areas in the GLAP which may be referenced by other routines (e.g.
A fifth list defines external data references to be satisfied by the loader before the program file is entered.

When the loading process is complete the code and GLA areas of a program are in different segments of the virtual memory. The code is in one or more segments which are connected in read mode, while the GLA is in a write-unshared file. The same physical copy of the code may be in use by several other users who have the program loaded at the same time, though not necessarily at the same virtual memory address. It can be seen that the shareable part of a program file must not contain any 'absolute' virtual memory addresses. With the exception of the external references satisfied by the loader all virtual memory addresses have to be established by the program at run-time.

External Program Calling Conventions

Before entry to an external routine, the three addresses, code block, GLA block and entry point, inserted into the external reference item for the called routine by the loader, are loaded into three (consecutive) registers. One of the general purpose registers contains the address of the start of the free space on a common stack which all programs are constrained to use for the entry sequence to external routines. The calling routine may copy its registers on to the stack in the 16 words immediately ahead of the stack free space pointer. These are restored on return by the called routine. Any parameters to the call are set up beyond the register save area according to standard conventions. Control is then transferred to the entry point address with the return address in another register. In practice this sequence usually requires only three hardware instructions.

Dynamic Loading

In the case of a routine which is to be loaded dynamically this entry sequence, together with the loader action defined above, results in entry to the dynamic load sequence. This sequence finds the program file containing the routine (if it exists), loads it, and overwrites the relevant three words in the GLA of the calling routine with the addresses of the code block, the GLA block and the entry point of the called routine. Control is transferred to the entry point. The overwriting of the three words enables the program to use the same code sequence to jump directly to the external routine on any subsequent call.

The actual instruction sequences used in a real implementation of this method are given on the diagram (above) and in Appendix 1 (below).

Reference

G.E. Millard, D.J.Rees and H.Whitfield (1975). The Standard EMAS Subsystem, The Computer Journal, Vol. 18, No. 3, pp 213-219.

Appendix 1

The use of registers in the EMAS standard subsystem external routine entry sequence was as follows:

RO-R3   Scratch registers - not saved
R4-R14  Environment registers - saved and restored
R15     Return address

The following had special uses:

R11     Stack free space pointer
R12     Code base register
R13     GLA base register
R14     Entry Point Address register

The following were typical call and return sequences:

CALLING ROUTINE
                    Store parameters (if any) beyond the save area

STM 4,14,16(11)     Store R4-R14 in the save area on the stack

LM 12,14,EPREF(13)  Load code, GLA and Entry Point addresses of the called
                    routine from an entry point reference in the GLA of the
                    calling program.
                    
BALR 15,14          Enter the called routine leaving the return address in R15.

CALLED ROUTINE

ST 15,60(11)        It was usual to save the return address on the stack

LR base,11          Load the local base register

LM 4,15,16(base)    Restore R4-R14 and pick up the return address

BR 15               Return to calling routine

This ensured that R13 had been reset to its entry value if the called routine had changed it (which it generally had).

The called routine could set its own code base by using a BALR 12,0 instruction and subtracting some fixed offset. However, the reset sequence was more general and more efficient. The restoring of the registers from the save area could take place on return to the calling routine with a LM 4,14,16(11) instruction but the present scheme was more efficient because the called routine usually stored the return address in memory to make R15 available for other uses, and the present scheme reloaded R15 at negligible cost. Because we had chosen not to save R0-R3 and R15, the corresponding positions in the save area could be used for other purposes (e.g. a dynamic or static environment chain in an ALGOL type language). However, it is obvious that this was by no means necessary and a more general scbeme could have used the whole save area.

harry.whitfield@ncl.ac.uk               20 March 1998 ...