\documentstyle[a4,12pt]{article} \begin{document} \author{Hamish Dewar} \title{APM IMP compiler -- Version 1} \maketitle \parskip .1 in \setcounter{secnumdepth}{10} \parindent 0in \section{Preamble} IMP Compiler for Motorola 68000 {\hspace{1.3 in}} Hamish Dewar Example commands:- \\ {\hspace*{0.6 in}} IMP MYPROG \\ {\hspace*{0.6 in}} IMP PARSER-LIST \\ {\hspace*{0.6 in}} IMP PROG2-NOCHECK-NODIAG Unless -NOEDIT is specified, the compiler works in tandem with the editor. On detection of a fault, control is passed to the editor with the file pointer at the fault position. Corrective action may be taken, or not as wished, then \%c (or keypad ',') to resume. \section{Calling the Compiler} The Compiler is invoked by the command IMP. There is only one obligatory parameter and that is the name of the IMP source program to be compiled. No default extension is applied to the source file-name. Error reports are directed to the terminal (as well as the listing file). Line-numbering continues through any included files. Unless otherwise directed, the Compiler generates an object file with the same name as the source file and extension MOB (for Motorola object). The object file is eligible for execution by citing the name as a command verb. For example, the system command IMP MYFIRST calls for the source file "MYFIRST" (no extension) to be compiled as an IMP program, creating the file "MYFIRST.MOB". On successful completion of the compilation, the command "MYFIRST ...." becomes available as a command, and causes the object program to be loaded and entered. \section{Options} There are a number of options affecting the way in which the Compiler processes the source program. Modifications to the default selections may be requested by including appropriate keywords in the parameter string. These are introduced by a dash (minus). The main options are given below. Those not yet implemented are marked with an asterisk, and those only partially implemented with a query. In the case of the yes/no options, the option is suppressed by selecting the keyword prefixed with "NO". Source listing control: \small\tt \begin{verbatim} -LFILE= send listing to specified file (extension LIS added) -LFILE=LP: send listing to network printer -LIST produce source file listing -- if no LFILE specified, then to a file with same name as source and extension LIS added Default: -NOLIST -LOG output statistics indicating number of statements, atoms per statement, identifiers per statement, time taken, etc. Default: -NOLOG \end{verbatim}\rm \normalsize Object program control: \small\tt \begin{verbatim} -OFILE= produce object file with specified name (extension MOB added) -OFILE=:N do not produce object file Default: produce object file with same name as source and extension MOB -FORCE produce object file even if program faulty Default: -NOFORCE \end{verbatim}\rm \normalsize Run-time checks: \small\tt \begin{verbatim} ? -ASS include unassigned check on full integers ? -STRASS include unassigned check on strings ? -SASS include unassigned check on shorts ? -BASS include unassigned check on bytes -ARR include array bound checking * -LOOP include %for loop check * -OVER include overflow check * -CAP include capacity checking \end{verbatim}\rm \normalsize Default: -ASS-STRASS-NOSASS-NOBASS-ARR-LOOP-OVER-CAP \\ {\hspace*{0.7 in}} -NOCHECK has the effect of suppressing all checks Run-time diagnostics: \small\tt \begin{verbatim} -DIAG include line-number updating for diagnostics -TRACE generate code which allows the program to be executed one line at a time under the control of the Software Front Panel Default: -DIAG-NOTRACE \end{verbatim}\rm \normalsize \section{Language facilities} The language implemented is as defined in the Departmental Report "The IMP77 Language" (3rd Edition), with the exceptions noted below. Reals are limited to 32-bits and are implemented by software. Features of IMP77 which are not valid in IMP80 (EMAS IMP) and conversely, or which are peculiar to this implementation, are reported as 'non-standard' (warning only). However, this warning is not produced for the following widely used IMP77 facilities, even though they are not available in IMP80: {\hspace*{0.5 in}} \%shortintegers, \%predicates, \%else \%if, \\ {\hspace*{0.5 in}} the operator mod-sign, \\ {\hspace*{0.5 in}} initialisation of dynamic variables to literal values \\ {\hspace*{0.5 in}} (but initialisation to non-literal values is flagged). \subsection{Not yet implemented} Omissions in current release -- to be added: . full diagnostics and \%monitor . procedures as parameters . NEW and DISPOSE . \%string(*)\%array\%name variables \subsection{Variant features} {\hspace*{0.3 in}} . \%continue is interpreted as a return to the start of a loop so \\ {\hspace*{0.5 in}} that it by-passes any \%until test on the \%repeat {\hspace*{0.3 in}} . \%array \%name parameters must have bounds specified in the declaration \\ {\hspace*{0.5 in}} in one of the following ways: {\hspace*{0.8 in}} all literal bounds {\hspace{0.9 in}} eg \%byteintegerarrayname A(0:31) \\ {\hspace*{0.8 in}} one or more indefinite bounds eg \%integerarrayname B,C(1:*) {\hspace*{0.5 in}} [Note that in the second example, all actual parameters for B,C \\ {\hspace*{0.5 in}} must have the same upper bound. Cf B(1:*),C(1:*)] {\hspace*{0.5 in}} The Vax IMP form in which the keyword \%array is followed by the \\ {\hspace*{0.5 in}} number of dimensions in parentheses is also accepted: {\hspace*{0.5 in}} eg \%integerarray(1)\%name A == \%integerarrayname A(*:*) \\ {\hspace*{0.7 in}} \%shortarray(2)\%name B {\hspace{0.3 in}} == \%shortarrayname B(*:*,*:*) {\hspace*{0.5 in}} However, if this form is used, the number must be included even \\ {\hspace*{0.5 in}} for the one-dimensional case (cf Vax IMP which allows omission \\ {\hspace*{0.5 in}} in this case). {\hspace*{0.3 in}} . the low-level pointer-relative facility (see Section 9) differs from \\ {\hspace*{0.5 in}} the version of this implemented in the current Vax compiler \subsection{Omissions and restrictions} {\hspace*{0.3 in}} . \%name \%function as variant for \%map is not supported {\hspace*{0.3 in}} . \%on \%event * as a way of trapping all events is not supported {\hspace*{0.3 in}} . line-breaks following \%and and \%or are not ignored {\hspace*{0.3 in}} . there is no built-in function TYPEOF {\hspace*{0.3 in}} . untyped \%name parameters are usable only to supply an address, \\ {\hspace*{0.5 in}} not type or length information; the built-in function SIZEOF \\ {\hspace*{0.5 in}} is not applicable to such parameters {\hspace*{0.3 in}} . the Compiler is stricter about the ordering of statements \\ {\hspace*{0.5 in}} than the Vax IMP Compiler. The normal ordering in any block \\ {\hspace*{0.5 in}} should be declarations, then event trap if any, then instructions. \\ {\hspace*{0.5 in}} It is acceptable for static declarations, of procedures and \%own \\ {\hspace*{0.5 in}} variables, to be interspersed among instructions, but the Compiler \\ {\hspace*{0.5 in}} queries any dynamic variable declarations which appear after \\ {\hspace*{0.5 in}} instructions, and hard-faults any which appear in a loop or after a \\ {\hspace*{0.5 in}} sequence change. See also the restrictions noted in the section on externals \subsection{Obsolete features -- to be discontinued} {\hspace*{0.3 in}} . non-decimal constants expressed in IBM style (X'...') {\hspace*{0.3 in}} . Atlas Autocode style of \%for loop (without \%for) \section{Error reports} Error reports are kept short to economise on screen space. For reports which relate to a particular component of a statement, the culprit is identified by a marker at the start of the component. \small\tt \begin{verbatim} REPORT MEANING Faulty form statement part indicated is syntactically faulty Unknown atom lexical atom indicated is mis-spelt or unknown Non-starter atom at the start of the statement is not a possible statement introducer Unknown name identifier indicated has not been declared Duplicate identifier indicated has already been declared Mismatch parameters in body of procedure do not match spec Not variable operand in assignment context is not a variable Not reference operand in pointer context is not a reference Wrong type expression is of wrong type for context Wrong class category of identifier is wrong for context Not literal expression in literal context is not literal Inside out upper bound is less than lower Endless loop literal %for loop cannot terminate Out of range operand value is out of range Too few args too few arguments are given for procedure call Too many args too many arguments are given for procedure call Not in loop %exit %or %continue is not within a loop construct Not in routine %return is not within a routine Not in fn/map %result is not within a function or map Not in pred %true/%false is not within a predicate %CYCLE missing %repeat encountered with no matching %cycle %REPEAT missing %end reached with unmatched %cycle %START missing %finish or %else encountered with no matching %start %FINISH missing %end reached with unmatched %start Extra %ELSE %else encountered matching earlier unconditional %else %BEGIN missing %end encountered with no matching %begin or procedure header %END missing %end %of %program or %file reached with unmatched %begin or procedure header %RESULT missing there is apparently a path to the end of a function or map which does not specify a result Not accessible instruction apparently cannot be executed (warning only) Out of order statement appears in incorrect order in block (warning if benign) Nonstandard non-standard language feature (warning only) void identifier is used before a value is assigned to it (warning only) missing forward label or specced identifier does not appear in block (?) missing switch label to which there is an explicit jump has not been specified extra values for too many values for %const or %own array Faulty operand incorrect operand type for machine instruction Wrong size operand is of incorrect size for machine instruction Too complex! compiler limitation exceeded Not in yet! language feature not supported in current compiler Out of reach! %const or literal string, record or array is not within the range of the machine addressing capability Out of reach! call to procedure specified is not within the range of the machine addressing capability Internal error ! compiler fault Disastrous errors These reports relate to compiler limits being exceeded. They all cause compilation to be abandoned. Program too big total size of the compiled code and constants exceeds the maximum allowed for Program space exhausted size of the code for currently open blocks exceeds the maximum allowed for Identifier space exhausted total length of all identifiers currently in scope exceeds the maximum permitted Too many identifiers number of identifiers currently in scope exceeds the maximum permitted Too many levels depth of textual nesting of blocks exceeds the maximum permitted Too many owns space required for %own variable initialisation exceeds the maximum permitted Too many literals internal storage space for literals is exhausted Input ended there is no %endofprogram statement \end{verbatim}\rm \normalsize \section{Compiler limits} The following limits apply: {\hspace*{0.3 in}} . The maximum depth of textual nesting is 7. {\hspace*{0.3 in}} . The total size of the static data area (for owns/externals) is \\ {\hspace*{0.5 in}} limited to 32k bytes. {\hspace*{0.3 in}} . The size of the code area (for program and constants) is limited \\ {\hspace*{0.5 in}} to 64k bytes. In addition, there is no escape from the machine \\ {\hspace*{0.5 in}} imposed reach of +-32k bytes for accesses to constants. \\ {\hspace*{0.5 in}} The compiler does attempt to cater automatically for procedure calls \\ {\hspace*{0.5 in}} which would otherwise run into this limitation, but is not \\ {\hspace*{0.5 in}} guaranteed to succeed. \\ {\hspace*{0.5 in}} The compiler report 'Out of reach!' indicates that one of these \\ {\hspace*{0.5 in}} limits has been breached. If a name is given, it is that of a \\ {\hspace*{0.5 in}} procedure; if no name is given, the problem is access to a constant. \section{Efficiency} {\hspace*{0.3 in}} . In the absence of floating-point hardware, all floating-point \\ {\hspace*{0.5 in}} operations are performed by software. {\hspace*{0.3 in}} . Integer multiply and divide are also performed by software, although \\ {\hspace*{0.5 in}} multiplication by a literal power of 2 or sum of two powers of 2 \\ {\hspace*{0.5 in}} is optimised. This optimisation is also performed for array \\ {\hspace*{0.5 in}} accesses where the element size meets the condition stated (with \\ {\hspace*{0.5 in}} array bound checks off). {\hspace*{0.3 in}} . Accesses to arrays (including accesses via array names) are more \\ {\hspace*{0.5 in}} efficient if at least the lower bound is a literal (and for \\ {\hspace*{0.5 in}} multi-dimensional arrays if all the inner bounds are literal) \section{External linkage (May 83)} The external linkage mechanism provides for the use of program modules or libraries containing externally accessible procedures or variables. A program module containing external procedures or variables, and terminated by \%endoffile, is compiled in the ordinary way, eg IMP MYLIB. Before the externals it contains can be accessed, the compiled module must be installed by the INSTALL command. The INSTALL command takes as parameter the name of a program file (extension MOB assumed) -- or list of names separated by commas. For example INSTALL MYLIB. The effect of installing a program file is to add all the external names it contains to the external symbol table; the code of the module is not loaded at this stage. Thereafter (until logoff) these names are available for external linking, which is done automatically when a program referencing any of the names is loaded. It is only necessary to re-install a program file if a change is made to the external names it contains. An external name is the identifier declared as external (standardised to space-free lower-case form) unless this is over-ridden by an alias, in which case the alias name is used instead (exactly as presented). Note that an alias over-rides the declared identifier, rather than providing an alternative to it. The system command HELP LIB gives information about generally available libraries. \subsection{Restrictions on externals} The following restrictions apply:- 1. Names are truncated to 11 characters for purposes of matching as externals \\ {\hspace*{0.3 in}} (temporary restriction). 2. The type checking is nominal (temporary amnesty). 3. External \%name variables are not supported (temporary restriction). 4. External specs must all be at the top program level, that is, they may not \\ {\hspace*{0.3 in}} appear within procedures or inner blocks. 5. An external spec for a procedure may be satisfied by a procedure occurring \\ {\hspace*{0.3 in}} later in the same module, but the same is not true for variables. 6. Externals may not be combined with a main program (\%begin ... \\ {\hspace*{0.3 in}} \%endofprogram). CAUTION: The form of external linkage in V2.0 onwards will differ from this interim form, necessitating re-compilation. \section{Permanent procedures} The permanent procedures listed below are available without specification. These are generally as provided on Vax, with exceptions and additions noted. Terminal output is direct, so that there is no need for any TTPUT or FLUSH OUTPUT procedures. \subsection{Standard PERM} \small\tt \begin{verbatim}%integermap INTEGER(%integer a) %realmap REAL(%integer a) %string(*)%map STRING(%integer a) %record(*)%map RECORD(%integer a) %bytemap BYTEINTEGER(%integer a) %shortmap SHORTINTEGER(%integer a) %bytemap LENGTH(%string(*)%name s) %bytemap CHARNO(%string(*)%name s, %integer n) %integerfn ADDR(%name n) %string(1)%fn TOSTRING(%integer k) %string(255)%fn SUBSTRING(%string(255) s, %integer from,to) %integerfn REM(%integer a,b) %integerfn INTPT(%real x) %integerfn INT(%real x) %realfn FRACPT(%real x) %realfn SQRT(%real x) %string(8)%fn DATE %string(5)%fn TIME %integerfn CPUTIME !in milliseconds (actually elapsed time at present) %record(*) NIL %recordformat EVENTFM(%integer event,sub,extra,pc, %string(255) message,%integerarray r(0:15)) !PC and R are APM-specific extensions %record(eventfm) EVENT %constinteger NL %integerfn NEXTSYMBOL %routine READSYMBOL(%name n) %routine PRINTSYMBOL(%integer k) %routine SKIPSYMBOL %routine PRINTSTRING(%string(255) s) %routine READ(%name n) %routine WRITE(%integer m, n) %routine PRINT(%real x, %integer n,m) %routine PRINTFL(%real x, %integer n) %routine NEWLINE %routine NEWLINES(%integer i) %routine SPACE %routine SPACES(%integer i) %routine SELECTINPUT(%integer n) %routine SELECTOUTPUT(%integer n) %routine RESET INPUT %routine RESET OUTPUT %routine CLOSE INPUT %routine CLOSE OUTPUT %integerfn INSTREAM %integerfn OUTSTREAM %routine OPEN INPUT(%integer n, %string(31) S) !*selects N at present* %routine OPEN OUTPUT(%integer n, %string(31) S) !*selects N at present* %routine PROMPT(%string(31) S) %integerfn TESTSYMBOL !*APM specific* !value is -1 if no symbol available from terminal !otherwise value is the symbol, which is skipped %constinteger NOECHO,NOTERMECHO,SINGLE,NOPAGE %routine SET TERMINAL MODE(%integer m) !*APM specific* !set terminal mode as specified by M (sum of flags) \end{verbatim}\rm \normalsize \section{Low-level facilities} As a system implementation language IMP provides a number of facilities which permit access to parts of the system other languages do not reach. The store-mapping functions (INTEGER, BYTEINTEGER, etc) allow for direct addressing using machine addresses; the name-relative indexing facility (as in CUR[1] or POS[-2]) extends the capability of \%name variables; the fixed-address declaration facility (as in @16\_200 \%integer TIMER, KBIN, KBOUT) allows efficient source-level access to specific areas of store. The use of these facilities involves some loss of the protection normally conferred by a high-level language and a potential (particularly on a processor without hardware protection) for uncontrolled malfunction. They should, therefore, be used with restraint and care. \subsection{Assembler in IMP} Sequences of assembly language instructions may be incorporated in an IMP program at any point. At present no enabling directive is required. Apart from the usual hazards of machine level programming, there is the danger of corrupting the environment pre-supposed by the high-level facilities. For this reason, it is sensible to keep the use of these instructions to a minimum and to take particular care in programming these sections. Assembly language instructions are distinguished from IMP statements by being prefaced by an asterisk (eg *MOVE D0,D1). They appear within IMP blocks and procedures and are executed according to the normal flow of control, subject to any control transfers invoked by the instructions themselves. IMP conventions for statement termination, labelling and commenting apply, but otherwise the form of instruction follows closely that defined under the heading of Assembler Syntax for each individual instruction in the Motorola M68000 manual. Machine-level operands are specified using the mnemonics D0-D7 and A0-A7 (or SP), and the standard syntax for $<$ea$>$ modes. Immediate and Address register variants of op-codes are selected automatically. By comparison with full Assembler, most of the directives do not apply; assembly-time expression evaluation is restricted; size specification in cases where the size is implicit in the op-code is not in general allowed; the indexed PC-relative mode is not supported, nor is the 'current location counter'. The error report 'Faulty operand' indicates use of an invalid operand for the context, but the compiler does not fully check the validity of $<$ea$>$ modes for the particular instruction. NB the default 'size' applied is Long (32-bits) rather than Word (16-bits) which is the manufacturer's default. It is sensible to make a practice of including the size suffix explicitly for all relevant instructions. The special mnemonics USP (User Stack Pointer), SR (Status Register) and CCR (Condition Code Register) are NOT supported but the relevant effects can be achieved by use of the additional op-codes: \small\tt \begin{verbatim} MTCCR for MOVE ,CCR MTSR for MOVE ,SR MFSR for MOVE SR, MTUSP An for MOVE An,USP MFUSP An for MOVE USP,An ATCCR # for AND #,CCR ATSR # for AND #,SR ETCCR # for EOR #,CCR ETSR # for EOR #,SR OTCCR # for OR #,CCR OTSR # for OR #,SR \end{verbatim}\rm \normalsize IMP identifiers may also be used as operands for assembly language instructions. The detailed implications of this over the whole range of data types are fairly complex and subject to change. The following cases are reasonably straightforward and stable. Scalar variables may be used as $<$ea$>$ operands provided that they are: {\hspace*{0.3 in}} declared as own or \\ {\hspace*{0.3 in}} declared at the outermost level or \\ {\hspace*{0.3 in}} declared at the current level or \\ {\hspace*{0.3 in}} declared by means of a fixed-address declaration Variables declared at intermediate levels should not be accessed. In the case of a name variable the reference is to the pointer value (32-bit address) rather than the referenced object. At present it is not possible to access record fields in assembly language instructions. Labels and procedure names may be referenced in the Branch group of instructions, including BSR. The only alternative form of operand for these instructions is immediate (eg *BLT \#-4), the value specified being the machine-level displacement. Short and long branches are handled automatically (though the Compiler's CODE listing does not show this). To access a forward label in an assembly language instruction, it is neccesary for the label to have been declared by means of the IMP declaration \%LABEL $<$lab$>$. The registers D0-D3 and A0-A3 may be freely used within assembly sections, but no assumptions can be made about their contents after execution of most IMP statements. Other registers may be used only on the basis that their values are restored before reverting to IMP. In particular this applies to SP. In addition, NOTE WELL that the accessing of local variables depends on SP; if SP is changed, access to such variables in machine instructions becomes a nonsense (though no error report can be made). Similarly, any modification of A4-A6 rules out access to non-local variables. Note that any declaration of IMP identifiers which co-incide with the register mnemonics takes precedence. \section{Events} Events may be signalled by the system or by the user (using \%SIGNAL) and trapped using the \%ON \%EVENT.. construction. \%on \%event $<$eventno$>$, $<$eventno$>$, ..... {\hspace{0.3 in}} \%start \\ {\hspace*{0.2 in}} ... \%finish "\%event" {\hspace{0.2 in}} is optional in the statement above. Once the event has been trapped, further information may be obtained by interrogating the built-in record EVENT.. \%recordformat eventfm(\%byte event,sub,\%short line,\%integer extra, \\ {\hspace*{1.5 in}} \%string(255)message,\%integerarray r(0:15)) IMP and PASCAL share a common set of Event numbers. {\hspace{0.2 in}} HELP EVENTS gives a table of these. \section{Known Defects} \small\tt \begin{verbatim} . Statements like "%if COND %then %result = EXP1 %else %result = EXP2" "%if COND %then %return %else ... "%if COND %then %true/%false %else ... "%if COND %then ->LAB %else ... are liable to be mis-compiled. Fix: change "%else" to ";". . run-time checks are not imposed on capacity, overflow for addition and subtraction, invalid %for loops or stack over-run . The %include construction cannot currently be nested i.e. %included files should not contain %includes. This restriction has been lifted in versions of IMP for the new operating system. \end{verbatim}\rm \normalsize This software is frozen. It may be possible to fix bugs in the run-time libraries but any faults found in the compiler will have to be programmed around. \vspace{.75in} view:impv1 printed on 01/03/89 at 16.15 \newpage \tableofcontents \end{document}