\documentstyle[a4,12pt]{article}
\begin{document}
\author{Hamish Dewar}
\title{APM IMP compiler -- Version 1}
\maketitle
\parskip .1 in
\setcounter{secnumdepth}{10}
\parindent 0in
\section{Preamble}
IMP Compiler for Motorola 68000 {\hspace{1.3 in}} Hamish Dewar

 Example commands:-
\\ {\hspace*{0.6 in}} IMP MYPROG
\\ {\hspace*{0.6 in}} IMP PARSER-LIST
\\ {\hspace*{0.6 in}} IMP PROG2-NOCHECK-NODIAG

Unless -NOEDIT is specified, the compiler works in tandem with the editor.
On detection of a fault, control is passed to the editor with the file
pointer at the fault position. Corrective action may be taken, or not as
wished, then \%c (or keypad ',') to resume.


\section{Calling the Compiler}
The Compiler is invoked by the command IMP.
There is only one obligatory parameter and that is the name of the IMP source
program to be compiled. No default extension is applied to the source
file-name.

Error reports are directed to the terminal (as well as the listing file).
Line-numbering continues through any included files.

Unless otherwise directed, the Compiler generates an object file with the same
name as the source file and extension MOB (for Motorola object). The object
file is eligible for execution by citing the name as a command verb.

For example, the system command IMP MYFIRST calls for the source file "MYFIRST"
(no extension) to be compiled as an IMP program, creating the file
"MYFIRST.MOB". On successful completion of the compilation, the command
"MYFIRST ...." becomes available as a command, and causes the object program to
be loaded and entered.

\section{Options}
There are a number of options affecting the way in which the Compiler processes
the source program. Modifications to the default selections may be requested by
including appropriate keywords in the parameter string. These are introduced by
a dash (minus).

The main options are given below. Those not yet implemented are marked with an
asterisk, and those only partially implemented with a query. In the case of the
yes/no options, the option is suppressed by selecting the keyword prefixed with
"NO".

Source listing control:

\small\tt \begin{verbatim}   -LFILE=<filename>   send listing to specified file
                       (extension LIS added)
   -LFILE=LP:          send listing to network printer
   -LIST               produce source file listing
                       -- if no LFILE specified, then to
                       a file with same name as source
                       and extension LIS added
   Default: -NOLIST

   -LOG                output statistics indicating number of statements,
                       atoms per statement, identifiers per statement,
                       time taken, etc.
   Default: -NOLOG
\end{verbatim}\rm  \normalsize 
Object program control:

\small\tt \begin{verbatim}   -OFILE=<filename>   produce object file with specified name
                       (extension MOB added)
   -OFILE=:N           do not produce object file

 Default: produce object file with same name as source and extension MOB

   -FORCE              produce object file even if program faulty

 Default: -NOFORCE
\end{verbatim}\rm  \normalsize 
 Run-time checks:
\small\tt \begin{verbatim}  ?   -ASS       include unassigned check on full integers
  ?   -STRASS    include unassigned check on strings
  ?   -SASS      include unassigned check on shorts
  ?   -BASS      include unassigned check on bytes

      -ARR       include array bound checking
  *   -LOOP      include %for loop check
  *   -OVER      include overflow check
  *   -CAP       include capacity checking
\end{verbatim}\rm  \normalsize 
 Default: -ASS-STRASS-NOSASS-NOBASS-ARR-LOOP-OVER-CAP
\\ {\hspace*{0.7 in}} -NOCHECK has the effect of suppressing all checks


 Run-time diagnostics:

\small\tt \begin{verbatim}      -DIAG         include line-number updating for diagnostics
      -TRACE        generate code which allows the program to
                    be executed one line at a time under the
                    control of the Software Front Panel
  Default: -DIAG-NOTRACE
\end{verbatim}\rm  \normalsize 

\section{Language facilities}
The language implemented is as defined in the Departmental Report "The IMP77
Language" (3rd Edition), with the exceptions noted below. Reals are limited to
32-bits and are implemented by software.

Features of IMP77 which are not valid in IMP80 (EMAS IMP) and conversely, or
which are peculiar to this implementation, are reported as 'non-standard'
(warning only). However, this warning is not produced for the following widely
used IMP77 facilities, even though they are not available in IMP80:

{\hspace*{0.5 in}} \%shortintegers, \%predicates, \%else \%if,
\\ {\hspace*{0.5 in}} the operator mod-sign,
\\ {\hspace*{0.5 in}} initialisation of dynamic variables to literal values
\\ {\hspace*{0.5 in}} (but initialisation to non-literal values is flagged).

\subsection{Not yet implemented}
Omissions in current release -- to be added:

 . full diagnostics and \%monitor

 . procedures as parameters

 . NEW and DISPOSE

 . \%string(*)\%array\%name variables


\subsection{Variant features}
{\hspace*{0.3 in}} . \%continue is interpreted as a return to the start of a loop so
\\ {\hspace*{0.5 in}} that it by-passes any \%until test on the \%repeat

{\hspace*{0.3 in}} . \%array \%name parameters must have bounds specified in the declaration
\\ {\hspace*{0.5 in}} in one of the following ways:

{\hspace*{0.8 in}} all literal bounds {\hspace{0.9 in}} eg \%byteintegerarrayname A(0:31)
\\ {\hspace*{0.8 in}} one or more indefinite bounds eg \%integerarrayname B,C(1:*)

{\hspace*{0.5 in}} [Note that in the second example, all actual parameters for B,C
\\ {\hspace*{0.5 in}} must have the same upper bound. Cf B(1:*),C(1:*)]

{\hspace*{0.5 in}} The Vax IMP form in which the keyword \%array is followed by the
\\ {\hspace*{0.5 in}} number of dimensions in parentheses is also accepted:

{\hspace*{0.5 in}} eg \%integerarray(1)\%name A == \%integerarrayname A(*:*)
\\ {\hspace*{0.7 in}} \%shortarray(2)\%name B {\hspace{0.3 in}} == \%shortarrayname B(*:*,*:*)

{\hspace*{0.5 in}} However, if this form is used, the number must be included even
\\ {\hspace*{0.5 in}} for the one-dimensional case (cf Vax IMP which allows omission
\\ {\hspace*{0.5 in}} in this case).

{\hspace*{0.3 in}} . the low-level pointer-relative facility (see Section 9) differs from
\\ {\hspace*{0.5 in}} the version of this implemented in the current Vax compiler


\subsection{Omissions and restrictions}
{\hspace*{0.3 in}} . \%name \%function as variant for \%map is not supported

{\hspace*{0.3 in}} . \%on \%event * as a way of trapping all events is not supported

{\hspace*{0.3 in}} . line-breaks following \%and and \%or are not ignored

{\hspace*{0.3 in}} . there is no built-in function TYPEOF

{\hspace*{0.3 in}} . untyped \%name parameters are usable only to supply an address,
\\ {\hspace*{0.5 in}} not type or length information; the built-in function SIZEOF
\\ {\hspace*{0.5 in}} is not applicable to such parameters

{\hspace*{0.3 in}} . the Compiler is stricter about the ordering of statements
\\ {\hspace*{0.5 in}} than the Vax IMP Compiler. The normal ordering in any block
\\ {\hspace*{0.5 in}} should be declarations, then event trap if any, then instructions.
\\ {\hspace*{0.5 in}} It is acceptable for static declarations, of procedures and \%own
\\ {\hspace*{0.5 in}} variables, to be interspersed among instructions, but the Compiler
\\ {\hspace*{0.5 in}} queries any dynamic variable declarations which appear after
\\ {\hspace*{0.5 in}} instructions, and hard-faults any which appear in a loop or after a
\\ {\hspace*{0.5 in}} sequence change.

See also the restrictions noted in the section on externals


\subsection{Obsolete features -- to be discontinued}

{\hspace*{0.3 in}} . non-decimal constants expressed in IBM style (X'...')

{\hspace*{0.3 in}} . Atlas Autocode style of \%for loop (without \%for)


\section{Error reports}

Error reports are kept short to economise on screen space. For reports which
relate to a particular component of a statement, the culprit is identified by a
marker at the start of the component.

\small\tt \begin{verbatim} REPORT                     MEANING

Faulty form      statement part indicated is syntactically faulty
Unknown atom     lexical atom indicated is mis-spelt or unknown
Non-starter      atom at the start of the statement is not a
                 possible statement introducer
Unknown name     identifier indicated has not been declared
Duplicate        identifier indicated has already been declared
Mismatch         parameters in body of procedure do not match spec
Not variable     operand in assignment context is not a variable
Not reference    operand in pointer context is not a reference
Wrong type       expression is of wrong type for context
Wrong class      category of identifier is wrong for context
Not literal      expression in literal context is not literal
Inside out       upper bound is less than lower
Endless loop     literal %for loop cannot terminate
Out of range     operand value is out of range
Too few args     too few arguments are given for procedure call
Too many args    too many arguments are given for procedure call
Not in loop      %exit %or %continue is not within a loop construct
Not in routine   %return is not within a routine
Not in fn/map    %result is not within a function or map
Not in pred      %true/%false is not within a predicate
%CYCLE missing   %repeat encountered with no matching %cycle
%REPEAT missing  %end reached with unmatched %cycle
%START missing   %finish or %else encountered with no matching %start
%FINISH missing  %end reached with unmatched %start
Extra %ELSE      %else encountered matching earlier unconditional %else
%BEGIN missing   %end encountered with no matching %begin or
                 procedure header
%END missing     %end %of %program or %file reached with unmatched
                 %begin or procedure header
%RESULT missing  there is apparently a path to the end of a function
                 or map which does not specify a result
Not accessible   instruction apparently cannot be executed (warning only)
Out of order     statement appears in incorrect order in block
                 (warning if benign)
Nonstandard      non-standard language feature (warning only)
<ident> void     identifier is used before a value is assigned to it
                 (warning only)
<ident> missing  forward label or specced identifier does not appear
                 in block
<ident>(?) missing  switch label to which there is an explicit jump
                 has not been specified
<n> extra values for <ident>
                 too many values for %const or %own array
Faulty operand   incorrect operand type for machine instruction
Wrong size       operand is of incorrect size for machine instruction
Too complex!     compiler limitation exceeded
Not in yet!      language feature not supported in current compiler
Out of reach!    %const or literal string, record or array is not
                 within the range of the machine addressing capability
<ident> Out of reach!
                 call to procedure specified is not within the
                 range of the machine addressing capability
Internal error <n>! compiler fault

Disastrous errors

These reports relate to compiler limits being exceeded.  They all cause
compilation to be abandoned.

Program too big             total size of the compiled code and constants
                            exceeds the maximum allowed for
Program space exhausted     size of the code for currently open blocks
                            exceeds the maximum allowed for
Identifier space exhausted  total length of all identifiers currently
                            in scope exceeds the maximum permitted
Too many identifiers        number of identifiers currently in scope
                            exceeds the maximum permitted
Too many levels             depth of textual nesting of blocks exceeds
                            the maximum permitted
Too many owns               space required for %own variable
                            initialisation exceeds the maximum permitted
Too many literals           internal storage space for literals is exhausted
Input ended                 there is no %endofprogram statement
\end{verbatim}\rm  \normalsize 
\section{Compiler limits}
The following limits apply:

{\hspace*{0.3 in}} . The maximum depth of textual nesting is 7.

{\hspace*{0.3 in}} . The total size of the static data area (for owns/externals) is
\\ {\hspace*{0.5 in}} limited to 32k bytes.

{\hspace*{0.3 in}} . The size of the code area (for program and constants) is limited
\\ {\hspace*{0.5 in}} to 64k bytes. In addition, there is no escape from the machine
\\ {\hspace*{0.5 in}} imposed reach of +-32k bytes for accesses to constants.
\\ {\hspace*{0.5 in}} The compiler does attempt to cater automatically for procedure calls
\\ {\hspace*{0.5 in}} which would otherwise run into this limitation, but is not
\\ {\hspace*{0.5 in}} guaranteed to succeed.
\\ {\hspace*{0.5 in}} The compiler report 'Out of reach!' indicates that one of these
\\ {\hspace*{0.5 in}} limits has been breached. If a name is given, it is that of a
\\ {\hspace*{0.5 in}} procedure; if no name is given, the problem is access to a constant.

\section{Efficiency}

{\hspace*{0.3 in}} . In the absence of floating-point hardware, all floating-point
\\ {\hspace*{0.5 in}} operations are performed by software.

{\hspace*{0.3 in}} . Integer multiply and divide are also performed by software, although
\\ {\hspace*{0.5 in}} multiplication by a literal power of 2 or sum of two powers of 2
\\ {\hspace*{0.5 in}} is optimised. This optimisation is also performed for array
\\ {\hspace*{0.5 in}} accesses where the element size meets the condition stated (with
\\ {\hspace*{0.5 in}} array bound checks off).

{\hspace*{0.3 in}} . Accesses to arrays (including accesses via array names) are more
\\ {\hspace*{0.5 in}} efficient if at least the lower bound is a literal (and for
\\ {\hspace*{0.5 in}} multi-dimensional arrays if all the inner bounds are literal)

\section{External linkage (May 83)}
The external linkage mechanism provides for the use of program modules or
libraries containing externally accessible procedures or variables.

A program module containing external procedures or variables, and terminated by
\%endoffile, is compiled in the ordinary way, eg IMP MYLIB. Before the externals
it contains can be accessed, the compiled module must be installed by the
INSTALL command. The INSTALL command takes as parameter the name of a program
file (extension MOB assumed) -- or list of names separated by commas. For
example INSTALL MYLIB. The effect of installing a program file is to add all
the external names it contains to the external symbol table; the code of the
module is not loaded at this stage. Thereafter (until logoff) these names are
available for external linking, which is done automatically when a program
referencing any of the names is loaded. It is only necessary to re-install a
program file if a change is made to the external names it contains.

An external name is the identifier declared as external (standardised to
space-free lower-case form) unless this is over-ridden by an alias, in which
case the alias name is used instead (exactly as presented). Note that an alias
over-rides the declared identifier, rather than providing an alternative to it.

The system command HELP LIB gives information about generally available
libraries.

\subsection{Restrictions on externals}
The following restrictions apply:-

1. Names are truncated to 11 characters for purposes of matching as externals
\\ {\hspace*{0.3 in}} (temporary restriction).
2. The type checking is nominal (temporary amnesty).
3. External \%name variables are not supported (temporary restriction).
4. External specs must all be at the top program level, that is, they may not
\\ {\hspace*{0.3 in}} appear within procedures or inner blocks.
5. An external spec for a procedure may be satisfied by a procedure occurring
\\ {\hspace*{0.3 in}} later in the same module, but the same is not true for variables.
6. Externals may not be combined with a main program (\%begin ...
\\ {\hspace*{0.3 in}} \%endofprogram).

CAUTION: The form of external linkage in V2.0 onwards will differ from this
interim form, necessitating re-compilation.


\section{Permanent procedures}
The permanent procedures listed below are available without specification.
These are generally as provided on Vax, with exceptions and additions noted.
Terminal output is direct, so that there is no need for any TTPUT or FLUSH
OUTPUT procedures.

\subsection{Standard PERM}

\small\tt \begin{verbatim}%integermap      INTEGER(%integer a)
%realmap         REAL(%integer a)
%string(*)%map   STRING(%integer a)
%record(*)%map   RECORD(%integer a)
%bytemap         BYTEINTEGER(%integer a)
%shortmap        SHORTINTEGER(%integer a)
%bytemap         LENGTH(%string(*)%name s)
%bytemap         CHARNO(%string(*)%name s, %integer n)
%integerfn       ADDR(%name n)
%string(1)%fn    TOSTRING(%integer k)
%string(255)%fn  SUBSTRING(%string(255) s, %integer from,to)

%integerfn       REM(%integer a,b)

%integerfn       INTPT(%real x)
%integerfn       INT(%real x)
%realfn          FRACPT(%real x)
%realfn          SQRT(%real x)

%string(8)%fn    DATE
%string(5)%fn    TIME
%integerfn       CPUTIME  
                 !in milliseconds (actually elapsed time at present)

%record(*)       NIL
%recordformat    EVENTFM(%integer event,sub,extra,pc,
                 %string(255) message,%integerarray r(0:15))
                 !PC and R are APM-specific extensions
%record(eventfm) EVENT
%constinteger    NL
%integerfn       NEXTSYMBOL
%routine         READSYMBOL(%name n)
%routine         PRINTSYMBOL(%integer k)
%routine         SKIPSYMBOL
%routine         PRINTSTRING(%string(255) s)
%routine         READ(%name n)
%routine         WRITE(%integer m, n)
%routine         PRINT(%real x, %integer n,m)
%routine         PRINTFL(%real x, %integer n)
%routine         NEWLINE
%routine         NEWLINES(%integer i)
%routine         SPACE
%routine         SPACES(%integer i)
%routine         SELECTINPUT(%integer n)
%routine         SELECTOUTPUT(%integer n)
%routine         RESET INPUT
%routine         RESET OUTPUT
%routine         CLOSE INPUT
%routine         CLOSE OUTPUT
%integerfn       INSTREAM
%integerfn       OUTSTREAM
%routine         OPEN INPUT(%integer n, %string(31) S)
                 !*selects N at present*
%routine         OPEN OUTPUT(%integer n, %string(31) S)
                 !*selects N at present*
%routine         PROMPT(%string(31) S)
%integerfn       TESTSYMBOL
                 !*APM specific*
                 !value is -1 if no symbol available from terminal
                 !otherwise value is the symbol, which is skipped

%constinteger    NOECHO,NOTERMECHO,SINGLE,NOPAGE
%routine         SET TERMINAL MODE(%integer m)
                 !*APM specific*
                 !set terminal mode as specified by M (sum of flags)
\end{verbatim}\rm  \normalsize 

\section{Low-level facilities}

As a system implementation language IMP provides a number of facilities which
permit access to parts of the system other languages do not reach. The
store-mapping functions (INTEGER, BYTEINTEGER, etc) allow for direct addressing
using machine addresses; the name-relative indexing facility (as in CUR[1] or
POS[-2]) extends the capability of \%name variables; the fixed-address
declaration facility (as in @16\_200 \%integer TIMER, KBIN, KBOUT) allows
efficient source-level access to specific areas of store.

The use of these facilities involves some loss of the protection normally
conferred by a high-level language and a potential (particularly on a processor
without hardware protection) for uncontrolled malfunction. They should,
therefore, be used with restraint and care.

\subsection{Assembler in IMP}

Sequences of assembly language instructions may be incorporated in an IMP
program at any point. At present no enabling directive is required. Apart from
the usual hazards of machine level programming, there is the danger of
corrupting the environment pre-supposed by the high-level facilities. For this
reason, it is sensible to keep the use of these instructions to a minimum and to
take particular care in programming these sections.

Assembly language instructions are distinguished from IMP statements by being
prefaced by an asterisk (eg *MOVE D0,D1). They appear within IMP blocks and
procedures and are executed according to the normal flow of control, subject to
any control transfers invoked by the instructions themselves. IMP conventions
for statement termination, labelling and commenting apply, but otherwise the
form of instruction follows closely that defined under the heading of Assembler
Syntax for each individual instruction in the Motorola M68000 manual.
Machine-level operands are specified using the mnemonics D0-D7 and A0-A7 (or
SP), and the standard syntax for $<$ea$>$ modes. Immediate and Address register
variants of op-codes are selected automatically. By comparison with full
Assembler, most of the directives do not apply; assembly-time expression
evaluation is restricted; size specification in cases where the size is implicit
in the op-code is not in general allowed; the indexed PC-relative mode is not
supported, nor is the 'current location counter'.

The error report 'Faulty operand' indicates use of an invalid operand for the
context, but the compiler does not fully check the validity of $<$ea$>$ modes for
the particular instruction.

NB the default 'size' applied is Long (32-bits) rather than Word (16-bits)
which is the manufacturer's default. It is sensible to make a practice of
including the size suffix explicitly for all relevant instructions.

The special mnemonics USP (User Stack Pointer), SR (Status Register) and CCR
(Condition Code Register) are NOT supported but the relevant effects can be
achieved by use of the additional op-codes:

\small\tt \begin{verbatim}  MTCCR <ea>          for      MOVE <ea>,CCR
  MTSR <ea>           for      MOVE <ea>,SR
  MFSR <ea>           for      MOVE SR,<ea>
  MTUSP An            for      MOVE An,USP
  MFUSP An            for      MOVE USP,An
  ATCCR #<data>       for      AND #<data>,CCR
  ATSR #<data>        for      AND #<data>,SR
  ETCCR #<data>       for      EOR #<data>,CCR
  ETSR #<data>        for      EOR #<data>,SR
  OTCCR #<data>       for      OR #<data>,CCR
  OTSR #<data>        for      OR #<data>,SR
\end{verbatim}\rm  \normalsize 
IMP identifiers may also be used as operands for assembly language
instructions. The detailed implications of this over the whole range of data
types are fairly complex and subject to change. The following cases are
reasonably straightforward and stable. Scalar variables may be used as $<$ea$>$
operands provided that they are:

{\hspace*{0.3 in}} declared as own or
\\ {\hspace*{0.3 in}} declared at the outermost level or
\\ {\hspace*{0.3 in}} declared at the current level or
\\ {\hspace*{0.3 in}} declared by means of a fixed-address declaration

Variables declared at intermediate levels should not be accessed. In the case
of a name variable the reference is to the pointer value (32-bit address) rather
than the referenced object. At present it is not possible to access record
fields in assembly language instructions.

Labels and procedure names may be referenced in the Branch group of
instructions, including BSR. The only alternative form of operand for these
instructions is immediate (eg *BLT \#-4), the value specified being the
machine-level displacement. Short and long branches are handled automatically
(though the Compiler's CODE listing does not show this). To access a forward
label in an assembly language instruction, it is neccesary for the label to have
been declared by means of the IMP declaration \%LABEL $<$lab$>$.

The registers D0-D3 and A0-A3 may be freely used within assembly sections,
but no assumptions can be made about their contents after execution of most IMP
statements. Other registers may be used only on the basis that their values are
restored before reverting to IMP. In particular this applies to SP. In
addition, NOTE WELL that the accessing of local variables depends on SP; if SP
is changed, access to such variables in machine instructions becomes a nonsense
(though no error report can be made). Similarly, any modification of A4-A6
rules out access to non-local variables.

Note that any declaration of IMP identifiers which co-incide with the
register mnemonics takes precedence.


\section{Events}
Events may be signalled by the system or by the user (using \%SIGNAL) and
trapped using the \%ON \%EVENT.. construction.

\%on \%event $<$eventno$>$, $<$eventno$>$, ..... {\hspace{0.3 in}} \%start
\\ {\hspace*{0.2 in}} ...
\%finish

"\%event" {\hspace{0.2 in}} is optional in the statement above.
Once the event has been trapped, further information may be obtained by
interrogating the built-in record EVENT..

\%recordformat eventfm(\%byte event,sub,\%short line,\%integer extra,
\\ {\hspace*{1.5 in}} \%string(255)message,\%integerarray r(0:15))


IMP and PASCAL share a common set of Event numbers. {\hspace{0.2 in}} HELP EVENTS gives a table
of these.

\section{Known Defects}

\small\tt \begin{verbatim} . Statements like "%if COND %then %result = EXP1 %else %result = EXP2"
                   "%if COND %then %return %else ...
                   "%if COND %then %true/%false %else ...
                   "%if COND %then ->LAB %else ...
   are liable to be mis-compiled. Fix: change "%else" to ";".

 . run-time checks are not imposed on capacity, overflow for addition
   and subtraction, invalid %for loops or stack over-run

 . The %include construction cannot currently be nested i.e. %included files
   should not contain %includes.  This restriction has been lifted in versions
   of IMP for the new operating system.
\end{verbatim}\rm  \normalsize 
This software is frozen. It may be possible to fix bugs in the run-time
libraries but any faults found in the compiler will have to be programmed
around.

\vspace{.75in} view:impv1 printed on 01/03/89 at 16.15

\newpage
\tableofcontents
\end{document}