\documentstyle[a4,12pt]{article}
\begin{document}
\author{Frank Cringle}
\title{APM C Compiler -- Version 2}
\maketitle
\parskip .1 in
\setcounter{secnumdepth}{10}
\parindent 0in
\section{Preamble}
APM C Compiler {\hspace{0.5 in}} (Version 2) {\hspace{2.1 in}} Frank Cringle

The Bell Labs portable C compiler, with M68000 code generator added at MIT, is
available on the APM.

This compiler conforms to "The C Programming Language" by Kernighan \& Ritchie,
with the addition of structure assignment, the enumerated type and in-line
assembly.

A large proportion of the UNIX(tm) run-time library exists, and porting of
programs from UNIX to the APM and vice versa is possible for programs which do
not rely on multiple processes.

\section{Preparation}

Before running any programs compiled with the V2 C compiler, the run-time
library must be installed, and preferably preloaded. Add the following
commands to your login.com file or, if you have a personal machine, to its
custom:xx.com file:

{\hspace*{0.2 in}} preload nc:libc.mob
\\ {\hspace*{0.2 in}} install nc:libc.mob

If your programs use transcendental functions or graphics, install and preload
the corresponding libraries too:

{\hspace*{0.2 in}} preload nc:libc.mob,nc:libm.mob,nc:libg.mob
\\ {\hspace*{0.2 in}} install nc:libc.mob,nc:libm.mob,nc:libg.mob


\section{Command line parsing}

The C compiler automatically generates a call to the run-time function \_START
before executing the main() function of a C program. This does a unix shell
style parse of the command line, and builds parameters argc and argv for the
main() function. The name of the command itself is not available under the
current APM operating system, so argv[0] is always set to "main".

The following features are provided:

\subsection{I/O redirection}

{\hspace*{0.2 in}} $<$ file {\hspace{0.4 in}} open file as stdin
\\ {\hspace*{0.2 in}} $>$ file {\hspace{0.4 in}} open file as stdout
\\ {\hspace*{0.2 in}} $>$$>$ file {\hspace{0.3 in}} open file as stdout in append mode

Any of the above may be preceded by a digit (0-9), in which case the file
descriptor corresponding to the digit is used instead of 0 (stdin) or 1
(stdout) - e.g. 2$>$ errors redirects the error stream stderr to file errors.

\subsection{Single character escape}

The sequence $\backslash$c is replaced with a single character as follows -

\small\tt \begin{verbatim}   c        character

   n        newline        10
   t        tab             9
   b        backspace       8
   r        return         13
   f        formfeed       12
   v        vertical tab   11
   other    c
\end{verbatim}\rm  \normalsize 
\subsection{Wild card expansion}

Any blank-delimited string containing the characters ?, * or [ is treated
as a filename template and is expanded to a sorted list of corresponding
filenames -

\small\tt \begin{verbatim}   ?     matches one character
   *     matches any number (including 0) of characters
   [...] matches any one of the characters enclosed.  A range of
         characters may be specified with a hyphen, e.g. a-z.
\end{verbatim}\rm  \normalsize 
Matching does not take place in the directory part of filenames.

\subsection{Symbol substitution}

A \$ followed by an optionally bracketed (with \{\}) string of letters and digits
is replaced with its value if the string was previously defined using
symbol=value.

{\hspace*{0.2 in}} examples:

{\hspace*{0.4 in}} \} symb=fred
\\ {\hspace*{0.4 in}} \} echo hallo\$\{symb\}die
\\ {\hspace*{0.4 in}} hallofreddie
\\ {\hspace*{0.4 in}} \} flags=-O -c
\\ {\hspace*{0.4 in}} \} cc68 \$flags *.c

\subsection{Quoting}

Strings separated with space or tab are considered to be separate tokens,
unless quoted using matching pairs of ' or ", in which case the whole
quoted string is treated as one token. No wild card expansion is done
inside quoted strings, and symbol substitution and single character
escapes are also suppressed if the quote character is '.


\section{Compiling and linking C programs}
A Unix-style cc command is available for the new C compiler. Its options
correspond fairly closely to the original. The command is nc:cc68 with the
following options:

\small\tt \begin{verbatim}   -o    name of output file
   -c    Compile named files to .mob, but do not run the linker
   -p    Generate profiling code (count function calls)
   -pt   Generate profiling code (accumulate cputime per function)
   -O    Run peephole optimiser
   -l    Maintain source line number in register d5
   -L    Generate trap #15 instruction before each source line for tracing
   -S    Compile named files to .a68
   -P    Run the preprocessor and leave macro-expanded source in .i
   -E    As -P, but output to stdout
   -Dx   Mark preprocessor symbol x as defined.
   -Ux   Mark preprocessor symbol x as undefined.
   -Ix   Add directory x to the list of directories to search for #includes.
\end{verbatim}\rm  \normalsize 
The options are followed by a list of source files. The files with .c
extensions are compiled, those with .a68 extensions are just assembled. By
default (unless -c is given), all the resulting .mob files are linked to produce
a file named a.mob (or as specified after -o).
Examples:

{\hspace*{0.2 in}} Compile a simple one-module program with no external data:
\\ {\hspace*{0.4 in}} nc:cc68 -c -O prog.c
\\ {\hspace*{0.2 in}} This produces prog.mob which can be run without further linking.

{\hspace*{0.2 in}} Compile and link a one-module program which does refer to uninitialised
\\ {\hspace*{0.2 in}} external data:
\\ {\hspace*{0.4 in}} nc:cc68 -O -o prog.mob prog.c
\\ {\hspace*{0.2 in}} If -o prog.mob is omitted, the output is put in a.mob.

{\hspace*{0.2 in}} Compile and link all .c programs in a directory:
\\ {\hspace*{0.4 in}} cc68 -O -o prog.mob *.c

{\hspace*{0.2 in}} Compile a mixture of C and assembler sources:
\\ {\hspace*{0.4 in}} cc68 -O prog1.c prog2.c prog3.a68

\subsection{Details of the compile and link phases}
\subsubsection{cpp - preprocessor}

The preprocessor expands '\#' directives in the source file, and produces an
output acceptable to the compiler. It is run automatically by cc68, or can be
invoked directly:

\small\tt \begin{verbatim}    nc:cpp file.c file.i

    Flags:  -C              do not delete comments
            -Dname=val      define name, as if by #define
            -Dname          define name=1
            -Idirectory     search directory for #include files         
            -P              do not insert line directives (#line 12, foo.c)
                            in the output
            -R              allow macro recursion
            -Uname          remove any built-in definition of name
\end{verbatim}\rm  \normalsize 
The symbols apm and mc68000 are predefined in this version of the preprocessor,
so machine dependent code can be expressed as:

{\hspace*{0.5 in}} \#ifdef apm
\\ {\hspace*{0.5 in}} /* do one thing */
\\ {\hspace*{0.5 in}} \#else
\\ {\hspace*{0.5 in}} /* do another */
\\ {\hspace*{0.5 in}} \#endif

Use \#ifdef apm for apm dependencies (e.g. filenames) and \#ifdef mc68000 for
architecture dependencies (e.g. byte sex).

\subsubsection{c68 - C compiler}

The compiler reads a 'pure' C source file (after preprocessing) and produces an
assembler file suitable for the MIT assembler a68.

\small\tt \begin{verbatim}    nc:c68 file.i file.a68

    Flags:  -l              generate code to maintain the line number in
                            register d5 (displayed in run-time error
                            messages).
            -XL             generate line number traps, so the program can
                            be traced using the software front panel.
            -XP             generate profiling code (count function calls)
            -XT	        generate code to count cputime per function.
\end{verbatim}\rm  \normalsize 
Assembler instructions can be included in the source file using the asm(..)
directive. Example:

{\hspace*{0.5 in}} asm("trap \#15"); asm(".word 999"); {\hspace{0.3 in}} /* line trap 999 for sfp */

The instructions must correspond to MIT's idea of the M68000 op-codes - see
C:A68.DOC

\subsubsection{o68 - optimiser}

The optimiser endeavours to reduce the size of an assembler file. This applies
not only to the code section, but also involves removing redundant symbol
information, which would otherwise slow down the assembler. There is
considerable latitude for improvement in the raw compiled code, so use of the
optimiser is highly recommended.

{\hspace*{0.5 in}} nc:o68 infile outfile

\subsubsection{a68 - assembler}

The assembler processes the output of the compiler, or user written assembler
programs, and produces an APM-style object module file.

{\hspace*{0.5 in}} nc:a68 file {\hspace{0.9 in}} -- assemble file.a68 to file.mob

Details of the instruction formats accepted by the assembler can be found in
C:A68.DOC. Changes and additions made for version 2 include:

{\hspace*{0.2 in}} New control instructions:

\small\tt \begin{verbatim}    .insrt   "filename"       (include text of named file.  If it is not
                               in the current directory, C: is tried)
    .if      value            (if operand is non-zero, assemble following)
    .elif    value            (  else if this operand in non-zero ...    )
    .else                     (  else .... )
    .endif                    (end of conditional )
\end{verbatim}\rm  \normalsize 
{\hspace*{0.2 in}} New operand formats:

\small\tt \begin{verbatim}    ?symbol                 ( = 0 if symbol is defined in .text section,
                            else 1 )
    [symbol,register]       ( equivalent to .pc@(symbol-.-2) if symbol
                            is defined in the .text section, otherwise
                            register@(symbol-_dbeg), where _dbeg is
                            the beginning of the .data section)
\end{verbatim}\rm  \normalsize 
{\hspace*{0.2 in}} New data declarations:

\small\tt \begin{verbatim}    .vect    "name",.extdata,size These cause space to be reserved for an
    .vect    "name",.sysproc      import vector of appropriate type, and an
    .vect    "name",.extproc      appropriate entry in the import list of the
    .vect    "name",.dynproc      .mob file.  If the type is .extdata, the
                                  optional value 'size' gives the minimum
                                  required size of the object referenced.
\end{verbatim}\rm  \normalsize 


\subsubsection{Clink - the C linker}

\subsubsection{When do I need to clink ?}

Clink is similar to the standard apm link program. It combines a number of .mob
files, resolving cross-references among them. There are two reasons why the
linker is language dependent. Firstly, the apm object module format was not
specified with pre-linking in mind. The header contains insufficient
information to locate the initialised data within the file, in order to fix
import vectors. Secondly, the apm loader does not support the concept of common
data blocks, which are referenced but not defined in any module of a program,
and should be allocated space by the loader.

The standard linker analyses the reset code to find the initialised data, but
this only works for products of H-series compilers. Clink expects to find a
'secondary header' at the beginning of the code section, which specifies the
location and size of the initialised data. This is provided by a68, so any .mob
file produced using cc68 will have one. If the initial value of a data import
vector is greater than zero, this is taken to signify a reference to a common
block, of size at least equal to the value. All references to the same symbol
among the files being linked are resolved to the same address within the final
data area, either to the defining instance if there is one (more than one
definition is an error), or to an address beyond the final location of the
initialised data. The data size requirement of the output module is adjusted
accordingly, and code is added to the reset routine to clear this common area
before the program is run.

Any program which references common data, even if it consists of only one
module, will have to be processed by clink, or messages of the form
Cannot find $<$symbol$>$ will be encountered on loading. Common references result
from external declarations of the form

{\hspace*{0.2 in}} int a; {\hspace{1.5 in}} /* better to declare static or initialise */
or
\\ {\hspace*{0.2 in}} struct \{ ... complicated ... \} array[9999]; /* leave as is and clink */

These values may be declared and initialised in another module, in which case
the reference is resolved in the normal way, either by clink or dynamically on
loading if clink is not used. But if there is no initialising declaration, the
reference must be resolved statically using clink. The code required to access
the variable is more efficient if it is declared with storage class static or,
if it is really referenced in other modules, given an initial value. However,
this means the value is allocated space in the initialised data image, so large
structures or arrays are best left to be allocated by clink.

\subsubsection{Parameters}

\small\tt \begin{verbatim}   -c             suppresses allocation of common data blocks
   -e filename    file contains a list of symbols to be excluded from
                  the export list
   -f filename    file contains a list of .mob file names to be linked.
                  The names would normally be placed on the command line.
   -i filename    file contains a list of symbols to be included in the
                  export list
   -o filename    output is written to file.  Default is a.mob.
   -v             a list of the modules linked with the position and size
                  of their code and data in the output module is produced
                  (verbose)
\end{verbatim}\rm  \normalsize 
The -e or -i filename can be '-', in which case stdin is used. A line with a
single '/' terminates a list of names. The filename can also be *, in which
case all the relevant names are selected.

Input files can be either .mob files (the usual case) or archives (produced by
ar) containing a number of .mob files.
\\ {\hspace*{0.2 in}} Examples:

{\hspace*{0.4 in}} \} clink -e$\backslash$* -o prog.mob prog.mob
\\ {\hspace*{0.4 in}} \} clink -i - -v -o lib.mob mod1.mob mod2.mob mod3.mob
\\ {\hspace*{0.4 in}} mod1entry mod1func mod2data
\\ {\hspace*{0.4 in}} /


\subsubsection{Filename extensions}

\small\tt \begin{verbatim}        .h      a C include (header) file
        .c      a C source file
        .i      a preprocessed C source file
        .a68    an assembler source file
        .mob    an apm object module
\end{verbatim}\rm  \normalsize 


\section{Execution profiling}

Execution profiles can be generated using the compiler flags -p and -pt.
With -p each call of a function is counted, and the totals are
displayed when the program terminates. With -pt the cputime (in
milliseconds) spent in each function is accumulated and displayed on
program termination. The results are written to the terminal, or into
a file if the symbol pro\_file gives a filename.

Cputime totals are only correct if a function is exited normally (via
a return statement or through the end of the function). If exit is via
a signal or longjump, the time is incremented by 1 ms.


Example:

\small\tt \begin{verbatim}     854  Proc_2()
     587  Func_3()
    2498  Proc_6()
    2613  Proc_3()
   11167  Proc_1()
    5151  Proc_8()
    1854  Proc_7()
    1790  Func_1()
    4510  Func_2()
     741  Proc_4()
     583  Proc_5()
   42538  main()
\end{verbatim}\rm  \normalsize 
The results can be processed using cutils:sort -nr pro\_file

\section{Stack trace-back on error}
Stack trace-back shows which functions were active when the program hit
an error. Example:

{\hspace*{0.5 in}} abort(8F624B) line 300
\\ {\hspace*{0.5 in}} Proc7() + 36
\\ {\hspace*{0.5 in}} Proc0() line 178
\\ {\hspace*{0.5 in}} main(1, 8F8B68) line 117

The line number is displayed if line numbers were requested at compile
time (by the -l flag). If a line number is not available the offset
from the start of the function is shown instead. Function parameters
are shown in hex. The symbol cdebug controls trace-back. Its value is
the sum of the following options:

{\hspace*{0.6 in}} 1 trace-back if stopped by CTRL-Y
\\ {\hspace*{0.6 in}} 2 trace-back if program returns non-zero status
\\ {\hspace*{0.6 in}} 4 trace-back if program returns zero status
\\ {\hspace*{0.6 in}} 8 show function parameters

The default value is cdebug=8.

The trace-back is displayed on the terminal, and then written into the file
'stacktrace' in the current directory.

\section{Libraries}

Run-time support, unix system calls and standard subroutines are available in
nc:libc.mob, which should always be installed to run C programs. Transcendental
maths functions are in nc:libm.mob, and links to Fred's graphics routines are in
nc:libg.mob.


\subsection{System calls}
Simulations of the UNIX system calls available on the APM:

\small\tt \begin{verbatim}access(2)   existence and executability are equated to mode 4 (read)
acct(2)     not available
alarm(2)    fully implemented
brk(2)      not available - use sbrk()
chdir(2)    change filestore default directory
chmod(2)    set owner and world protection on filestore file
chown(2)    not available
close(2)    fully implemented
creat(2)    mode interpreted as per chmod(2)
dup(2)      fully implemented
exec(2)     fully implemented
exit(2)     fully implemented
fork(2)     dummy routine
getpid(2)   returns filestore user number
getuid(2)   dummy routines
indir(2)    not implemented
ioctl(2)    dummy routine, returns 0 for a tty, -1 otherwise
kill(2)     dummy routine
link(2)     dummy routine - rename(n1, n2) provides alternative
lock(2)     not implemented
lseek(2)    seek past end of file not supported
mknod(2)    not implemented
mount(2)    not implemented
mpx(2)      not implemented
nice(2)     dummy routine
open(2)     ":" is interpreted as stream 0 (usually console),
            other names are passed to the filestore
pause(2)    fully implemented
phys(2)     not implemented
pipe(2)     dummy routine
pkon(2)     not implemented
profil(2)   dummy routine, alternative prvided (see profiling)
ptrace(2)   not implemented
read(2)     fully implemented
setuid(2)   dummy routine
signal(2)   partially implemented
stat(2)     st_mode and st_size implemented.  All times set to FS timestamp
stime(2)    not implemented
sync(2)     not implemented
time(2)     dummy routine
times(2)    tms_utime set to 50 * seconds since APM boot, others 0
umask(2)    dummy routine
unlink(2)   interpreted as delete file
utime(2)    not implemented
wait(2)     dummy routine
write(2)    fully implemented
\end{verbatim}\rm  \normalsize 
\subsection{C Subroutines}

The C subroutines, as documented in section 3 of the UNIX manuals, have been
compiled from UNIX source code, so they should be complete. Any which rely on
unavailable system calls or UNIX-specialities, e.g. the password file, will of
course not work.

\subsection{Maths routines}

The standard UNIX maths library for calculating transcendental functions is
available in nc:libm.mob


\section{Utilities}
\subsection{xr - cross-referencer}

This utility takes a list of object files and libraries, and analyses the
interdependencies of symbol reference and definition among them.

\small\tt \begin{verbatim}    nc:xr {flags} files

    Flags:  -f      three column output consisting of filename, symbols
                    defined, symbols referenced.
            default five column output of symbol name, symbol type, value,
                    defining file, referencing files.
\end{verbatim}\rm  \normalsize 

\vspace{.75in} c:v2.hlp printed on 05/04/89 at 21.04

\newpage
\tableofcontents
\end{document}