\documentstyle[a4,12pt]{article} \begin{document} \author{Rainer Thonnes} \title{APM NS 32000 Cross-Assembler} \maketitle \parskip .1 in \setcounter{secnumdepth}{10} \parindent 0in \section{Preamble} EUCSD Cross-Assembler for the National Semiconductor 32000 Series \section{User Notes} This document describes how to use the NatSemi assembler which is available on the EUCSD APMs. \section{Using the Assembler} The APM command "ASSEM:NS32000 HARRY" will assemble the sequence of statements held in file HARRY.32K, and generate two output files. Object code (in a locally defined format, described below) is sent to HARRY.O32, and a listing is sent to HARRY.LIS. \section{Assembly Language} The assembler recognises all the instruction mnemonics listed in National Semiconductor Company's publication "Series 32000 Instruction Set Reference Manual", and in addition supports a number of locally defined assembler directives. These are described below. The source file contains up to one statement per line. Blank lines are accepted and ignored. Comments begin with a semicolon and occupy the entire rest of the line. Statements are either machine instructions or assembler directives, and may be labelled. Labels, where they occur, must begin in column 1, and be separated from the statement proper by one or more spaces (or by a colon and any number of spaces). Non-labelled statements must have a space in column 1. Labels and other names may be defined by the programmer. They must contain alphanumeric characters only, and must begin with an alphabetic. Upper and lower case are not distinguished. Names may be up to 255 characters long. Names may be used to define literals or arbitrary operands, but not to redefine instruction mnemonics. Where different names are given to registers (R0-R7, F0-F7), these will only be recognised in the context of general operands, not, for example, in register lists as in SAVE [R0,R5]. Literals are either quoted characters (e.g. 'F', '5', '\%', ''''), or numbers. Numbers are interpreted in decimal radix unless over-ridden using '\_'-notation. So 2\_101111, 8\_57, 16\_2F, and 47 are all the same thing. No other notation for non-decimal radix is supported. Literals may be signed using '-' (minus) or '\' (not). In certain contexts constant expressions are allowed, in which all the usual integer operations are supported (+, -, *, /, \%(remainder), \&, !(or), !!(xor), $<$$<$(logical shift left), $>$$>$(l.s.right)). These operations are performed strictly left-to-right, i.e. all operations have the same precedence. No parentheses are allowed. Floating point literals may be specified. These consist of an integer part (which may be zero, but not absent), followed by a '.' and a string of decimal digits. Although the integer part may be specified in any radix, the fractional part is always in decimal. Floating point numbers are stored internally in IEEE format, and subsequently treated as integers. Thus constant expressions involving floating point numbers are allowed, but meaningless. \section{Restrictions} No checking is performed on whether operands are valid in the contexts in which they occur (for example using a literal as a destination operand). In particular, R0-R7 and F0-F7 are not distinguished. The assembler makes no attempt at forward-jump-squeezing. Displacements can be 1, 2, or 4 bytes long, and the shortest form will be chosen for backward jumps, but forward jumps will always use 4-byte displacements. \section{Assembler directives} EQU defines a name to be an operand. The name appears in the label field, the equivalent operand expression appears in the operand field. ORG resets the location counter. The operand must be a constant value. If an ORG statement is labelled, the label is defined as the value of the location counter BEFORE it is reset. The assembler maintains two independent location counters, one for the code area, the other for the data area. Both are assumed initially zero, and the code counter is selected at the start. Two directives, CODE and DATA are used to switch between the two areas; they expect no operands. If labelled, the label is defined as a reference to the old location counter. DC:B, DC:W, DC:D (DC:F, DC:L also accepted) are used to plant in-line values (allowed in data area only), of length 1, 2, 4 (4, 8) bytes, respectively. The operand field contains one or more contant expressions, separated by commas. In the case of DC:B, an operand may also be an ASCII string delimited by double quotes. DS:B, DS:W, DS:D (DS:F, DS:F) are used to reserve storage (in the data area) for a specified number of variables of the specified length. The operand field contains a single constant, the number of locations required. The effect is that the location counter is incremented by the product of that number and the location width (1, 2, 4, (4, 8)). A special case is when the operand (number of locations required) is zero. In this case the location counter is rounded up to the nearest multiple of the location width, for alignment purposes. No such alignment takes place unless the operand is zero. IMPORT is used to reference a symbol defined in another module. A label must be given, and is defined to be the next entry in the link table. The operand field contains the external name of the symbol (case significant but otherwise constrained by the same rules as names). EXPORT is used to define an external symbol for use by another module. No label is necessary. The operand field contains the external name. END is used to mark the end of a module, should it be desired to assemble several modules together in one file. \section{Description of object format} The object format is a stream of data bytes and instructions to a linker or loader. It is assumed that this linker or loader maintains three location counters, all initially zero, one for the code area and one for the data area, plus a third for a link table. Bytes appearing in the object file are appended one by one to whichever of the code or data areas is currently selected (initially the code area is assumed selected). Adding a byte to an area has the effect of incrementing the location counter for that area by one. There is an escape character (16\_5e) which causes the loader to take special action. The loader reads the next byte. If it is also the escape character, then that is added to the current area and normal processing continues. Different values cause the following actions: \small\tt \begin{verbatim}0 - Switch to code area 1 - Switch to data area 2 - End of module 3 - Set current location counter [followed by a four-byte value, LS first] 4 - Export an external label [followed by a length-prefixed string] (define an external symbol of that name as being in the current area, and having the value of the current location counter.) 5 - Import an external label [followed by a length-prefixed string] (reference an external symbol defined previously (usually in another file), and add a four-byte entry describing it (either as a 32-bit data address or as a pair of 16-bit values (module number, offset)) to the current module's link table, incrementing the link index by one. \end{verbatim}\rm \normalsize Documentation dated 19/12/86 \vspace{.75in} assem:NS32000.doc printed on 14/03/89 at 15.27 \newpage \tableofcontents \end{document}