description of deimos
                   an operating system for pdp 11s



Design aims

   DEIMOS was designed for operation in a medium sized PDP 11.  At
least  16k words of core, a memory management unit, a disc or similar
fast mass storage device, a terminal and a clock are required.

   The system was designed with six main aims:

    Running user programs
       The  system  is  designed  to  run  general  user   programs.
    Normally,  about twenty simultaneous programs are supported, but
    this figure is a parameter at system  generation.  Each  program
    runs in its own virtual memory environment (VM), not necessarily
    limited  to  the  hardware's  mapping  limit  of  32k words. The
    system, and other user programs, should be fully  protected  from
    the failure of a user program.

    Multiple terminal support
       The  system  supports  multiple terminals, each terminal can,
    optionally, be linked to a  command  language  interpreter  which
    will  enable  the  user to initiate and control programs from the
    terminal.

    Peripheral support
       The system supports a wide range  of  peripherals,  eg,  line
    printer, card reader, paper tape reader and punch, various discs,
    magnetic  tape,  graph  plotter, asynchronous communication lines
    and a synchronous communication  line  running  under  the  HDLC
    protocol.  New peripherals can be added with minimum disturbance
    to the rest of the system.

    Self Supporting
       The system is self supporting on a medium core  configuration
    (approx  28k  words).  This size is required in order to run the
    compiler

    Swapping
       The number and size of user programs is currently limited  to
    the  physical  store  size  of  the  machine,  a limited swapping
    strategy will be implemented to support a virtual store  size  of
    two to three times the physical store size.

    Minimal resident section
       The  size  of  the  resident system is kept small to allow as
    much store as possible for user programs.



Constraints





   The resident part of the system has been kept reasonably small to
enable the system  to  run  in  smaller  core  configurations.  This
constraint effects the overall design, distinguishing it from systems
with  a  large set of facilities like UNIX and RSX11D which require
at least 48k words of core to do useful work.



Structural Overview
   The system is based on the concept of a number of tasks, each  of
which  manages either a resource or a user program, which commnnicate
and synchronise with each other using  'messages'.  The  concept  of
messages comes from EMAS, the GEC 4080 and others.

   The  user interface is heavily influenced by a number of machines
running in the Computer Science department.

   The system has two  main  sections,  a  resident  section  and  a
potentially swappable section.

The  resident  section  consists  of  a  kernel and the mass storage
device handler which runs as an otherwise standard system task.

The kernel provides the following services:
    1) Controls the CPU allocation.
    2) Passes interrupts to their device handlers.
    3) Passes messages between tasks, storing them if necessary.
    4) Supports the  virtual  memories,  including  mapping  between
    them.
    5) Provides clock and timer functions.
    6) Controls core allocation control.

   All  peripherals  and  other  system functions, e.   g., the file
storage handler,  command  language  interpreter  and  loader  -  are
handled  by  system  tasks.  The system tasks are 'privileged', this
entitles them to access parts of the real machine and other tasks.

   A new task is created when a user program is run and  is  deleted
on  its termination. A task consists of a virtual memory environment
and a 'task descriptor block', held within the  kernel.  A  task  on
this  system  does  not  have  a  'task  monitor'; all interrupts and
messages are processed by the task in its 'user' state.

       The virtual memory environment of a task consists of a number
    of segments, these segments are used to hold  the  program  code,
    data  areas  and  shared system code. The hardware of the PDP11
    allows eight segments to be mapped onto real store at  any  given
    time,  giving  a  virtual memory address space of 32k words. The
    number of segments owned by a task is not limited  to  eight.  A
    segment  may  be mapped in a read only or a read/write mode which
    allows protection of code areas.

       The 'task descriptor block' contains the registers (when  the
    task is not actually executing) and other information such as the





    state,  priority  level  and  message  queue that constitutes the
    context of the task.

   The list of segments used by all  tasks  is  held  in  a  GLOBAL
SEGMENT  TABLE  within  the  kernel, with the core or disc address,
access permission and the number of tasks using  the  segment.  This
table  enables  the  kernel  to  maintain  control  over the usage of
segments and can easily determine what parts of tasks may be  swapped
out.  The  core  address,  access  permission and a pointer into the
global segment table are also maintained within the  task  descriptor
to optimise the context switch.

   If  a  task  fails  with  either a hardware fault (eg, an address
error or memory protection violation), or with a  fault  detected  by
software(eg.   an  illegal  supervisor  call  or message), the kernel
generates a  message  to  a  'system  error  task';  to  allow  later
investigation  the  failed  task  is  prevented from continuing. The
'error task' informs the user of the task failure and the reason  for
it. The 'error task' is also used by some system tasks to inform the
operator about the state of devices.

   All  communication  in  the  system  is done by sending messages.
These messages are queued by the kernel, if  necessary,  until  they
are   requested.   Interrupts  are  handled  similarly,  the  kernel
generates and queues a message for the appropriate task. A table  is
used  to  determine  which  task  a message (or interrupt) is for. A
supervisor call is provided to enable tasks to 'link' thenselves to a
particular message number. This  is  slightly  less  efficient  than
direct  ownership  but  enables device handlers to be configured into
the system dynamically.

   The address of a data area may  be  passed  by  a  message.  The
segment  containing this area may then be mapped from the callers VM
to the receivers VM. Currently this mechanism is only used to share
segments, which are eventually returned to the caller. There  is  no
restriction  to  stop  segments  actually  being  transferred by this
method.

   Input/output on this system  uses  a  separate  segment  in  each
user's  task to hold its I/O buffers. This will allow the kernel to
swap the major part of a task whilest slow I/O is in progress.  The
sharing  of  segments,  as  described  above,  is  used by the device
handlers to process the buffers, the segment  being  released  and  a
reply sent on completion.















Implementation

   This system was written in IMP. IMP was chosen for a variety of
reasons:
1)  a  deliberate  decision  was  made  to write the system in a high
    level language for ease of expansion and maintainability.
2)  IMP is a proven systems implementation language, for example  it
    was  used  to  write  both the System Four and 2900 versions of
    EMAS.
3)  A new implementation of IMP was available on the  PDP11,  that
    was  of  a  high enough standard to consider using it for systems
    implementation.

   Two modules of the system have been written  in  assembler.  The
first  module  is  at  the lowest level of the system, loading up the
registers on context switching. Since there are no explicit register
manipulations in IMP, it forced this module to be in assembler. The
second assembler module  provides  the  run  time  support  for  IMP
programs. This module could probably be converted to IMP later, but
was  written  in  assembler  for bootstrapping reasons. Fortunately,
these two sections are changed infrequently as they have proved to be
a disproportionate source of problems in relation to their size.

   The rest of the system consists of six IMP  modules,  comprising
the  kernel  and  the  system  tasks.  These  modules  are  compiled
separately and then 'linked' by a purpose  built  linker  which  also
sets up the bootstrapping area.

   Application  programs,  with  the exception of the editor - which
was brought from the previous system - have  been  written  in  IMP.
The sources are held and compiled on the system.


Operation

   The  first  system,  operational  since  May 1976, is used for a
large spooling RJE system with the usual peripherals plus a magnetic
tape drive and a graph plotter. The second system, also  operational
in  May  1976,  is  part  of  ERTE  (Edinburgh  Remote  Terminal
Emulation - see ???). The third system, on a PDP 11/34 used as the
front end for the ERCC 2970  running  EMAS  became  operational  in
December  1977;  the  fourth and fifth systems are used part time as
the drivers for an interactive benchmark of two  ICL  2900s  running
VME/K   and VME/K.   At   this  stage  swopping  has  still  to  be
implemented, though most of the necessary kernel features are already
present.