.pg 5. ROBOTIC VISION
A robot whose only sensory capabilities are touch and force is still almost
totally isolated from the real world. This would be so even if touch and
force sensing could be as good as their human counterpart; to imagine the
sensory information available to one of today's robots, think of being blind
and deaf
and wearing woolen mittens in sub-zero temperatures which have numbed your
fingers and frozen the mittens stiff. You can still 'feel' objects in your
grasp, but only by a sense of muscle resistance, and you can only sense the
grossest characteristics of shape.
Being blind and deaf you can only become aware of new objects by actually 
touching them, and only memory will allow you to keep track of 'known'
objects.
.ti 4
This analogy helps to explain why providing a robot with a visual sense
is such a desirable goal, even though the technology of visual processing is
still fairly rudimentary. At present it is possible for a robot vision
system to find and identify objects provided that it has an adequate
description of the object and that the lighting conditions are well
controlled so that the object can be seen as a clear silhouette. Finding
and identifying objects when they overlap (the 'bin picking' problem) or
in normal industrial lighting coditions is still the subject of a
considerable amount of research work. There are various ad hoc systems that
work when figure-ground separation is difficult; many function by assuming 
that
.ul 1
only
objects known to the system are ever viewed, and that these objects can be
distinguished from each other by some test that looks for special
features. For example, one test might be to extract a line drawing from the
image (this is by no means easy) and then match some lines against some edges
of known models of the various objects after suitable projective 
transformation. Usually easier and more specialised tests are applied - for
example, if the objects are all washers of different sizes the test might be
to identify the washer by computing the ratio of the area of the hole to the
area of the washer.
.ti 4
There are other tasks to which a vision system might be applied besides
finding and identifying objects, such as checking that the robot's internal
model of the world corresponds to reality or measuring some quantity such as
how far a shaft has been pushed into its bearing.
What tasks can be tackled clearly depends on what information is provided by
the vision system. Using more than one camera makes it possible to get range
and other 3-D information for objects that are simultaneously in the view of
both cameras, though the amount of processing this requires is large. Using
a colour camera in theory makes it possible to get much better quality of
information than is provided by a monochrome camera, though there is still
a lot of research work yet to be done on this.
.ti 4
A good vidicon camera, the sort used by the BBC for TV, is expensive. There
are cheap ones available, but the linearity and stability of the image they
produce is poor. Although in many applications it might be possible to have
the software compensate for the non-linearity, it is harder to compensate for
the correspondingly poorer accuracy. Vidicon cameras can suffer from noise
problems (e.g. temperature variations can have a major effect on picture
quality) that may not bother a TV viewer but will bother a computer vision
system. It is possible to compensate by taking an average image over a number
of consecutive frames, but this increases the minimum time to process a 
picture. In general it is better to choose one of the various sorts of solid
state array cameras that are now available. These depend for their functioning
on the fact that many materials, including many semiconducters, 'release'
electric charge when light falls on them and the quantity depends on the light
intensity. In CCD (charge-coupled device) cameras the released charge is
moved by varying the potential across two (or more) interlaced sets of
electrodes on the surface of the material, at a known rate. The moving charge
is a current whose magnitude depends on the amount of light. In diode array
cameras the released charge affects the conduction characteristics of diodes;
again the current flowing for a given applied voltage depends on the amount
of light.
Devices called RAMeras are now becoming available; they consist essentially of
semiconductor memory on which an image is projected. If one fills the memory
with 1s and waits briefly, the light will cause some of the 1s to deteriorate
to 0s and the image is obtainable merely by reading the memory! RAMeras are
very cheap but so far offer only poor resolution. It is possible to obtain
solid state array cameras that provide reasonable resolution combined with
good linearity and stability, though not cheaply (yet). Typical square arrays
might produce a 100x100 or 256x256 image. A higher resolution can be had by
using a linear array, say 1x2048, combined with a mechanical scan on the
second axis to generate a 2-D image. If the vision system is only required to
view objects passing on a conveyor belt, then the belt motion provides this
mechanical scanning for free!