.pg 5. ROBOTIC VISION A robot whose only sensory capabilities are touch and force is still almost totally isolated from the real world. This would be so even if touch and force sensing could be as good as their human counterpart; to imagine the sensory information available to one of today's robots, think of being blind and deaf and wearing woolen mittens in sub-zero temperatures which have numbed your fingers and frozen the mittens stiff. You can still 'feel' objects in your grasp, but only by a sense of muscle resistance, and you can only sense the grossest characteristics of shape. Being blind and deaf you can only become aware of new objects by actually touching them, and only memory will allow you to keep track of 'known' objects. .ti 4 This analogy helps to explain why providing a robot with a visual sense is such a desirable goal, even though the technology of visual processing is still fairly rudimentary. At present it is possible for a robot vision system to find and identify objects provided that it has an adequate description of the object and that the lighting conditions are well controlled so that the object can be seen as a clear silhouette. Finding and identifying objects when they overlap (the 'bin picking' problem) or in normal industrial lighting coditions is still the subject of a considerable amount of research work. There are various ad hoc systems that work when figure-ground separation is difficult; many function by assuming that .ul 1 only objects known to the system are ever viewed, and that these objects can be distinguished from each other by some test that looks for special features. For example, one test might be to extract a line drawing from the image (this is by no means easy) and then match some lines against some edges of known models of the various objects after suitable projective transformation. Usually easier and more specialised tests are applied - for example, if the objects are all washers of different sizes the test might be to identify the washer by computing the ratio of the area of the hole to the area of the washer. .ti 4 There are other tasks to which a vision system might be applied besides finding and identifying objects, such as checking that the robot's internal model of the world corresponds to reality or measuring some quantity such as how far a shaft has been pushed into its bearing. What tasks can be tackled clearly depends on what information is provided by the vision system. Using more than one camera makes it possible to get range and other 3-D information for objects that are simultaneously in the view of both cameras, though the amount of processing this requires is large. Using a colour camera in theory makes it possible to get much better quality of information than is provided by a monochrome camera, though there is still a lot of research work yet to be done on this. .ti 4 A good vidicon camera, the sort used by the BBC for TV, is expensive. There are cheap ones available, but the linearity and stability of the image they produce is poor. Although in many applications it might be possible to have the software compensate for the non-linearity, it is harder to compensate for the correspondingly poorer accuracy. Vidicon cameras can suffer from noise problems (e.g. temperature variations can have a major effect on picture quality) that may not bother a TV viewer but will bother a computer vision system. It is possible to compensate by taking an average image over a number of consecutive frames, but this increases the minimum time to process a picture. In general it is better to choose one of the various sorts of solid state array cameras that are now available. These depend for their functioning on the fact that many materials, including many semiconducters, 'release' electric charge when light falls on them and the quantity depends on the light intensity. In CCD (charge-coupled device) cameras the released charge is moved by varying the potential across two (or more) interlaced sets of electrodes on the surface of the material, at a known rate. The moving charge is a current whose magnitude depends on the amount of light. In diode array cameras the released charge affects the conduction characteristics of diodes; again the current flowing for a given applied voltage depends on the amount of light. Devices called RAMeras are now becoming available; they consist essentially of semiconductor memory on which an image is projected. If one fills the memory with 1s and waits briefly, the light will cause some of the 1s to deteriorate to 0s and the image is obtainable merely by reading the memory! RAMeras are very cheap but so far offer only poor resolution. It is possible to obtain solid state array cameras that provide reasonable resolution combined with good linearity and stability, though not cheaply (yet). Typical square arrays might produce a 100x100 or 256x256 image. A higher resolution can be had by using a linear array, say 1x2048, combined with a mechanical scan on the second axis to generate a 2-D image. If the vision system is only required to view objects passing on a conveyor belt, then the belt motion provides this mechanical scanning for free!