|
DIGITAL OPHTHALMOLOGIST
ROBOT VISION COMING OF AGE
by Daithi ó hAnluain
CARS that drive themselves, guides for the blind and even hard-working fruit pickers are just some of the applications of robotic vision currently in research. But the quest to give sight to machines has also revealed hints about our own evolutionary heritage and insights into how the visual cortex works.
"There is a link between intelligence and vision. All highly intelligent species also have highly evolved visual systems. Perceptual ability of an organism is directly responsible for its intelligence. We cannot expect to construct intelligent machines without providing them with capable vision systems," said Vladimir Brajovic PhD, Director of the Computational Sensor Laboratory at Carnegie-Mellon University , Pittsburgh , Pennsylvania USA .That's just one of the insights gained by research into robotic vision, usually called machine or computer vision, in the last 40 years. Another insight reveals that the human visual system is more complex than the pioneers of robotic vision ever suspected.
"We still don't know how a lot of it actually works," said Andrew Fitzgibbon, imaging researcher at Oxford University 's machine vision team at New College . In the 1960s MIT's Marvin Minsky, one of the pioneers of artificial intelligence, believed that machine vision was a simple problem that could be solved by one PhD student. Now scientists realise that vision is a much, bigger problem than Dr Minsky thought. "We've had 40 years of computer vision research and we still can't imitate the visual system of a common housefly. It's a tiny insect with a tiny brain and yet its visual system is amazingly capable. We can't match it. And if we could match it, would we be able to put it on a computing platform available to us? Is that even possible, given that the way neurons compute is completely different from the way digital computers do? We don't know yet," said Dr Brajovic.
Dr Minsky's mistake was that he believed the human vision system itself is simple, and that it would be a routine affair to replicate that simple system in machines. Of course it is relatively simple to get an "image" either from light, laser or sonar sensors. To go from image to meaning, say "predator" or "food", takes a leap of understanding. "We want to move from photons to decisions," said Dr. Brajovic. The main problem, then, is not giving human sight to machines. The main problem is understanding what human sight actually does in the first place. Once Dr Minsky and his contemporaries got the question right, they settled down to a long period of biomimetic research - attempting to replicate biological processes in machines. "Machine vision grew out of artificial intelligence research, which in the early days had a lot of influence from fields like psychology, philosophy and pure math. But to solve this problem really requires a mainstream, rigorous engineering approach, so we bring knowledge from optics and electrical engineering and applied physics. Research into machine vision has really matured as an engineering discipline," said Dr Larry Matthies, supervisor of the Machine Vision group at the Jet Propulsion Laboratory in Pasadena , California USA .
Back to top...
There are now two schools of thought on machine vision. One, the engineering school, exemplified by Dr Matthies, believes that computer vision is essentially an applied research and engineering problem. "The feasibility and utility of machine vision has been proven, and steady technical and mathematical advances will solve any remaining issues over time. But some subproblems, like object recognition, are not well understood today and may require revolutionary new approaches to solve.
"I think we are going to see a revolution in the next 10 years. Because the basic feasibility questions are not stumbling blocks, so it comes down to having a fast enough computer, that's affordable, small and low power enough to work on these mobile platforms. That's just a question of the semi-conductor industry developing them," Dr Mathies said. The second approach to machine vision is the biological or biomimetic school, which says that the problem has been barely defined and will take between 20 and 50 years to resolve. Biomimetics is a scientific method that seeks to mimic biological systems. Both schools are working towards different, though related, goals. While the engineers want to develop systems that accomplish given tasks, the biology-inspired scientists want their robots to adapt to and interact with the world around them. Engineering takes a problem and fixes it with the classical tools of that trade- geometry, applied physics and mathematics.
"My own speciality is stereo vision using two cameras and finding the same feature in both images to triangulate a 3-dimensional structure. And that's has enabled a fair amount of autonomous navigation indoors. The next step is to recognise elements of the environment," explains Dr Matthies. Stereo vision for 3-D perception and obstacle avoidance will be used in the two Mars Rovers, developed by JPL, that are due to land on the red planet early next year. Back on Earth, JPL is using colour and infrared to distinguish between types of terrain that could form an obstacle, say a rock, and other terrain that just appears as an obstacle, like tall grass. The Mars Rovers come with a whole battery of sensors. A panoramic camera at human-eye height will scout the terrain, while a miniature thermal emission spectrometer, with infrared vision, will help scientists identify the most interesting rocks.
Back to top... A special rock abrasion tool will expose the interior of interesting rocks so that another imager, like a geologist's hand lens, can get close-up texture views while two spectrometers will identify the composition of the rock. The engineering approach doesn't necessarily need a human view of the world. Laser range finders allow robots to locate themselves and judge depth of field. Industrial robots may soon use spectroscopic cameras to check the quality of fruit. One fairly simple-minded, robot vacuum cleaner that is now available to consumers bumps around a room using impact sensors. Once it hits an object, it changes direction and continues blindly bouncing around the room. "There are examples where computer vision works today. I like to cite a recent successes at Carnegie Mellon on autonomous road following. We demonstrated a car that can drive automatically at 75 mph and stay within its lane. It can also autonomously take highway exits. You don't have to steer," said Dr. Brajovic. But he notes that such as system is not adaptive. If the environment changes suddenly, for example, if road works appear ahead, the system doesn't know what to do. "Machine vision works well in constrained environments. Say, in an industrial production line, if I can control my lighting, if I can control how parts move past my camera, then my camera can identify it. But if the environment is not constrained, like the real world, say walking in town, now you have to find your way, avoid unpredictable obstacles and recognise when it is safe to cross the street, and that is very difficult for computer vision today," said Dr. Brajovic. The second school of machine vision, biomimetics, wants to develop adaptive, autonomous robots that can interact with the world around in real-time. Members of the biomimetics school believe that the machine vision problem is only now being understood, and the solution is much further away.
"We're using aspects of the human vision system to create robotic vision. For example, the brain identifies interesting points in a scene, say high-contrast areas and edges, and maps those. There are computer vision systems that do the same thing. It's not a model, but it is inspired from how human vision works," said Oxford 's Dr. Fitzgibbon. Yet much of the human visual system remains a mystery. Even so, there is progress. Researchers at Foveola, an aptly named high tech company, say they have developed software that can rapidly learn to identify objects. The Foveola software uses banks of simulated cells to analyse a small region of an image producing numerical signatures to recognise shapes. Like the real foveola, this 50 by 50 pixel window can be repeatedly moved about a scene.
Similarly, Dr. Laurent Itti at the University of Southern California is using neurophysical insights to improve machine vision. His approach, called selective attention modelling, is inspired by how the brain constructs scenes around highlights of interest, like striking colours and sharp contrasts. The machine version of this process gives more weight, or significance, to elements of an image that stand out. Researchers hope that mimicking human or animal vision will eventually allow the development of robots that can act autonomously and learn from their environment as they interact with it. Dr. Brajovic's team is developing light-sensitive silicon chip using neuromorphic engineering, a biomimetic approach that seeks to imitate neural signal processing. "The prime inspiration has been the biological retina, considered by some scientists to be 'the approachable part of the brain'," he noted.Motivated by biological systems, Dr Brojovic has developed a mathematical model for reflectance perception that has the potential of being implemented in an adaptive imaging chip. In a single mathematical step the model accounts for both scene contours, which in humans are produced by the retina, and the scene surfaces which are reconstructed in early visual cortex through the "filling-in" process. The reflectance function describes objects in the scene. It determines how the surfaces in the scene reflect light to produce optical images our eyes see. Human vision is adaptive; it is largely sensitive to the reflectance of object and is able to suppress wide variations in lighting conditions. Because of that we are able to function in environment with wide variation of illumination conditions: night, day, deep shadows, very bright scenes and so on. But today cameras lack that level of adaptation. They will quickly saturate, underexpose and fail in real-world conditions rendering the entire computer vision system ineffective. "The results of our reflectance perception model are amazingly good. From ordinary digital photographs, which do not exhibit any kind of adaptation we are able to reconstruct a view of the world that closely reconstructs what the photographer's eye perceived at the time of taking the picture. For example, ordinary photographs usually exhibit much harsher shadows than we remember seeing. This technology could, when combined with a light-sensitive silicone chip implantation, allow digital cameras to adapt to harsh lighting conditions and deliver images that would dramatically improve the reliability of computer vision" said Dr Brajovic. (An interactive on-line demonstration of Dr Brajovic's reflectance perception algorithm is available at www.shadowillumiinator.org). And while adaptive and truly autonomous robots may be a generation or two away, their slower forebears are already a commercial success, like Sony's famous robo-pet Aibo. The commercialisation of robot technologies has had two benefits for research: providing a market for new discoveries and providing platforms for new experiments.
"Sony's Aibo has helped research a lot. It has freed researchers to get on with research rather than worrying about the robot. Before researchers had to spend a lot of time maintaining the robots," commented Dr Fitzgibbon. Many still do, such as those involved in designing the highly specialised machines that can roam Mars. But the work will be worth it if it can deliver what Dr. Brajovic believes is the long-term goal: "Ideally you want a computer vision system that you turn on, just like you open your eyes, and suddenly it sees things and understands what they are."
Vladimir M. Brajovic PhD
brajovic@cs.cmu.edu
Andrew Fitzgibbon
awf@robots.ox.ac.uk
Dr Larry Matthies
lhm@robotics.jpl.nasa.gov
Back to top... |