Researchers at the MIT McGovern Institute for Brain Research have used a biological model to train a computer model to recognize objects in busy street scenes, such as cars or people. Their very innovative approach, which combines neuroscience and artificial intelligence with computer science, mimics how the brain functions to recognize objects in the real world. This versatile model could soon be used for automobile driver's assistance, visual search engines, biomedical imaging analysis or robots with realistic vision. It also have many potential applications for neuroscientists, to design augmented sensory prostheses for example. And the researchers are thinking about a commercial implementation of their technology.
This computer model has been built in Tomaso Poggio's laboratory at the McGovern Institute. Poggio also is co-director of the Center for Biological & Computational Learning (CBCL) at MIT where he worked with Thomas Serre. Here is how this biologically-inspired computer model works.
The team "showed" the model randomly selected images so that it could "learn" to identify commonly occurring features in real-word objects, such as trees, cars, and people. In so-called supervised training sessions, the model used those features to label by category the varied examples of objects found in digital photographs of street scenes: buildings, cars, motorcycles, airplanes, faces, pedestrians, roads, skies, trees, and leaves.
Compared to traditional computer-vision systems, the biological model was surprisingly versatile. Traditional systems are engineered for specific object classes. For instance, systems engineered to detect faces or recognize textures are poor at detecting cars. In the biological model, the same algorithm can learn to detect widely different types of objects.
Below are several images which show how this computer model works. The top row contains examples taken from the Street Scene database of the CBCL. The middle row shows the results of true hand-labeling: "color overlay indicates texture-based objects and bounding rectangles indicate shape-based objects. Note that pixels may have multiple labels due to overlapping objects or no label at all (indicated in white)." Finally, the bottom row shows the "results obtained with a system trained on examples like (but not including) those in the second row." (Credit: McGovern Institute at MIT, via IEEE)

Teaching a computer how to recognize objects has always been difficult even if children can easily do it. This is why this new computer model is really innovative because it mimics the brain's own hierarchy.
Specifically, the "layers" within the model replicate the way neurons process input and output stimuli -- according to neural recordings in physiological labs. Like the brain, the model alternates several times between computations that help build an object representation that is increasingly invariant to changes in appearances of an object in the visual field and computations that help build an object representation that is increasingly complex and specific to a given object.
While the team is working on a commercial implementation, it also has ideas to go further.
"The lab is now elaborating the model to include the brain's feedback loops from the cognitive centers. This slower form of object recognition provides time for context and reflection, such as: if I see a car, it must be on the road not in the sky. Giving the model the ability to recognize such semantic features will empower it for broader applications, including managing seemingly insurmountable amounts of data, work tasks, or even email."
This research work has been published by the IEEE Transactions on Pattern Analysis and Machine Intelligence under the name "Robust Object Recognition with Cortex-Like Mechanisms" (Volume 29, Number 3, Pages 411-426, March 2007). Here are two links to the abstract and to the full paper (PDF format, 31 pages, 3.48 MB), from which the above images have been extracted.
Finally, the Street Scene database and other ones are directly available as well as various pieces of software from this CBCL page.
Sources: The McGovern Institute at MIT, February 7, 2007; and various other websites
______________________
* The term Wetware is used to describe the integration of the concepts of the physical construct known as the central nervous system (CNS) and the mental construct known as the human mind. It is a two part abstraction drawn from the computer related idea of hardware or software.
The first abstraction solely concerns the bioelectric and biochemical properties of the CNS, specifically the brain. If the impulses traveling the various neurons are analogized as software, then the physical neurons would be the hardware. The amalgamated interaction of the software and hardware manifest through continuously changing physical connections, and chemical and electrical influences spreading across wide spectrums of supposedly unrelated areas. This interaction requires a new term that exceeds the definition of those individual terms.
The second abstraction is relegated to a higher conceptual level. If the human mind is analogized as software, then the first abstraction described above is the hardware. The process by which the mind and brain interact to produce the collection of experiences that we define as self-awareness is still seriously in question. Importantly, the intricate interaction between physical and mental realms is observable in many instances. The combination of these concepts are expressed in the term wetware.
-------------------------------------------------
Note: Will Evans is a software information architect for a risk modeling software company in Boston. Previously he was the information architect responsible for designing the Gather user experience. He has published articles about Information Architecture, User Experience, and Interaction Design. He has taught User Centered Design and Building Usable Enterprise Architectures to both small and large corporate audiences.
He enjoys publishing his musings, ideas, poetry and pre-Simulationist and post-modern critiques of modern culture and aesthetics. He drinks way to much coffee and needs more sleep but is really trying to change that.


Comments: 4
So, whereas before it was originally from "software" to imitation of neural nets and weak CGI by way of reverse engineering, the new computer scientists with faster computers, models based on actual intelligent adaptations of neural nets and the latest neuroanatomy can now go the other direction as Evans suggested in his cogent title, "from Wetware to Hardware."
The implications of these new models and capacities to "map" the world as the brain does not only by calculating in the same way neurons process visual input, but "adding feedback from the cognitive centers", with rules based on semantic context processing, are enormous, and Will has drawn them out for us: " automobile driver's assistance, visual search engines, biomedical imaging analysis or robots with realistic vision. It also have many potential applications for neuroscientists, to design augmented sensory prostheses for example. And the researchers are thinking about a commercial implementation of their technology."
I can see how this information can be used as well in sim world simulation, in routing encounters between individuals in imaginary space, for instance. I am brainstorming right now to try to imagine its application in the arts.
Above all, it shows the soundness of the pre-Sim view that Will Evans and I and a few others hold that we must examine this new use of environmental intelligence and recognition by machines that are being trained to simulate neural functioning as urgent issues in talking about what is today's "reality". What reality does this imply in a world where our own modalities of sensing it are being 'encoded' into the surround? What that means to our representation of it in the future?
What possible windows of human/machine interface will be opened, and more ominously, what possible gates can be erected against the individual in the name of faceless security?