Neuroscience

Computational vision

The new study A Feedforward Architecture Accounts for Rapid Categorization, Serre, T., A. Oliva and T. Poggio, PNAS 2007, in press [not online yet] reveals the success of a computational version of vision modeled on the visual cortex processes of immediate recognition of objects. The feedforward model is based on what our vision perceives in the first 100-200 milliseconds of exposure in the ventral stream before cognitive feedback loops kick in. It recognized objects in a database of street scenes with reasonable accuracy and uses a learning algorithm to become better at categorizing new objects. In this study, their system was trained by exposure to images then pitted against human vision and both performed nearly the same, with over 90% accuracy for close-ups and 74% for distant views.

Thomas Serre, Tomaso Poggio and others at the Center for Biological and Computational Learning in the McGovern Institute, the Department of Brain and Cognitive Sciences, and the Computer Science and Artificial Intelligence Lab at MIT collaborated on the system. Another new paper, Robust Object Recognition with Cortex-Like Mechanisms, Serre et al., IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 29 No 3, March 2007 [free PDF], describes the development. The feedforward model uses four layers:

Visual processing is hierarchical, aiming to build
invariance to position and scale first and then to
viewpoint and other transformations.
Along the hierarchy, the receptive fields of the neurons
(i.e., the part of the visual field that could potentially
elicit a response from the neuron) as well as the
complexity of their optimal stimuli (i.e., the set of
stimuli that elicit a response of the neuron) increases.
The initial processing of information is feedforward
(for immediate recognition tasks, i.e., when the image
presentation is rapid and there is no time for eye
movements or shifts of attention).
Plasticity and learning probably occurs at all stages
and certainly at the level of inferotemporal (IT)
cortex and prefrontal cortex (PFC), the top-most
layers of the hierarchy.

Poggio said, “We have not solved vision yet, but this model of immediate recognition may provide the skeleton of a theory of vision. The huge task in front of us is to incorporate into the model the effects of attention and top-down beliefs.”

Their next goal is research on the 200-300 milliseconds after the feedforward process of immediate recognition, and a larger one is to incorporate cognitive feedback loops. The feedforward model may ultimately be useful as a front end to more complex processing systems. Bigger implications:

This new study supports a long–held hypothesis that rapid categorization happens without any feedback from cognitive or other areas of the brain. The results also indicate that the model can help neuroscientists make predictions and drive new experiments to explore brain mechanisms involved in human visual perception, cognition, and behavior. Deciphering the relative contribution of feed-forward and feedback processing may eventually help explain neuropsychological disorders such as autism and schizophrenia. The model also bridges the gap between the world of artificial intelligence (AI) and neuroscience because it may lead to better artificial vision systems and augmented sensory prostheses.

Read more.
Download the open source software with StreetScenes dataset.
More research from the MIT CBCL lab.

x-posted to Omni Brain

- Expanding The Frontiers Of Human Cognition
Chris Chatham: "The goal of developmental cognitive neuroscience is to uncover those mechanisms of change which allow the mature mind to emerge from the brain. The term encompasses a wide spectrum of research with one common fundamental assumption: the...

- Looking For Longer But 'seeing' Less
Looking for too long at something can sometimes make it harder to ‘see’ what you are looking for, according to Li Zhaoping and Nathalie Guyader at UCL. In an odd-one-out type task, a single line orientated like this / was hidden among dozens of lines...

- I Am What I See
Schematic representation of the two streams of visual processing in human cerebral cortex (taken from Goodale & Westwood, 2004). There is no pattern, yet there is The configuration lies within --Single Gun Theory, I Am What I See How does the brain categorize...

- About Vision Therapy, Ii
I get so many hits on my blog for vision therapy, mostly looking for material about Stereo Sue, so I take it for granted that all of my visitors know what vision therapy is. Bad librarian! It's hard to find a good definition online of vision therapy...

- Vision Therapy, A Personal Perspective
My vision buddy Heather has just started blogging about her experiences with vision therapy. She calls it "One Eyed Girl - My Life with Strabismus: [a] Journal of living with monocular vision and learning through Vision Therapy to use both eyes." I went...

Neuroscience