Haruo Hosoya
Tue 20 Mar 2018, 11:00 - 12:00
IF 4.31/4.33

If you have a question about this talk, please contact: Gareth Beedham (gbeedham)

Haruo Hosoya, Senior Researcher

Dept. Dynamic Brain Imaging

Cognitive Mechanisms Laboratories

ATR Institute International

Japan

 

Talk title:

A mixture of sparse coding models for holistic and parts-based face

 processing in the IT cortex

 

Abstract:

Experimental studies have revealed evidence of both parts-based and holistic representations of objects and faces in the primate visual system.  However, it is computationally not obvious how such seemingly contradictory types of processing can coexist within a single system.  Here, we propose a novel theory called mixture of sparse coding models, inspired by the formation of category-specific subregions in the inferotemporal (IT) cortex.  We developed a hierarchical network that constructed a mixture of two sparse coding submodels on top of a simple Gabor analysis.  The submodels were each trained with face or non-face object images, which resulted in separate representations of facial parts and object parts.  Evoked neural activities were modeled by Bayesian inference, which had a top-down explaining-away effect that enabled recognition of an individual part to depend strongly on the category of the whole input.  Notably, the resulting model explained, qualitatively and quantitatively, almost all response properties reported by Freiwald, Tsao, and Livingstone (2009) on the middle patch of face processing in IT.  Namely, our model units exhibited (1) significant selectivity to face images over object images, (2) tuning to only a small number of facial features that were often related to geometrically large parts, (3) preference and anti-preference of extreme facial features (e.g., very large/small inter-eye distance), (4) reduction of the gain of feature tuning for partial face stimuli compared to whole face stimuli, and (5) similarity of feature tuning between inverted and normal face stimuli.  Not all above properties could be reproduced with a simple sparse coding model trained with face images or a multi-layer perceptron trained to discriminate faces from objects.  Thus, we hypothesize that the coding principle of facial features in the middle patch of face processing in the macaque IT cortex may be closely related to mixture of sparse coding models.