Li Yang



Document Type


Degree Name



Department of Computer Science and Electrical Engineering


Oregon Health & Science University


The research objective was to develop visual models and corresponding algorithms to automatically extract parts of novel objects from raw gray-scale images, and finally, to achieve object recognition with rotational invariance within structural description framework. Given primate visual systems can process visual information fast and well, we used primate brains as the inspiration for developing visual models with sparse and unsupervised learning algorithms. Sparse representation allowed our visual models to exhibit intrinsic fault-tolerance and low-power consumption operation compared to other computing paradigms. Unsupervised learning allowed our visual models to automatically extract features of novel objects based on statistical properties of input images, and without the visual models employing any explicit knowledge. Inspired by the primate visual ventral pathway, we developed several visual models in hierarchical network architecture for low-level visual feature extraction (V1 and V2 models), parts-based shape representation (V4 model) and high-level object recognition (IT model). Using this world as its own representation and extracting information from it, as necessary, through the action of feature detectors based on the notion of cells' receptive fields, our visual models are biologically inspired and are also computationally tractable. Our results show that these models can efficiently and adaptively process visual information with approximate transformation invariance. The low-level features extracted by the V1 and V2 models are very sparse but rich enough for further visual processing in high-layer models, such as V4 and IT. With the sparse coding constraint, the V4 model combines unsupervised representation in the feed-forward stream with lateral interaction to achieve stable, efficient and natural representation of shapes. Furthermore, we found that V4 model cells display same curvature and object centered tuning as the reported tuning properties of V4 cells in the primate visual ventral pathway. Based on object parts output from the V4 model, we developed an IT mode for the purpose of recognizing objects from different viewing angles, where objects are represented as flexible constellations of rigid parts. The IT model achieves very good object recognition results with approximate viewpoint invariance. The main contribution of this work is the biologically motivated integration of a number of existing approaches, e.g., unsupervised learning and sparse representation into the hierarchical network architecture. These models yield better performance than many existing algorithms and represent biologically plausible mechanisms, therefore, may provide some idea to further explore the mechanisms of visual information processing both in biological and robotic settings.




OGI School of Science and Engineering



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.