Object Recognition : Computation & Neuroimaging

Psy8036-57061

or

Psy 8993-56020

University of Minnesota, Spring Semester, 2006
http://courses.kersten.org

Instructors:
Dan Kersten (kersten@umn.edu)
Eugene Bart (bart@ima.umn.edu)
Sheng He (sheng@umn.edu)

Meeting time : 8 to 9:45 am Tuesdays
Place: Elliott 204

The ability to visually recognize objects is important for a wide range of behavioral functions such as navigation, object manipulation, social interaction, mate selection, language, foraging, and avoiding danger. Although we are still far from a complete model of human object recognition, there is growing consensus regarding its overall computational architecture. Evidence from computational, behavioral, and neural studies suggests the following picture. Visual recognition begins with a fast feedforward process that extracts features (based on a sequence of spatial-temporal filtering operations, possibly over a sequence of cortical regions). These features serve to rapidly index or “propose” candidate object categories, such as “ animal“, “car”, "face", etc.. For a typical natural image, it seems unlikely that complete reliable object boundaries are extracted at this initial stage. Instead, depending on the confidence level required for specific task goals, feedback would be important to facilitate boundary and shape estimation, to verify object decisions, do category refinement, and initiate additional fixations. In this seminar we focus on the fast initial feedforward access to object categories. We will read recent litereature and discuss current computational theories and neural mechanisms of object categorization. We will pay particular attention to computational models of object categorization that work with natural image input, and explore their possible relationships to studies of human cortical activity as measuring using fMRI.


Format: Discussion of journal articles led by seminar members. Students who register for 3 credits will prepare a term paper or term project on a related topic.


Tentative Reading List



A. Computation

*Bart, E., & Ullman, S. Cross-generalization: learning novel classes from a single example by feature replacement. in Proc. CVPR, 2005.

*Bart, Evgeniy, Evgeny Byvatov, and Shimon Ullman (2004). View-invariant recognition using corresponding object fragments. In Proceedings of the European Conference on Computer Vision, Part II, pages 152–165.

Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94, 115-147.

*Borenstein, E. and Ullman, S. 2002. Class-Specific, Top-Down Segmentation. In Proceedings of the 7th European Conference on Computer Vision-Part II (May 28 - 31, 2002).

*Epshtein, B., S. Ullman (2005), "Identifying Semantically Equivalent Object Fragments", CVPR.

*Fei-Fei, L., Fergus, R., and Perona, P. (2003). A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories. In Proceedings of the Ninth IEEE international Conference on Computer Vision - Volume 2 (October 13 - 16, 2003). ICCV. IEEE Computer Society, Washington, DC, 1134.

*Krempp, S, D. Geman and Y. Amit, (2002) "Sequential learning with reusable parts for object detection," Technical Report, 2002.

*Kobi Levi, Michael Fink and Yair Weiss (2004). Learning From a Small Number of Training Examples by Exploiting Object Categories, Workshop of Learning in Computer Vision (LCVPR).

Liu, Z., Knill, D. C., & Kersten, D. (1995). Object Classification for Human and Ideal Observers. Vision Research, 35(4), 549-568.

Liu, Z., & Kersten, D. (1998). 2D observers for human 3D object recognition? Vision Res, 38(15-16), 2507-2519.

Miller, Erik, Nicholas Matsakis, and Paul Viola. (2000) Learning from One Example Through Shared Densities on Transforms. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition.

Palmeri, T. J., Wong, A. C., & Gauthier, I. (2004). Computational approaches to the development of perceptual expertise. Trends Cogn Sci, 8(8), 378-386.

*Sali, Erez and Shimon Ullman (1999). Combining class-specific fragments for object recognition. In British Machine Vision Conference, pages 203–213.

Tanaka, J. W., & Taylor, M. (1991). Object Categories and Expertise: Is the Basic Level in the Eye of the Beholder? Cognitive Psychology, 23, 457-482.

Tarr, M. J., & Cheng, Y. D. (2003). Learning to see faces and objects. Trends Cogn Sci, 7(1), 23-30.

Tarr, M. J., & Gauthier, I. (2000). FFA: a flexible fusiform area for subordinate-level visual processing automatized by expertise. Nat Neurosci, 3(8), 764-769.

*Tarr, M. J., & Gauthier, I. (1998). Do viewpoint-dependent mechanisms generalize across members of a class? Cognition, 67(1-2), 73-110.

Tarr, M. J., & Bulthoff, H. H. (1998). Image-based object recognition in man, monkey and machine. Cognition, 67(1-2), 1-20.

Theoret, H., Merabet, L., & Pascual-Leone, A. (2004). Behavioral and neuroplastic changes in the blind: evidence for functionally relevant cross-modal interactions. J Physiol Paris, 98(1-3), 221-233.

Tenenbaum JB (1999) Bayesian modeling of human concept learning. In: Advances in Neural Information Processing Systems (Kearns MSS, Solla A, Cohn DA, eds): Cambridge, MA: MIT Press.

Tenenbaum JB, Griffiths TL (2001) Generalization, similarity, and Bayesian inference. Behav Brain Sci 24:629-640; discussion 652-791.

*Torralba, A. , K. P. Murphy and W. T. Freeman. (2004a). Sharing features: efficient boosting procedures for multiclass object detection. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). pp 762- 769

*Torralba, A., K. P. Murphy and W. T. Freeman. (2004b). Sharing Visual Features for Multiclass and Multiview Object Detection. MIT AI Lab Memo AIM-2004-008.).

*Ullman, S., Vidal-Naquet, M., & Sali, E. (2002). Visual features of intermediate complexity and their use in classification. Nat Neurosci, 5(7), 682-687.

VanRullen, R., & Thorpe, S. J. (2001). Is it a bird? Is it a plane? Ultra-rapid visual categorisation of natural and artifactual objects. Perception, 30(6), 655-668.

*Weber M., M. Welling and P. Perona (2000) Unsupervised Learning of Models for Recognition
Proc. 6th Europ. Conf. Comp. Vis., ECCV2000, Dublin, Ireland, June 2000 (pdf)

*Weber, M. , M. Welling, and P. Perona. Towards automatic discovery of object categories. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pages 101–109, 2002

B. Neuroimaging and behavior

Avidan G, Harel M, Hendler T, Ben-Bashat D, Zohary E, Malach R. Contrast sensitivity in human visual areas and its relationship to object recognition. J Neurophysiol. 2002 Jun;87(6):3102-16.

Biederman, I. (2000). Recognizing depth-rotated objects: a review of recent research and theory. Spat Vis, 13(2-3), 241-253.

Carlson TA, Schrater P, He S. Patterns of activity in the categorical representations of objects. J Cogn Neurosci. 2003 Jul 1;15(5):704-17.

Hochstein S, Ahissar M. View from the top: hierarchies and reverse hierarchies in the visual system. Neuron. 2002 Dec 5;36(5):791-804.

Malach R, Levy I, Hasson U. The topography of high-order human object areas. Trends Cogn Sci. 2002 Apr 1;6(4):176-184.

Grill-Spector K. The neural basis of object perception. Curr Opin Neurobiol. 2003 Apr;13(2):159-66.

Hanson SJ, Matsuka T, Haxby JV. Combinatorial codes in ventral temporal lobe for object recognition: Haxby (2001) revisited: is there a "face" area? Neuroimage. 2004 Sep;23(1):156-66.

Haxby JV, Gobbini MI, Furey ML, Ishai A, Schouten JL, Pietrini P. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science. 2001 Sep 28;293(5539):2425-30.

James TW, Humphrey GK, Gati JS, Menon RS, Goodale MA. Differential effects of viewpoint on object-driven activation in dorsal and ventral streams. Neuron. 2002 Aug 15;35(4):793-801.

Fang F, He S. Viewer-centered object representation in the human visual system revealed by viewpoint aftereffects.
Neuron. 2005 Mar 3;45(5):793-800.

Fang F, He S. Cortical responses to invisible objects in the human dorsal and ventral pathways.
Nat Neurosci. 2005 Oct;8(10):1380-5.

Friston, K. (2005). A theory of cortical responses. Philos Trans R Soc Lond B Biol Sci, 360(1456), 815-836.

Riesenhuber, M., & Poggio, T. (2002). Neural mechanisms of object recognition. Curr Opin Neurobiol, 12(2), 162-168.

Rousselet, G. A., Fabre-Thorpe, M., & Thorpe, S. J. (2002). Parallel processing in high-level categorization of natural images. Nat Neurosci, 5(7), 629-630.

Serre, T., L. Wolf and T. Poggio. (2005) Object Recognition with Features Inspired by Visual Cortex. In: Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society Press, San Diego, June 2.

Shmuelof L, Zohary E. Dissociation between ventral and dorsal fMRI activation during object and action recognition.
Neuron. 2005 Aug 4;47(3):457-70.