PICT Computational Theories of Visual Cortex

Computational Theories of Visual Cortex

Graduate Seminar, Psy 8036


University of Minnesota, Spring Semester, 2012

http://courses.kersten.org

Instructors: Dan Kersten (kersten@umn.edu), Damien Mannion (dmannion@umn.edu)

Meeting time : 3:00 to 4:30 Tuesdays. (First meeting is Tuesday Jan 17th, 2012) Place: Elliott S204

Human visual decisions are believed to be based on a hierarchical organization of stages through which image information is successively transformed from a high-dimensional set of local feature measurements with a small number types (e.g. edges at many locations) to increasingly lower-dimensional representations of many types (e.g. dog, car, ...). Functional utility requires selecting task-relevant information while discounting confounding factors. For example, decisions requiring object recognition involve pathways in the hierarchy in which representations become increasingly selective for specific pattern types (e.g. boundaries, textures, shapes, parts, objects), together with increased invariance to transformations such as translation, scale, and illumination. Computer vision architectures for object recognition and parsing, as well as models of the primate ventral visual stream are consistent with this hierarchical view of visual processing. However, this view belies the enormous flexibility and adaptability of human vision. The visual system is able to engage in an unlimited variety of tasks, including the ability to estimate the “confounding” factors when required. This ability would seem to require the rapid construction of programs or routines that control flow of visual information. This seminar will examine recent ideas relevant to understanding flexible and dynamic routines in vision. Examples include top-down selection and grouping of features depending on task, learning and recognition of transformations, integrating information over time, and inferring object relationships. We will study the implications of attention, learning and memory on early visual areas and object processing.There will be a strong emphasis on hypotheses testable using behavioral, neuroimaging, and electrophysiological methods. The class format will consist of short lectures to provide overviews of upcoming themes, followed by discussion of journal articles led by seminar participants. Students will prepare a final term paper or computer project on a related topic.

Themes

1 . Tasks: What do we do with our eyes?

*Fei-Fei, L., Iyer, A., Koch, C., and Perona, P. (2007). What do we perceive in a glance of a real-world scene? Journal of Vision, 7(1):10–10

#*Land, M. F. (2009). Vision, eye movements, and natural behavior. Visual Neuroscience, 26(01):51

*Donahue, J. and Grauman, K. Annotator Rationales for Visual Recognition. Proceedings of the International Conference on Computer Vision (ICCV)

Parih, D. and Grauman, K. (2011). Relative Attributes. Proceedings of the International Conference on Computer Vision (ICCV)

Hayhoe, M. and Ballard, D. (2005). Eye movements in natural behavior. Trends in Cognitive Sciences, 9(4):188–194

2 . Generative models: What is the causal structure of images?

#*Schwartz, O., Sejnowski, T. J., and Dayan, P. (2009). Perceptual organization in the tilt illusion. Journal of Vision, 9(4):19.1–20

*Tappen, M., Freeman, W., and Adelson, E. (2005). Recovering intrinsic images from a single image. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 27(9):1459–1472

#Zhu, S. (2003). Statistical modeling and conceptualization of visual patterns. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 25(6):691–712

2.1 . Neural representation of visual information: Measurement and meaning

Logothetis, N. K. (2008). What we can do and what we cannot do with fMRI. Nature, 453(7197):869–878

Guillery, R. W. and Sherman, S. M. (2002). Thalamic relay functions and their role in corticocortical communication: generalizations from the visual system. Neuron, 33(2):163–175

#Grill-Spector, K. and Malach, R. (2004). THE HUMAN VISUAL CORTEX. Annual Review of Neuroscience, 27(1):649–677

#*Green, C. S., Pouget, A., and Bavelier, D. (2010). Improved probabilistic inference as a general learning mechanism with action video games. Current biology : CB, 20(17):1573–1579

Ma, W. J. and Pouget, A. (2006). Bayesian inference with probabilistic population codes. Nature Neuroscience, 9(11):1432–1438

#Lennie, P. (1998). Single units and visual cortical organization. PERCEPTION-LONDON-, 27:889–936

Blake, R. (1995). Psychoanatomical strategies of studying human visual perception. In Early vision and beyond. MIT Press

#Adelson, E. H. and Bergen, J. R. (1991). The plenoptic function and the elements of early vision. In Landy, M. S. and Movshon, J. A., editors, Computational models of visual processing. MIT Press, Cambridge, MA

Callaway, E. (1998). Local circuits in primary visual cortex of the macaque monkey. Annual Review of Neuroscience, 21:47–74

#Graf, A. B. A., Kohn, A., Jazayeri, M., and Movshon, J. A. (2011). Decoding the activity of neuronal populations in macaque primary visual cortex. Nature Publishing Group, 14(2):239–245

3 . Bottom-up models

3.1 . Bottom-up models: Recognition

Riesenhuber, M. and Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2:1019–1025

#Thorpe, S., Fize, D., and Marlot, C. (1996). Speed of processing in the human visual system. Nature, 381(6582):520–522

#*Poggio, T. (2011). The Computational Magic of the Ventral Stream: Towards a Theory. Nature Precedings

*Zeiler, M. D., Krishnan, D., Taylor, G. W., and Fergus, R. (2010). Deconvolutional Networks. In 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2528–2535. IEEE

Leibo, J., Mutch, J., and Rosasco, L. (2010). Learning Generic Invariances in Object Recognition: Translation and Scale

*Serre, T., Oliva, A., and Poggio, T. (2007). A Feedforward Architecture Accounts for Rapid Categorization. Proceedings of the National Academy of Sciences of the United States of America, 104(15):6424–6429

Alpert, S., Galun, M., Brandt, A., and Basri, R. (2011). Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration. IEEE Transactions on Pattern Analysis and Machine Intelligence

#*Sharon, E. (2006). Hierarchy and adaptivity in segmenting visual scenes. Nature, 442(7104):810–813

Peelen, M. V., Fei-Fei, L., and Kastner, S. (2009). Neural mechanisms of rapid natural scene categorization in human visual cortex. Nature, 460(7251):94–97

3.2 . Bottom-up models: Saliency

Li Z. (1997) Primary cortical dynamics for visual grouping Presented at "Theoretical Aspects of Neural Computation" workshop, May 1997, Hong Kong University of Science and Technology. Published in "Theoretical aspects of neural computation" K.M. Wong, I. King, and D.Y. Yeung (eds) page 155-164. Springer-verlag January 1998 (pdf)

#*Soltani, A. and Koch, C. (2010). Visual saliency computations: mechanisms, constraints, and the effect of feedback. Journal of Neuroscience, 30(38):12831–12843

#*Zhang, X., Zhaoping, L., Zhou, T., and Fang, F. (2012). Neural Activities in V1 Create a Bottom-Up Saliency Map. Neuron, 73(1):183–192

Spratling, M. W. (2011). Predictive coding as a model of the V1 saliency map hypothesis. Neural Networks

#*Spratling, M. W. (2010). Predictive coding as a model of response properties in cortical area V1. Journal of Neuroscience, 30(9):3531–3543

Itti, L. and Baldi, P. (2009). Bayesian surprise attracts human attention. Vision Research, 49(10):1295–1306

Zhang, L., Tong, M. H., Marks, T. K., Shan, H., and Cottrell, G. W. (2008). SUN: A Bayesian framework for saliency using natural statistics. Journal of Vision, 8(7):32–32

*Tatler, B. W., Hayhoe, M. M., Land, M. F., and Ballard, D. (2011). Eye guidance in natural vision: Reinterpreting salience. Journal of Vision, 11(5):5–5

*Zhang, L., Tong, M. H., Marks, T. K., Shan, H., and Cottrell, G. W. (2008). SUN: A Bayesian framework for saliency using natural statistics. Journal of Vision, 8(7):32–32

4 . Top-down models

#Yuille, A. and Kersten, D. (2006). Vision as Bayesian inference: analysis by synthesis? Trends in Cognitive Sciences, 10(7):301–308

Ullman, S. (1995). Sequence seeking and counter streams: a computational model for bidirectional information flow in the visual cortex. Cerebral Cortex, 5(1):1–11

Lauritzen, S. and Spiegelhalter, D. (1988). Local Computations with Probabilities on Graphical Structures and Their Application to Expert Systems. Journal of the Royal Statistical Society. Series B (Methodological), 50(2):157–224

#Mumford, D. (1992). On the computational architecture of the neocortex. Biological Cybernetics, 66(3):241–251

#*Epshtein, B., Lifshitz, I., and Ullman, S. (2008). Image interpretation by a single bottom-up top-down cycle. Proceedings of the National Academy of Sciences of the United States of America, 105(38):14298

Rao, R. P. N. and Ballard, D. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2:79–87

Jehee, Rothkopf, C., Beck, J., and Ballard, D. (2006). Learning receptive fields using predictive feedback. Journal of Physiology-Paris, 100(1-3):125–132

*Hinton, G. (2009). Learning to represent visual input. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1537):177–184

#*Lamme, V. and Roelfsema, P. (2000). The distinct modes of vision offered by feedforward and recurrent processing. Trends in Neurosciences, 23(11):571–579

*Ringach, D. L. (2009). Spontaneous and driven cortical activity: implications for computation. Current Opinion in Neurobiology, 19(4):439–444

Tu, Z., Chen, X., Yuille, A., and Zhu, S. (2005). Image parsing: Unifying segmentation, detection, and recognition. In International Journal of Computer Vision, pages 113–140. Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90095 USA

*von der Heydt, R. (2002). Image Parsing Mechanisms of the Visual Cortex. pages 1–25

4.1 . Top-down models: “Shutup” or “Stop gossiping”?

#*Cardin, V., Friston, K. J., and Zeki, S. (2011). Top-down Modulations in the Visual Form Pathway Revealed with Dynamic Causal Modeling. Cerebral Cortex, 21(3):550–562

*Cichy, R. M., Heinzle, J., and Haynes, J. D. (2011). Imagery and Perception Share Cortical Representations of Content and Location. Cerebral Cortex

*Hsieh, P. J., Vul, E., and Kanwisher, N. (2010). Recognition alters the spatial pattern of fMRI activation in early retinotopic cortex. Journal of Neurophysiology, 103(3):1501–1507

*Alink, A., Schwiedrzik, C. M., Kohler, A., Singer, W., and Muckli, L. (2010). Stimulus Predictability Reduces Responses in Primary Visual Cortex. Journal of Neuroscience, 30(8):2960–2966

Furl, N., van Rijsbergen, N. J., Kiebel, S. J., Friston, K. J., Treves, A., and Dolan, R. J. (2010). Modulation of Perception and Brain Activity by Predictable Trajectories of Facial Expressions. Cerebral Cortex, 20(3):694–703

Vanni, S. and Rosenström, T. (2010). Local non-linear interactions in the visual cortex may reflect global decorrelation. Journal of Computational Neuroscience, 30(1):109–124

#Ban, H., Yamamoto, H., Fukunaga, M., Nakagoshi, A., Umeda, M., Tanaka, C., and Ejima, Y. (2006). Toward a common circle: interhemispheric contextual modulation in human early visual areas. J. Neurosci., 26(34):8804–8809

#Lee, T. S. and Mumford, D. (2003). Hierarchical Bayesian inference in the visual cortex. J Opt Soc Am A Opt Image Sci Vis, 20(7):1434–1448

#Friston, K. (2005). A theory of cortical responses. Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1456):815–836

4.2 . Top-down models: Tasks

*McManus, J. N. J., Li, W., and Gilbert, C. D. (2011). Adaptive shape processing in primary visual cortex. Proceedings of the National Academy of Sciences, 108(24):9739–9746

#*McMains, S. and Kastner, S. (2011). Interactions of Top-Down and Bottom-Up Mechanisms in Human Visual Cortex. Journal of Neuroscience, 31(2):587–597

*Williams, M. A., Baker, C. I., Op de Beeck, H. P., Shim, W. M., Dang, S., Triantafyllou, C., and Kanwisher, N. (2008). Feedback of visual object information to foveal retinotopic cortex. Nature Neuroscience, 11(12):1439–1445

*Harrison, S. A. and Tong, F. (2009). Decoding reveals the contents of visual working memory in early visual areas. Nature, 458(7238):632–635

Fang, F., Boyaci, H., Kersten, D., and Murray, S. O. (2008). Attention-dependent representation of a size illusion in human V1. Current biology : CB, 18(21):1707–1712

Fang, F., Boyaci, H., and Kersten, D. (2009). Border ownership selectivity in human early visual cortex and its modulation by attention. Journal of Neuroscience, 29(2):460–465

Kaas, A., Weigelt, S., Roebroeck, A., Kohler, A., and Muckli, L. (2010). Imagery of a moving object: The role of occipital cortex and human MT/V5+. NeuroImage, 49(1):794–804

*Smith, F. W. and Muckli, L. (2010). Nonstimulated early visual areas carry information about surrounding context. Proceedings of the National Academy of Sciences, 107(46):20099–20103

Weidner, R., Krummenacher, J., Reimann, B., Müller, H. J., and Fink, G. R. (2009). Sources of top-down control in visual search. Journal of Cognitive Neuroscience, 21(11):2100–2113

#*Neri, P. (2011). Global properties of natural scenes shape local properties of human edge detectors. i-Perception

Gilbert, C. D. and Sigman, M. (2007). Brain States: Top-Down Influences in Sensory Processing. Neuron, 54(5):677–696

*Chen, J., Zhou, T., Yang, H., and Fang, F. (2010). Cortical Dynamics Underlying Face Completion in Human Visual System. Journal of Neuroscience, 30(49):16692–16698

Rothkopf, C. A. and Ballard, D. H. (2009). Image statistics at the point of gaze during human navigation. Visual Neuroscience, 26(01):81

#*Egner, T., Monti, J. M., and Summerfield, C. (2010). Expectation and Surprise Determine Neural Population Responses in the Ventral Visual Stream. Journal of Neuroscience, 30(49):16601–16608

#Li, W., Pich, V., and Gilbert, C. D. (2004). Perceptual learning and top-down influences in primary visual cortex. Nat Neurosci, 7(6):651–657

5 . Structures, processes & routines

Lee, T. S. and Yuille, A. L. (2006). Efficient coding of visual scenes by grouping and segmentation: theoretical predictions and biological evidence. In Doya, K., Ishii, S., Pouget, A., and Rao, R. P., editors, Bayesian Brain: Probabilistic Approaches to Neural Coding, pages 1–29

*Ullman, S. (1984). Visual routines. COGNITION, 18(1-3):97–159

5.1 . Structures, processes & routines: Representation

Kovacs, I. and julesz, B. (1993). A Closed Curve Is Much More Than an Incomplete One - Effect of Closure in Figure Ground Segmentation. Proceedings of the National Academy of Sciences of the United States of America, 90(16):7495–7497

#*Kimia, B. B. (2003). On the role of medial geometry in human vision. Journal of Physiology-Paris, 97(2-3):155–190

#Roelfsema, P. R. (2006). Cortical algorithms for perceptual grouping. Annu Rev Neurosci, 29:203–227

#Connor, C. E., Brincat, S. L., and Pasupathy, A. (2007). Transformation of shape information in the ventral pathway. Curr Opin Neurobiol, 17(2):140–147

5.2 . Structures, processes & routines: Filling-in

#*Anderson, B. L., O’Vari, J., and Barth, H. (2011). Non-Bayesian Contour Synthesis. Current biology : CB, 21(6):492–496

Boyaci, H., Fang, F., Murray, S. O., and Kersten, D. (2010). Perceptual grouping-dependent lightness processing in human early visual cortex. Journal of Vision, 10(9):1–12

Davey, M., Maddess, T., and Srinivasan, M. (1998). The spatiotemporal properties of the Craik-O’Brien-Cornsweet effect are consistent with’filling-in’. Vision Research, 38(13):2037–2046

*Imber, M. L., Shapley, R. M., and Rubin, N. (2005). Differences in real and illusory shape perception revealed by backward masking. Vision Research, 45(1):91–102

*Nishina, S., Okada, M., and Kawato, M. (2003). Spatio-temporal dynamics of depth propagation on uniform region. Vision Research, 43(24):2493–2503

#*Dakin, S. C. and Bex, P. J. (2003). Natural image statistics mediate brightness ’filling in’. Proceedings of the Royal Society B: Biological Sciences, 270(1531):2341–2348

*Roe, A. W., Lu, H. D., Hung, C. P., and Kaas, J. H. (2005). Cortical Processing of a Brightness Illusion. Proceedings of the National Academy of Sciences of the United States of America, 102(10):3869–3874

Haynes, J. D., Lotto, R., and Rees, G. (2004). Responses of human visual cortex to uniform surfaces. Proceedings of the National Academy of Sciences of the United States of America, 101(12):4286

Koch, C., Marroquin, J., and Yuille, A. (1986). Analog ”neuronal” networks in early vision. Proceedings of the National Academy of Sciences of the United States of America, 83(12):4263–4267

#Roach, N. W., McGraw, P. V., and Johnston, A. (2011). Visual motion induces a forward prediction of spatial pattern. Curr Biol, 21(9):740–745

5.3 . Structures, processes & routines: Transformations and relations

*Memisevic, R. and Hinton, G. (2007). Unsupervised learning of image transformations. Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, pages 1–8

Bergmann, U. and von der Malsburg, C. (2011). Self-organization of topographic bilinear networks for invariant recognition. Neural Computation, pages 1–28

Grimes, D. and Rao, R. P. N. (2003). A bilinear model for sparse coding. Advances in neural information processing systems, pages 1311–1318

*Tenenbaum, J. B. and Freeman, W. (2000). Separating style and content with bilinear models. Neural Computation, 12(6):1247–1283

*Geman, S. (2006). Invariance and selectivity in the ventral visual pathway. Journal of Physiology-Paris

5.4 . Structures, processes & routines: Programs

#*Olshausen, B. A., Anderson, C. H., and Van Essen, D. (1993). A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. The Journal of Neuroscience, 13(11):4700–4719

*Ommer, B. and Buhmann, J. M. (2007). Learning the compositional nature of visual objects. Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, pages 1–8

*Jin, Y. and Geman, S. (2006). Context and hierarchy in a probabilistic image model. Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, 2:2145–2152

#Roelfsema, P., Lamme, V. A., and Spekreijse, H. (2000). The implementation of visual routines. Vision Research, 40(10-12):1385–1411

#Zylberberg, A., Dehaene, S., Roelfsema, P. R., and Sigman, M. (2011). The human Turing machine: a neural framework for mental programs. Trends in Cognitive Sciences, 15(7):293–300

6 . Structures, processes & routines: Learning

*Roelfsema, P. R., van Ooyen, A., and Watanabe, T. (2010). Perceptual learning rules based on reinforcers and attention. Trends in Cognitive Sciences, 14(2):64–71

*Kahnt, T., Grueschow, M., Speck, O., and Haynes, J.-D. (2011). Perceptual Learning and Decision-Making in Human Medial Frontal Cortex. Neuron, 70(3):549–559

*Zhu, L. L., Chen, Y., Torralba, A., Freeman, W., and Yuille, A. (2010). Part and Appearance Sharing: Recursive Compositional Models for Multi-View Multi-Object Detection. pages 1–8

*Song, Y., Hu, S., Li, W., and Liu, J. (2010). The Role of Top-Down Task Context in Learning to Perceive Objects. Journal of Neuroscience, 30(29):9869–9876

#*Kourtzi, Z. and Connor, C. E. (2011). Neural Representations for Object Perception: Structure, Category, and Adaptive Coding. Annual Review of Neuroscience, 34(1):45–67

*Fleuret, F., Li, T., Dubout, C., Wampler, E. K., Yantis, S., and Geman, D. (2011). Comparing machines and humans on a visual categorization test. Proceedings of the National Academy of Sciences

#*Shibata, K., Watanabe, T., Sasaki, Y., and Kawato, M. (2011). Perceptual Learning Incepted by Decoded fMRI Neurofeedback Without Stimulus Presentation. Science, 334(6061):1413–1415

#Fiser, J. (2009). Perceptual learning and representational learning in humans and animals. Learn Behav, 37(2):141–153

References

[Adelson and Bergen, 1991]   Adelson, E. H. and Bergen, J. R. (1991). The plenoptic function and the elements of early vision. In Landy, M. S. and Movshon, J. A., editors, Computational models of visual processing. MIT Press, Cambridge, MA.

[Alink et al., 2010]   Alink, A., Schwiedrzik, C. M., Kohler, A., Singer, W., and Muckli, L. (2010). Stimulus Predictability Reduces Responses in Primary Visual Cortex. Journal of Neuroscience, 30(8):2960–2966.

[Alpert et al., 2011]   Alpert, S., Galun, M., Brandt, A., and Basri, R. (2011). Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration. IEEE Transactions on Pattern Analysis and Machine Intelligence.

[Anderson et al., 2011]   Anderson, B. L., O’Vari, J., and Barth, H. (2011). Non-Bayesian Contour Synthesis. Current biology : CB, 21(6):492–496.

[Ban et al., 2006]   Ban, H., Yamamoto, H., Fukunaga, M., Nakagoshi, A., Umeda, M., Tanaka, C., and Ejima, Y. (2006). Toward a common circle: interhemispheric contextual modulation in human early visual areas. J. Neurosci., 26(34):8804–8809.

[Bergmann and von der Malsburg, 2011]   Bergmann, U. and von der Malsburg, C. (2011). Self-organization of topographic bilinear networks for invariant recognition. Neural Computation, pages 1–28.

[Blake, 1995]   Blake, R. (1995). Psychoanatomical strategies of studying human visual perception. In Early vision and beyond. MIT Press.

[Boyaci et al., 2010]   Boyaci, H., Fang, F., Murray, S. O., and Kersten, D. (2010). Perceptual grouping-dependent lightness processing in human early visual cortex. Journal of Vision, 10(9):1–12.

[Callaway, 1998]   Callaway, E. (1998). Local circuits in primary visual cortex of the macaque monkey. Annual Review of Neuroscience, 21:47–74.

[Cardin et al., 2011]   Cardin, V., Friston, K. J., and Zeki, S. (2011). Top-down Modulations in the Visual Form Pathway Revealed with Dynamic Causal Modeling. Cerebral Cortex, 21(3):550–562.

[Chen et al., 2010]   Chen, J., Zhou, T., Yang, H., and Fang, F. (2010). Cortical Dynamics Underlying Face Completion in Human Visual System. Journal of Neuroscience, 30(49):16692–16698.

[Cichy et al., 2011]   Cichy, R. M., Heinzle, J., and Haynes, J. D. (2011). Imagery and Perception Share Cortical Representations of Content and Location. Cerebral Cortex.

[Connor et al., 2007]   Connor, C. E., Brincat, S. L., and Pasupathy, A. (2007). Transformation of shape information in the ventral pathway. Curr Opin Neurobiol, 17(2):140–147.

[Dakin and Bex, 2003]   Dakin, S. C. and Bex, P. J. (2003). Natural image statistics mediate brightness ’filling in’. Proceedings of the Royal Society B: Biological Sciences, 270(1531):2341–2348. PICT PICT

[Davey et al., 1998]   Davey, M., Maddess, T., and Srinivasan, M. (1998). The spatiotemporal properties of the Craik-O’Brien-Cornsweet effect are consistent with’filling-in’. Vision Research, 38(13):2037–2046.

[Donahue and Grauman, ]   Donahue, J. and Grauman, K. Annotator Rationales for Visual Recognition. Proceedings of the International Conference on Computer Vision (ICCV).

[Egner et al., 2010]   Egner, T., Monti, J. M., and Summerfield, C. (2010). Expectation and Surprise Determine Neural Population Responses in the Ventral Visual Stream. Journal of Neuroscience, 30(49):16601–16608.

[Epshtein et al., 2008]   Epshtein, B., Lifshitz, I., and Ullman, S. (2008). Image interpretation by a single bottom-up top-down cycle. Proceedings of the National Academy of Sciences of the United States of America, 105(38):14298.

[Fang et al., 2009]   Fang, F., Boyaci, H., and Kersten, D. (2009). Border ownership selectivity in human early visual cortex and its modulation by attention. Journal of Neuroscience, 29(2):460–465.

[Fang et al., 2008]   Fang, F., Boyaci, H., Kersten, D., and Murray, S. O. (2008). Attention-dependent representation of a size illusion in human V1. Current biology : CB, 18(21):1707–1712.

[Fei-Fei et al., 2007]   Fei-Fei, L., Iyer, A., Koch, C., and Perona, P. (2007). What do we perceive in a glance of a real-world scene? Journal of Vision, 7(1):10–10.

[Fiser, 2009]   Fiser, J. (2009). Perceptual learning and representational learning in humans and animals. Learn Behav, 37(2):141–153.

[Fleuret et al., 2011]   Fleuret, F., Li, T., Dubout, C., Wampler, E. K., Yantis, S., and Geman, D. (2011). Comparing machines and humans on a visual categorization test. Proceedings of the National Academy of Sciences.

[Friston, 2005]   Friston, K. (2005). A theory of cortical responses. Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1456):815–836.

[Furl et al., 2010]   Furl, N., van Rijsbergen, N. J., Kiebel, S. J., Friston, K. J., Treves, A., and Dolan, R. J. (2010). Modulation of Perception and Brain Activity by Predictable Trajectories of Facial Expressions. Cerebral Cortex, 20(3):694–703.

[Geman, 2006]   Geman, S. (2006). Invariance and selectivity in the ventral visual pathway. Journal of Physiology-Paris.

[Gilbert and Sigman, 2007]   Gilbert, C. D. and Sigman, M. (2007). Brain States: Top-Down Influences in Sensory Processing. Neuron, 54(5):677–696.

[Graf et al., 2011]   Graf, A. B. A., Kohn, A., Jazayeri, M., and Movshon, J. A. (2011). Decoding the activity of neuronal populations in macaque primary visual cortex. Nature Publishing Group, 14(2):239–245.

[Green et al., 2010]   Green, C. S., Pouget, A., and Bavelier, D. (2010). Improved probabilistic inference as a general learning mechanism with action video games. Current biology : CB, 20(17):1573–1579.

[Grill-Spector and Malach, 2004]   Grill-Spector, K. and Malach, R. (2004). THE HUMAN VISUAL CORTEX. Annual Review of Neuroscience, 27(1):649–677.

[Grimes and Rao, 2003]   Grimes, D. and Rao, R. P. N. (2003). A bilinear model for sparse coding. Advances in neural information processing systems, pages 1311–1318.

[Guillery and Sherman, 2002]   Guillery, R. W. and Sherman, S. M. (2002). Thalamic relay functions and their role in corticocortical communication: generalizations from the visual system. Neuron, 33(2):163–175.

[Harrison and Tong, 2009]   Harrison, S. A. and Tong, F. (2009). Decoding reveals the contents of visual working memory in early visual areas. Nature, 458(7238):632–635.

[Hayhoe and Ballard, 2005]   Hayhoe, M. and Ballard, D. (2005). Eye movements in natural behavior. Trends in Cognitive Sciences, 9(4):188–194.

[Haynes et al., 2004]   Haynes, J. D., Lotto, R., and Rees, G. (2004). Responses of human visual cortex to uniform surfaces. Proceedings of the National Academy of Sciences of the United States of America, 101(12):4286.

[Hinton, 2009]   Hinton, G. (2009). Learning to represent visual input. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1537):177–184.

[Hsieh et al., 2010]   Hsieh, P. J., Vul, E., and Kanwisher, N. (2010). Recognition alters the spatial pattern of fMRI activation in early retinotopic cortex. Journal of Neurophysiology, 103(3):1501–1507.

[Imber et al., 2005]   Imber, M. L., Shapley, R. M., and Rubin, N. (2005). Differences in real and illusory shape perception revealed by backward masking. Vision Research, 45(1):91–102.

[Itti and Baldi, 2009]   Itti, L. and Baldi, P. (2009). Bayesian surprise attracts human attention. Vision Research, 49(10):1295–1306.

[Jehee et al., 2006]   Jehee, Rothkopf, C., Beck, J., and Ballard, D. (2006). Learning receptive fields using predictive feedback. Journal of Physiology-Paris, 100(1-3):125–132.

[Jin and Geman, 2006]   Jin, Y. and Geman, S. (2006). Context and hierarchy in a probabilistic image model. Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, 2:2145–2152.

[Kaas et al., 2010]   Kaas, A., Weigelt, S., Roebroeck, A., Kohler, A., and Muckli, L. (2010). Imagery of a moving object: The role of occipital cortex and human MT/V5+. NeuroImage, 49(1):794–804.

[Kahnt et al., 2011]   Kahnt, T., Grueschow, M., Speck, O., and Haynes, J.-D. (2011). Perceptual Learning and Decision-Making in Human Medial Frontal Cortex. Neuron, 70(3):549–559.

[Kimia, 2003]   Kimia, B. B. (2003). On the role of medial geometry in human vision. Journal of Physiology-Paris, 97(2-3):155–190.

[Koch et al., 1986]   Koch, C., Marroquin, J., and Yuille, A. (1986). Analog ”neuronal” networks in early vision. Proceedings of the National Academy of Sciences of the United States of America, 83(12):4263–4267.

[Kourtzi and Connor, 2011]   Kourtzi, Z. and Connor, C. E. (2011). Neural Representations for Object Perception: Structure, Category, and Adaptive Coding. Annual Review of Neuroscience, 34(1):45–67.

[Kovacs and julesz, 1993]   Kovacs, I. and julesz, B. (1993). A Closed Curve Is Much More Than an Incomplete One - Effect of Closure in Figure Ground Segmentation. Proceedings of the National Academy of Sciences of the United States of America, 90(16):7495–7497.

[Lamme and Roelfsema, 2000]   Lamme, V. and Roelfsema, P. (2000). The distinct modes of vision offered by feedforward and recurrent processing. Trends in Neurosciences, 23(11):571–579.

[Land, 2009]   Land, M. F. (2009). Vision, eye movements, and natural behavior. Visual Neuroscience, 26(01):51.

[Lauritzen and Spiegelhalter, 1988]   Lauritzen, S. and Spiegelhalter, D. (1988). Local Computations with Probabilities on Graphical Structures and Their Application to Expert Systems. Journal of the Royal Statistical Society. Series B (Methodological), 50(2):157–224.

[Lee and Mumford, 2003]   Lee, T. S. and Mumford, D. (2003). Hierarchical Bayesian inference in the visual cortex. J Opt Soc Am A Opt Image Sci Vis, 20(7):1434–1448.

[Lee and Yuille, 2006]   Lee, T. S. and Yuille, A. L. (2006). Efficient coding of visual scenes by grouping and segmentation: theoretical predictions and biological evidence. In Doya, K., Ishii, S., Pouget, A., and Rao, R. P., editors, Bayesian Brain: Probabilistic Approaches to Neural Coding, pages 1–29.

[Leibo et al., 2010]   Leibo, J., Mutch, J., and Rosasco, L. (2010). Learning Generic Invariances in Object Recognition: Translation and Scale.

[Lennie, 1998]   Lennie, P. (1998). Single units and visual cortical organization. PERCEPTION-LONDON-, 27:889–936.

[Li et al., 2004]   Li, W., Pich, V., and Gilbert, C. D. (2004). Perceptual learning and top-down influences in primary visual cortex. Nat Neurosci, 7(6):651–657.

[Logothetis, 2008]   Logothetis, N. K. (2008). What we can do and what we cannot do with fMRI. Nature, 453(7197):869–878.

[Ma and Pouget, 2006]   Ma, W. J. and Pouget, A. (2006). Bayesian inference with probabilistic population codes. Nature Neuroscience, 9(11):1432–1438.

[McMains and Kastner, 2011]   McMains, S. and Kastner, S. (2011). Interactions of Top-Down and Bottom-Up Mechanisms in Human Visual Cortex. Journal of Neuroscience, 31(2):587–597.

[McManus et al., 2011]   McManus, J. N. J., Li, W., and Gilbert, C. D. (2011). Adaptive shape processing in primary visual cortex. Proceedings of the National Academy of Sciences, 108(24):9739–9746.

[Memisevic and Hinton, 2007]   Memisevic, R. and Hinton, G. (2007). Unsupervised learning of image transformations. Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, pages 1–8.

[Mumford, 1992]   Mumford, D. (1992). On the computational architecture of the neocortex. Biological Cybernetics, 66(3):241–251.

[Neri, 2011]   Neri, P. (2011). Global properties of natural scenes shape local properties of human edge detectors. i-Perception.

[Nishina et al., 2003]   Nishina, S., Okada, M., and Kawato, M. (2003). Spatio-temporal dynamics of depth propagation on uniform region. Vision Research, 43(24):2493–2503.

[Olshausen et al., 1993]   Olshausen, B. A., Anderson, C. H., and Van Essen, D. (1993). A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. The Journal of Neuroscience, 13(11):4700–4719.

[Ommer and Buhmann, 2007]   Ommer, B. and Buhmann, J. M. (2007). Learning the compositional nature of visual objects. Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, pages 1–8.

[Parih and Grauman, 2011]   Parih, D. and Grauman, K. (2011). Relative Attributes. Proceedings of the International Conference on Computer Vision (ICCV).

[Peelen et al., 2009]   Peelen, M. V., Fei-Fei, L., and Kastner, S. (2009). Neural mechanisms of rapid natural scene categorization in human visual cortex. Nature, 460(7251):94–97.

[Poggio, 2011]   Poggio, T. (2011). The Computational Magic of the Ventral Stream: Towards a Theory. Nature Precedings.

[Rao and Ballard, 1999]   Rao, R. P. N. and Ballard, D. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2:79–87.

[Riesenhuber and Poggio, 1999]   Riesenhuber, M. and Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2:1019–1025.

[Ringach, 2009]   Ringach, D. L. (2009). Spontaneous and driven cortical activity: implications for computation. Current Opinion in Neurobiology, 19(4):439–444.

[Roach et al., 2011]   Roach, N. W., McGraw, P. V., and Johnston, A. (2011). Visual motion induces a forward prediction of spatial pattern. Curr Biol, 21(9):740–745.

[Roe et al., 2005]   Roe, A. W., Lu, H. D., Hung, C. P., and Kaas, J. H. (2005). Cortical Processing of a Brightness Illusion. Proceedings of the National Academy of Sciences of the United States of America, 102(10):3869–3874.

[Roelfsema et al., 2000]   Roelfsema, P., Lamme, V. A., and Spekreijse, H. (2000). The implementation of visual routines. Vision Research, 40(10-12):1385–1411.

[Roelfsema, 2006]   Roelfsema, P. R. (2006). Cortical algorithms for perceptual grouping. Annu Rev Neurosci, 29:203–227.

[Roelfsema et al., 2010]   Roelfsema, P. R., van Ooyen, A., and Watanabe, T. (2010). Perceptual learning rules based on reinforcers and attention. Trends in Cognitive Sciences, 14(2):64–71.

[Rothkopf and Ballard, 2009]   Rothkopf, C. A. and Ballard, D. H. (2009). Image statistics at the point of gaze during human navigation. Visual Neuroscience, 26(01):81.

[Schwartz et al., 2009]   Schwartz, O., Sejnowski, T. J., and Dayan, P. (2009). Perceptual organization in the tilt illusion. Journal of Vision, 9(4):19.1–20.

[Serre et al., 2007]   Serre, T., Oliva, A., and Poggio, T. (2007). A Feedforward Architecture Accounts for Rapid Categorization. Proceedings of the National Academy of Sciences of the United States of America, 104(15):6424–6429.

[Sharon, 2006]   Sharon, E. (2006). Hierarchy and adaptivity in segmenting visual scenes. Nature, 442(7104):810–813.

[Shibata et al., 2011]   Shibata, K., Watanabe, T., Sasaki, Y., and Kawato, M. (2011). Perceptual Learning Incepted by Decoded fMRI Neurofeedback Without Stimulus Presentation. Science, 334(6061):1413–1415.

[Smith and Muckli, 2010]   Smith, F. W. and Muckli, L. (2010). Nonstimulated early visual areas carry information about surrounding context. Proceedings of the National Academy of Sciences, 107(46):20099–20103.

[Soltani and Koch, 2010]   Soltani, A. and Koch, C. (2010). Visual saliency computations: mechanisms, constraints, and the effect of feedback. Journal of Neuroscience, 30(38):12831–12843.

[Song et al., 2010]   Song, Y., Hu, S., Li, W., and Liu, J. (2010). The Role of Top-Down Task Context in Learning to Perceive Objects. Journal of Neuroscience, 30(29):9869–9876.

[Spratling, 2010]   Spratling, M. W. (2010). Predictive coding as a model of response properties in cortical area V1. Journal of Neuroscience, 30(9):3531–3543.

[Spratling, 2011]   Spratling, M. W. (2011). Predictive coding as a model of the V1 saliency map hypothesis. Neural Networks.

[Tappen et al., 2005]   Tappen, M., Freeman, W., and Adelson, E. (2005). Recovering intrinsic images from a single image. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 27(9):1459–1472.

[Tatler et al., 2011]   Tatler, B. W., Hayhoe, M. M., Land, M. F., and Ballard, D. (2011). Eye guidance in natural vision: Reinterpreting salience. Journal of Vision, 11(5):5–5.

[Tenenbaum and Freeman, 2000]   Tenenbaum, J. B. and Freeman, W. (2000). Separating style and content with bilinear models. Neural Computation, 12(6):1247–1283.

[Thorpe et al., 1996]   Thorpe, S., Fize, D., and Marlot, C. (1996). Speed of processing in the human visual system. Nature, 381(6582):520–522.

[Tu et al., 2005]   Tu, Z., Chen, X., Yuille, A., and Zhu, S. (2005). Image parsing: Unifying segmentation, detection, and recognition. In International Journal of Computer Vision, pages 113–140. Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90095 USA.

[Ullman, 1984]   Ullman, S. (1984). Visual routines. COGNITION, 18(1-3):97–159.

[Ullman, 1995]   Ullman, S. (1995). Sequence seeking and counter streams: a computational model for bidirectional information flow in the visual cortex. Cerebral Cortex, 5(1):1–11.

[Vanni and Rosenström, 2010]   Vanni, S. and Rosenström, T. (2010). Local non-linear interactions in the visual cortex may reflect global decorrelation. Journal of Computational Neuroscience, 30(1):109–124.

[von der Heydt, 2002]   von der Heydt, R. (2002). Image Parsing Mechanisms of the Visual Cortex. pages 1–25.

[Weidner et al., 2009]   Weidner, R., Krummenacher, J., Reimann, B., Müller, H. J., and Fink, G. R. (2009). Sources of top-down control in visual search. Journal of Cognitive Neuroscience, 21(11):2100–2113.

[Williams et al., 2008]   Williams, M. A., Baker, C. I., Op de Beeck, H. P., Shim, W. M., Dang, S., Triantafyllou, C., and Kanwisher, N. (2008). Feedback of visual object information to foveal retinotopic cortex. Nature Neuroscience, 11(12):1439–1445.

[Yuille and Kersten, 2006]   Yuille, A. and Kersten, D. (2006). Vision as Bayesian inference: analysis by synthesis? Trends in Cognitive Sciences, 10(7):301–308.

[Zeiler et al., 2010]   Zeiler, M. D., Krishnan, D., Taylor, G. W., and Fergus, R. (2010). Deconvolutional Networks. In 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2528–2535. IEEE.

[Zhang et al., 2008]   Zhang, L., Tong, M. H., Marks, T. K., Shan, H., and Cottrell, G. W. (2008). SUN: A Bayesian framework for saliency using natural statistics. Journal of Vision, 8(7):32–32.

[Zhang et al., 2012]   Zhang, X., Zhaoping, L., Zhou, T., and Fang, F. (2012). Neural Activities in V1 Create a Bottom-Up Saliency Map. Neuron, 73(1):183–192.

[Zhu et al., 2010]   Zhu, L. L., Chen, Y., Torralba, A., Freeman, W., and Yuille, A. (2010). Part and Appearance Sharing: Recursive Compositional Models for Multi-View Multi-Object Detection. pages 1–8.

[Zhu, 2003]   Zhu, S. (2003). Statistical modeling and conceptualization of visual patterns. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 25(6):691–712.

[Zylberberg et al., 2011]   Zylberberg, A., Dehaene, S., Roelfsema, P. R., and Sigman, M. (2011). The human Turing machine: a neural framework for mental programs. Trends in Cognitive Sciences, 15(7):293–300.

PICT