Modeling Human Vision

University of Minnesota, Spring Semester, 2001

Psy 8993 (58119 -002 ) Directed Studies: Special Areas of Psychology and Related Sciences

Special Topics in Computational Vision.


http://vision.psych.umn.edu/www/kersten-lab/courses/Psy8036/Psy8993_Section2Spring2001.html

Instructors: Daniel Kersten and Paul Schrater
Contact: kersten@umn.edu, 625-2589 , URL: www.umn.edu/~kersten

First Meeting: 2:00pm-04:00pm Tuesday, January 16, 2001 in 204 Elliott Hall
Place: 204 Elliott Hall

A challenging goal for a science of visual perception is the development of testable, quantitative models of visual performance that take into account the complexities of natural images and the richness and flexibility of human visual behavior. The purpose of this seminar is to explore the theoretical principles required for modelling human vision. These principles constitute what we call pattern inference theory. The basic elements of pattern inference theory are not new and have their mathematical roots in communication and information theory, Bayesian decision theory, pattern theory, control theory, and Bayes nets. There has been recent progress in developing principles and techniques that apply to models of perceptual inference. Although, there is still a large gap between theory and experiment, part of our focus will be to look at psychophysical studies that address these principles of inference. The seminar will be a mixture of lectures and discussion.

The readings will be a combination of lecture notes (indicated by Chapter *), and selected papers from the relevant literature in computational vision, biological vision, and human psychophysics.



 

 

Schedule (Readings below are incomplete and tentative)

Week Topics Readings
1. Week of January 15 Computational Vision: Introduction and overview. NIPS*98 Tutorial Notes (http://vision.psych.umn.edu/www/kersten-lab/papers/NIPS98.pdf)
2. Week of January 22 Bayes Probability Theory: Pattern Analysis and Pattern Synthesis.
Vision for an Agent in the World.
Tasks for Experiments and Artificial Systems.
Bayesian estimation: Putting the pieces together.

Chapter 1, Geisler (1989), Kersten (1999)

Mathematica Reviews: Probability, and Linear systems
If you don't have a copy of Mathematica, you can download a free reader: MathReader
Matlab Review of linear systems
3. Week of January 29 Bayesian Decision Theory I:
Discrete state spaces. Decision Theory for Continuous Variables.
Chapter 2.
4. Week of February 5 Bayesian Decision Theory II; Multi-dimensional inputs: the Geometry of Decision Surfaces.
An Information Theoretic Perspective.
ROC curve and two-alternative forced choice. Signal Known Exactly Gaussian Model.
Fisher Information and the Cramer-Rao lower bound.
Chapter 2. cont'd, Burgess et al., Schrater et al., (2000), Blake et al., Eckstein (1998), Gold (1999)
5. Week of February 12 Integrating Out Secondary Variables I :Secondary Variables: An Example and Overview. Marginalizing over Continuous Secondary Variables. Phase Space and Integrating Secondary Variables. Chapter 3. Freeman (1993), Schrater and Kersten (in press)
6. Week of February 19 Integrating Out Secondary Variables II: Discrete Secondary Variables. The EM algorithm.
MFT Approximation and Bounding the Evidence.
Chapter 3. Yuille, Stolorz & Ultans (1994)
7. Week of February 26 Learning I. Learning as Empirical Risk Minimization. Learning Histograms for Discrimination.
Nearest Neighbour Classification: an example of non-parametric methods.
Parametric Learning: Sufficient Statistics, and Exponential Distributions.
Chapter 4. Freeman, W. T., & Pasztor, E. C. (1999)
8. Week of March 5 Learning II. Model Selection and Occam's Razor. Clustering and Dimension Reduction.
Learning Classification by Perceptrons. Support Vector Machines. Vapnik-Chervonenkis Dimension and Bounds.
Chapter 4 cont'd.Simoncelli (1997)
9. Week of March 12 Large Number of Observations: Improving Bayesian Estimates using Multiple Observations.
Convolution of Distributions.
The Behaviour of Sums of Random Variables.
Applications of Limit Theorems: Model Section Criteria. Large Deviation Theory: Cramer's Theorem.
Chapter 5. Wu and Zhu (1999), Zhu et al. (1997)
10. Week of March 19 One-dimensional models I: One-Dimensional Bayes. Spatial Structure and 1-D Markov
1-D Markov with Line Processors
More learning: PCA, power spectra, and histograms.
Chapter 6.Atick and Redlich (1992), Bell & Sejnowski (1997), Olshausen & Field (1996)
Week of March 26-30 Spring Break --  
11. Week of April 2 One-dimensional models II:
1-D Computational techniques.
Chapter 6 cont'd. Belhumeur et al. (1997),
12. Week of April 9 Dynamic one-dimensional models: Dynamic One-Dimensional Bayes, Tracking and Kalman filtering, switching, Control theory.
Hidden Markov models, inhomogeneous, learning.
Reinforcement learning
Chapter 7.Rao and Ballard (1999), Legge et al. (1997),
13. Week of April 16 Curves, regions and surfaces I: Homogeneous models. Modeling curves, Snakes, Energy, Splines, Modeling regions. Tracking splines. Surfaces, Integrability. Chapter 8. Nakayama, K., & Shimojo, S. (1992)
14. Week of April 23 Curves, regions and surfaces II: Inhomogeneous models. Deformable templates. Chapter 9. Kersten, D., & Madarasmi, S. (1995)
15. Week of April 30 TBA  

 

 

 

PRELIMINARY READING LIST

*Atick, J. J., & Redlich, A. N. (1992). What does the retina know about natural scenes? Neural Computation, 4(2), 196-210.

*Barlow, H. B. (1961). Possible principles underlying the transformation of sensory messages. In W. A. Rosenblith (Ed.), Sensory Communication Cambridge, MA: MIT Press.

*Bell, A. J., & Sejnowski, T. J. (1997). The "independent components" of natural scenes are edge filters. Vision Res., 37(23), 3327-38.

Belhumeur, P., Hespanha, J. P., & Kriegman, D. J. (1997). Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(July '97), 711-720.

Blake, Andrew, Bulthoff, Heinrich, Sheinberg, David 1993. Shape from Texture: Ideal Observers and Human Psychophysics.

*Bloj, M., Kersten, D., & Hurlbert, A. C. (in press). 3D shape perception influences colour perception via mutual illumination. Nature.

Bülthoff, H. H., & Yuille, A. (1991). Bayesian models for seeing surfaces and depth. Comments on Theoretical Biology, 2(4), 283-314.

*Burgess, A. E., Wagner, R. F., Jennings, R. J., & Barlow, H. B. (1981). Efficiency of human visual signal discrimination. Science, 214, 93-94.

Burgess, A. E. (1985). Visual signal detection. III. On Bayesian use of prior knowledge and cross correlation. J. Opt. Soc. Am. A, 2(9), 1498-1507.

Burns, et. al.,1995; relevant work on letter recognition.

Brainard, D. H., & Freeman, W. T. (1994). Bayesian Method for Recovering Surface and Iluminant Properties from Photosensor Responses. Human Vision, Visual Processing, and Digital Display V. Bellingham, Washington. The Society of Photo-Optical Instrumentation Engineers, 2179, 364-376.

Clark, J. J., & Yuille, A. L. (1990). Data Fusion for Sensory Information Processing . Boston: Kluwer Academic Publishers.

Crowell, J. A., & Banks, M. S. (1996). Ideal observer for heading judgments. Vision Research, 36, 471-490.

Eagle and Blake, 1995. relevant work on structure from motion.

Gold, J., Bennett, P. J., & Sekuler, A. B. (1999). Signal but not noise changes with perceptual learning. Nature, 402(6758), 176-178.

Harris and Parker, 1992; work on depth perception in random-dot stereograms (also Scharff and Geisler, 1992;)

Huang, Jinggang and Mumford, David . Statistics of Natural Images and Models http://www.dam.brown.edu/people/mumford/Papers/paper1.ps

Bochud, F. O., Abbey, C. K., & Eckstein, M. P. (2000). Visual signal detection in structured backgrounds. III. Calculation of figures of merit for model observers in statistically nonstationary backgrounds. J Opt Soc Am A Opt Image Sci Vis, 17(2), 193-205.

Eckstein, M. P. (1998). The lower efficiency for conjunctions is due to noise and not serial attentional processing. Psychological Science, 9, 111-118.

Eckstein, M. P., Abbey, C. K., & Bochud, F. O. (2000). Visual signal detection in structured backgrounds. IV. Figures of merit for model performance in multiple-alternative forced-choice detection tasks with correlated responses. J Opt Soc Am A Opt Image Sci Vis, 17(2), 206-217.

Eckstein, M. P., Ahumada, A. J., Jr., & Watson, A. B. (1997). Visual signal detection in structured backgrounds. II. Effects of contrast gain control, background variations, and white noise. J Opt Soc Am A, 14(9), 2406-2419.

Eckstein, M. P., & Whiting, J. S. (1996). Visual signal detection in structured backgrounds. I. Effect of number of possible spatial locations and signal contrast. J Opt Soc Am A, 13(9), 1777-1787.

Field, D. J. (1994). What is the goal of sensory coding? Neural Computation, 6, 559-601.

Field, D. J. (1987). Relations between the statistics of natural images and the response properties of cortical cells. 4(12), 2379-2394.

*Freeman, W. T. (1994). The generic viewpoint assumption in a framework for visual perception. Nature, 368(7 April 1994), 542-545.

Freeman, W. T., & Pasztor, E. C. (1999). Learning to estimate scenes from images. In S. A. S. a. D. A. C. M. S. Kearns (Ed.), Adv. Neural Information Processing Systems 11 Cambridge MA: MIT Press.

Geisler, W. (1989). Sequential Ideal-Observer analysis of visual discriminations. Psychological Review, 96(2), 267-314.

Ghahramani, Z., & Wolpert, D. M. (1997). Modular decomposition in visuomotor learning. Nature, 386, 392-395.

*Hinton, G. E., & Ghahramani, Z. (1997). Generative models for discovering sparse distributed representations. Philos Trans R Soc Lond B Biol Sci, 352(1358), 1177-90.

Kersten, D. (1984). Spatial summation in visual noise. Vision Research, 24, 1977-1990.

Kersten, D. J. (1987). Predictability and Redundancy of Natural Images. Journal of the Optical Society of America, 4, 2395-2400.

Kersten, D. (1990). Statistical limits to image understanding. In C. Blakemore (Ed.), Vision: Coding and Efficiency (pp. 32-44). Cambridge, UK: Cambridge University Press.

Kersten, D. J. (1991). Transparency and the Cooperative Computation of Scene Attributes. In M. Landy, & A. Movshon (Ed.), Computational Models of Visual Processing (pp. 209-228). Cambridge, Massachusetts: M.I.T. Press.

Kersten, D., Bülthoff, H. H., Schwartz, B., & Kurtz, K. (1992). Interaction between transparency and structure from motion. Neural Computation, 4(4), 573-589.

Kersten, D., & Madarasmi, S. (1995). The Visual Perception of Surfaces, their Properties, and Relationships. In I. J. Cox, P. Hansen, & B. Julesz (Ed.), Partitioning Data Sets: With applications to psychology, vision and target tracking (pp. 373-389). American Mathematical Society.

Kersten, D. (1999). High-level vision as statistical inference. In M. S. Gazzaniga (Ed.), The New Cognitive Neurosciences -- 2nd Edition (pp. 353-363). Cambridge, MA: MIT Press.

*Knill, D. C. (in press). Surface orientation from texture: Ideal observers, generic observers and the information content of texture cues. Vision Research.

*Knill. (in press). Discrimination of planar surface slant from texture: Human and ideal observers compared. Vision Research.

Knill, D. C., Field, D., & Kersten, D. (1990). Human discrimination of fractal images. 7, 1113-1123.

*Landy, M. S., Maloney, L. T., Johnston, E. B., & Young, M. J. (1995). Measurement and modeling of depth cue combination: In defense of weak fusion. Vision Research, 35, 389-412.

Legge, G. E., Klitz, T. S., & Tjan, B. S. (1997). Mr. Chips: an ideal-observer model of reading. Psych. Review, 104(3), 524-53.

*Liu, Z., Knill, D. C., & Kersten, D. (1995). Object Classification for Human and Ideal Observers. Vision Research, 35(4), 549-568.

*Mamassian, P., & Landy, M. S. (1998). Observer biases in the 3D interpretation of line drawings. Vision Research., 38, 2817-2832.

*Mumford, D. (1994). Neuronal architectures for pattern-theoretic problems. In C. Koch, & J. L. Davis (Ed.), Large-Scale Neuronal Theories of the Brain (pp. 125-152). Cambridge, MA: MIT Press.

Mumford, D. (1999) The Dawning of the Age of Stochasticity. http://www.dam.brown.edu/people/mumford/Papers/linceiams.pdf

Mumford, D. (1996). Pattern theory: A unifying perspective. In D. C. Knill, & R. W. (Ed.), Perception as Bayesian Inference (pp. Chapter 2). Cambridge: Cambridge University Press.

Nakayama, K., & Shimojo, S. (1992). Experiencing and perceiving visual surfaces. Science, 257, 1357-1363

*Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607-609.

Pelli, D. G. (1990). The quantum efficiency of vision. In C. Blakemore (Ed.), Vision:Coding and Efficiency Cambridge: Cambridge University Press.

Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat Neurosci, 2(1), 79-87.

*Roweis, S., & Ghahramani, Z. (1999). A unifying review of linear gaussian models. Neural Comput, 11(2), 305-45.

*Schrater, P. R., Knill, D. C., & Simoncelli, E. P. (under review?). Mechanisms of visual motion detection. Nature Neuroscience.

Schrater, P. R., & Kersten, D. (in press). The role of task specification in optimal cue integration. International Journal of Computer Vision.

Simoncelli, E. P., & Portilla, J. (1998). Texture Characterization via Joint Statistics of Wavelet Coefficient Magnitudes. 5th IEEE Int'l Conf on Image Processing. Chicago, IL.

Simoncelli, E. P. (1997). Statistical Models for Images: Compression, Restoration and Synthesis. Proc. 31st Asilomar Conference on Signals, Systems and Computers. Pacific Grove, CA. © IEEE Signal Processing Society.

J. B. Tenenbaum. Bayesian modeling of human concept learning. http://www-psych.stanford.edu/~jbt/rulesim.pdf

Tjan, B., Braje, W., Legge, G. E., & Kersten, D. (1995). Human efficiency for recognizing 3-D objects in luminance noise. Vision Research, 35(21), 3053-3069.

*Weiss, Y., & Adelson, E. H. (1998). Slow and smooth: a Bayesian theory for the combination of local motion signals in human vision (A.I. Memo No. 1624). M.I.T.

Weiss, Y. (1997). Interpreting images by propagating Bayesian beliefs. In M. I. J. a. T. P. M.C. Mozer (Ed.), Advances in Neural Information Processing Systems 9 (pp. 908-915). Cambridge MA: MIT Press.

Wu, Y. N., & Zhu, S. C. (1999). Equivalence of Ensembles and Fundamental Bounds--A unified theory of texture modeling and synthesis. IEEE PAMI.

*Yuille, A. L., & Bülthoff, H. H. (1996). Bayesian decision theory and psychophysics. In K. D.C., & R. W. (Ed.), Perception as Bayesian Inference Cambridge, U.K.: Cambridge University Press.

Yuille, A. (1991). Deformable templates for face recognitoin. Journal of Cognitive Neuroscience, 3(1), 59-70.


Yuille, A. L., & Bülthoff, H. H. (1996). Bayesian decision theory and psychophysics. In K. D.C., & R. W. (Ed.), Perception as Bayesian Inference Cambridge, U.K.: Cambridge University Press.

Yuille, A., Stolorz, P., & Ultans, J. (1994). Statistical physics, mixtures of distributions and the EM algorithm. Neural Computation, 6, 334-340.

*Zhu, S. C., Wu, Y., & Mumford, D. (1997). Minimax Entropy Principle and Its Applications to Texture Modeling. Neural Computation, 9(8), 1627-1660.

Zucker, S. W., & David, C. (1988). Points and end-points: A size-spacing constraint for dot grouping. Perception, 17, 229-247.

***

Other papers: Maloney-- dot grouping in lines, Watamaniak, Norbert, Yuille, McKee--tracking moving dot, Yang & Zemel -- cue integration, Knill's NIPS*99 workshop material.