Computational Vision:
Pattern Inference Theory

University of Minnesota
Spring Semester, 2000

Psy 8993 - Section 002
Call #59889

http://vision.psych.umn.edu/www/kersten-lab/courses/Psy8970/psy8993.html

Instructors: Daniel Kersten and Paul Schrater
Contact: kersten@tc.umn.edu, 625-2589
 

First Meeting: 2:00-4:00, Tuesday Jan. 18, 1999
Place: 204 Elliott Hall
 

We see the grand challenge for a science of visual perception to be testable, quantitative theories of visual performance that take into account the complexities of natural images and the richness of visual behavior. The purpose of this seminar is to explore the fundamental principles in addressing the grand challenge. These principles constitute what we will call pattern inference theory. The basic elements of pattern inference theory are not new and have their mathematical roots in communication and information theory, Bayesian decision theory, pattern theory, control theory, and Bayes nets. Although the elements are not new, there has been considerable recent progress in developing fundamental principles and techniques that apply to perceptual inference. Perception is distinctive by virtue of high dimensionality and the often non-Gaussian nature of the distributions. Although, there is still a large gap between theory and experiment, part of our focus will be to look at psychophysical studies that address these principles of inference. The course will be a mixture of lectures and discussion. Application topics will include: 1) early visual coding as redundancy reduction; 2) learning and using intermediate-level organizational processes (e.g. surface structure and Gestalt principles, cue integration); and, 3) high-level visual functions (object recognition and localization).

For an introduction to the approach, see: Yuille, Coughlan and Kersten (pdf), and Kersten and Schrater (pdf).


In addition to the outside readings, we will be distributing notes based on chapters from Yuille, Coughlan, Kersten, and Schrater.


Tentative Schedule

 
1. Week of January 17
Computational Vision: Introduction and overview.
Reading: NIPS*98 Tutorial Notes, Kersten, 1998
 
2. Week of January 24
Introduction to Pattern Inference Theory
Basic Bayes & Information theory, Pattern Theory, Bayesian Reasoning,
Vision in the Complete Agent, Inference Tasks.
Main reading: Chapter 1, Yuille, Coughlan and Kersten (pdf).
 
Mathematica Notebook Reviews: Probability, and Linear systems
If you don't have a copy of Mathematica, you can download a free reader: MathReader
 
 
3. Week of January 31
Bayesian Decision Theory I
Probability, Risk and Loss. 0-order generative models.
Light discrimination
Main reading: Chapter 2 (pig2.pdf) (NEW REVISION-Feb 29, 2000)
(previous revision was Feb. 21)
 
 
4. Week of February 7
Bayesian Decision Theory: Marginalization
Secondary variables, genericity, robustness (priors vs. loss functions)
View, illumination direction.
 
5. Week of February 14
Bayesian Decision Theory III
Fisher information, Cramer-Rao.
Schrater and Kersten; d'Avossa and Kersten
 
6. Week of February 21
Bayesian Decision Theory IV
Learning. Sufficient Statistics. Minimax entropy, Exponential Distributions.
Knill, Field and Kersten. Zhu et al.
Main reading: Chapter 3 (pig3.pdf)
 
7. Week of February 28
Bayesian Decision Theory V
Gaussians as special case. Linear and Quadratic discriminants. Neural networks.
Belhumeur et al..
 
8. Week of March 6
Bayesian Decision Theory VI
Model selection.
Knill. Tenenbaum.
 
9. Week of March 13
Bayes Nets I
Holmes and Watson. Vision examples of "inferred cues" (occlusion)
 
10. Week of March 20
Bayes Nets II
algorithms. forward-backward algorithm, dynamic Bayes Nets
(Kalman).
 
Spring Break -- Week of March 27
 
11. (Week of April 3)
Control
Decision theory roots of vision for control.
Forward models, inverse models, sensorimotor integration
Kalman filter example (Gharhamani and Wolpert)
Examples from: Schrater, Kersten, Hartung
 
 
12. (Week of April 10)
13. (Week of April 17)
14. (Week of April 24)
 
15. (Week of May 1)
 
Geometrical Generative: Shape and Shift-variant Models, Edges, lines and curves, Fields, Surfaces
Photometrical Generative: illumination models, BDRFs.
 
Readings: Weiss and Adelson 1998.
 
 
 

PRELIMINARY READING LIST

*Atick, J. J., & Redlich, A. N. (1992). What does the retina know about natural scenes? Neural Computation, 4(2), 196-210.

*Barlow, H. B. (1961). Possible principles underlying the transformation of sensory messages. In W. A. Rosenblith (Ed.), Sensory Communication Cambridge, MA: MIT Press.

*Bell, A. J., & Sejnowski, T. J. (1997). The "independent components" of natural scenes are edge filters. Vision Res, 37(23), 3327-38.

Belhumeur, P., Hespanha, J. P., & Kriegman, D. J. (1997). Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(July '97), 711-720.

Blake, Andrew, Bulthoff, Heinrich, Sheinberg, David 1993. Shape from Texture: Ideal Observers and Human Psychophysics.

*Bloj, M., Kersten, D., & Hurlbert, A. C. (in press). 3D shape perception influences colour perception via mutual illumination. Nature.

Bülthoff, H. H., & Yuille, A. (1991). Bayesian models for seeing surfaces and depth. Comments on Theoretical Biology, 2(4), 283-314.

*Burgess, A. E., Wagner, R. F., Jennings, R. J., & Barlow, H. B. (1981). Efficiency of human visual signal discrimination. Science, 214, 93-94.

Burgess, A. E. (1985). Visual signal detection. III. On Bayesian use of prior knowledge and cross correlation. J. Opt. Soc. Am. A, 2(9), 1498-1507.

Burns, et. al.,1995; relevant work on letter recognition.

Brainard, D. H., & Freeman, W. T. (1994). Bayesian Method for Recovering Surface and Iluminant Properties from Photosensor Responses. Human Vision, Visual Processing, and Digital Display V. Bellingham, Washington. The Society of Photo-Optical Instrumentation Engineers, 2179, 364-376.

Clark, J. J., & Yuille, A. L. (1990). Data Fusion for Sensory Information Processing . Boston: Kluwer Academic Publishers.

Crowell, J. A., & Banks, M. S. (1996). Ideal observer for heading judgments. Vision Research, 36, 471-490.

Eagle and Blake, 1995. relevant work on structure from motion.

Harris and Parker, 1992; work on depth perception in random-dot stereograms (also Scharff and Geisler, 1992;)

Huang, Jinggang and Mumford, David . Statistics of Natural Images and Models http://www.dam.brown.edu/people/mumford/Papers/paper1.ps

Field, D. J. (1994). What is the goal of sensory coding? Neural Computation, 6, 559-601.

Field, D. J. (1987). Relations between the statistics of natural images and the response properties of cortical cells. 4(12), 2379-2394.

*Freeman, W. T. (1994). The generic viewpoint assumption in a framework for visual perception. Nature, 368(7 April 1994), 542-545.

Freeman, W. T., & Pasztor, E. C. (1999). Learning to estimate scenes from images. In S. A. S. a. D. A. C. M. S. Kearns (Ed.), Adv. Neural Information Processing Systems 11 Cambridge MA: MIT Press.

Geisler, W. (1989). Sequential Ideal-Observer analysis of visual discriminations. Psychological Review, 96(2), 267-314.

Ghahramani, Z., & Wolpert, D. M. (1997). Modular decomposition in visuomotor learning. Nature, 386, 392-395.

*Hinton, G. E., & Ghahramani, Z. (1997). Generative models for discovering sparse distributed representations. Philos Trans R Soc Lond B Biol Sci, 352(1358), 1177-90.

Kersten, D. (1984). Spatial summation in visual noise. Vision Research, 24, 1977-1990.

Kersten, D. J. (1987). Predictability and Redundancy of Natural Images. Journal of the Optical Society of America, 4, 2395-2400.

Kersten, D. (1990). Statistical limits to image understanding. In C. Blakemore (Ed.), Vision: Coding and Efficiency (pp. 32-44). Cambridge, UK: Cambridge University Press.

Kersten, D. J. (1991). Transparency and the Cooperative Computation of Scene Attributes. In M. Landy, & A. Movshon (Ed.), Computational Models of Visual Processing (pp. 209-228). Cambridge, Massachusetts: M.I.T. Press.

Kersten, D., Bülthoff, H. H., Schwartz, B., & Kurtz, K. (1992). Interaction between transparency and structure from motion. Neural Computation, 4(4), 573-589.

Kersten, D., & Madarasmi, S. (1995). The Visual Perception of Surfaces, their Properties, and Relationships. In I. J. Cox, P. Hansen, & B. Julesz (Ed.), Partitioning Data Sets: With applications to psychology, vision and target tracking (pp. 373-389). American Mathematical Society.

*Knill, D. C. (in press). Surface orientation from texture: Ideal observers, generic observers and the information content of texture cues. Vision Research.

*Knill. (in press). Discrimination of planar surface slant from texture: Human and ideal observers compared. Vision Research.

Knill, D. C., Field, D., & Kersten, D. (1990). Human discrimination of fractal images. 7, 1113-1123.

*Landy, M. S., Maloney, L. T., Johnston, E. B., & Young, M. J. (1995). Measurement and modeling of depth cue combination: In defense of weak fusion. Vision Research, 35, 389-412.

Legge, G. E., Klitz, T. S., & Tjan, B. S. (1997). Mr. Chips: an ideal-observer model of reading. Psych. Review, 104(3), 524-53.

*Liu, Z., Knill, D. C., & Kersten, D. (1995). Object Classification for Human and Ideal Observers. Vision Research, 35(4), 549-568.

*Mamassian, P., & Landy, M. S. (1998). Observer biases in the 3D interpretation of line drawings. Vision Research., 38, 2817-2832.

*Mumford, D. (1994). Neuronal architectures for pattern-theoretic problems. In C. Koch, & J. L. Davis (Ed.), Large-Scale Neuronal Theories of the Brain (pp. 125-152). Cambridge, MA: MIT Press.

Mumford, D. (1999) The Dawning of the Age of Stochasticity. http://www.dam.brown.edu/people/mumford/Papers/linceiams.pdf

Mumford, D. (1996). Pattern theory: A unifying perspective. In D. C. Knill, & R. W. (Ed.), Perception as Bayesian Inference (pp. Chapter 2). Cambridge: Cambridge University Press.

Nakayama, K., & Shimojo, S. (1992). Experiencing and perceiving visual surfaces. Science, 257, 1357-1363

*Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607-609.

Pelli, D. G. (1990). The quantum efficiency of vision. In C. Blakemore (Ed.), Vision:Coding and Efficiency Cambridge: Cambridge University Press.

Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat Neurosci, 2(1), 79-87.

*Roweis, S., & Ghahramani, Z. (1999). A unifying review of linear gaussian models. Neural Comput, 11(2), 305-45.

*Schrater, P. R., Knill, D. C., & Simoncelli, E. P. (under review?). Mechanisms of visual motion detection. Nature Neuroscience.

Simoncelli, E. P., & Portilla, J. (1998). Texture Characterization via Joint Statistics of Wavelet Coefficient Magnitudes. 5th IEEE Int'l Conf on Image Processing. Chicago, IL.

Simoncelli, E. P. (1997). Statistical Models for Images: Compression, Restoration and Synthesis. Proc. 31st Asilomar Conference on Signals, Systems and Computers. Pacific Grove, CA. © IEEE Signal Processing Society.

J. B. Tenenbaum. Bayesian modeling of human concept learning. http://www-psych.stanford.edu/~jbt/rulesim.pdf

Tjan, B., Braje, W., Legge, G. E., & Kersten, D. (1995). Human efficiency for recognizing 3-D objects in luminance noise. Vision Research, 35(21), 3053-3069.

*Weiss, Y., & Adelson, E. H. (1998). Slow and smooth: a Bayesian theory for the combination of local motion signals in human vision (A.I. Memo No. 1624). M.I.T.

Weiss, Y. (1997). Interpreting images by propagating Bayesian beliefs. In M. I. J. a. T. P. M.C. Mozer (Ed.), Advances in Neural Information Processing Systems 9 (pp. 908-915). Cambridge MA: MIT Press.

Wu, Y. N., & Zhu, S. C. (1999). Equivalence of Ensembles and Fundamental Bounds--A unified theory of texture modeling and synthesis. IEEE PAMI.

*Yuille, A. L., & Bülthoff, H. H. (1996). Bayesian decision theory and psychophysics. In K. D.C., & R. W. (Ed.), Perception as Bayesian Inference Cambridge, U.K.: Cambridge University Press.

Yuille, A. (1991). Deformable templates for face recognitoin. Journal of Cognitive Neuroscience, 3(1), 59-70.


Yuille, A. L., & Bülthoff, H. H. (1996). Bayesian decision theory and psychophysics. In K. D.C., & R. W. (Ed.), Perception as Bayesian Inference Cambridge, U.K.: Cambridge University Press.

Yuille, A., Stolorz, P., & Ultans, J. (1994). Statistical physics, mixtures of distributions and the EM algorithm. Neural Computation, 6, 334-340.

*Zhu, S. C., Wu, Y., & Mumford, D. (1997). Minimax Entropy Principle and Its Applications to Texture Modeling. Neural Computation, 9(8), 1627-1660.

Zucker, S. W., & David, C. (1988). Points and end-points: A size-spacing constraint for dot grouping. Perception, 17, 229-247.

***

Maloney-- dot grouping in lines.

Watamaniak, Norbert, Yuille, McKee--tracking moving dot.

Yang & Zemel -- cue integration

Knill's NIPS material