Computational Vision: Principles of Perceptual Inference

University of Minnesota

Fall, 1998

Course number: Psy 8970 - Call # 065647

http://vision.psych.umn.edu/www/kersten-lab/courses/Psy8970/psy8970.html

Instructors: Daniel Kersten and Paul Schrater
Contact: kersten@tc.umn.edu, 625-2589
 

First Meeting: 2:30-4:30, Tuesday Sept. 29, 1998
Place: 225 Elliott Hall
 

This seminar will provide an overview of basic principles of visual perception as statistical inference. It is now widely appreciated that the problem of perception is complex and formally hard. Theoretical work has highlighted problems that constrain our understandingof the nature of the neural machinery underlying vision. One problem pointed out as far back as Helmholtz is that interpreting image data is underconstrained--there are multiple interpretations of the world consistent with the image data. A second problem is that for any given visual task (e.g. object recognition), there are image variations (e.g. illumination, clutter, noise) that confound the signal (e.g. object shape). A key to solving the problems of image ambiguity and variations is to understand how vision exploits the inherent statistical structure of natural images for the various tasks vision is used for. Over the past decade, there has been considerable progress in understanding the fundamental principles of perceptual inference. The course will be a mixture of lectures, which primarily emphasize theory, and discussion, which will focus on integrating theory with psychophysical applications. The lectures will be based on chapters developed by Alan Yuille, James Coughlan and Daniel Kersten. Application topics will include: early visual coding as redundancy reduction; learning and using intermediate-level organizational processes (e.g. surface structure and Gestalt principles); and, high-level visual functions (object recognition and localization) as Bayesian inference.

1. Pattern Theory (Week of September 28)
Basic Bayes & Information theory, Pattern Theory, Bayesian Reasoning, Vision in the Complete Agent, Inference Tasks
 
2. Bayesian Decision Theory I (Week of October 5)

Probability, Risk and Loss, Generic views, robustness,
The Central Limit Theorem, Stochastic Sampling, Matched filter, Weak cue integration
 
Burgess, A. E. (1985). Visual signal detection. III. On Bayesian use of prior knowledge and cross correlation. J. Opt. Soc. Am. A, 2(9), 1498-1507.
 
3. Bayesian Decision Theory II (Week of October 12)
Fisher information, Cramer-Rao, The Theory of Types.
 
Weiss, Y., & Adelson, E. H. (1998). Slow and smooth: a Bayesian theory for the combination of local motion signals in human vision (A.I. Memo No. 1624). M.I.T.
 
4. Surface reconstruction, Markov Models of Images, Textures, Surfaces, Flows, Transparency (Week of October 19)

Basic Tasks & Performance Bounds, Stochastic Sampling, Sampling on Graphcs
Multiple Models, EM, Extensions of Bayesian models, More realistic priors, Color and texture, image formation, Motion Flow, Global Integration of Measurements
 
(5. Edges and Lines: Markov Random Fields -- replaced)

Basic Tasks, Performance Bounds, Stochastic Sampling, Sampling on Graphs)
 
6. Redundancy Reduction, Density Estimation, Minimax (Weeks of October 26 & Nov 9.)

Information theory, channel capacity, Minimax Entropy Method
 
Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607-609.
 
7. Control, Kalman (Week of November 16)

Active Splines, Mixture Models, Control
 
Wolpert, D. M., Ghahramani, Z., & Jordan, M. I. (1995). An Internal Model for Sensorimotor Integration. Science, 269 (29 September), 1880-1882.
8. (Hidden Markov Models -- replaced )

Classification using HHMs, Training )
 
9. Bayes Nets (Week of November 23)

Hierarchical Models, Generative models, Top down/Bottom up, Factor Analysis, Cue Integration
 
10. (Shape and Shift-variant Models -- replaced)
 
11. Twenty questions (Week of November 30)
Visual Search, Attention
 
 
Other Topics
PCA, SVD, Fisher, support vectors, RBFs, Projection pursuit,
discriminant analysis, Linear system review. Wavelets.
 
Algorithms: steepest descent, sampling, DP, A*, Marginalization
 
POSSIBILITIES FOR OUTSIDE READINGS ON EXPERIMENTAL APPLICATIONS

Atick, J. J., & Redlich, A. N. (1992). What does the retina know about natural scenes? Neural Computation, 4(2), 196-210.

Barlow, H. B. (1961). Possible principles underlying the transformation of sensory messages. In W. A. Rosenblith (Ed.), Sensory Communication Cambridge, MA: MIT Press.

Blake, Andrew, Bulthoff, Heinrich, Sheinberg, David 1993. Shape from Texture: Ideal Observers and Human Psychophysics

Burgess, A. E., Wagner, R. F., Jennings, R. J., & Barlow, H. B. (1981). Efficiency of human visual signal discrimination. Science, 214, 93-94.

Burgess, A. E. (1985). Visual signal detection. III. On Bayesian use of prior knowledge and cross correlation. J. Opt. Soc. Am. A, 2(9), 1498-1507.

Burns, et. al.,1995; relevant work on letter recognition.

Brainard, D. H., & Freeman, W. T. (1994). Bayesian Method for Recovering Surface and Iluminant Properties from Photosensor Responses. Human Vision, Visual Processing, and Digital Display V. Bellingham, Washington. The Society of Photo-Optical Instrumentation Engineers, 2179, 364-376.

Crowell, J. A., & Banks, M. S. (1996). Ideal observer for heading judgments. Vision Research, 36, 471-490.

Eagle and Blake, 1995. relevant work on structure from motion.

Harris and Parker, 1992; work on depth perception in random-dot stereograms (also Scharff and Geisler, 1992;)

Field, D. J. (1994). What is the goal of sensory coding? Neural Computation, 6, 559-601.

Field, D. J. (1987). Relations between the statistics of natural images and the response properties of cortical cells. 4(12), 2379-2394.

Freeman, W. T. (1994). The generic viewpoint assumption in a framework for visual perception. Nature, 368(7 April 1994), 542-545.

Geisler, W. (1989). Sequential Ideal-Observer analysis of visual discriminations. Psychological Review, 96(2), 267-314.

Ghahramani, Z., & Wolpert, D. M. (1997). Modular decomposition in visuomotor learning. Nature, 386, 392-395.

Kersten, D. (1984). Spatial summation in visual noise. Vision Research, 24, 1977-1990.

Kersten, D. J. (1987). Predictability and Redundancy of Natural Images. Journal of the Optical Society of America, 4, 2395-2400.

Knill, D. C. (in press). Surface orientation from texture: Ideal observers, generic observers and the information content of texture cues. Vision Research.

Knill. (in press). Discrimination of planar surface slant from texture: Human and ideal observers compared. Vision Research.

Knill, D. C., Field, D., & Kersten, D. (1990). Human discrimination of fractal images. 7, 1113-1123.

Landy, M. S., Maloney, L. T., Johnston, E. B., & Young, M. J. (1995). Measurement and modeling of depth cue combination: In defense of weak fusion. Vision Research, 35, 389-412.

Liu, Z., Knill, D. C., & Kersten, D. (1995). Object Classification for Human and Ideal Observers. Vision Research, 35(4), 549-568.

Mamassian, P., & Landy, M. S. (1998). Observer biases in the 3D interpretation of line drawings. Vision Research., 38, 2817-2832.

Nakayama, K., & Shimojo, S. (1992). Experiencing and perceiving visual surfaces. Science, 257, 1357-1363

Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607-609.
 

Pelli, D. G. (1990). The quantum efficiency of vision. In C. Blakemore (Ed.), Vision:Coding and Efficiency Cambridge: Cambridge University Press.

Schrater, P. R., Knill, D. C., & Simoncelli, E. P. (under review?). Mechanisms of visual motion detection. Nature Neuroscience, ,

Simoncelli, E. recent work on natural image statistics and neural coding.

Tjan, B., Braje, W., Legge, G. E., & Kersten, D. (1995). Human efficiency for recognizing 3-D objects in luminance noise. Vision Research, 35(21), 3053-3069.

Weiss, Y., & Adelson, E. H. (1998). Slow and smooth: a Bayesian theory for the combination of local motion signals in human vision (A.I. Memo No. 1624). M.I.T.

Yuille, A. L., & Bülthoff, H. H. (1996). Bayesian decision theory and psychophysics. In K. D.C., & R. W. (Ed.), Perception as Bayesian Inference Cambridge, U.K.: Cambridge University Press.

Zucker, S. W., & David, C. (1988). Points and end-points: A size-spacing constraint for dot grouping. Perception, 17, 229-247.