Psychology Department , University of Minnesota

Computational Vision
Psy 5036W, Fall 2006, 3 credits

Web page:
courses.kersten.org
09:05 A.M. - 10:20 A.M. Mondays and Wednesdays, Elliott 150

Instructor: Daniel Kersten. Office: 212 Elliott Hall. Phone: 612 625-2589 email: kersten@umn.edu
Office hours: Wednesdays 10:20-11:20 am or by appointment.

TA: Evangelos Theodoru, Office: N13 Elliott Hall. Phone: 625-1337 email: theo0027@umn.edu
Office hours: Mondays 10:20 am or by appointment.

The visual perception of what is in the world is accomplished continually, instantaneously, and usually without conscious thought. The very effortlessness of perception disguises the underlying richness of the problem. We can gain insight into the processes and functions of human vision by studying the relationship between neural mechanisms and visual behavior through computer analysis and simulation. Students will learn about the anatomy and neurophysiology of vision and how they related to the phenomona of perception. An underlying theme will be to treat vision as a process of statistical inference. There will be in-class programming exercises using the language Mathematica. No prior programming experience is required; however, a backround in calculus and linear algebra is helpful.

Readings & software

Grade Requirements

There will be a mid-term, final examination, programming assignments, as well as a final project.

The grade weights are:

The programming assignments will use the Mathematica programming environment. No prior experience with Mathematica is necessary. List of Computer Labs at the University of Minnesota with Mathematica installed.

Assignment due BEFORE class start time (9:05 am) on the day due.
Late Policy: Assignments turned in within 24 hours following the due date will have 15% deducted from the assignment score. Assignments turned in between 24 and 48 hours following the due date will have 30% deducted from the score. Assignments more than 48 hours late will receive a score of zero.

 


Lectures

Check this section before each class for recent additions and revisions.

To see what the course looked like last time, with downloadable lecture notes, see
Psy 5036W SPRING2004 Web Pages

University Calendar

Date

Lecture

Main Readings

Supplementary Material

Assignments
due

I. Introduction
Sep 6
1. Introduction to Computational Vision

1.IntroToComputationalVision.nb
(pdf)

Kersten, D., & Yuille, A. (2003). Bayesian models of object perception. Current Opinion in Neurobiology, 13(2), 1-9. (pdf)

intro.nb

http://www.psych.upenn.edu/backuslab/helmholtz/

mystery.jpg

 
11
2. Limits to Vision

2.LimitsToVision.nb
(pdf)

Hecht, S., Shlaer, S., & Pirenne, M. H. (1942). Energy, quanta, and vision. Journal of General Physiology, 25, 819-840. (pdf)

Barlow, H. B. (1981). Critical Limiting Factors in the Design of the Eye and Visual Cortex. Proc. Roy. Soc. Lond. B, 212, 1-34. (pdf)

Baylor, D. A., Lamb, T. D., & Yau, K. W. (1979). Responses of retinal rods to single photons. Journal of Physiology, Lond., 288, 613-634. (pdf)

 
13
3. The Ideal Observer

3.TheIdealObserver.nb
(pdf)

 

 
18
4. Ideal observer analysis: Humans vs. ideals

4.IdealObserverAnalysis.nb
(pdf)

Burgess, A. E., Wagner, R. F., Jennings, R. J., & Barlow, H. B. (1981). Efficiency of human visual signal discrimination. Science, 214(4516), 93-94. (pdf)

 

II. Image formation,
pattern synthesis

20
5.Psychophysics: tools & techniques

5.Psychophysics.nb
(pdf)

 

ProbabilityOverview.nb

Farell, B. & Pelli, D. G. (1999) Psychophysical methods, or how to measure a threshold and why. In R. H. S. Carpenter & J. G. Robson (Eds.), Vision Research: A Practical Guide to
Laboratory Methods, New York: Oxford University Press.http://psych.nyu.edu/pelli/

Husyein Boyaci's Psychophysics with JAVA


For an excellent (Matlab) psychophysics package, see: http://psychotoolbox.org

 

25
6. Bayesian decision theory & perception

6.BayesDecisionTheory.nb
(pdf)

Geisler, W. S., & Kersten, D. (2002). Illusions, perception and Bayes. Nat Neurosci, 5(6), 508-510. (pdf)

#1 Ideal Detector (7%)
Assignmt_1IdealDetector.nb
27
7. Limits to spatial resolution, image modeling, introduction to linear systems

7.ImageModelLinearSystems.nb
(pdf)

Campbell, F. W., & Green, D. (1965). Optical and retinal factors affecting visual resolution. Journal of Physiology (Lond.), 181, 576-593. (pdf)

Williams, D. R. (1986). Seeing through the photoreceptor mosaic. 9(5), 193-197. (pdf)

LinearAlgebraReview.nb
(LinearAlgebraReview.nb.pdf)

Image data files: Fourier128x128.jpeg

Convolutions_Tutorial.nb

 
III. Early visual coding

Oct 2

8. Linear systems analysis

8.LinearSystemsOptics.nb
(pdf)

CSF.gif

Tutorials:
Fourier_neural_image.nb

 
4
9. Spatial filter models of early human vision

9.NeuralSpatialFiltering.nb
(pdf)

Campbell, F. W., & Robson, J. R. (1968). Application of Fourier Analysis to the Visibility of Gratings. Journal of Physiology 197, 551-566. (pdf)

De Valois, R. L., Albrecht, D. G., & Thorell, L. G. (1982). Spatial frequency selectivity of cells in macaque visual cortex. Vision Res, 22(5), 545-559.

Watson, A. B. (1987). Efficiency of a model human image code. J Opt Soc Am A, 4(12), 2401-2417. (pdf)

http://www.cns.nyu.edu/~eero/steerpyr/

 
9

10. Local processing & image analysis

10.ImageProcessing.nb
(pdf)

Albrecht, D. G., De Valois, R. L., & Thorell, L. G. (1980). Visual cortical neurons: are bars or gratings the optimal stimuli? Science, 207(4426), 88-90.(pdf)

Adelson, E. H., & Bergen, J. R. (1991). The plenoptic function and the elements of early vision. In M. S. Landy & J. A. Movshon (Eds.), Computational Models of Visual Processing. Cambridge, MA: The MIT Press: A Bradford Book.(pdf)

ClassificationImage demo (ReverseCorrelation.nb)

Ahumada, A. J., Jr. (2002). Classification image weights and internal noise level estimation. J Vis, 2(1), 121-131. (pdf)

Assignmt #2Convolve.nb(7%)
11
11. Coding efficiency: Retina

11.CodingEfficiency.nb
(pdf)

Meister, M., & Berry, M. J., 2nd. (1999). The neural code of the retina. Neuron, 22(3), 435-450.(pdf)


Laughlin, S. (1981). A simple coding procedure enhances a neuron's information capacity. Z Naturforsch [C], 36(9-10), 910-912.(pdf)

Srinivasan, M. V., Laughlin, S. B., & Dubs, A. (1982). Predictive coding: a fresh view of inhibition in the retina. Proc R Soc Lond B Biol Sci, 216(1205), 427-459.(pdf)

ColorAlpine256x256.jpg
GrayAlpine256x256.jpg
Graygranite256x256.jpg
Grass64x64.jpg
granite64x64.jpg

 
16

12. Coding efficiency: Cortex

12.SpatialCodingEfficiency.nb
(pdf)

Simoncelli, E. P., & Olshausen, B. A. (2001). Natural image statistics and neural representation. Annu Rev Neurosci, 24, 1193-1216.(pdf)

Laughlin, S. B., de Ruyter van Steveninck, R. R., & Anderson, J. C. (1998). The metabolic cost of neural information. Nat Neurosci, 1(1), 36-41.(pdf)

 
IV. Intermediate-level vision,
integration, grouping
18
13. Edge detection 13.EdgeDetection.nb
(pdf)
Hubel, D. H., & Wiesel, T. N. (1977). Ferrier lecture. Functional architecture of macaque monkey visual cortex. Proc R Soc Lond B Biol Sci, 198(1130), 1-59. (pdf)

 

23
  MID-TERM

MID-TERM Study guide (pdf)

MID-TERM (16%)
25
14. Contrast normalization,Scenes from images

14.ScenesfromImages.nb
(pdf)

von der Heydt R (2003) Image parsing mechanisms of the visual cortex. In: The Visual Neurosciences (Werner JS, Chalupa LM, eds.), pp 1139-1150. Cambridge, Mass.: MIT press.(pdf)

deer.jpg
(from "Walter Wick's Optical Tricks")

Zhou H, Friedman HS, von der Heydt R (2000) Coding of border ownership in monkey visual cortex. J Neuroscience 20: 6594-6611

 

Oct 30
15.Surface geometry, Scene-based generative models

15.SurfaceGeometryDepth.nb
(pdf)

Kersten, D., Mamassian, P., & Yuille, A. (2004). Object perception as Bayesian Inference. Annual Review of Psychology, 55, 271-304. (pdf link)

ProjectIdeasF2006.nb (pdf)

 
Nov 1
16. Shape-from-X

16.ShapeFromX.nb
(pdf)

RDS.m, ShowStereo2.m, ImplicitSolids.m

Reflectance map: Shape from shading: Horn BKP (1986) Robot Vision. Cambridge MA: MIT Press. Ch 11 (pdf)

Assignmt_3Illusions.nb
(pdf)
(7%)

bluradaptationdemo

(Webster et al. pdf)

motion-induced-blindness demo

(Bonneh et al. pdf)

 
6
17. Shape-from-shading, bas-relief, Bayesian estimators

17.basrelief.nb
(pdf)

Belhumeur, P. N., Kriegman, D. J., & Yuille, A. (1997). The Bas-Relief Ambiguity. (pdf)

 
8
18. Motion: optic flow

18.MotionOpticFlow.nb
(pdf)

Horn, B. K. P., & Schunck, B. G. (1981). Determining Optical Flow. Artificial Intelligence, 17, 185-203. (pdf)

 
13
19. Motion: biological, human perception

19.MotionHumanPerception.nb
(pdf)

Weiss, Y., Simoncelli, E. P., & Adelson, E. H. (2002). Motion illusions as optimal percepts. Nat Neurosci, 5(6), 598-604.(pdf)

Heeger, D. J., Simoncelli, E. P., & Movshon, J. A. (1996). Computational models of cortical visual processing. Proc Natl Acad Sci U S A, 93(2), 623-627.

aperturedemomovie.mov(quicktime)
http://psych.la.psu.edu/clip/Perception.htm
http://epunix.biols.susx.ac.uk/Home/George_Mather/Motion/index.html

Assignmt4SceneImageModels.nb
(7%)  
15
20. Material perception

20.SurfaceMaterial.nb
(pdf)

Adelson, E. H. (1993). Perceptual organization and the judgment of brightness. Science, 262, 2042-2044 (pdf)

Fleming, R. W., Dror, R. O., & Adelson, E. H. (2003). Real-world illumination and the perception of surface reflectance properties. J Vis, 3(5), 347-368. (link)

Lightness Perception and Lightness Illusions Chapter 24 in M. Gazzaniga, ed., The New Cognitive Neurosciences, 2nd ed.Cambridge, MA: MIT Press, 339-351, 2000.(html)

http://web.mit.edu/persci/people/adelson/checkershadow_illusion.html
http://vision.psych.umn.edu/www/kersten-lab/demos/transparency.html
http://gandalf.psych.umn.edu/~kersten/kersten-lab/demos/MatteOrShiny.html

Final project title & paragraph outline (2%)
20
21. Texture.

21.Texture.nb
(pdf)

Heeger DJ and Bergen JR, Pyramid Based Texture Analysis/Synthesis, Computer Graphics Proceedings, p. 229-238, 1995. (pdf).

 
22
(Thanks-giving,
the 23th)

22.Science writing

22.ScienceWriting.nb

Gopen & Swan, 1990 (pdf)  

 

 
V. High-level vision
27
23.Perceptual integration, cue integration, cooperative computation

23.PerceptualIntegration.nb
(pdf)

Hillis, J. M., Ernst, M. O., Banks, M. S., & Landy, M. S. (2002). Combining sensory information: mandatory fusion within, but not between, senses. Science, 298(5598), 1627-1630.(pdf)

 

McDermott, J., Weiss, Y., & Adelson, E. H. (2001). Beyond junctions: nonlocal form constraints on motion interpretation. Perception, 30(8), 905-923. http://www.perceptionweb.com/perc0801/square.html

 
Nov
29

24. Object recognition

 

24.ObjectRecognition.nb
(pdf)

Liu, Z., Knill, D. C., & Kersten, D. (1995). Object Classification for Human and Ideal Observers. Vision Research, 35(4), 549-568. (pdf)

Tanaka K (2003) Columns for complex visual object features in the inferotemporal cortex: clustering of cells with similar but slightly different stimulus selectivities. Cerebral cortex 13:90-99.(pdf)

 

Dec 4
25. Object perception & cortex

25.ObjectRecBackground.nb
(pdf)

Supplement: LearningCamouflage (pdf)

Grill-Spector, K. (2003). The neural basis of object perception. Curr Opin Neurobiol, 13(2), 159-166.(pdf)


Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat Neurosci, 2(1), 79-87. (pdf)

Bullier, J. (2001). Integrated model of visual processing. Brain Res Brain Res Rev, 36(2-3), 96-107. (pdf)

Cohen and Tong (pdf)

Brady MJ, Kersten D (2003) Bootstrapped learning of novel objects. J Vis 3:413-422. http://journalofvision.org/3/6/2/

Ullman, S., Vidal-Naquet, M., & Sali, E. (2002). Visual features of intermediate complexity and their use in classification. Nat Neurosci, 5(7), 682-687. (pdf)

Example of excellent writing, and ideal observer analysis, Pelli et al., 2006 (pdf)
6
26. Vision for action: attention, eye movements, heading. Science writing 2.

26.SpatialLayoutScenes.nb
(pdf)


Longuet-Higgins, H. C., & Prazdny, K. (1980). The Interpretation of a Moving Retinal Image. Proceedings of the Royal Society of London B, 208, 385-397. (pdf)

Horn BKP (1986) Robot Vision. Cambridge MA: MIT Press., chapter 17 (pdf)

Schrater PR, Kersten D (2000) How optimal depth cue integration depends on the task. International Journal of Computer Vision 40:73-91. (pdf)

 
  11 27. Learning, concepts & categories,
Theories of cortex

Lecture27Cortex.pdf

Tomaso Poggio and Christian R. Shelton (1999). "Machine Learning, Machine Vision, and the Brain." AI Magazine, 20(3), 37-55.(pdf)

Top-Down Control of Visual Attention in Object Detection. Aude Oliva, Antonio Torralba, Monica S. Castelhano and John M. Henderson.  (2003), International Conference on Image Processing (ICIP). Vol. I, pages 253-256.  September 14-17, in Barcelona, Spain (pdf)

Tenenbaum JB: Bayesian modeling of human concept learning. In Advances in Neural Information Processing Systems. Edited by Kearns MSS, Solla A, Cohn DA: Cambridge, MA: MIT Press: 1999.(pdf)

DRAFT DUE Monday December 11th., 9 AM.

Complete Draft of Final Project (5%: 2 pts for completing Introduction, 2 pts for completing Methods, 1 pt for completing Discussion)

 
13
(Last day of classes)
  FINAL EXAM Final Study Guide (pdf)

FINAL EXAM (16%)

(Drafts returned December 14th)

19       Final Revised Draft of Project due (33%)

 

 


Final Project Assignment.


ProjectIdeasF2006.nb (pdf)

Goal: This course integrates the behavioral, neural and computational principles of perception. Students often find the interdisciplinary integration to be the most challenging aspect of the course. Through writing, you will learn to synthesize results from diverse and typically isolated disciplines. By writing about your project work, you will learn to think through the broader implications of their projects, and to effectively communicate the rationale and results of your computer projects in words. You will do a final page research report in which you will describe, in the form of a scientific paper, the results of an original computer simulation.

Completing the final paper involves 3 steps:

  1. You will submit a working title and paragraph outline. These outlines will be critiqued in order to help you find an appropriate focus for your papers. (2% of grade)
  2. You will then submit a complete draft of your paper. Each paper will be reviewed with specific recommendations for improvement. (5% of grade)
  3. You will submit a final revision for grading. (33% of grade)

Your final project will involve: 1) a computer simulation and; 2) a 2000-3000 word final paper describing your simulation. For your computer project, you will do one of the following: 1) Write a program to simulate a model from the computer vision literature ; 2) Design and program a method for solving some problem in perception. 3) Design and program a psychophysical experiment to study an aspect of human visual perception. The results of your final project should be written up in the form of a short scientific paper or Mathematica Notebook, describing the motivation, methods, results, and interpretation.

If you choose to write your program in Mathematica, your paper and program can be combined can be formated as a Mathematica notebook. See: Books and Tutorials on Notebooks.

Your paper will be critiqued and returned for you to revise and resubmit in final form. You should write for an audience consisting of your class peers.

    1. Outline. You must submit a title and paragraph outline of your intended paper by the deadline noted in the syllabus. (Consult with the instructor or TA for ideas well ahead of this first deadline).
    2. Complete draft. A double-spaced, complete draft of the paper must be turned in by the deadline noted in the syllabus. Papers should be between 2000 and 3000 words. Papers must include the following sections: Introduction, Methods, Results, Discussion, and Bibliography. Use citations to motivate your problem and to justify your claims. Cite authors by name and date, e.g. (Marr & Poggio, 1979). Use a standard citation format, such as APA . (The UM library has information on research, citation style, and in particular APA style.) Papers must be typed, with a page number on each page.
    3. Final draft. The final draft must be turned in by the date noted on the syllabus. Students who wish to submit their final papers to be published in the class electronic journal should turn in both paper and electronic copies of their reports.

     

    Some Resources:

    Student Writing Support: Center for Writing, 306b Lind Hall and satellite locations (612.625.1893) http://writing.umn.edu.
    Online Writing Center:http://writing.umn.edu/sws

    NOTE: Plagiarism, a form of scholastic dishonesty and a disciplinaryoffense, is described by the Regents as follows: Scholasticdishonesty means plagiarizing; cheating on assignments or examinations;engaging in unauthorized collaboration on academic work; taking,acquiring, or using test materials without faculty permission; submittingfalse or incomplete records of academic achievement; acting alone or incooperation with another to falsify records or to obtain dishonestlygrades, honors, awards, or professional endorsement; or altering,forging, or misusing a University academic record; or fabricating orfalsifying of data, research procedures, or data analysis.http://www1.umn.edu/regents/policies/academic/StudentConductCode.html. See too: http://writing.umn.edu/tww/plagiarism/