CS 4476 / 6476 Computer Vision
Fall 2016, MWF 11:05 to 12:55, College of Computing room 16
Instructor: James Hays
TAs: Shray Bansal, Zhaoyang Lv, Huda Alamri, Varun Agrawal, Mahita Mahesh, Amit Raj, Cusuh Ham.

Course Description
This course provides an introduction to computer vision including fundamentals of image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification and scene understanding. We'll develop basic methods for applications that include finding known models in images, depth recovery from stereo, camera calibration, image stabilization, automated alignment, tracking, boundary detection, and recognition. The focus of the course is to develop the intuitions and mathematics of the methods in lecture, and then to learn about the difference between theory and practice in the projects.The difference between the undergraduate version of the class (CS4476) and the graduate version (CS6476) will be the requirements on the projects. In particular, more challenging extensions of the projects will be extra credit for CS4476 but required for CS6476.
The Advanced Computer Vision course (CS7476) in spring will build on this course and deal with advanced and research related topics in Computer Vision, including Machine Learning, Graphics, and Robotics topics that impact Computer Vision.
Learning Objectives
Upon completion of this course, students should be able to:- 1. Recognize and describe both the theoretical and practical aspects of computing with images. Connect issues from Computer Vision to Human Vision
- 2. Describe the foundation of image formation and image analysis. Understand the basics of 2D and 3D Computer Vision.
- 3. Become familiar with the major technical approaches involved in computer vision. Describe various methods used for registration, alignment, and matching in images.
- 4. Get an exposure to advanced concepts leading to object and scene categorization from images.
- 5. Build computer vision applications.
Prerequisites
No prior experience with computer vision is assumed, although previous knowledge of visual computing or signal processing will be helpful. The following skills are necessary for this class:- Data structures: You'll be writing code that builds representations of images, features, and geometric constructions.
- Programming: A good working knowledge of programming environments that support image and video analysis. All lecture code and project starter code will be in MATLAB. Students are strongly encouraged to use MATLAB and the TA's will support questions about MATLAB. If you've never used MATLAB that is OK.
- Math: Linear algebra, vector calculus, and probability. Linear algebra is the most important and students who have not taken a linear algebra course have struggled in the past.
Grading
Your final grade will be made up from- 80% 6 programming projects
- 20% 2 written quizzes
These late days are intended to cover unexpected clustering of due dates, travel commitments, interviews, hackathons, etc. Don't ask for extensions to due dates because we are already giving you a pool of late days to manage yourself.
Graduate Credit
If you are enrolled in the graduate section CS 6476 then you will be expected to do additional work on each project. Each project will list several extra credit opportunities available and CS 6476 students will be required to do at least 10 points worth of extra credit (for which you will not get extra credit, unless you do more than 10 points worth).Academic Integrity
Academic dishonesty will not be tolerated. This includes cheating, lying about course matters, plagiarism, or helping others commit a violation of the Honor Code. Plagiarism includes reproducing the words of others without both the use of quotation marks and citation. Students are reminded of the obligations and expectations associated with the Georgia Tech Academic Honor Code and Student Code of Conduct, available online at www.honor.gatech.edu. For quizzes, no supporting materials are allowed (notes, calculators, phones, etc).You are expected to implement the core components of each project on your own, but the extra credit opportunties often build on third party data sets or code. That's fine. Feel free to include results built on other software, as long as you are clear in your handin that it is not your own work.
Learning Accommodations
If needed, we will make classroom accommodations for students with documented disabilities. These accommodations must be arranged in advance and in accordance with the ADAPTS office (www.adapts.gatech.edu).Important Links:
- Piazza for CS 4476 / 6476. This should be your first stop for questions and announcements.
- t-square.gatech.edu will be used to hand in assignments.
- Matlab Tutorial
- Get Matlab from software.oit.gatech.edu
Contact Info and Office Hours:
If possible, please use Piazza to ask questions and seek clarifications before emailing the instructor or staff.- James: hays[at]gatech.edu
- GTA Huda Alamri: halamri3[at]gatech.edu
- GTA Shray Bansal: sbansal34[at]gatech.edu
- GTA Zhaoyang Lv: zhaoyang.lv[at]gatech.edu
- GTA Varun Agrawal: varunagrawal[at]gatech.edu
- GTA Mahita Mahesh: mahitamahesh[at]gatech.edu
- GTA Amit Raj: amit.raj[at]gatech.edu
- TA Cusuh Ham: cusuh[at]gatech.edu
- James, Monday 3pm and Wednesday 1pm (CCB 315).
- TA hours: Monday 12 to 3 (CCB 360), Tuesday 11 to 1 (CCB 360), Thursday 11 to 1 (TBD).
Textbook
Readings will be assigned in "Computer Vision: Algorithms and Applications" by Richard Szeliski. The book is available for free online or available for purchase.Syllabus
Class Date | Topic | Slides | Reading | Projects |
Mon, Aug 22 | Introduction to computer vision | pptx, pdf | Szeliski 1 | |
| ||||
Wed, Aug 24 | Cameras and Optics | pptx, pdf | Szeliski 2.1, especially 2.1.5 | Project 1 out |
Fri, Aug 26 | Light and Color | pptx, pdf | Szeliski 2.2 and 2.3 | |
Mon, Aug 29 | Image Filtering | pptx, pdf | Szeliski 3.2 | |
Wed, Aug 31 | Thinking in frequency | pptx, pdf | Szeliski 3.4 | |
Fri, Sept 2 | Thinking in frequency part 2 | pptx, pdf | Szeliski 3.5.2 and 8.1.1 | |
| ||||
Mon, Sept 5 | No classes, Institute holiday | |||
Wed, Sept 7 | Edge detection | pptx, pdf | Szeliski 4.2 | Project 1 due |
Fri, Sept 9 | Interest points and corners | pptx, pdf | Szeliski 4.1.1 | Project 2 out |
Mon, Sept 12 | Local image features | pptx, pdf | Szeliski 4.1.2 | |
Wed, Sept 14 | Feature matching and hough transform | pptx, pdf | Szeliski 4.1.3 and 4.3.2 | |
Fri, Sept 16 | Model fitting and RANSAC | pptx, pdf | Szeliski 6.1 and 2.1 | |
| ||||
Mon, Sept 19 | Stereo intro | pptx, pdf | Szeliski 11 | |
Wed, Sept 21 | Camera Calibration | pptx, pdf | Szeliski 6.2.1 | |
Fri, Sept 23 | Epipolar Geometry and Structure from Motion | pptx, pdf | Szeliski 7 | Project 2 due |
Mon, Sept 26 | Feature Tracking and Optical Flow | pptx, pdf | Szeliski 8.1 and 8.4 | Project 3 out |
Wed, Sept 28 | Optical Flow continued | pptx, pdf | ||
| ||||
Fri, Sept 30 | Machine learning: unsupervised learning | pptx, pdf | Szeliski 5.3 | |
Mon, Oct 3 | Machine learning: Supervised learning | pptx, pdf | Szeliski 5.3 | |
Wed, Oct 5 | Quiz 1 | |||
| ||||
Fri, Oct 7 | Recognition overview and bag of features | pptx, pdf | Szeliski 14 | |
Mon, Oct 10 | No classes, Institute holiday | Project 3 due | ||
Wed, Oct 12 | No lecture, work on project 4 | Project 4 out | ||
Fri, Oct 14 | Large-scale instance recognition | pptx, pdf | Szeliski 14.3.2 | |
Mon, Oct 17 | Large-scale instance recognition, continued | pptx, pdf | ||
Wed, Oct 19 | Large-scale category recognition and advanced feature encoding | pptx, pdf | ||
Fri, Oct 21 | Detection with sliding windows: Viola Jones | pptx, pdf | Szeliski 14.1 and 14.2 | |
Mon, Oct 24 | Detection with sliding windows: Dalal Triggs | pptx, pdf | Szeliski 14.1 | |
Wed, Oct 26 | Pascal VOC and Big Data | pptx, pdf | Szeliski 14.5 | Project 4 due |
Fri, Oct 28 | Big Data 2 | pptx, pdf | Project 5 out | |
Mon, Oct 31 | Human computation and crowdsourcing | pptx, pdf | ||
Wed, Nov 2 | Attributes and more crowdsourcing | pptx, pdf | ||
Fri, Nov 4 | Modern boundary detection and Sketches | pptx, pdf | Szeliski 4.2 | |
Mon, Nov 7 | Context, Spatial Layout, and scene parsing | pptx, pdf | ||
| ||||
Wed, Nov 9 | Neural networks | pptx, pdf | ||
Fri, Nov 11 | Convolutional networks for recognition | pptx, pdf | Project 5 due | |
Mon, Nov 14 | Object Detectors Emerge in Deep Scene CNNs | pptx, pdf | Project 6 out | |
Wed, Nov 16 | Deep Geolocalization | pptx, pdf | ||
Fri, Nov 18 | MS COCO and Deeper Deep Architectures | pptx, pdf | ||
Mon, Nov 21 | Structured output from Deep Learning | pptx, pdf | ||
Wed, Nov 23 | No classes, Institute holiday | |||
Fri, Nov 25 | No classes, Institute holiday | |||
Mon, Nov 28 | "Unsupervised" Learning and Style Transfer | pptx, pdf, pdf2 | ||
Wed, Nov 30 | Generative Networks - Colorization | pptx, pdf | ||
Fri, Dec 2 | Quiz 2 | |||
Mon, Dec 5 | No classes, final instructional period | Project 6 due, Tuesday 11:55pm | ||
Wed, Dec 7 | No classes, reading period | |||
Wed, Dec 14 | Final Exam Period - not used |