CS 4616
Pattern Recognition
Fall 2008
Howey (Physics) S204
MWF 2:00 - 3:00
Grading
Syllabus
Problem Sets
Miniprojects
This course provides an introduction to the theory and practice of pattern
recognition. The goal of this class is to provide a set of unifying concepts and
theory for standard pattern recognition methods and problems, along with
significant practical experience in applying these methods to real-world
datasets from a number of different problem domains.
Instructor
Jim Rehg
Email: rehg@cc.gatech.edu
Office: TSRB 230B
Office hours: MWF 4:00 - 5:00
Phone: 404-894-9105 (email preferred)
Prerequisites
Familiarity with probability and statistics at the level of an introductory
undergraduate course is the main prerequisite. Linear algebra will be important
for a portion of the material, and an introductory course is a prerequisite.
Familiarity with programming is the remaining prerequisite. Please contact me if you have questions.
Text
The class textbook is Pattern Recognition and Machine Learning by
Christopher M. Bishop. Springer 2006. There is a
website with
supplementary material.
Other general pattern recognition texts:
- The Elements of Statistical Learning by T. Hastie, R. Tibshirani,
and J. Friedman, Springer, 2001.
Broad treatment of much of our course material from a statistician's
perspective - Information Theory, Inference and Learning Algorithms by
D. J. C. MacKay, Cambridge University Press, 2003
- Pattern Classification by Duda, Hart, and Stork. Wiley 2001.
Updated version of the classic pattern recognition text - Neural Networks for Pattern Recognition
by C. Bishop. Oxford University Press, 1995.
-
Data Mining (2nd Ed.) by I. Witten and E. Frank, Morgan Kaufmann, 2005.
Companion text to the Weka software, you may prefer the on-line documentation
instead
More specialized texts:
- Learning with Kernels by B. Scholkopf and A. J. Smola, MIT Press,
2002
- An Introduction to Support Vector Machines by N. Cristianini and J.
Shawe-Taylor, Cambridge University Press, 2000
- Gaussian Processes for Machine Learning by C. E. Rasmussen and C.
K. I. Williams, MIT Press, 2006
- Learning in Graphical Models by M. I. Jordan (ed.), MIT Press, 1999
- An Introduction to Computational Learning Theory by M. J. Kearns
and U. V. Vazirani, MIT Press, 1994
- Statistical Learning Theory by V. N. Vapnik, Wiley, 1998
Recommended text books for background material:
- All of Statistics by L. Wasserman, Springer, 2004.
Clear and concise summary of relevant probability and statistics material. -
Probability and Random Processes by G. R. Grimmett and D. Stirzaker
Standard introductory probability text - Probability, Random Variables, and Stochastic Processes by A.
Papoulis.
Classic engineering text for probability theory and its application. Also see
on-line lecture
notes for EE178 at Stanford.
- Probability Theory: The Logic of Science by E. T. Jaynes. [online]
Classic text on
probability theory, chapters 1 and 2 in particular are good background
reading. - Introduction to Probability by D. P. Bertsekas and J. N.
Tsitsiklis [used to be on-line, unfortunately no longer]
- Linear Algebra and Its Applications or Introduction to Linear
Algebra by G. Strang
- Matrix Reference Manual [online]
Organization
Grades will be assessed as follows:
| Problem Sets |
70% |
| Midterm Exam |
10% |
| Final Exam |
15% |
| Participation |
5% |
The midterm will be a take-home exam that will not require any programming.
The final exam will be administered during our assigned exam period at the end
of the semester.
Problem Sets
There will be approximately 4 problem sets of one or two weeks duration. A
number of them will require implementation of a vision algorithm or small vision
system. I will provide a Python-based software infrastructure for these
projects, but you will also be free to use Java, C/C++, or other languages. I
will announce the projects and provide pointers to the software environment
shortly.
There will also be a miniproject of slightly longer duration, for which you
will be allowed to choose your own topic. If you are currently involved in a
computer vision research project, you may do some additional work on this topic
for your miniproject. I will also provide a list of project suggestions. Grading
for the miniproject will be based on in-class presentations of the projects,
which will take place during the last week of classes.
You are encouraged to work in teams of 2-3 people on the problem sets and
miniprojects. (Of course all exams must be your individual work.) Team-forming
will be begin next week, and I say more about it in class.
All problem sets and exams must be turned in on time. No late submissions
of work will be accepted. I realize that conference travel and job trips may
conflict with a problems set deadline. As a result, you will be allowed to
replace your grade for one of the problem sets with your average grade for the
others, effectively removing that problem set from consideration. You can only
do this once.
Problem Sets and Exams
- 9/8/08: PS 1
(Due in class on Wed Sept 17)
- 10/10/08: PS
2 (Due in class on Fri Oct 24)
- 10/27/08:
Midterm
(Due at my
office or email by 5:00pm, 10/28/08)
Study
Guide
- 12/10/08: Final exam 11:30 - 2:30
Study
Guide
Syllabus
Below is a rough outline of the course, I will update it as we go along.
- Unit 1 (slides)
- Overview of elements of a pattern recognition problem
- Example problem: Skin detection in images (paper)
- Overview of statistical inference (frequentist and bayesian), decision
theory, and generative/discriminative modeling
- Probability review: marginalization, factorization, and Bayes rule.
- Unit 2
- Probability review: conditional mean and variance. Interpretation of
conditional mean as a function. Law of iterated expectations
- Probability review: Gaussian distribution, geometry of covariance matrix structure, mean and covariance
- Frequentist statistical inference, bias-variance decomposition for squared
loss
- Maximum likelihood (ML) method, ML learning of mean and variance of Gaussian
- Bias-variance decomposition for ML mean estimate, biased nature of ML
covariance estimate
- Decision theoretic optimal linear regressor is the conditional mean, which
minimizes the expected squared loss
- General form of Bayes classifier in K class case, discussion of bayes
error.
- Unit 3
- Linear regression with fixed basis functions, generative data model based
on additive Gaussian noise.
- ML estimator for linear regressor in scalar case, interpretation as least squares minimizer and geometry of LS
- Logistic regression for classification (discriminative approach)
- Geometry of hyperplanes for linear classification
- ML estimator for logistic regression in two-class case, overfitting
behavior of ML estimator.
- Newton-Raphson for computing weight vector (IRLS algorithm), connection to
nonlinear least squares
- Generative interpretation of binary logistic regression for
Gaussian class conditional distributions with shared covariances
- Bias-variance decomposition for linear regression
- Regularized linear regression
- Summary of issues and approaches
- Unit 4 (Lecture
1 and
Lecture 2)
- Neural network view of logistic regression
- Multi-layer neural network architecture
- Neural network training, forward and backward passes
- Backpropagation for recursive computation of weight gradients
- Properties of neural net training: efficiency, on-line implementation,
modularity
- Regularization via early stopping
- Elements of Support Vector Machines: Max margin classifier, Kernel method,
Quadratic programming
- Overview of max margin classification and structured risk minimization
- Kernel density estimation via Parzen window method
- Kernelized linear regression, Gram matrix, duality
- Unit 5
- Efficiency of kernel substitution: Implementing inner products in feature
space consisting of degree d monomial features
- Necessary conditions for kernel functions: symmetry and positive
definiteness. connection to Gram matrix
- Reproducing Kernel Property: definition of feature space via reproducing
kernel map, representer of evaluation, span of Gram matrix
- Designing kernels may be easier than embedding complex objects (e.g.
strings, time series, documents) in an inner product space
- Estimating kernel parameters via cross-validation, empirical equivalence
of kernel solutions
- Optimization with equality constraints: Constraint surface, geometry of
stationary point, Lagrangian
- Optimization with inequality constraints: feasible set, KKT condition,
saddle point inequalities, necessary and sufficient conditions for minima,
convex functions, differential form of KKT conditions
- Derivation of SVM optimization problem: Margin maximization, distance
scaling, dual formulation, definition and properties of support vectors
- Unit 6
- Slack-variable formulation for non-separable case
- Interpretation as regularized "soft margin" solution via hinge loss.
- Implementation issues: complexity of quadratic program, chunking,
extension to Platt's Sequential Minimal Optimization (SMO)
- Manifold view of data and regularity in kernel feature mappings
- Relationship to logistic and L2 regression via loss function comparison,
approximating 0-1 loss.
- Derivation of AdaBoost as additive model fit to exponential loss
- AdaBoost.M1 algorithm and properties
- Exponential loss comparison, robustness to overfitting, noise sensitivity
- No Free Lunch Theorem, the need to understand your data in practice, the
question of how to build "human-like" inductive bias into general learning
algorithms.
Miniprojects
You will have a miniproject which will allow you to work in groups on a topic of
your choice during the last few weeks of the semester. The deliverable for this
project will be a presentation in class during the last three class periods
prior to final exam week. The deadline for forming your project team and
submitting a project proposal is Monday, November 10.Each team is required to
submit one project proposal (and obviously everyone in the class needs to be
involved in one proposal). It is fine if you want to work on a project by
yourself (of course you still need to submit a proposal). Proposals are
submitted to me by email and must contain the following information. Please use
the headings below in formatting your email so it is easy for me to pull out
your answers to these questions.
- Project Title
- Project Team members
- Project Description - What is the goal of the project, what problem are
you solving or what experiment will you be performing, and what do you hope to
accomplish by the end of the semester?
- Project Deliverables - What results and analysis will you have by the time
of the presentation of your project to the class?
- Work Plan - What work needs to be done to accomplish your objectives, and
how will you distribute that work among your project team? It is very
important that each person in the project team have a well-defined
responsibility for a specific portion of the work, and that these assignments
be balanced across the team.
- Datasets - What dataset will you be using? Do you already have experience
in using it? How large is it?
My main interest in your miniprojects is that you have the chance to learn
about a particular topic in more detail and gain some insight. In this regard,
the results you obtain in the miniproject are not as important as the insight
and explanation that you have. In other words, I am less interested in absolute
measures of performance such as error rates, etc. and more interested in how you
can explain and interpret your findings in terms of the strengths and weaknesses
of your approach. This is what I will be looking for in your project
presentations in terms of assigning project grades. Note that you are not
required to write a project report. All of the necessary information about your
project must contained in your presentation to the class.
If you want to work in a project team but do not have any partners yet,
please send me email. If your team is formed, but you are not sure what
project you want to work on, then please send me an email containing your team
members and some areas of class material that you are interested in. I can
suggest some miniprojects based on that information.
Miniproject Presentation Guidelines
Your miniproject grade will be determined completely by your in-class
presentation and associated slides. In particular there is no formal written
report requirement for the miniproject. See below for information on how to
submit your slides and on the presentation order.
Since we have a small class this year, we can fit all of the project
presentations into two days. Because this class proceeds my Computer Vision
class, we will need to start a bit earlier in order to fit everyone in and allow
laptop changes etc. Therefore, we will have presentations during the following
two periods:
- Wed 12/3 1:45-2:55
- Fri 12/5 1:45-2:55
You will have 10 minutes to give your presentation. This includes
whatever time you need to change laptops etc. I will time your presentation and
prevent you from going over time, so please use your allocated time wisely so
that we can remain on schedule. In the case of group projects, it is not
necessary that every project member give part of the presentation. It is better
if one (or at most two) people summarize the work of the group. As stated below,
you will have the opportunity to give the contribution of each project member.
If you feel that you have more details that you would like to present than time
permits, it is OK to include additional slides at the end of your presentation.
I can look at those if I have questions later.
Please include the following information in your presentation:
- Project goal, what you were trying to accomplish
- Technical approach, what method you were using to achieve your goal
- Outcome and results, what were the results of your efforts
- Analysis of your approach and results --- This is the most
important element of the presentation. I want to see your analysis of what
worked, what didn't work, why, and what could be done to improve the approach.
- At the end of your slide deck, please include a slide containing a list of
the contributions that each person in the team made to the project. In
particular, please list contributions to the final results and analysis. Do
not attempt to go over this slide in detail during your presentation, you
can't afford the time. I will look at this off-line.
Please keep in mind that the main purpose of the presentations is to enable
everyone to learn from the efforts of all of the project groups. Please
include images and other visualizations that make it easier to explain and
understand your goal, approach, and analysis. As in real life, my grade will
reflect the fact that the quality and clarity of your presentation is at least
as important as the analysis and results that you obtain. I expect everyone in
the class to attend all of the presentations next week. Please let me know in
advance if you cannot attend for some reason.
Submission Instructions
Please bring your slides to class on a USB drive. If you have a laptop, you
may use that to present your work. Otherwise please load your slides onto the
classroom computer prior to the start of class.
Please post your slides and any accompanying video files on a
publicly-accessible website, and send the link to me. Please ensure that the link remains active until the Monday following exam
week, as I will use the on-line version of your slides to assign grades. If you are using a
Mac to prepare your slides, please refrain from using image formats (such as
tiff) that require the Quicktime decoder, as this is often not available on PC
platforms and may cause your images to be unviewable. Please test drive your
presentation on a standard Windows PC to avoid problems.
Good luck, I am looking forward to seeing your presentations!