Reviews optimization and convex optimization in its relation to statistics. Covers the basics of unconstrained and constrained convex optimization, basics of clustering and classification, entropy, KL divergence and exponential family models, duality, modern learning algorithms like boosting, support vector machines, and variational approximations in inference. Prerequisite: experience with programming in a high level language. Offered: W.
The course will cover various methods for solving optimization problems occuring in data fitting, such as the Choleski, SVD, and conjugate gradient methods for linear least squares, algorithms for penalized least squares with L2 and L1 penalty, gradient boosting, the Marquardt method for nonlinear least squares, Newton and quasi-Newton methods, and the EM algorithm for maximum likelihood estimation. I will also discuss motivating examples from a variety of areas.
Student learning goals
General method of instruction
There will be a mixture of traditional lectures, discussion of homework assignments, and student presentations.
Statistics at least on the level of Stat/Math 390: Familiarity with notions like "least squares", "conditional probability", "independence","random variable", "probability density", "covariance".
Linear Algebra: Familiarity with notions like "vector space", "norm", "inner product", "projection", "eigenvector".
Some programming experience. The class will use the R language and system, and C.
Class assignments and grading
There will be about five homeworks on the material presented in the lectures.
In addition, every student will be expected to complete a project and present it during the last week of classes. Projects can, and ideally should, be motivated by the student's research, but I also have suggestions if need arises.
The project will count for about 2/3 of the grade, the homeworks for 1/3.