Jeffrey A. Bilmes
E E 512
Bayesian networks, Markov random fields, factor graphs, Markov properties, standard models as graphical models, graph theory (e.g., moralization and triangulation), probabilistic inference (including pearl's belief propagation, Hugin, and Shafer-Shenoy), junction threes, dynamic Bayesian networks (including hidden Markov models), learning new models, models in practice. Prerequisite: E E 508; E E 511.
Course Announcement: EE-512 Graphical Models, Spring 2006, SLN-3204 Instructor: Jeff A. Bilmes, 418 EE/CS Bldg. Email: firstname.lastname@example.org Course web page: TBA Units: 4 Course Hours: TTh 1:30-3:20 EE1-054
Description: This course will serve as a thorough overview of graphical models including Bayesian networks, undirected models, and factor graphs. Topics will include:
1) Graph types: conditional independence; directed, undirected, and factor models; algorithms for conditional independence (e.g., Bayes-ball, d-separation, Markov properties on graphs, factorization, Hammersley-Clifford theorems).
2) Models: linear Gaussian models, mixture models, factor analysis, probabilistic decision trees, Markov Random Fields, Gibbs distributions, static conditional random fields (CRFs), multivariate Gaussians as graphical models.
3) Dynamic (temporal) models: Hidden Markov Models, Kalman filtering and linear-Gaussian HMMs, linear dynamic models, dynamic Bayesian networks (DBNs), label and observation bias in natural language processing, dynamic conditional random fields (CRFs), and general dynamic graphical models.
4) Chordal Graph Theory: moralization; triangulated, decomposable, and intersection graphs, $k$-trees, hyper-graphs. Tree-width and path-width parameters of a graph. Graph separators and their crossings. Relational data-base systems and joins, and phylogenetic tree construction.
5) Exact Probabilistic Inference: junction trees, belief propagation in its various forms (including Pearl's formulation, Hugin, Shafer-Shenoy, Bucket-elimination, etc.); join-trees and data-structures; optimal triangulations. The elimination family of algorithms. Generalized distributed law (GDL). Relation to dynamic programming. Generality (such as Viterbi, MPE, the fast Fourier transform). NP hardness results, and NP hardness results for finding the best form.
6) Inference as search: Search-based approaches to inference and how it relates to junction trees, message passing, and BP. Results from the constraint satisfaction (CSP) and SAT literatures, constraint solvers, and heuristics used in these domains. Cutset and recursive conditioning, comparison with elimination. Branch-and-bound.
7) Approximate Probabilistic Inference: loopy belief propagation (BP), expectation propagation (EP), generalized belief propagation (GBP), cluster GBP, comparisons with mean-field and variational approaches, statistical physics, Bethe and Kikuchi free-energies, fixed points, recent theoretical results and open questions. Also, sampling approaches such as MCMC and particle algorithms (such as Rao-Blackwellized particle filtering). Pruning based approaches.
8) Learning: learning Bayesian networks, hardness results for learning networks in the maximum likelihood sense, learning in the PAC (probably approximately correct) framework, EM for parameter and structure learning, alternating minimization, discriminative parameter and structure learning, CRF training vs. discriminatively trained HMM, other learning methods. Information theory and graphical models.
9) Models in practice: Real-world static graphs. HMMs and DBNs in speech, language, and bioinformatics, QMR and Factorial HMMs, turbo-coding, low-density parity check codes, other codes on graphs, belief propagation algorithms on these graphs. Practical issues such as data structures and algorithms useful for performing inference.
The course will have homework assignments, a midterm exam, and a final project (a research paper and final presentation). There will be no final exam.
Course materials: Main Texts: 1) "An Introduction to Graphical Models", by Mike Jordan (will use a pre-print directly the author) 2) Online course reader and handouts.
1) Chapters from "Graphical Models" by Steffen Lauritzen, Oxford Science Publications 2) Chapters from Koller/Friedman text.
Reference Texts: 1) "Neural Networks for Pattern Recognition", by C. Bishop 2) "The Elements of Statistical Learning: Data Mining, Inference," and Prediction Hastie, Tibshirani, and Friedman 3) "Pattern Classification," R. Duda, P. Hart and D. Stork
Prerequisite: Statistical Pattern Recognition I 511, or prior exposure to pattern recognition concepts, or consent of instructor.
Student learning goals
General method of instruction
Class assignments and grading