(CS/CNS/EE 155) Machine Learning & Data Mining

Course Description

Prerequisite: background in algorithms and statistics (CS/CNS/EE/NB 154 or CS/CNS/EE 156a or instructor’s permission)

This course will cover popular methods in machine learning and data mining, with an emphasis on developing a working understanding of how to apply these methods in practice. This course will also cover core foundational concepts underpinning and motivating modern machine learning and data mining approaches. This course will be research-oriented, and will cover recent research developments.

Course Details

Late Homework Policy

Students are allowed 2 free late days for submitting homeworks and miniprojects. After using the two free late days, a 50% penalty will to submissions that are one day late, and submissions beyond one day late will not be accepted. You can use fractions of late days. For example, you can turn in the homework 12 hours late and use up half of a late day. Please specify how many late days you are using at the top when you submit your homework.

Instructor

Yisong Yue               yyue@caltech.edu

Teaching Assistants

Bryan He bryanhe@caltech.edu
Masoud Farivar mfarivar@caltech.edu
Shenghan Yao syao@caltech.edu
Vighnesh Shiv vshiv@caltech.edu
Minfa Wang mwang5@caltech.edu
Vineet Augustine    vaaugust@caltech.edu

Office Hours

Optional Textbook

Machine Learning: a Probabilistic Perspective, by Kevin Murphy.
Since this is an advanced level course, all relevant course materials can be learned via research papers and supplementary lecture notes. However, this book is an excellent reference and I will refer to various chapters of it throughout the course.

Assignments

Lectures & Recitation Schedule

Note: schedule is subject to change.

                                Further Reading:
1/06/2015 Lecture: Administrivia, Review [slides][pdf]
1/07/2015 Recitation: Linear Algebra & Optimization [pdf]
1/08/2015 Lecture: Review Part 2 [slides][pdf]
1/13/2015 Lecture: Regularization, Sparsity & Lasso [slides][pdf] Murphy 13.3
1/14/2015 Recitation: Probability & Statistics [pdf]
1/15/2015 Lecture: Recent Applications of Lasso [slides][pdf]
1/20/2015 Lecture: Sequence Prediction & HMMs [slides][pdf] Murphy 17.3--17.5
1/21/2015 Recitation: Viterbi Algorithm (no slides, sorry)
1/22/2015 Lecture: Conditional Random Fields [slides][pdf][notes] Wallach's intro to CRFs [pdf]
1/27/2015 Lecture: Recap of CRFs & Structural SVMs [slides][pdf][notes]
1/28/2015 Recitation: Gradient Descent for non-Differentiable Functions [pdf]
1/29/2015 Lecture: Structural SVMs Part 2 & General Structured Prediction [slides][pdf]
2/03/2015 Lecture: Decision Trees, Bagging & Random Forests [slides][pdf] Overview of Decision Trees [pdf]
Overview of Bagging [pdf]
Overview of Random Forests [pdf]
2/04/2015 Recitation: Brief Tutorial on Kaggle & Decision Tree Packages [pdf]
2/05/2015 Lecture: Boosting & Ensemble Selection [slides][pdf] Shapire's Overview of Boosting [pdf]
2/10/2015 Lecture: Learning Reductions & Recent Applications of DTs     [slides][pdf]
2/12/2015 NO LECTURE
2/17/2015 Lecture: Clustering & Dimensionality Reduction [slides][pdf] Murphy 12.2
2/19/2015 Lecture: Latent Factor Models & Non-Negative Matrix Factorization [slides][pdf] Original Netflix Paper [link]
2/24/2015 Lecture: Embeddings [slides][pdf] Locally Linear Embedding [link]
Playlist Embedding [link]
2/26/2015 Lecture: Recent Applications of Latent Factor Models [slides][pdf]
3/03/2015 Lecture: Deep Learning [slides][pdf]
3/05/2015 Lecture: The Multi-Armed Bandit Problem [slides][pdf]
3/10/2015 Lecture: Course Review

Additional References