Tuesdays & Thursdays, 10:50am - 12:05pm
Location: DH 1042
Piazza Course Webpage
Office Hours: Thursday 5:15 - 6:30pm
Office Hours: Tuesdays 12-1pm
Office Hours: Mondays 3-4pm
Recitation: Wednesdays 5-6pm, DH 1046
Elements of Statistical Learning by Hastie, Tibshirani & Friedman.
Introduction to Statistical Learning by James, Witten, Tibshirani & Hastie.
Statistical Learning with Sparsity by Hastie, Tibshirani & Wainwright.
Statistics for High-Dimensional Data by Buhlmann & van de Geer.
Convex Optimization by Boyd & Vandenberghe.
Midterm Exam 20%
Final Project / Competition 40%
Class Participation 5%
A hard copy of homeworks are due in class or by 12pm to the TA in Duncan Hall. All homeworks must be typeset using LaTeX and no late homeworks will be accepted. Homeworks may be discussed with classmates but must be written and submitted individually.
There will be an (open book / open notes) in-class midterm exam on November 2.
Final Project & Contest:
The final project will be a data analysis contest. The competition will begin on September 19 and can be done in teams of two people. Competition Description: [pdf]
September 8: Add Deadline
October 6: Drop Deadline
Announcements, Assignments & Lectures:
Course DescriptionThis course is an advanced survey of statistical machine learning theory and methods. Emphasis will be placed methodological, theoretical, and computational aspects of tools such as regularized regression, classification, kernels, dimension reduction, clustering, graphical models, trees, and ensemble learning. Students will learn how and when to apply statistical learning techniques, their comparative strengths and weaknesses, their mathematical and statistical properties, how to compute each method and how to critically evaluate the performance of learning algorithms. Students completing this course should be able to (i) apply sophisticated statistical learning methods to build predictive models or perform exploratory analysis, (ii) evaluate methods for a mathematical, statistical and computational perspective, and (iii) properly validate statistical learning models and interpret their results.
Tentative Lecture Schedule:August 22: Intro & MSE / Least Squares
August 24: Ridge Regression
September 5: Sparse Regression I
September 7: Sparse Regression II
September 12: High-Dimensional Theory I
September 14: High-Dimensional Theory II
September 19: GLMs & Regularized GLMs
September 21 & Bayes Classifiers
September 26: LDA
September 28: SVMs I
October 3: SVMs II
October 5: Non-Linear I
October 12: Non-Linear II
October 17: Model Validation I
October 19: Model Validation II
October 24: Dimension Reduction I
October 26: Dimension Reduction & Clustering
October 31: Clustering II
November 2: Graphical Models
November 7: In-Class Midterm Exam
November 9: Trees & Bagging
November 14: Random Forests
November 16: Boosting
November 21: Boosting & Ensemble Learning
November 28: Competition Presentations
November 30: Competition Presentations