This course introduces undergraduate computer science students to the field of machine learning. Assuming no prior knowledge in machine learning, the course focuses on two major paradigms in machine learning which are supervised and unsupervised learning. In supervised learning, we learn various methods for classification and regression. Dimensionality reduction and clustering are discussed in the case of unsupervised learning. If time permits, we will learn how to extend those methods to be deep.
This course is aimed at 3rd- or 4th-year undergraduate students in computer science.
For non-CS students
Please contact either Romeo Kumar <kumar _at_ cs.nyu.edu> or Leeann Longi <longi _at_ cs.nyu.edu>, the student adivors at the CS Department, directly.
MATH-UA 235 Probability and Statistics
MATH-UA 234 Mathematical Statistics
DS-GA 1001 Introduction to Data Science
DS-GA 1002 Statistical and Mathematical Methods
Note that the schedule below is only a guideline. The content of each lecture will be decided as the course progresses.
Müller & Guido
Classification I: Problem setup, Logistic regression
Ch 2 (56-62)
Classification II: Overfitting, Validation, Regularization
1.4.7-1.4.8, 6.5 (6.5.1-6.5.3)
Ch 2 (26-29), Ch 5 (252-275)
Classification III: Stochastic gradient descent algorithm
Classification IV: Support vector machine and loss functions
7.1 (7.1.1-7.1.2), 6.1-6.2
Ch 2 (92-104)
Classification V: Nonlinear classification and kernel method
Classification VI: Other classifiers
Ch 2 (68-70, 70-83)
Classification VII: Ensemble methods
Ch 2 (83-92)
Regression I: Linear regression, regularization
7.2-7.3, 7.5 (7.5.1, 7.5.4)
Ch 2 (45-55)
Regression II: Regularization and prior distribution
5.1-5.3, 6.5.1, 7.5.1
Regression III: Gaussian process regression
Dimensionality Reduction I: Problem setup
Dimensionality Reduction II: Principal component analysis
12.2 (12.2.1, 12.2.3), 12.3.2
Ch 3 (140-155)
Dimensionality Reduction III: Probabilistic principal component analysis, EM algorithm
12.2 (12.2.1-12.2.2), 9.3
12.2.4-12.2.5, 11.4 (11.4.1)
Dimensionality Reduction III: Gaussian process latent variable model
Dimensionality Reduction IV: Matrix factorization, collaborative filtering
12.2.3, Ilin and Raiko (2010)
Ch 3 (156-163)
Clustering I: Problem setup and evaluation
Ch 3 (191-207)
Clustering II: k-mean clustering
Ch 3 (176-181)
Clustering III: Mixture of Gaussians
11.2.1, 11.4.2 (220.127.116.11-18.104.22.168)
Clustering IV: Other clustering methods
Ch 3 (182-187)
Time Series I: MoG to HMM
Time Series II: PCA to Kalman Filter
There will be bi-weekly homeworks, starting from the second week of the semester. Each homework will be announced at the beginning of the lecture on Wednesday every other week. The answer must be submitted by email to the grader within two weeks after the announcement, and there will be no extension. All the answers must be typesetted using either LaTeX or Microsoft Word and submitted as a pdf file. Handwritten answers will not be accepted. Each homework may include one or more programming assignments.
As a part of the course, a student is expected to read at least five research papers on one of the following topics and summarize them into a single review paper.
From perceptron, neocognitron to modern convolutional networks.
Matrix Factorization for Collaborative Filtering: from SVD, non-negative matrix factorization, probabilistic PCA to Bayesian matrix factorization
Gaussian Process Latent Variable Models: from PCA to Gaussian process latent variable models and deep Gaussian process
Unsupervised representation learning: independent component analysis, sparse coding, restricted Boltzmann machines and denoising autoencoders
From k-means algorithm, Gaussian mixture models to the infinite Gaussian mixture model
Each student must compile a list of at least ten papers to read by March 1 and send by email the list to the instructor for feedback. Based on the feedback, the student should choose at least four papers from the list and write a review paper. The review paper must put different models under a single general framework and describe each model as its special case. In doing so, the similarities among those models will naturally emerge, and differences must be separately discussed in detail. The final review paper must be sent by email to the instructor by May 5.
A student in this course is expected to act professionally. Please also follow the GSAS regulations on academic integrity found here http://gsas.nyu.edu/page/academic.integrity