Frontpage‎ > ‎Courses‎ > ‎

Fall 2015 DS-GA 3001 <Natural Language Understanding with Distributed Representations>


How should natural languages be understood and analyzed? In this course, we will examine some of the modern computational approaches, mainly using deep learning, to understanding, processing and using natural languages. Unlike conventional approaches to language understanding, we will focus on how to represent and manipulate linguistic symbols in a continuous space.

Target Audience

The course is mainly intended for master- and doctorate-level students in computer science and data science. The number of seats is limited, and the priority is given to the students enrolled in the master’s programme at the Center for Data Science and those in the Ph.D. programme of the Department of Computer Science, Courant Institute of Mathematical Sciences.

General Information

  • Lecture: 5.10pm - 7.00pm on Monday at Warren Weaver Hall 202

  • Laboratory: 5.10pm - 6.00pm  on Wednesday at BOBS LL139

    • (First four lab sessions are mandatory for every student)

  • Instructor: Kyunghyun Cho

  • Teaching Assistants

    • Sebastien Jean

    • Kelvin Xu

  • Office Hours

    • Instructor: 15.30-16.30 on Monday (location: 726 Broadway Rm 784)

    • TA: 18.10-19.00 on Wednesday (location: BOBS LL139)

  • Grading (tentative): Prerequisite Knowledge Test (5%) + Lab Assignments (30%) + Final Project (50%) + In-Class Exam (15%)

  • Course Site: NYUClasses will be used extensively for the following purposes

    • Distribution of lecture notes and slides

    • Lab assignments: reports are to be uploaded via NYU Classes.

    • Final project: both proposal and report will be received via NYU Classes.


A student is expected to be familiar with the following topics:

  • Undergraduate level Probability and Statistics

  • Undergraduate level Linear Algebra

  • Undergraduate level Calculus

  • Machine Learning: DS-GA-1003

A student is encouraged to try the following languages/frameworks in advance:

A student is expected to have taken the following courses before taking this course:

  • DS-GA-1002: Statistical and Mathematical Methods

  • DS-GA-1003: Machine Learning and Computational Statistics

This course is complementary to

Schedule (Draft)




Reading List

14 Sep


Guest Lecture by Felix Hill (Cambridge)

Machine Learning: Basic Concepts

(Cho, mandatory)

- Ch. 1 of <Foundations of Statistical Natural Language Processing> by Manning and Schuetze. 1999 (2001). (accessible from NYU)

- Sec. 1.1.1 and 1.1.2 of <Procedures as a Representation for Data in a Computer Program for Understanding Natural Language> by Terry Winograd. 1971.

- <Aetherial Symbols> by Geoff Hinton

- <A Review of B. F. Skinner's Verbal Behavior> by Noam Chomsky. 1967.

- Rumelhart, David E., James L. McClelland, and PDP Research Group. Parallel distributed processing. Vol. 1. IEEE, 1988.

21 Sep

Neural Networks

Theano Tutorial:

Classifying MNIST digits using Logistic Regression

Multilayer Perceptron

Lab Assignment 1

(Jean, mandatory)

1. Video lectures by Hugo Larochelle: 1.1 - 2.11

28 Sep

5 Oct

Recurrent neural networks: Basic, Time series and its modelling

Theano Tutorial: LSTM Networks for Sentiment Analysis

Lab Assignment 2

(Xu, mandatory)

13 Oct

19 Oct

Q&A Session by TA’s

26 Oct

Language Modeling &

Continuous space representation

1. LSTM Language Model with Theano

2. n-gram Language Model with KenLM

Lab Assignment 3

(Jean and Xu, mandatory)

1. <From Sequence Modeling to Translation> by Kyunghyun Cho

2. <From language modeling to machine translation> by Blunsom at DLSS@Montreal 2015

3. <A neural probabilistic language model> by Bengio et al.

4. <Three New Graphical Models for Statistical Language Modelling> by Mnih and Hinton (2007)

2 Nov

1. <Aetherial Symbols> by Geoff Hinton

2. <Deep Consequences-Why Neural Networks are Good for (Language) Science> by Felix Hill

3. <The Lighthill Debate (1973)>

4. “Every time I fire a linguist, the performance of the recognizer goes up” by Fred Jelinek, 1998

5. Warren Weaver memorandum, July 1949

9 Nov

Neural machine translation

16 Nov


(Jean and Xu)

1. Sec. 18.8 of <An introduction to machine translation> by W. John Hutchins and Harold L. Somers

2. Warren Weaver memorandum, July 1949

3. Introduction to Neural Machine Translation with GPUs (Parts 1, 2 and 3) by Kyunghyun Cho

4. The Bandwagon by Claude Shannon.

23 Nov

Beyond Sentences/Languages: Multimodal, Multitask Learning


(Jean and Xu)

1. Describing Multimedia Content using Attention-based Encoder-Decoder Networks by Cho, Courville and Bengio

2. Teaching Machines to Read and Comprehend by Hermann et al.

3. Memory Networks by Weston et al.

4. End-to-End Memory Networks by Sukhbaatar et al.

30 Nov

Guest Lecture by
Ryan Kiros

(University of Toronto)

Guest Lecture by Antoine Bordes (Facebook)


(Jean and Xu)

1. Skip-Thought Vectors by Kiros et al.

2. Large-scale Simple Question Answering with Memory Networks by Bordes et al.

7 Dec

Break (NIPS 2015)

14 Dec

Final Exam

Deadline for Assignment

Lab Assignments

First of all, it is mandatory to attend the first eight lab sessions. Missing any of these sessions will result in a lower grade/score.

There will be three lab assignments during these eight lab sessions:

  1. Multilayer Perceptron for Object Recognition

    1. TA in charge: Sebastien Jean

    2. Deadline: 30 September

  2. Recurrent neural networks for sequence classification

    1. TA in charge: Kelvin Xu

    2. Deadline: 13 October

  3. Recurrent neural network language model

    1. TA in charge: Kelvin Xu

    2. Deadline: 4 November

For each lab assignment, a student is expected to hand in a short report outlining the model, its implementation and experimental results (up to 3 pages long) and present the working code to the TA in charge during the lab session. Note that office hours are not meant for assisting students on these assignments.

Final Project

In this course, a student is expected to conduct a research project related to the topics presented during the lectures. The topic of each research project is to be agreed upon with the lecturer and teaching assistants based on the topic proposal submitted by a student. The deadline for the topic proposal is 26 October, and the proposal should consist of up to 3 pages of the description of the topic, method and experimental procedures. Once the proposal has been submitted, the student will receive a confirmation and feedback by email from the lecturer and/or teaching assistants in two weeks.   

The final report is due on the last lecture (14 December.) The final report should include the description of the task, models, experiments and conclusion and be up to 6 pages long (a more specific instruction on the format will be announced later.) The deadline will not be extended.

Some of the candidate topics include, but are definitely not limited to,

  1. Language modeling

  2. Distributed representation learning for natural languages

  3. Machine translation

  4. Image/Video Description Generation

Students are encouraged to find recent literatures on one of these topics and prepare to discuss it with the lecturer and/or teaching assistants, in order to narrow down a specific topic. Students are encouraged and expected to use the lab sessions to ask the questions on practical issues implementing these models and running experiments. At each lab session, one of the teaching assistants or the lecturer will be present.


A student in this course is expected to act professionally. Please also follow the GSAS regulations on academic integrity found here