Course Description

The application of neural network methods - under the name Deep Learning - has led to breakthroughs in a wide range of fields, including in building language technologies (e.g. for search, translation, text input prediction). This course will provide a hands-on introduction to the use of deep learning methods for processing natural language. Methods to be covered include static word embeddings, feed-forward networks for text, recurrent neural networks, transformers, pre-training and transfer learning, with applications including sentiment analysis, translation, generation, and testing Linguistic theory.

Days	Time	Location
Monday and Wednesday	10:25 - 11:40 AM	Hylan 307

Teaching Staff

Role	Name	Office	Office Hours
Instructor	C.M. Downey	Lattimore 507	Wednesdays 2-4pm

Prerequisites

Programming in Python
Linux/Unix commands
Calculus 1

Course Resources

More information coming soon

Policies

Homework

Students will complete 8 homeworks, comprised of both written and (Python) programming assignments. Unless noted otherwise on the schedule, homeworks will be released on Wednesdays, and due at 11pm on the following Wednesday. All homework will be submitted via Blackboard.

All deadlines and meeting times for this class are in "Eastern Time". Please note: on Sunday November 3, this will change from Eastern Daylight Time (EDT/UTC-4) to Eastern Standard Time (EST/UTC-5).

Late work

All work should be submitted by 11:00pm the day it is due. Work that is received late will incur the following penalties:

Up to 1 hour late: 5%
Up to 24 hours late: 10%
Up to 48 hours late: 20%
Later than 48 hours: not graded (0 for the assignment)

Extensions (without penalty) may be offered if they are requested within a reasonable amount of time (relative to the reason for the extension) before the work is due. Please don't hesitate to ask for an extension if you need one.

Special topic presentations

The latter portion of the course will focus on examples of Deep Learning being applied to Linguistics and Linguistic Theory. Students will pick a scholarly paper featuring such an application and present the work during class, including leading a discussion. Depending on course enrollment, this may be completed inidividually or as a small group.

Final grading

80%: Homework assignments
15%: Special topic presentation / discussion
5%: Participation / attendance

Exceptions

Students will not be penalized because of important civic, ethnic, family or religious obligations, or university service. You will have a chance, whenever feasible, to make up within a reasonable time any assignment that is missed for these reasons. Absences for these reasons will count as excused for the sake of the participation grade. But it is your job to inform me of any expected missed work in advance, as soon as possible.

Academic honesty

All assignments and activities associated with this course must be performed in accordance with the University of Rochester's Academic Honesty Policy. More information is available here. Please note: The use of Generative AI to produce any part of the written or programming assignments is not allowed. Due to the topicality of the course, I will make an exception if you implement and train the model yourself (i.e. no use of pre-trained weights or API calls to pre-existing models), and turn in the implementation with the assignment you used in on. For the sake of your time, I do not recommend this option.

Schedule

Date	Topics + Slides	Readings	Events
Aug 26	Introduction / Overview; History
Aug 28	Linear Algebra	Essence of Linear Algebra Ch.1-8
Sep 2	Labor Day: no class
Sep 4	Word vectors; Gradient descent	JM 5.4-5.6, 6 YG 2	hw1 released [pdf, tex] [due Sep 11]
Sep 9	Word2Vec	JM 6.8 - 6.12
Sep 11	Computation graphs; Backpropagation	JM 7.5.3 - 7.5.5 YG 5.1.1 - 5.1.2 Calculus on computational graphs CS 231n notes Yes, you should understand backprop
Sep 16	Github Classroom and Codespaces (Demo)		hw2 released [pdf, tex] [due Sep 23]
Sep 18	Neural Networks edugrad library	JM 7.1 - 7.4 YG 4
Sep 23	Feed-forward networks for LM and classification	JM 7.5 YG 9 A Neural Probabilistic Language Model (Bengio et al 2003) Deep Unordered Composition Rivals Syntactic Methods for Text Classification (Iyyer et al 2015)	hw3 released [pdf, tex] [due Sep 30]
Sep 25	Recurrent Neural Networks	JM 9.1-9.5 The Unreasonable Effectiveness of Recurrent Neural Networks
Sep 30	Vanishing gradients; RNN variants	JM 9.6 YG 15 Understanding LSTMs Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation On the difficulty of training recurrent neural networks	hw4 released [pdf, tex] [due Oct 7]
Oct 2	Sequence-to-sequence; Attention	JM 10 Sequence to Sequence Learning with Neural Networks (original seq2seq paper) Neural Machine Translation by Jointly Learning to Align and Translate (original seq2seq + attention paper)
Oct 7	Transformers 1	JM 9.7-9.9 Attention is All You Need (original Transformer paper) The Annotated Transformer The Illustrated Transformer
Oct 9	Transformers 2	"
Oct 14	Fall Break: no class
Oct 16	Pre-training / fine-tuning paradigm	JM 11 Contextual Word Representations: Putting Words into Computers The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)	hw5 released [pdf, tex] [due Oct 23]
Oct 21	Pre-training / fine-tuning paradigm (cont.)	"
Oct 23	Text tokenization in language models
Oct 28	Interpretability and analysis	Analysis Methods in Natural Language Processing A Primer in BERTology
Oct 30	Multilingual language models	Cross-Lingual Language Model Pretraining Optional / peruse if interested: Are All Languages Created Equal in Multilingual BERT? Emerging Cross-lingual Structure in Pretrained Language Models On the Cross-lingual Transferability of Monolingual Representations Word Translation Without Parallel Data Bilingual alignment transfers to multilingual alignment for unsupervised parallel text mining	hw6 released [pdf, tex] [due Nov 6]
Nov 4	"Large Language Models" (LLMs)
Nov 6	Questions of LLM hype and dangers		hw7 released [pdf, tex] [due Nov 13]
Nov 11	Instructor at conference, no class
Nov 13	Instructor at conference, no class
Nov 18	Class cancelled (illness) Paper presentation assigned		hw8 released [pdf, tex] [due Nov 25]
Nov 20	EMNLP Conference Highlights
Nov 25	TBA
Nov 27	Thanksgiving Break: no class
Dec 2	Presentations 1	TBA
Dec 4	Presentations 2	TBA
Dec 9	Overflow / Summary / Review