This course will cover mathematically rigorous models for handling large, high-dimensional datasets for which we can only afford linear or even sub-linear time and space resources. Examples of topics that will be covered:

(1) Dimensionality reduction. General techniques and impossibility results for reducing data dimensionality while still preserving geometric structure

(2) Numerical linear algebra. Algorithms for big matrices (e.g. a user/product rating matrix for Netflix or Amazon). Regression, low rank approximation, matrix completion,-

(3) Compressed sensing. Recovery of (approximately) sparse signals based on few linear measurements.- Sketching and Streaming. Using small-space to create data structures that describe a ``stream’’ of data that we can only scan sequentially once or few times, and that preserve important information from that data.

Grading:

Homework: 40%

Final project: 40%

Scribing a class: 20%

(1) Dimensionality reduction. General techniques and impossibility results for reducing data dimensionality while still preserving geometric structure

(2) Numerical linear algebra. Algorithms for big matrices (e.g. a user/product rating matrix for Netflix or Amazon). Regression, low rank approximation, matrix completion,-

(3) Compressed sensing. Recovery of (approximately) sparse signals based on few linear measurements.- Sketching and Streaming. Using small-space to create data structures that describe a ``stream’’ of data that we can only scan sequentially once or few times, and that preserve important information from that data.

Grading:

Homework: 40%

Final project: 40%

Scribing a class: 20%