This course will cover mathematically rigorous models for handling large, high-dimensional datasets for which we can only afford linear or even sub-linear time and space resources. Examples of topics that will be covered:
(1) Dimensionality reduction. General techniques and impossibility results for reducing data dimensionality while still preserving geometric structure
(2) Numerical linear algebra. Algorithms for big matrices (e.g. a user/product rating matrix for Netflix or Amazon). Regression, low rank approximation, matrix completion,-
(3) Compressed sensing. Recovery of (approximately) sparse signals based on few linear measurements.- Sketching and Streaming. Using small-space to create data structures that describe a ``stream’’ of data that we can only scan sequentially once or few times, and that preserve important information from that data.
Grading:
Homework: 40%
Final project: 40%
Scribing a class: 20%
(1) Dimensionality reduction. General techniques and impossibility results for reducing data dimensionality while still preserving geometric structure
(2) Numerical linear algebra. Algorithms for big matrices (e.g. a user/product rating matrix for Netflix or Amazon). Regression, low rank approximation, matrix completion,-
(3) Compressed sensing. Recovery of (approximately) sparse signals based on few linear measurements.- Sketching and Streaming. Using small-space to create data structures that describe a ``stream’’ of data that we can only scan sequentially once or few times, and that preserve important information from that data.
Grading:
Homework: 40%
Final project: 40%
Scribing a class: 20%