Goals
The goal of this course is to provide strong foundations for students interested in the manipulation of data, broadly defined. In particular, this course is highly recommended for students who are interested in machine learning, algorithms, data-mining, mathematical finance, and microeconomics.
Prerequisites
This will be a mathematically rigorous course. Basic knowledge in linear algebra and competency with calculus are required (e.g. Math 23a or Math 25a and Math 25b) as well as basic probability (Stat 110). AM 121 is certainly helpful but is not a necessary prerequisite. An appreciation for aesthetics, as well as prior coursework in algorithms, machine learning, and statistics will be helpful but not necessary. There will be some simple programming exercises (CS 50 and comfort with programming in a scripting language suffice). The course is intended for graduate students, but advanced undergraduates are encouraged to attend as well.
Assessment
Half of the course grade will be determined by the performance on the problem sets, and half determined by the final project. There will be weekly problem sets, and the performance on the problem sets will be the average score of the best ten problem sets, submitted in compliance with the submission policy. The project can be data-oriented or theoretically-focused, or (better) a combination of both. There is room for projects involving algorithms, data mining, machine learning, game theory and mechanism design, and statistics.
Course Outline
Prelude
- Optimization as a foundation for data science
- Convex sets, convex functions
- Projection, separating hyperplanes, polyhedral sets
Linear Optimization
- Farkas' lemma
- Duality theory
- The Simplex method
- Ellipsoid, and interior point methods
Convex Optimization
- Gradient descent
- Newton's method and its variants
- Karush-Kuhn-Tucker conditions
- Lagrangian duality
- Semi-definite programming
Combinatorial Optimization
- Computational complexity
- Approximation algorithms
- Submodular functions
- Local search and greedy algorithms
- Dynamic Programming
- Matroids, multilinear extensions, convex and concave closures
- Rounding techniques
- Continuous approximation algorithms
Resources
The class is self contained, and will not follow a specific book. The material of the course is largely covered in the following textbooks:
- Convex Optimization, by Boyd and Vandenberghe
- Introduction to Linear Optimization, by Bertsimas and Tsitsiklis
- The Design of Approximation Algorithms, by Williamson and Shmoys
- Approximation Algorithms, by Vazirani
- Integer and Combinatorial Optimization, by Nemhauser and Wolsey
- The Elements of Statistical Learning, by Hastie, Tibshirani and Friedman