|
|
Emmanuel Candes
Cal Tech
Monday, March 02
03:30 PM - 04:30 PM
|
Exact Matrix Completion via Convex Optimization: Theory and Algorithms
This talk considers a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. In partially filled out surveys, for instance, we would like to infer the many missing entries. In the area of recommender systems, users submit ratings on a subset of entries in a database, and the vendor provides recommendations based on the user's preferences. Because users only rate a few items, we would like to infer their preference for unrated items (this is the famous Netflix problem). Formally, suppose that we observe m entries selected uniformly at random from a matrix. Can we complete the matrix and recover the entries that we have not seen?
We show that perhaps surprisingly, one can recover low-rank matrices exactly from what appear to be highly incomplete sets of sampled entries; that is, from a comparably small number of entries. Further, perfect recovery is possible by solving a simple convex optimization program, namely, a convenient semidefinite program (SDP). This result hinges on powerful techniques in probability theory. We will also present a very efficient algorithm based on iterative singular value thresholding, which can complete matrices with about a billion entries in a matter of minutes on a personal computer.
|
|
|