Projects


Below are several of my larger projects - for a more complete summary of my work, please see my publications page.

Robust PCA by Convex Optimization

 
Classical PCA is a beautiful algorithm, with applications from vision and signal processing to bioinformatics and search and ranking. Unfortunately, while PCA is optimal for recovering a low-rank matrix under small Gaussian noise, it breaks down under highly-corrupted or outlying observations.

Fortunately, for most cases, the problem of recovering such a low-rank matrix A from highly-corrupted observations D = A + E can be efficiently and exactly solved by convex optimization:

A natural semidefinite programming relaxation perfectly recovers almost all d x d matrices A of rank O(d/log d) from almost all errors E affecting cd2 of the observations.


Face Recognition by Sparse Representation: Theory and Practice


 
Automatic face recognition is a classical problem in computer vision. Despite much progress over the last 30 years, making recognition systems work in the real world is still a surprisingly difficult challenge.

This project shows how face recognition under varying illumination, occlusion and alignment error can be effectively cast as a sparse representation problem, which can then be solved using tools from convex programming.

In turn, peculiarities of face images reveal surprising phenomena in the theory of sparse representation, including the correction of large fractions of errors and the success of l1 minimization even in highly coherent dictionaries that violate classical sufficient conditions such as the restricted isometry property.

Read more:
    Robust Face Recognition via Sparse Representation
    Towards a Practical Automatic Face Recognition System: Robust Illumination and Alignment by Sparse Representation
    Dense Error Correction via L1 Minimization
    Project webpage - Face Recognition by Sparse Representation

Clustering and Classification by Lossy Data Compression

 
This project investigates the use of lossy data coding and compression as a tool for clustering and classifying multivariate data. Compression provides a natural answer to the model selection problem, and also has nice regularization effects in classification.

Algorithms built around these principles perform very effectively on real-world tasks involving mixture distributions with components of varying dimension. One important example of this is natural image segmentation, where the allowable distortion in computing the coding length provides a natural measure of scale. 

Read more:
    Segmentation of Multivariate Mixed Data via Lossy Coding and Compression
    Classification via Minimum Incremental Coding Length (MICL)
    Unsupervised Segmentation of Natural Images via Lossy Data Compression
    Project webpage - Clustering and Classification by Lossy Compression