Research Overview
My research focuses on the interaction among the fields of signal processing, statistical inference, and machine learning. These fields share a complex relationship, with advances in one often providing insights into important outstanding problems in another. Together, they pervade nearly every scientific and applied mathematical field, serving as a key interface between mathematics, science, and engineering. At their core, they are all primarily concerned with efficient algorithms for extracting information from signals or data.
In recent years, however, all three of these fields have come under mounting pressure to accommodate massive amounts of increasingly high-dimensional data. For example, witness the explosion in the quantity of high-resolution audio, imagery, video, and other sensed data produced by relatively inexpensive mobile devices. Similar "big data" scenarios arise in such diverse application areas as medical and scientific imaging, genomic data analysis, meteorology, remote surveillance, digital communications, and large-scale network analysis. Despite extraordinary advances in computational power, such high-dimensional data continues to pose a number of challenges in signal processing, statistical inference, and machine learning. For example, mobile devices equipped with a range of sensors can easily acquire high-dimensional data at a rate which exceeds local storage and/or communication capacity by several orders of magnitude. In many other scientific and industrial applications (such as at the Large Hadron Collider at CERN), data can be generated at rates that far outstrip our ability to store it. In such scenarios we are faced with the challenge of extracting meaningful information from massively undersampled data. Moreover, even if we could collect all of this data, any effort to extract information must somehow overcome the "curse of dimensionality."
From a classical perspective, there is no obvious way to overcome these challenges. However, in many cases these nominally high-dimensional signals actually obey some sort of low-dimensional model, i.e., a model with only a few degrees of freedom. Geometrically, this means that the data lies near a low-dimensional structure such as a manifold, a union of subspaces, the set of low-rank matrices, or some other natural set. This structure, when present, allows for principled approaches to handling high-dimensional data. For example, the model of sparsity lies at the heart of compressive sensing — an emerging framework for efficient sensing that is revolutionizing the data processing pipeline starting with the acquisition of the data itself.
To address the challenges posed by high-dimensional data, my research focuses largely on the use of low-dimensional models such as sparsity, low-rank structure, and manifold or parametric models to perform practical signal processing and machine learning tasks. Such models can allow for elegant solutions to the challenges posed by high-dimensional data in a range of possible contexts.