NESTA

A Fast and Accurate First-order Method for Sparse Recovery

Introduction

NESTA is a fast and robust first-order method than solves basis-pursuit problems and a large number of extensions (including tv-denoising). It is described in the paper NESTA: A Fast and Accurate First-order Method for Sparse Recovery (technical report, April 2009, California Institute of Technology) by Jérôme Bobin, Stephen Becker, and Emmanuel Candès. A journal version appears in SIAM J Imaging Sciences 4(1) 2011.

The algorithm uses two ideas due to Yurii Nesterov. The first idea is an accelerated convergence scheme for first-order methods, giving the optimal convergence rate for this class of problems. The second idea is a smoothing technique that replaces the non-smooth l1 norm with a smooth version.

The basic algorithm solves the following equation, often known as basis pursuit denoising, or simply as l1-minimization:

The parameter epsilon is typically small and proportional to an estimate of the standard deviation of any noise in the measurements. If epsilon is 0, the problem is known as just basis pursuit. Basis pursuit and basis pursuit denoising are often used in statistics, image-processing, and compressed sensing.

NESTA is capable of solving many other problems; some popular extensions are discussed on this website.

The current version of code for NESTA requires that the measurement matrix A is an orthogonal projection. It is sometimes possible to extend NESTA to allow general matrices A; a few special cases have been implemented in the code.

Analysis

For real-world signals, the signal is not sparse, but it may be sparse (or be approximately-sparse, i.e. most energy is concentrated in only a few coefficients) in a dictionary W. For example, W may be a discrete cosine basis, a Gabor frame, a curvelet frame, a wavelet basis, etc. In this case, it may be desirable to solve the analysis-based problem:

An alternative is the synthesis-based problem:

Unless the dictionary is a basis, the two problems are not equivalent. Traditionally, the synthesis-based approach has been used because it requires no modification to basis pursuit solvers: simply treat (AW) as a single operator.

NESTA is one of few algorithms that can solve the analysis problem (in addition to the synthesis problem). The code accepts W as in input.

Note that this allows NESTA to solve the reweighted l1 problem; for this case, W is a diagonal matrix. Demos showing the use of analysis and reweighting are included in the code.

Total-Variation Minimization

It is possible to use NESTA to solve the total-variation (TV) minimization problem, which is often used to recover images from noisy and/or undersampled data.

Briefly, the TV norm is given as

where

NESTA can solve the following TV-minimization problem:

A demo showing how to solve the TV minimization problem is included in the code.

Parameters

One of the most attractive features of NESTA is that it relies on very few parameters. The most important parameter is mu, which is explained below.

NESTA solves a smoothed version of the l1-norm. Instead of solving equation

where the primal feasible set is

NESTA solves instead

where f_mu is the smoothed version of the l1 norm. When mu is zero, f_mu is identically the l1 norm. For high accuracy, mu should be set small. For total-variation minimization, the setup is analogous, and mu should be small for high accuracy, or large for faster performance.

When mu is large, the algorithm converges faster. Because of this, there is a continuation scheme which will initially solve the problem with large mu, and use the solution to this problem as the initial starting point for the algorithm. The number of continuation steps, T, may be varied. Typical choices are about 4 or 5. Setting T to 1 (i.e. no continuation) may be beneficial for problems that are very sparse, but in general T should be larger than 1. Setting T much larger than about 7 will probably hurt performance most of the time.

The stopping criteria is controlled by a single parameter delta. For high accuracy, delta should be small. If mu is small, then delta should be small, but if mu is large, then delta can be larger as well.

The parameter epsilon is an estimate based on the noise level. In the paper, the following values were used for these parameters:

where sigma is the standard deviation of the noise and m is the number of measurements.

Tricks for the code

In many problems, the measurement matrix A is a sub-sampled orthogonal operator, which can be written as

where R is a subsampling operator. In this case, it is beneficial to make the following change of variables: equation

and then solve the following problem

This is an analysis problem, where the analysis operator W* is just U*, and the measurement matrix is just R; after solving the problem, the solution may be recovered by undoing the change of variables. The advantage of this approach is that U and U* will each be applied exactly once per iteration; applying R is extremely rapid and typically applying U and U* is the dominant cost of the whole algorithm. Without the change of variables, U or U* will be applied six times per iteration. If U is an fft, then the cost per iteration is 2 fft's with the change of variables, compared to 6 fft's without the change of variables.

The code contains an example of the change of variables trick.

Download the code

UPDATE: NESTA v.1.1 is released (188 kB zip file, Nov 24 2009). There are no algorithmic changes, but there are some bug fixes, more demos, and more support for complex numbers and the case when A is not a projector. You may view the important changes in the change log.

Download NESTA v.1.0 (100 kB zip file, released June 3 2009, updated October 9 2009), which contains NESTA implemented in MATLAB. The code was written primarily by J. Bobin. Also included is a version of NESTA that is set to handle the Lagrangian form of the basis pursuit problem (aka the unconstrained version, denoted by "UP" in the code).

The code can accept the linear operators A and W as either matrices or as function handles.

There are many demo scripts that demonstrate how to use NESTA.

The version of NESTA used in the paper was slightly less user-friendly. To download that version, and all the scripts needed to recreate the data that was presented in the paper, you may download the NESTA experiments zip file, which was written by J. Bobin and S. Becker. The zip file also contains stripped-down and modified versions of the other solvers that were used in the comparisons. It must be emphasized that these solvers were downloaded in early spring 2009 -- please do not use these versions for your own work! These other software packages are updated frequently, so please download the most recent verions. Links are provided below:

RecPF: Reconstruction from Partial Fourier data. The paper used version 1.1 (Feb 4 2009); as of June 2011, version 2.2 (Nov 23, 2010) is available.
SpaRSA: Sparse Reconstruction by Separable Approximation. The paper used the version from Jan 20, 2009.
GPSR: Gradient Projection for Sparse Reconsruction. The paper used version 5.0 from December 2007, with a few parameter modifications suggested by the authors. These modifications will likely be made a part of the package soon. The website hosts version 6.0 as of Jan 19, 2009.
FPC: Fixed-Point Continuation. The paper used vesion 2.0 (June 2008).
FPC_AS: Fixed-Point Continuation and Active Set. The paper used version 1.0 (Sept 2008); version 1.21 is available as of July 2010.
Bregman Iterative Algorithms. The paper used version 1.0 (Oct 2007), in conjunction with the recent FPC code for the subproblem solvers (this requires some modification). The website says that version 2.0 will be released in the future.
SPGL1. The paper used version 1.6 (Aug 2008). The current version is 1.7 (May 20, 2009).

Download the paper

NESTA: A Fast and Accurate First-order Method for Sparse Recovery (PDF file, 1.7 MB), technical report, April 2009, California Institute of Technology

Please note there is a typo in equation (3.7). In the equation for lambda, there should be a "L" in both terms of the expression in the second part of the max function.

Contact the authors

For questions/comments about the code, please contact Jérôme Bobin or Stephen Becker; for questions/comments about the website, please contact Stephen.

Last updated June 16, 2011
page by Stephen Becker