Table Of Content1
Approximate Message Passing with Nearest
Neighbor Sparsity Pattern Learning
Xiangming Meng, Sheng Wu, Linling Kuang, Defeng (David) Huang, and Jianhua Lu, Fellow, IEEE
Abstract—We consider the problem of recovering clus- group LASSO [10], StructOMP [17], Graph-CoSaMP
teredsparsesignalswithnopriorknowledgeofthesparsity [18], and blocksparse Bayesian learning (B-SBL) [19]–
6 pattern. Beyond simple sparsity, signals of interest often [21], etc. However, these algorithms require knowledge
1 exhibitsanunderlyingsparsitypatternwhich,ifleveraged, of sparsity pattern which is usually unknown a priori.
0 canimprovethereconstructionperformance.However, the
2 sparsity pattern is usually unknown a priori. Inspired To reconstruct sparse signals with unknown structure, a
by the idea of k-nearest neighbor (k-NN) algorithm, we numberofmethods[22]–[28]havebeendevelopedtouse
n proposeanefficientalgorithmtermedapproximatemessage various structured priors to encourage both sparsity and
a passing with nearest neighbor sparsity pattern learning
cluster patternssimultaneously.Themain effortof these
J
(AMP-NNSPL), which learns the sparsity pattern adap-
algorithmsliesinconstructingahierarchicalpriormodel,
4 tively.AMP-NNSPLspecifiesaflexiblespikeandslabprior
ontheunknownsignaland,aftereachAMPiteration,sets e.g., Markov tree [23], structured spike and slab [24],
] the sparse ratios as the average of the nearest neighbor [25], hierarchicalGamma-Gaussian [26]–[28] to encode
T
estimates via expectation maximization (EM). Experimen- the structured sparsity pattern.
.I tal results on both synthetic and real data demonstrate In this letter, we take an alternative approach and
s thesuperiorityofourproposedalgorithmbothintermsof
c reconstructionperformanceandcomputationalcomplexity. propose an efficient message passing algorithm, termed
[ AMP with nearest neighbor sparsity pattern learning
Index Terms—Compressed sensing, structured sparsity, (AMP-NNSPL),torecoverclusteredsparsesignalsadap-
1 approximate message passing, k-nearest neighbor.
v tively, i.e., without any prior knowledge of the sparsity
3 pattern.Forclusteredsparsesignals,ifthenearestneigh-
4 I. INTRODUCTION bors of one element are zeros (nonzeros),it will tend to
5
bezero(nonzero)withhighprobability,asimilarideaof
0 Compressed sensing (CS) aims to accurately recon-
k-nearestneighbor(k-NN)algorithmwhichassumesthat
0 struct sparse signals from undersampled linear mea-
. surements [1]–[3]. To this end, a plethora of methods data close together morelikely belongto the same cate-
1
gory[29],[30].Therefore,insteadofexplicitlymodeling
0 have been studied in the past years. Among others,
thesophisticatedsparsitypattern,AMP-NNSPLspecifies
6 approximate message passing (AMP) [4] proposed by
1 Donohoetal.isonestate-of-the-artalgorithmtoaddress aflexiblespikeandslabpriorontheunknownsignaland,
: aftereachAMPiteration,updatesthesparseratiosasthe
v sparse signal reconstruction in CS. Moreover, AMP
average of their nearest neighbor estimates via expecta-
i has been extended to Bayesian AMP (B-AMP) [5],
X tion maximization (EM) [31]. In this way, the sparsity
[6] and general linear mixing problems [7]–[9]. While
r patternislearnedadaptively.Simulationsresultsonboth
a many practical signals can be described as sparse, they
synthetic and real data demonstrate the superiority of
often exhibit an underlying structure, e.g., the nonzero
our proposed algorithm both in terms of reconstruction
coefficients occur in clusters [10]–[16]. Exploiting such
performance and computational efficiency.
intrinsic structure beyond simple sparsity can signifi-
cantly boost the reconstruction performance [14]–[16].
II. SYSTEMMODEL
Tothisend,variousalgorithmshavebeenproposed,e.g.,
Consider the following linear Gaussian model
This work was partially supported by the National Nature Sci-
y=Ax+w, (1)
ence Foundation of China (Grant Nos. 91338101, 91438206, and
61231011),theNationalBasicResearchProgramofChina(GrantNo. where x ∈ RN is the unknown signal, y ∈ RM is
2013CB329001).
X. Meng and J. Lu are with the Department of Elec- the available measurements, A ∈ RM×N is the known
tronic Engineering, Tsinghua University, Beijing, China. (e-mail: measurement matrix, and w ∈ RM ∼ N w;0,∆ I
0
[email protected]; [email protected]). (cid:0) (cid:1)
is the additive noise. N x;m,C denotes a Gaussian
S.WuandL.KuangarewiththeTsinghuaSpaceCenter,Tsinghua
(cid:0) (cid:1)
University, Beijing, China. (e-mail: [email protected]; distribution of x with mean m and covariance C and
[email protected]). I denotes the identity matrix. Our goal is to estimate x
Defeng (David) Huang is with the School of Electrical, Electronic
from y when M ≪ N and x is clustered sparse while
and Computer Engineering, The University of Western Australia,
Australia (e-mail:[email protected]). its specific sparsity pattern is unknown a priori.
2
To enforce sparsity, from a Bayesian perspective, For more details of AMP and its extensions, the
the signals are assumed to follow sparsity-promoting readersare referredto [4]–[6], [35]. Two problemsarise
prior distributions, e.g., Laplace prior [32], automatic in traditional AMP. First, it assumes full knowledge of
relevance determination [33], and spike and slab prior the prior distribution and noise variance, which is an
[6], [34]. In this letter we consider a flexible spike and impractical assumption. Second, it does not account for
slab prior of the form the potential structure of sparsity. In the sequel, we
resorttoexpectationmaximization(EM)tolearntheun-
N N
p (x)= p (x )= 1−λ δ(x )+λ f(x ) , (2) knownhyperparameters.Further,toencouragestructured
0 Y 0 i Y(cid:2)(cid:0) i(cid:1) i i i (cid:3) sparsity, we develop a nearest neighbor sparsity pattern
i=1 i=1
learning rule motivated by the idea of k-NN algorithm.
where λ ∈(0,1)is the sparse ratio, i.e., the probability
i For lack of space, we onlyconsider the sparse Gaussian
of x being nonzero, δ(x ) is the Dirac delta function,
i i case, f x = N x ;µ ,τ , while generalization to
f(xi)isthedistributionofthenonzeroentriesinx,e.g., other se(cid:0)ttini(cid:1)gs is po(cid:0)ssiible.0 0(cid:1)
f(x ) = N(x ;µ ,τ ) for sparse Gaussian signals and
i i 0 0 Thehiddenvariablesarechosenastheunknownsignal
f(x )=δ(x −1) for sparse binary signals, etc.
i i vector x and the hyperparameters are denoted by θ.
It is important to note that in (2) we specify an indi-
The specific definition of θ depends on the choice of
vidualλ for each entry, as opposed to a commonvalue
i distribution f x in (2). In the Gaussian case, θ =
in[6],[34].Thisisakeyfeaturethatwillbeexploitedby (cid:0) (cid:1)
µ ,τ ,∆ ,λ ,i=1,...,N while in the binary case,
the proposed algorithm for reconstruction of structured (cid:8)θ =0 ∆0 ,λ0 ,ii=1,...,N .(cid:9)Denote by θt the estimate
sparse signals. Up to now, it seems that no structure (cid:8) 0 i (cid:9)
of hyperparameters at the tth EM iteration, then EM
is ever introduced to enforce the underlying sparsity
alternates between the following two steps [31]
pattern. Indeed, if the sparse ratios λ ,i = 1,...,N
i
are learned independently, we will not benefit from the Q θ,θt =E lnp x,y |y;θt , (7)
potential structure. The main contribution of this letter (cid:0) (cid:1) n (cid:0) (cid:1) o
is a novel adaptive learning method which encourages θt+1 =argmaxQ θ,θt , (8)
θ (cid:0) (cid:1)
clustered sparsity, as descried in Section III.
where E ·|y;θt denotes expectation conditioned on
observati(cid:8)ons y w(cid:9)ith parameters θt, i.e., the expectation
III. PROPOSED ALGORITHM
is with respect to the posterior distribution p x|y;θt .
In this section, inspired by the idea of k-NN, we (cid:0) (cid:1)
From (1), (2), the joint distribution p(x,y) in (7) is
propose an adaptive reconstruction algorithm to recover
defined as
clustered sparse signals without any prior knowledge of
the sparsity pattern, e.g., structure and sparse ratio. p(x,y)=p y|x (1−λ )δ(x )+λ f(x ), (9)
(cid:0) (cid:1)Y i i i i
Beforeproceeding,we first givea brief descriptionof i
AMP. Generally, AMP decouples the vector estimation
where p(y|x) = N y;Ax,∆ I . AMP offers an
0
problem (1) into N scalar problems in the asymptotic efficient approximatio(cid:0)n of p x|y(cid:1);θt , denoted as
regime [35], [36] q x|y;θt = q x |R ,Σ , (cid:0)whereby(cid:1)the E step (7)
(cid:0) (cid:1) Qi (cid:0) i i i(cid:1)
can be efficiently calculated. Since joint optimization of
R =x +w˜
1 1 1 θ is difficult, we adopt the incremental EM update rule
.
y=Ax+w−→.. , (3) proposedin [37], i.e., we updateone or partialelements
at a time while holding the other parameters fixed.
R =x +w˜
N N N Aftersomealgebra,themarginalposteriordistribution
where the effective noise w˜i asymptotically follows of xi in (4) can be written as
N w˜ ;0,Σ . The values of R ,Σ are updated itera-
i i i i
tiv(cid:0)ely in eac(cid:1)h AMP iteration (see Algorithm 1) and the q xi|Ri,Σi = 1−πi δ xi +πiN xi;mi,Vi , (10)
(cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1)
posterior distribution of x is estimated as
i where
1 τ Σ
q(cid:0)xi|Ri,Σi(cid:1)= Z(Ri,Σi)p0(cid:0)xi(cid:1)N(cid:0)xi;Ri,Σi(cid:1), (4) Vi = Σi0+iτ0, (11)
whereZ(Ri,Σi)isthenormalizationconstant.From(4), m = τ0Ri+Σiµ0, (12)
i
the estimates of the mean and variance of xi are Σi+τ0
λ
i
π = , (13)
ga(Ri,Σi)=ˆ xiq(cid:0)xi|Ri,Σi(cid:1)dxi, (5) i λi+(cid:0)1−λi(cid:1)exp(cid:0)−L(cid:1)
2
gc(Ri,Σi)=ˆ x2iq(cid:0)xi|Ri,Σi(cid:1)dxi−ga2(Ri,Σi). (6) L= 12lnΣiΣ+iτ0 + 2RΣi2i − (cid:0)2RΣii−+µτ00(cid:1) . (14)
(cid:0) (cid:1)
3
Note that for notational brevity, we have omitted the maximum or a saddle point of the likelihood function
iteration index t. The mean and variance defined in (5) [31]. The sparse ratios λ and noise variance ∆ are
i 0
and (6) can now be explicitly calculated as initialized as λ1 =0.5 and ∆1 = y 2/M SNR0+1 ,
i 0 (cid:13) (cid:13)2 (cid:0) (cid:1)
respectively, where SNR0 is sugg(cid:13)est(cid:13)ed to be 100 [34].
g R ,Σ =π m , (15)
a i i i i
(cid:0) (cid:1) For the sparse Gaussian case, active mean µ and vari-
gc(cid:0)Ri,Σi(cid:1)=πi(cid:0)m2i +Vi(cid:1)−ga2(cid:0)Ri,Σi(cid:1). (16) ance τ0 are initialized as µ10 = 0, and τ01 =0 (cid:0)(cid:13)y(cid:13)22 −
toTmoalxeiamrnizetheQ(cid:0)spθa,rθset(cid:1)rawtiioths λreis,pie=ct1to,..λ.i,.NA,ftweresnoemede lM2 ∆no10r(cid:1)m/λa1in(cid:13)(cid:13)dAF(cid:13)(cid:13)ro2Fb,erneisupsenctoivrmel,y,rewspheecreti(cid:13)(cid:13)v·e(cid:13)(cid:13)l2y,. (cid:13)(cid:13)·(cid:13)(cid:13)F(cid:13)ar(cid:13)e the
algebra, we obtain the standard EM update equation as Theproposedapproximatemessagepassingwithnear-
λt+1 = πt, which, albeit simple, fails to capture the est neighbor sparsity pattern learning (AMP-NNSPL) is
i i
inherentstructurein the sparsity pattern.To addressthis summarized in Algorithm 1. The complexity of AMP-
problem, a novel learning rule is proposed as follows NNSPLisdominatedbymatrix-vectormultiplicationsin
1 the originalAMP and thus only scales as O(MN), i.e.,
λt+1 = πt, (17)
i N i X j the proposed algorithm is computationally efficient.
(cid:12) (cid:0) (cid:1)(cid:12)j∈N(i)
(cid:12) (cid:12)
where N i denotes the set of nearest neighbor in- Algorithm 1 AMP-NNSPL Algorithm
(cid:0) (cid:1)
dexes of element x in x (1) and N i denotes the Input: y A.
i
(cid:12) (cid:0) (cid:1)(cid:12)
cardinality of N i . For one dimen(cid:12)siona(cid:12)l (1D) data , Initialization: Set t=1 and Tmax,ǫtoc. Initialize
N i = i−1,i(cid:0)+(cid:1)1 1 and N i =2, while for two µ ,τ ,∆ and λ ,i=1,...,N as in Section III.
0 0 0 i
(cid:0) (cid:1) (cid:8) (cid:9) (cid:12) (cid:0) (cid:1)(cid:12)
dimensional(2D)data,N(cid:0)i(cid:1)(cid:12)=(cid:8)(q,(cid:12)l−1),(q,l+1),(q− xˆ1i = xip0(xi)dxi,νi1 = |xi−xˆ1i|2p0(xi)dxi,i=
1,l),(q+1,l) and N i = 4, where (q,l) indicates 1,...,´N, V0 =1,Z0 =y´,a=1,...,M.
(cid:9) (cid:12) (cid:0) (cid:1)(cid:12) a a a
the coordinates of xi(cid:12)in th(cid:12)e 2D space. Generalizations 1) Factor node update: For a=1,...,M
to other cases can be made.
Vt = |A |2νt,
Notethatin(17),wehavechosenthenearestneighbor a X ai i
ofeachelement,excludingitself,astheneighboringset. i
Vt
The estimate of one sparse ratio is not determined by Zt = A xˆt− a y −Zt−1 .
its own estimate, but rather the average of its nearest a Xi ai i ∆t0+Vat−1(cid:0) a a (cid:1)
neighbor estimates. The insight for this choice is that,
2) Variable node update: For i=1,...,N
for clustered sparse signals, if the nearest neighbors of
oneelementarezero(nonzero),itwillbezero(nonzero) Σt = |Aai|2 −1,
with high probability, a similar idea to k-NN. If the i hX∆t +Vti
a 0 a
neighboring set is chosen as the whole elements, the
A y −Zt
proposedalgorithmreducestoEM-BG-GAMP[6],[34]. Rt =xˆt+Σt ai(cid:0) a a(cid:1),
i i iX ∆t +Vt
The leaning of other hyperparameters follows the a 0 a
standard rule of EM algorithm. Maximizing Q θ,θt xˆt+1 =g Rt,Σt ,
(cid:0) (cid:1) i a(cid:0) i i(cid:1)
with respect to ∆0 and after some algebra, we obtain νˆt+1 =g Rt,Σt .
i c(cid:0) i i(cid:1)
∆t+1 = 1 (cid:0)ya−Zat(cid:1)2 + ∆t0Vat , (18) 3) Update λti+1,i=1,...N, as (17);
0 M Xa h(cid:0)1+Vat/∆t0(cid:1)2 ∆t0+Vati 4) Update µt0+1,τ0t+1,∆t0+1 as (19), (20), and (18);
5) Set t ← t + 1 and proceed to step 1) until T
whereZt andVt areobtainedwithin theAMPiteration max
a a iterations or xˆt+1−xˆt <ǫ xˆt .
and are defined in Algorithm 1. Similarly, maximizing (cid:13) (cid:13)2 toc(cid:13) (cid:13)2
Q θ,θt with respect to µ and τ resultsin the update (cid:13) (cid:13) (cid:13) (cid:13)
0 0
(cid:0) (cid:1)
equations
πtmt IV. NUMERICAL EXPERIMENTS
µt+1 = Pi i i, (19)
0 πt In this section, a series of numerical experiments
i i
τ0t+1 = P1πtXπit(cid:2)(cid:0)µt0−mti(cid:1)2+Vi(cid:3). (20) aprreoppoesrefdoramlgeodritthomdeumndoenrsvtraartieoutsheseptteinrfgosr.mCaonmcepaorfisothnes
Pi i i aremadetosomestate-of-the-artmethodswhichneedno
Valid initialization of the unknown hyperparameters prior information of the sparstiy pattern, e.g., PC-SBL
is essential since EM algorithmmay convergeto a local [26] and its AMP version PCSBL-GAMP [27], MBCS-
LBP[28],andEM-BG-GAMP[34].Theperformanceof
1For end points of 1D data, the nearest neighbor set has only one
BasisPursuit(BP) [38]–[40] isalsoevaluated.Through-
element.Foredgepointsof2Ddata,thenearestneighborsethasonly
twoorthreeelements. out the experiments, we set the maximum number of
4
1 1 10 0.8
AMP-NNSPL AMP-NNSPL
Success Rate0000....2468 APPEMCCM-SP-SBB-BNGLLN--GGSAAPMMLPP Pattern Success Rate0000....2468 APPEMCCM-SP-SBB-BNGLLN--GGSAAPMMLPP NMSE (dB)----432100000 PPEMBCCMPB-S-CSBBSBGL-LL--GGBAAPMMPP Recovery Time (sec)000000......234567 PPEMBCCMPB-S-CSBBSBGL-LL--GGBAAPMMPP
MBCS-LBP MBCS-LBP -50 0.1
BP BP
00.2 0.3 0.4 0.5 0.6 0.7 0.8 00.2 0.3 0.4 0.5 0.6 0.7 0.8 -600.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 00.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Measurement ratio: M/N Measurement ratio: M/N Measurement ratio: M/N Measurement ratio: M/N
Figure1. Successrate(left)andpatternsuccessrate(right)vs.M/N Figure2. NMSE(left)andrecoverytime(right)vs.M/N forblock-
forblock-sparse signalsN =100,K=25,L=4,noiseless case. sparsesignals N =100,K=25,L=4,SNR=50dB.
1 0
AMP-NNSPL
iterations for AMP-NNSPL, PCSBL-GAMP, and EM- 0.8 APMCSPB-NLN-GSAPMLP -10 PECMS-BBGL--GGAAMMPP
oBwfGet-euGrsmeAitMnhaePtidoteonfabtuoetbsTeemttǫaitnxogc=.=T2h10e00−e,l6ae.nmFdeontrhtseootthfoelmrereaaalngscouerrietvmhamleunset, Success Rate00..46 EBMP-BG-GAMP NMSE(dB)---432000 BP
matrixA∈RM×N are independentlygeneratedfollow- 0.2 -50
ing standard Gaussian distribution and the columns are 0 -60
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
Measurement ratio: M/N Measurement ratio: M/N
normalized to unit norm. The success rate is defined as
the ratio of the number of successful trials to the total Figure 3. Success rate (left) in noiseless case and NMSE (right) at
number of experiments, where a trial is successful if SNR=50dBvs.M/N forreal2Dangiogram image.
the normalized mean square error (NMSE) is less than
-60dB, where NMSE=20log ( xˆ−x / x ). The
10 (cid:13) (cid:13)2 (cid:13) (cid:13)2 canbeseenthatAMP-NNSPLsignificantlyoutperforms
pattern recovery success rate is d(cid:13)efined(cid:13)as t(cid:13)he(cid:13)ratio of
other methodsboth in terms of success rate and NMSE.
the number of successful trials to the total number of
In particular, when M/N = 0.12 and SNR = 50 dB,
experiments, where a trial is successful if the support is
typical recovery results are illustrated in Fig. 4, which
exactlyrecovered.Acoefficientwhosemagnitudeisless
showsthatAMP-NNSPLachivesthebestreconstruction
than 10−4 is deemed as a zero coefficient.
performance.
A. Synthetic Data
Wegeneratesyntheticblock-sparsesignalsinasimilar
way as [21], [26], where K nonzero elements are par-
titioned into L blocks with random sizes and random
locations. Set N = 100,K = 25,L = 4 and the
(cid:11)(cid:68)(cid:12)(cid:50)(cid:85)(cid:76)(cid:74)(cid:76)(cid:81)(cid:68)(cid:79) (cid:11)(cid:69)(cid:12)(cid:37)(cid:51) (cid:11)(cid:70)(cid:12)(cid:40)(cid:48)(cid:16)(cid:37)(cid:42)(cid:16)(cid:42)(cid:36)(cid:48)(cid:51)
nonzeroelementsaregeneratedindependentlyfollowing
Gaussian distribution with mean µ = 3 and variance
0
τ =1. The results are averaged over 1000 independent (cid:37)(cid:51) (cid:16)(cid:19)(cid:17)(cid:23)(cid:19)(cid:20)
0
runs.Fig.1depictsthesuccessrateandpatternrecovery (cid:40)(cid:48)(cid:16)(cid:37)(cid:42)(cid:16)(cid:42)(cid:36)(cid:48)(cid:51) (cid:16)(cid:20)(cid:17)(cid:21)(cid:20)(cid:28)
(cid:51)(cid:38)(cid:54)(cid:37)(cid:47)(cid:16)(cid:42)(cid:36)(cid:48)(cid:51) (cid:16)(cid:26)(cid:17)(cid:20)(cid:23)(cid:25)
success rate. It can be seen that AMP-NNSPL achieves
(cid:36)(cid:48)(cid:51)(cid:16)(cid:49)(cid:49)(cid:54)(cid:51)(cid:47) (cid:16)(cid:20)(cid:26)(cid:17)(cid:28)(cid:19)(cid:27)
the highest success rate and pattern recovery rate at
(cid:11)(cid:71)(cid:12)(cid:51)(cid:38)(cid:54)(cid:37)(cid:47)(cid:16)(cid:42)(cid:36)(cid:48)(cid:51) (cid:11)(cid:72)(cid:12)(cid:36)(cid:48)(cid:51)(cid:16)(cid:49)(cid:49)(cid:54)(cid:51)(cid:47) (cid:11)(cid:73)(cid:12)(cid:49)(cid:48)(cid:54)(cid:40)
various measurement ratios. In the noisy setting, Fig.
2 shows the average NMSE and runtime of different Figure 4. Recovery results of real 2D angiogram image in noisy
algorithms when the signal to noise ratio (SNR) is 50 settingwhenM/N =0.12andSNR=50dB.
dB, where SNR = 20log ( Ax / w ). We see
10 (cid:13) (cid:13)2 (cid:13) (cid:13)2
that AMP-NNSPL outperforms(cid:13)othe(cid:13)r m(cid:13)eth(cid:13)ods both in V. CONCLUSION
terms of NMSE and computational efficiency.
Inthislettter,weproposeanefficientalgorithmtermed
AMP-NNSPL to recover clustered sparse signals when
B. Real Data the sparsity pattern is unknown. Inspired by the k-
Toevaluatetheperformanceonrealdata,weconsider NN algorithm, AMP-NNSPL learns the sparse ratios
a real angiogram image [18] of 100×100 pixels with in each AMP iteration as the average of their nearest
sparsity around 0.12. Fig. 3 depicts the success rate in neighbor estimates using EM, thereby the sparsity pat-
noiseless case and NMSE at SNR = 50 dB, respec- tern is learned adaptively. Experimental results on both
tively. The MBCS-LBP and PC-SBL algorithms are not synthetic and real data demonstrate the state-of-the-art
included due to their high computational complexity. It performance of AMP-NNSPL.
5
REFERENCES [20] Z.ZhangandB.D.Rao,“Sparsesignalrecoverywithtemporally
correlated source vectors using sparsebayesian learning,” IEEE
J.Sel.TopicsSignalProcess.,vol.5,no.5,pp.912–926,2011.
[1] D.L.Donoho, “Compressedsensing,” IEEETrans.Inf.Theory,
[21] Z. Zhang and B. Rao, “Extension of SBL algorithms for the
vol.52,no.4,pp.1289–1306, Apr.2006.
recovery of block sparse signals with intra-block correlation,”
[2] E.J.CandèsandM.B.Wakin,“Anintroduction tocompressive
IEEETrans. onSignal Process.,vol. 61, no.8, pp. 2009–2015,
sampling,” IEEE Signal Process. Mag., vol. 25(2), pp. 21–30,
2013.
Mar.2008.
[22] L. He and L. Carin, “Exploiting structure in wavelet-based
[3] Y.C.EldarandG.Kutyniok, Eds.,Compressedsensing:theory bayesian compressive sensing,” IEEE Trans. Signal Process.,
andapplications. Cambridge Univ.Press,2012.
vol.57,no.9,pp.3488–3497, 2009.
[4] D.L.Donoho, A.Maleki, andA.Montanari, “Message-passing
algorithmsforcompressedsensing,”inProc.Nat.Acad.Sci.,vol. [23] S.SomandP.Schniter,“Compressiveimagingusingapproximate
106,no.45,Nov.2009,pp.18914–18919. message passing and a markov-tree prior,” IEEE Trans. Signal
[5] ——, “Message passing algorithms for compressed sensing: Process.,vol.60,no.7,pp.3439–3448, 2012.
I. motivation and construction,” in IEEE Information Theory [24] L.Yu,H.Sun,J.-P.Barbot,andG.Zheng,“Bayesiancompressive
Workshop(ITW),Jan.2010,pp.1–5. sensingforclusterstructuredsparsesignals,”SignalProcessing,
vol.92,no.1,pp.259–269, 2012.
[6] F. Krzakala, M. Mézard, F.Sausset, Y. Sun, and L. Zdeborová,
[25] M.R.Andersen,O.Winther,andL.K.Hansen,“Spatio-temporal
“Probabilistic reconstruction incompressedsensing:algorithms,
spikeandslabpriorsformultiplemeasurementvectorproblems,”
phase diagrams, and threshold achieving matrices,” Journal of
arXivpreprintarXiv:1508.04556, 2015.
StatisticalMechanics:TheoryandExperiment,vol.2012,no.08,
[26] J. Fang, Y. Shen, H. Li, and P. Wang, “Pattern-coupled sparse
p.P08009,2012.
Bayesian learning for recovery of block-sparse signals,” IEEE
[7] S. Rangan, “Generalized approximate message passing for esti-
Trans.onSignalProcess.,vol.63,no.2,pp.360–372,2015.
mationwithrandomlinearmixing,”inProc.IEEEInt.Symp.Inf.
[27] J.Fang,L.Zhang,andH.Li,“Two-dimensionalpattern-coupled
Theory,2011,pp.2168–2172.
sparse Bayesian learning via generalized approximate message
[8] P.Schniter, “Amessage-passingreceiver forBICM-OFDMover
passing,”arXivpreprintarXiv:1505.06270, 2015.
unknown clustered-sparse channels,” IEEEJ. Sel. Topics Signal
[28] L.Yu,H.Sun,G.Zheng,andJ.P.Barbot,“ModelbasedBayesian
Process.,vol.5,no.8,pp.1462–1474, Dec.2011.
compressive sensing via local beta process,” Signal Processing,
[9] U. S. Kamilov, S. Rangan, A. K. Fletcher, and M. Unser, “Ap-
vol.108,pp.259–271,2015.
proximatemessagepassingwithconsistentparameterestimation
[29] E. Fix and J. L. Hodges Jr, “Discriminatory analysis-
and applications to sparse learning,” IEEE Trans. Inf. Theory,
nonparametric discrimination: consistency properties,” DTIC
vol.60(5),pp.2969–2985,May.2014.
Document, Tech.Rep.,1951.
[10] M.YuanandY.Lin,“Modelselection andestimationinregres- [30] T.M.CoverandP.E.Hart,“Nearestneighborpatternclassifica-
sion with grouped variables,” Journal of the Royal Statistical tion,”IEEETrans.Inf.Theory,vol.13,no.1,pp.21–27,1967.
Society: Series B (Statistical Methodology), vol. 68, no. 1, pp. [31] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum
49–67,2006. likelihoodfromincompletedataviatheemalgorithm,”Journalof
[11] V.Cevher,P.Indyk,C.Hegde,andR.G.Baraniuk,“Recoveryof theroyalstatisticalsociety.SeriesB(methodological), pp.1–38,
clusteredsparsesignalsfromcompressivemeasurements,”DTIC 1977.
Document, Tech.Rep.,2009. [32] T. Park and G. Casella, “The Bayesian lasso,” Journal of the
[12] M.Stojnic,F.Parvaresh,andB.Hassibi,“Onthereconstruction AmericanStatisticalAssociation,vol.103,no.482,pp.681–686,
of block-sparse signals with an optimal number of measure- 2008.
ments,” IEEE Trans. Signal Process., vol. 57, no. 8, pp. 3075– [33] M. E. Tipping, “Sparse Bayesian learning and the relevance
3085,2009. vector machine,” The journal of machine learning research,
[13] R.G.Baraniuk,V.Cevher,M.F.Duarte,andC.Hegde,“Model- vol.1,pp.211–244,2001.
based compressive sensing,” IEEE Trans. Inf. Theory, vol. 56, [34] J. P. Vila and P. Schniter, “Expectation-maximization gaussian-
no.4,pp.1982–2001,2010. mixtureapproximatemessagepassing,”IEEETrans.SignalPro-
[14] J. Huang, T. Zhang et al., “The benefit of group sparsity,” The cess.,vol.61,no.19,pp.4658–4672,2013.
AnnalsofStatistics, vol.38,no.4,pp.1978–2004, 2010. [35] A.Montanari, “Graphical models concepts incompressed sens-
[15] Y.C.EldarandM.Mishali,“Robustrecoveryofsignals froma ing,”arXivpreprintarXiv:1011.4328, 2010.
structuredunionofsubspaces,”IEEETrans.Inf.Theory,vol.55, [36] M.BayatiandA.Montanari,“Thedynamicsofmessagepassing
no.11,pp.5302–5316, 2009. ondensegraphs,withapplicationstocompressedsensing,”IEEE
[16] Y.C.Eldar,P.Kuppinger,andH.Bölcskei,“Block-sparsesignals: Trans.Inf.Theory,vol.57,no.2,pp.764–785,Feb.2011.
Uncertaintyrelationsandefficientrecovery,”IEEETrans.Signal [37] R.M.NealandG.E.Hinton,“AviewoftheEMalgorithmthat
Process.,vol.58,no.6,pp.3042–3054, 2010. justifies incremental, sparse, and other variants,” in Learning in
[17] J.Huang, T.Zhang, andD.Metaxas, “Learning with structured graphical models. Springer, 1998,pp.355–368.
sparsity,” The Journal of Machine Learning Research, vol. 12, [38] S.S.Chen,D.L.Donoho,andM.A.Saunders,“Atomicdecom-
pp.3371–3412, 2011. positionbybasispursuit,”SIAMjournalonscientificcomputing,
vol.20,no.1,pp.33–61,1998.
[18] C. Hegde, P. Indyk, and L. Schmidt, “A nearly-linear time
[39] E. J. Candès and T. Tao, “Decoding by linear programming,”
framework forgraph-structured sparsity,” inProceedings ofThe
IEEETrans.Inf.Theory,vol.51,no.12,pp.4203–4215, 2005.
32ndInternational Conference onMachine Learning, 2015, pp.
[40] E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty
928–937.
principles: Exact signal reconstruction from highly incomplete
[19] D.P.Wipf and B.D.Rao, “Anempirical Bayesian strategy for
frequency information,” IEEETrans.Inf.Theory,vol.52,no.2,
solving the simultaneous sparse approximation problem,” IEEE
pp.489–509, 2006.
Trans.SignalProcess.,vol.55,no.7,pp.3704–3716, 2007.