Table Of Content

Robust PCA and subspace tracking from incomplete observations using (cid:96) -surrogates∗ 0 ClemensHageandMartinKleinsteuber DepartmentofElectricalEngineeringandInformationTechnology TechnischeUniversita¨tMu¨nchen 2 Arcisstr.21,80333Munich,Germany 1 0 {hage,kleinsteuber}@tum.de 2 http://www.gol.ei.tum.de t c O October3,2012 2 ] L Abstract M Manyapplicationsindataanalysisrelyonthedecompositionofadatamatrix into a low-rank and a sparse component. Existing methods that tackle this task . t usethenuclearnormand(cid:96) -costfunctionsasconvexrelaxationsoftherankcon- a 1 t straintandthesparsitymeasure,respectively,oremploythresholdingtechniques. s We propose a method that allows for reconstructing and tracking a subspace of [ upper-bounded dimension from incomplete and corrupted observations. It does 1 not require any a priori information about the number of outliers. The core of v ouralgorithmisanintrinsicConjugateGradientmethodonthesetoforthogonal 5 projectionmatrices, theso-calledGrassmannian. Non-convexsparsitymeasures 0 are used for outlier detection, which leads to improved performance in terms of 8 robustlyrecoveringandtrackingthelow-rankmatrix. Inparticular,ourapproach 0 can cope with more outliers and with an underlying matrix of higher rank than . 0 otherstate-of-the-artmethods. 1 2 1 1 Introduction : v i The detection of subspaces that best fit high-dimensional data is a challenging and X importanttaskindataanalysiswithcountlessapplications. TheclassicPrincipalCom- r a ponentAnalysis(PCA)isstillthestandardtoolforsearchingthebestrank-k approx- imationofahigh-dimensionaldataset X ∈ Rm×n, wheretheapproximationqualityis measuredintermsoftheFrobeniusnorm. Ontheonehand,thisminimizationproblem can be solved easily via a Singular Value Decomposition (SVD), while on the other handthechoiceoftheFrobeniusnormmakesthisapproachhighlyvulnerabletoheavy outliers. Inthepastyearsalotofefforthasbeenspentonmethodsthatallowtofind thebestfittingsubspacedespitethepresenceofheavyoutliersinthemeasurementsor missing data. Certain approaches in this area commonly known as Robust PCA stick closelytotheclassicPCAconceptandrobustifyitbymeasuringthedistancefromthe ∗ThisworkhasbeensupportedbytheDFGexcellenceinitiativeresearchclusterCoTeSys. 1 datapointstothesubspacewith(cid:96) -costfunctions,cf.Dingetal(2006),Kwak(2008). 1 Others,includingtheapproachpresentedhere,relateRobustPCAtothefieldofrobust matrixcompletion. Thatis,givensomeincompleteobservationsXˆ ofacorruptedand possiblynoisydatamatrixX,whichisthesuperpositionofalow-rankmatrixLanda sparsematrixS,thetaskistorecoverLandS. 1.1 RelatedWork One of the most prominent approaches to Robust PCA in the past years is proposed by Candes et al (2009) and Wright et al (2009). The authors formulate a convex optimization problem by relaxing the hard rank constraint to a nuclear norm minimiza- tionandanalyzehowwellthe(cid:96) -relaxationapproximatesthe(cid:96) -norminthelow-rank 1 0 andsparsedecompositiontask. MethodstosolvetheproblemincludeSingularValue Thresholding (SVT, Cai et al (2010)) and the exact (EALM) or inexact (IALM) aug- mentedLagrangianmultipliermethod(Linetal,2010),whicharefastandcomparably reliableiftheunderlyingassumptions,i.e. low-rank-propertyandsparsityofoutliers, arevalid. Theproblemofreconstructingthelow-rankmatrix Linthecasewhereen- tirecolumnsinthemeasurementsarecorruptedisconsideredbyChenetal(2011). A methodcalledSpaRCSisproposedbyWatersetal(2011),whichrecoversthelow-rank andthesparsepartofamatrixfromcompressivemeasurements. InmanyapplicationsitisreasonabletoassumethatnoadditiveGaussiannoiseis present in the model, as noise is negligible compared to the signal power, e.g. when dealingwithhigh-qualitysensorsinthefieldofComputerVision. However,therealso existapproachessuchasGoDec(ZhouandTao,2011)thatexplicitlymodeladditional Gaussian noise. The method uses thresholding techniques and extends the low-rank and sparse decomposition to the noisy case. SpaRCS and GoDec aim at recovering boththelow-rankmatrixLandtheoutliersS fromnoisymeasurementsandtherefore requireanupperboundforthecardinalityofthesparsecomponent. Incontrasttotheoften-performedrank-relaxation,therealsoexistmethodswhich fixthedimensionofthesubspaceapproximation,suchasthegreedyalgorithmGECO (Shalev-Shwartzetal,2011). Thismethodreconstructsadatasetiterativelybasedon SVDwhileincreasingtherankoftheapproximationineachstep.Ageneralframework for optimizing over matrices of a fixed rank has also been proposed by Shalit et al (2010). One drawback here is that the manifold requires a fixed rank k and optimal pointsthatmayhavearankstrictlylowerthankarenotinthefeasibleset. An alternative and elegant way to control the rank in terms of an upper bound is optimization on the set of fixed-dimensional subspaces, the so-called Grassmannian. As pointed out by Meyer et al (2011), optimizing on the Grassmannian offers many advantages,suchaslimitedmemoryusageandalowernumberofparameterstoopti- mize. Althoughtheoptimizationproblembecomesnon-convex,reliableperformance can be achieved in practice. In the work of Keshavan and Montanari (2010), spec- traltechniquesarecombinedwithalearningalgorithmontheGrassmannian. Boumal andAbsil(2011)proposeaRiemanniantrust-regionmethodontheGrassmannianfor low-rank matrix completion, which, however, does not consider heavy outliers. The GROUSEalgorithm(Balzanoetal,2010)furthermoredemonstratesthatoptimization on the Grassmannian allows to estimate the underlying subspace incrementally. In- steadofbatch-processingadataset,thesamplescanbeprocessedoneatatime,which makes it possible to track a subspace that varies over time, even if it is incompletely observed. Asmallportionofthedataisusedtoobtainaninitialsubspaceestimateand witheverynewincomingdatasamplethisestimateismodifiedtofollowchangesinthe 2 dominantsubspaceovertime. WhilethemethodofBalzanoetal(2010)operateswith (cid:96) -cost functions, its recent adaptation GRASTA (He et al, 2012) performs a gradient 2 descentontheGrassmannianandaimsatoptimizingan(cid:96) -costfunctiontomitigatethe 1 effectsofheavyoutliersinthesubspacetrackingstage. Theauthorsovercomethenon- differentiabilityofthe(cid:96) -normbyformulatinganaugmentedLagrangianoptimization 1 problematthecostofdoublingthenumberofunknownparameters. 1.2 OurContribution Althoughthe(cid:96) -normleadstofavorablyconditionedoptimizationproblemsitiswell- 1 knownthatpenalizingwithnon-convex(cid:96) -surrogatesallowsreconstructioneveninthe 0 case when (cid:96) -based methods fail, see e.g. Chartrand and Staneva (2008). Therefore, 1 weproposeaframeworkthatcombinestheadvantagesofGrassmannianoptimization withnon-convexsparsitymeasures. Ourapproachfocusesprimarilyonreconstructing andtrackingtheunderlyingsubspaceandcanoperateonbothfullyandincompletely observed data sets. The algorithm performs a low-rank and sparse decomposition. However,itdoesnotrequireanyinformationaboutthecardinalityorthesupportsetof thesparseoutliers,thusbeingdifferentfromSpaRCSandGoDec. In contrast to GRASTA (He et al, 2012), the method presented in this paper directly optimizes the cost function and thus operates with less than half the number of un- knowns. LikealloptimizationmethodsontheGrassmannianouralgorithmallowsto upper-boundthedimensionoftheunderlyingsubspaceandeasilyextendstotheprob- lemofrobustlytrackingthissubspace. Experimentalresultsconfirmthattheproposedmethodcancopewithmoreoutliers andwithanunderlyingmatrixofhigherrankthanotherstate-of-the-artmethods,thus extendingpossibleareasofapplicationfromthestrictlow-rankandsparsedecompo- sition to the more general area of robust dimensionality reduction. In the following sectionwepresentanalternatingminimizationschemeandrelateourapproachtodi- mensionalityreductionviaPCA.Subsequently,wecarefullyderiveandexplainaCon- jugate Gradient (CG) type algorithm on the Grassmannian for solving the individual minimization tasks. We then extend the static method by a dynamic subspace tracking algorithm and finally, we evaluate the performance of the proposed method with various(cid:96) -surrogatesandcompareourapproachtootherstate-of-the-artmethods. 0 2 Problem statement LetX ∈Rm×n bethedatamatrixofwhichweselectthepartialentriesXˆ =A(X)using alinearoperator. WeconsiderthedatamodelX = L+S whereLisofrankrk(L) ≤ k andS issparse. Ouraimistorecover L fromtheobservations Xˆ. Adirectapproach leadstothenumericallyinfeasibleoptimizationproblem min(cid:107)Xˆ −A(L)(cid:107) . (1) 0 rkL≤k Sincerk(L)≤k,wecanwritethematrixLastheproductL=UY,whereY ∈Rk×nand U isanelementoftheso-calledStiefelmanifoldSt ={U ∈Rm×k|U(cid:62)U = I }withI k,m k k denotingthe(k×k)-identitymatrix. ThisfactorizationofLhasthefollowingpractical interpretation: U isanorthonormalbasisoftherobustlyestimatedsubspaceofthefirst k principal components, hereafter referred to as (robust) dominant subspace. Note, that neither U nor Y are uniquely determined in this factorization. Let O(k) = {θ ∈ 3 Rk×k|θ(cid:62)θ= I }definethesetoforthogonalmatrices,thenU canalwaysbeadjustedby k anorthogonalmatrixθ ∈ O(k),suchthatUY = Uθθ(cid:62)Y withθ(cid:62)Y havinguncorrelated rows,leadingtothek(robustlyestimated)principlecomponentscoresofX. Our concession to the numerical infeasibility of (1) is to employ an alternative, possibly non-convex sparsity measure h. Some concrete examples can be found in Section6.1. Asaresult,problem(1)relaxestotheminimizationproblem min h(Xˆ −A(UY)). (2) U∈Stk,m,Y∈Rk×n Problem(2)canbeaddressedintwodifferentways. Eitherthefulldataset Xˆ ispro- cessedatonce,resultinginoneestimateforUandY,orthedatavectorsxâreprocessed oneatatime. Inthelattercase,whileforeachnewsamplexôneobtainsacorrespond- ingoptimalcoordinatesety,thesubspaceestimateU shouldvarysmoothlyovertime and fit both recent and current observations. We will refer to the former problem as subspacereconstructionorRobustPCAandtothelatterassubspacetracking. 3 An alternating minimization framework using (cid:96) sur- 0 rogates Directly tackling problem (2) has two severe drawbacks. The first is the above men- tioned ambiguity in the factorization UY and the second is the fact that reasonable sparsitymeasuresarenotsmoothandthusforbidtheuseoffastsmoothoptimization methods. Weovercomethesetwoproblemsbyproposinganalternatingminimization frameworkthatgetsridoftheambiguitybyiteratingonamoreappropriategeometric settingandallowsustousesmoothapproximationsofh,whichcombinetheadvantage ofhavingfastsmoothoptimizationtoolsathandtogetherwiththestrongcapabilityof non-convexsparsitymeasuresinreconstructiontasks. So let h : Rm×n → R with µ ∈ R+ be a smooth approximation of h such that h µ µ converges pointwise to h as µ tends to zero. Monotonically shrinking the smoothing parameterµbetweenalternatingminimizationstepsallowstocombinetheadvantages of both smooth optimization and (cid:96) -like sparsity measures. Schematically, we tackle 0 problem(2)byiteratingthefollowingtwostepsuntilconvergence. Step#1 LetL(i) = U(i)Y(i) bethei-thiterateoftheminimizationprocess. Inthefirst step,theestimateofthedominantsubspaceisimprovedbysolving U(i+1) =argminh (Xˆ −A(UU(cid:62)L(i))). (3) µ U∈Stk,m Wemastertheambiguityproblembyreformulating(3)asaminimizationtaskonthe setofsymmetricrank-kprojectors,whichpossessesamanifoldstructureandisknown astheGrassmannian Gr :={P∈Rm×m|P=UU(cid:62),U ∈St }. (4) k,m k,m Thisresultsintheoptimizationproblem P(i+1) =argminh (Xˆ −A(PL(i))), (5) µ P∈Grk,m withP(i+1) =U(i+1)(U(i+1))(cid:62). 4 Step #2 In the second step, the coordinates with respect to U(i+1) of the projected datapointsareadjustedbysolving Y(i+1) =argminh (Xˆ −A(U(i+1)Y)). (6) µ Y∈Rk×n Then, µ is decreased previous to the next iteration. Since ordinary PCA of the data allows a reasonably good initialization of U and Y, we initialize our algorithm with atruncatedSVDofsomematrix X thatisinaccordancewiththemeasurements, i.e. 0 A(X )= Xˆ. ThealternatingschemeissummarizedinAlgorithm1. 0 Algorithm1AlternatingschemeforRobustPCA Initialize: ChooseX ,s.t.A(X )= Xˆ. 0 0 ObtainU(0)fromkleftsingularvaluesofX . 0 Y(0) =U(0)(cid:62)X 0 L(0) =U(0)Y(0) P(0) =U(0)U(0)(cid:62) Chooseµ(0)andµ(I),compute (cid:18) (cid:19)1/(I−1) c = µ(I) µ µ(0) fori=1: Ido P(i+1) =argminh (Xˆ −A(PL(i))) Step#1 µ(i) P∈Grk,m findU(i+1) s.t. U(i+1)U(i+1)(cid:62) = P(i+1) Y(i+1) =argminh (Xˆ −A(U(i+1)Y)) Step#2 µ(i) Y∈Rk×n L(i+1) =U(i+1)Y(i+1) µ(i+1) =c µ(i) µ endfor Lˆ =U(I)Y(I), Sˆ = X−Lˆ 4 Optimizing the smooth sparsity measure Theproposedalternatingminimizationschemeinvolvesthetwonon-convexbutsmooth optimizationproblems(5)and(6). WeminimizebothusingaConjugateGradienttype method due to its scalability and superlinear rate of convergence. The optimization methods are conceptually very similar but differ in the domains they operate on, as becomesclearfromthecostfunctionsandtheirrespectivegradients. (cid:16) (cid:17) f : Gr →R, P(cid:55)→h (Xˆ −A(PL)), ∇f =−A∗ ∇h (Xˆ −A(PL)) L(cid:62) (7) 1 k,m µ 1 µ (cid:16) (cid:17) f : Rk×n →R, Y (cid:55)→h (Xˆ −A(UY)), ∇f =−U(cid:62)A∗ ∇h (Xˆ −A(UY)) (8) 2 µ 2 µ Note that A∗ denotes the adjoint of the operator A. Conjugate Gradient methods for minimizationproblemsintheEuclideancase,suchas(8)arestandardandwellestab- lished. Incontrasttothis,thecoreofouralgorithm,i.e.minimizing(7)isageometric optimizationproblemontherealGrassmannianandthusrequiresadditionalconcepts suchasvectortransportandretraction. 5 Inthefollowing,wewillrecallsomegeneralconceptsandfurthercollecttheingre- dientsforouralgorithm. Inparticular,wederiveanewretractionthatiscrucialforthe algorithm’scomputationalperformance.Foradeeperandmoregeneralinsightintothe topicofGeometricOptimizationwerefertotheworkofAbsiletal(2008). 4.1 GeometryoftheGrassmannian Consideraprojector P ∈ Gr asapointontheGrassmannianandletu(m) := {Ω ∈ k,m Rm×m|Ω(cid:62) =−Ω}bethesetofskew-symmetricmatrices.UsingtheLiebracketoperator [Z ,Z ]=Z Z −Z Z ,thesetofelementsinthetangentspaceofGr atPisgivenby 1 2 1 2 2 1 k,m T Gr ={[P,Ω]|Ω∈u }. Inthefollowing,wewillendowRm×mwiththeFrobenius P k,m m innerproduct(cid:104)Z ,Z (cid:105) := tr(Z(cid:62)Z )wheretr(·)denotesthetraceoperator,andconsider 1 2 1 2 the Riemannian metric on Gr accordingly as the restriction of this product to the k,m respective tangent space. The orthogonal projection of an arbitrary point Z ∈ Rm×m ontothetangentspaceatPis π : Rm×m →T Gr , Z (cid:55)→[P,[P,Z ]] (9) P P k,m s withZ = 1(Z+Z(cid:62))beingthesymmetricpartofZ. s 2 It is crucial for optimization procedures on manifolds to establish a relation be- tweenelementsofthetangentspaceandcorrespondingpointsonthemanifold,which motivatestheusageofretractions. Conceptually,aretractionatsomepointPisamap- pingfromthetangentspaceatPtoGr withalocalrigidityconditionthatpreserves k,m gradientsatP. The generic way of locally parameterizing a smooth manifold is via Riemannian exponentialmappings.Astheyarecostlytocomputeingeneralweperformanapprox- imation based on the QR-decomposition. LetGl(m) be the set of invertible (m×m)- matrices and let R(m) ⊂ Gl(m) be the set of upper triangular matrices with positive entriesonthediagonal.ItfollowsfromtheGram-Schmidtorthogonalizationprocedure thattheQR-decompositionO(m)×R(m)→Gl(m), (Q,R)(cid:55)→ QRisadiffeomorphism. Accordingly,everyA ∈Gl(m)decomposesuniquelyintoA =: A A withA ∈ O(m) Q R Q andA ∈R(m). Moreover,itfollowsthatthemap R qΩ: R→O(m), qΩ(t):=(Im+tΩ)Q (10) issmoothforallΩ ∈ u(m)withderivativeat0beingq˙Ω(0) = Ω,cf.Kleinsteuberand Hu¨per(2007). Using α(cid:48) : T Gr →Gr , α(cid:48)(ξ):=q (1) P(cid:0)q (1)(cid:1)(cid:62) (11) P P k,m k,m P [ξ,P] [ξ,P] andexploitingthepropertiesoftheexponentialmapping,onecandevelopthefollowing result: Lemma1 Considerarbitraryorthogonalmatricesθ∈O(m). Themapping α : T Gr →Gr , α (ξ):=θ(cid:0)q (1)(cid:1)θ(cid:62)Pθ(cid:0)q (1)(cid:1)(cid:62) θ(cid:62) (12) P,θ P k,m k,m P,θ θ(cid:62)[ξ,P]θ θ(cid:62)[ξ,P]θ definesasetofretractionsonGr . k,m Proof1 Thefirstcondition,namelyα (0) = P,followsstraightforwardly. Itremains (cid:12) P,θ toshowthat ddt(cid:12)(cid:12)t=0αP,θ(tξ)=ξ. 6 Firstly, note that q[tξ,P](1) = q[ξ,P](t). Since it has been verified that q˙Ω(0) = Ω (cf.Helmkeetal(2007))and[ξ,P]isskewsymmetric,itfollowsthat ddt(cid:12)(cid:12)(cid:12)t=0αP,θ(tξ)=θq˙θ(cid:62)[ξ,P]θ(0)θ(cid:62)Pθ(cid:0)qθ(cid:62)[ξ,P]θ(0)(cid:1)(cid:62)θ(cid:62)+θqθ(cid:62)[ξ,P]θ(0)θ(cid:62)Pθ(cid:0)q˙θ(cid:62)[ξ,P]θ(0)(cid:1)(cid:62)θ(cid:62) =θθ(cid:62)[ξ,P]θθ(cid:62)P−Pθθ(cid:62)[ξ,P]θθ(cid:62) =[P,[P,ξ]] =ξ Tounderstandthelaststepoftheequation, notethatξ ∈ T Gr andtherefore, ξ is P k,m invariantundertheprojectionπ (·). (cid:3) P As an associated vector transport, i.e. a mapping that for a given ξ ∈ T Gr trans- P k,m portsthetangentelementη∈T Gr alongtheretractionα (ξ)tothetangentspace P k,m P,θ T Gr ,wechoose αP,θ(ξ) k,m τ (η):=θ(cid:0)q (1)(cid:1)θ(cid:62)ηθ(cid:0)q (1)(cid:1)(cid:62)θ(cid:62). (13) ξ,P,θ θ(cid:62)[ξ,P]θ θ(cid:62)[ξ,P]θ Note that in our algorithm the context of τ is always clear. Thus, we will drop the subscriptsandsimplywriteτ(η)forenhancedlegibility. 4.2 CGontheGrassmannian Inthefollowing,wesketchhowthewell-knownnonlinearCGmethodextendstothe Grassmannianforminimizingasmoothfunction f: Gr →R. Recall,thatif f isthe k,m restrictionofasmoothfunction fˆ: Rm×m →R,theRiemanniangradientinthetangent spaceisgivenby grad f(P)=π (∇fˆ), (14) P where∇fîsthecommongradientof fînRm×m. TheCGmethodontheGrassmannian can be outlined as follows. Starting at an initial point P(0) ∈ Gr , the Riemannian k,m gradientΓ(0) canbecomputedand H(0) = −Γ(0) isselectedasinitialsearchdirection. In each iteration suitable step-size t(i) is determined using a backtracking line-search algorithmontheGrassmannian,cf. Algorithm2. Thenewiterateisthenobtainedvia P(i+1) =α (t(i)H(i)). Finally,thesearchdirection P(i) H(i+1) =−Γ(i+1)+β(i)τ(H(i)) (15) isupdated,whereweconsidertwoupdaterulesforβ(i),namely (cid:104)Γ(i+1),Γ(i+1)(cid:105) (cid:104)Γ(i+1),(Γ(i+1)−τ(Γ(i)))(cid:105) β(i) = , β(i) = . (16) FR (cid:104)Γ(i),Γ(i)(cid:105) HS (cid:104)τ(H(i)),(Γ(i+1)−τ(Γ(i)))(cid:105) The former is a Riemannian adaption of the well-known Fletcher-Reeves update for- mula and guarantees convergence of our algorithm (see the subsequent remark). The latterisanadaptationoftheHestenes-Stiefelformulaandtypicallyleadstobettercon- vergencebehaviorinpractice,whichiswhyweuseitforallouralgorithms. Algorithm2BacktrackinglinesearchonGrassmannian Chooset >0;c,ρ∈(0,1)andsett←t init init repeat t←ρt (cid:16) (cid:17) until f α (tH(i)) ≤ f(P)+ct tr(Γ(i)(cid:62)H(i)) P Choosestep-sizet(i) :=t 7 RemarkConvergenceofthegeometricCGmethod,whichusestheretractionandvector transport as described above together with the Fletcher-Reeves update β and the FR strong Wolfe-Powell condition, is guaranteed by a result of Ring and Wirth (2012) in the sense that liminf (cid:107)Γ(i)(cid:107) = 0. Note, that the Wolfe-Powell condition would i→∞ requireaslightmodificationofAlgorithm2. However,wedonotstressthisissuesince ithasnoimpactonthepracticalconvergencebehavior. 4.3 ImplementationofCGonGrassmannian So far, the algorithm outlined above requires full (m × m)-matrices for the iterates P(i),Γ(i) and H(i). Thisisadrasticlimitationontheperformanceofapracticalimple- mentation. Inthissectionwederiveanewretractionandshowhowitcanbeusedto avoidfullmatrixmultiplicationandtoreducethestoragerequirementstremendously. ThekeyideaistodecomposetheprojectionmatricesP ∈ Gr intoP = UU(cid:62) andto k,m iterateonStiefelmatricesU ∈St instead. Moreover,onecanexploitthestructureof k,m thetangentspaceT Gr atthestandardprojectorI,thatis I k,m I =(cid:34)Ik 0(cid:35), T Gr =(cid:40)(cid:34)0 A(cid:62)(cid:35) (cid:12)(cid:12)(cid:12)A∈R(m−k)×k(cid:41). (17) 0 0 I k,m A 0 (cid:12) GivenU,alargeQR-decompositionofU yieldsafastwayofconstructing (cid:104) (cid:105) V := U U⊥ ∈O(m), (18) where U⊥ ∈ St denotes a basis of the orthogonal complement of the subspace (m−k),m spannedbyU. Then,ifP=UU(cid:62)andξ ∈T Gr ,theidentities P k,m (cid:34) (cid:35) 0 A(cid:62) V(cid:62)PV =I and V(cid:62)ξV = (19) A 0 holdforsomeA∈R(m−k)×k. Therefore,insteadofstoringthefullPandξitissufficient tostoreU andA. Formally,thisdefinesthebijection con : T Gr →R(m−k)×k, con (ξ)= A. (20) V P k,m V TheprojectionontoT Gr followsfromastraightforwardcalculation. P k,m (cid:104) (cid:105) Lemma2 Let V = U U⊥ ,P = UU(cid:62) and Z = 1(Z +Z(cid:62)) be defined as above s 2 andtheorthogonalprojectionontothetangentspaceatPbedenotedbyπ . Thenthe P identity con (π (Z))=(U⊥)(cid:62)Z U (21) V P s holds. Proof2 Usingthedefinitionofπ (Z)andthefactthatV(cid:62)PV =I, P V(cid:62)π (Z)V =[V(cid:62)PV ,V(cid:62)[P,Z ]V] P s =[I,[I,V(cid:62)Z V]] s andthesamestructureasin(19)isobtained. (cid:3) 8 Lemma 1 allows to choose a particular retraction that is easy to compute. Consider A=con (ξ)andlet V (cid:34) (cid:35) R A=θ (22) A 0 bethelargeQR-decompositionofA,withθ ∈O(m−k)andRanuppertriangular(not A necessarilyinvertible)(k×k)-matrix. Furthermore,define (cid:34) (cid:35) I −R(cid:62) M(R):= k (23) R I k anditsQ-factorθ ∈O(2k). M (cid:34) (cid:35) I 0 Lemma3 Letα (ξ)bearetractionasin(12),whereθischosenasθ:=V k . P,θ 0 θ A ThentheStiefelmatrix (cid:34) (cid:35)(cid:34) (cid:35) θ 0 I U(cid:101):=θ M k ∈St (24) 0 I 0 k,m m−2k satisfiesα (ξ)=U(cid:101)U(cid:101)(cid:62). P,θ (cid:34) (cid:35) θ 0 Proof3 Onecandeduceq (1)=(I +θ(cid:62)[ξ,P]θ) = M from θ(cid:62)[ξ,P]θ m Q 0 I m−2k (cid:34) (cid:35) (cid:34) (cid:35) I I 0 θ(cid:62)[ξ,P]θ= k [V(cid:62)ξV,V(cid:62)PV] k θ(cid:62) 0 θ A A (cid:34) (cid:35)(cid:34) (cid:35)(cid:34) (cid:35) I 0 0 −A(cid:62) I 0 = k k 0 θ(cid:62) A 0 0 θ A A (cid:34) (cid:35) 0 −A(cid:62)θ = A θ(cid:62)A 0 A  (cid:104) (cid:105) =(cid:34)R0(cid:35) −RT0 0 . 0 Therefore,theretractionis (cid:34) (cid:35) (cid:34) (cid:35) θ 0 θ(cid:62) 0 α (ξ)=θ M θ(cid:62)Pθ M θ(cid:62) P,θ 0 I 0 I m−2k m−2k (cid:34) (cid:35)(cid:34) (cid:35)(cid:34) (cid:35) (cid:34) (cid:35)(cid:34) (cid:35)(cid:34) (cid:35) I 0 θ 0 I 0 I 0 θ(cid:62) 0 I 0 =V k M k V(cid:62)PV k M k V(cid:62) 0 θ 0 I 0 θ(cid:62) 0 θ 0 I 0 θ(cid:62) A m−2k (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)A(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)A(cid:32) m−2k A (cid:124) (cid:123)(cid:122) (cid:125) I (cid:34) (cid:35)(cid:34) (cid:35)(cid:34) (cid:35) (cid:34) (cid:35)(cid:34) (cid:35) I 0 θ 0 I (cid:104) (cid:105) θ(cid:62) 0 I 0 =V k M k I 0 M k V(cid:62), 0 θ 0 I 0 k 0 I 0 θ(cid:62) A m−2k m−2k A whichcompletestheproof. (cid:3) Similarlytotheretraction,thevectortransportfrom(13)issimplified,asisdescribed inthefollowing. 9 (cid:104) (cid:105) Lemma4 Letθ,ξ,P,U(cid:101)beasaboveandV(cid:101):= U(cid:101) U(cid:101)⊥ . Thenforη∈T Gr and P k,m B:=con (η),theidentitycon (τ (η))=θ(cid:62)Bholds. V V(cid:101) ξ,P,θ A (cid:34) (cid:35) 0 B(cid:62) Proof4 From(19)wehaveV(cid:62)ηV = . Thevectortransportcanthusbewrit- B 0 tenas (cid:34) (cid:35)(cid:34) (cid:35) (cid:34) (cid:35) (cid:34) (cid:35)(cid:34) (cid:35) θ 0 I 0 0 B(cid:62) I 0 θ(cid:62) 0 τ (η)=θ M k k M θ(cid:62). ξ,P,θ 0 I 0 θ(cid:62) B 0 0 θ 0 I k A A k Usingthefactthat (cid:34) (cid:34) (cid:35)(cid:34) (cid:35) (cid:34) (cid:35)(cid:34) (cid:35)(cid:35) (cid:34) (cid:35) (cid:104) (cid:105) θ 0 I θ 0 0 θ 0 θ(cid:62) U(cid:101) U(cid:101)⊥ =θ(cid:62) θ M k θ M = M 0 I 0 0 I I 0 I m−2k m−2k m−k m−2k wecanshowthat (cid:34) (cid:35)(cid:34) (cid:35)(cid:34) (cid:35)(cid:34) (cid:35)(cid:34) (cid:35) V(cid:101)(cid:62)τ (η)V(cid:101)=(cid:104)U(cid:101) U(cid:101)⊥(cid:105)(cid:62)θ θM 0 Ik 0 0 B(cid:62) Ik 0 θ(cid:62)M 0 θ(cid:62)(cid:104)U(cid:101) U(cid:101)⊥(cid:105) ξ,P,θ 0 I 0 θ(cid:62) B 0 0 θ 0 I m−2k A A m−2k (cid:34) (cid:35)(cid:34) (cid:35)(cid:34) (cid:35) I 0 0 B(cid:62) I 0 = k k 0 θ(cid:62) B 0 0 θ A A (cid:34) (cid:35) 0 B(cid:62)θ = A θ(cid:62)B 0 A andcon (τ (η))followsagainfromthematrixstructure. (cid:3) V(cid:101) ξ,P,θ Algorithm 3 illustrates how the minimization of (2) on the Grassmannian is effi- cientlyimplementedinpractice. Note,thattherein, H andG denotethepreimagesof the respective bijection con and thus are of dimension (m−k)×k. We do not fur- V ther discuss the CG method in the Euclidean space as it is a standard method. In all minimization procedures, the convergence is evaluated by observing the progress in decreasingthecostfunction. Algorithm3ImplementationofCGonGr k,m input U =U(i) ObtainV andU⊥fromEq. (18) ComputeG =(U⊥)(cid:62)∇f˜(UU(cid:62))U H =−G repeat Obtainθ ,RfromHasinEq. (22) H Determinestep-sizetacc. Algorithm2 Obtainθ fromQR-dec.ofM(tR)(23) M UpdateU accordingto(24) UpdateV,U⊥,Gasabove Computeτ(G)=θ(cid:62)Gandτ(H)=θ(cid:62)H H H UpdateHfollowing(15)and(16) untilconverged output U(i+1) 10

Robust PCA and subspace tracking from incomplete observations using L0-surrogates PDF

0.68 MB·English

by Clemens Hage

#additional_collections #journals #arxiv

Checking for file health...

Save to my drive

Quick download

Download

Download Robust PCA and subspace tracking from incomplete observations using L0-surrogates PDF Free - Full Version

by Clemens Hage| 0.68| English

Download Robust PCA and subspace tracking from incomplete observations using L0-surrogates by Clemens Hage in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Robust PCA and subspace tracking from incomplete observations using L0-surrogates

No description available for this book.

Detailed Information

Author:	Clemens Hage
Language:	English
File Size:	0.68
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Robust PCA and subspace tracking from incomplete observations using L0-surrogates Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Robust PCA and subspace tracking from incomplete observations using L0-surrogates PDF?

Yes, on https://PDFdrive.to you can download Robust PCA and subspace tracking from incomplete observations using L0-surrogates by Clemens Hage completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Robust PCA and subspace tracking from incomplete observations using L0-surrogates on my mobile device?

After downloading Robust PCA and subspace tracking from incomplete observations using L0-surrogates PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Robust PCA and subspace tracking from incomplete observations using L0-surrogates?

Yes, this is the complete PDF version of Robust PCA and subspace tracking from incomplete observations using L0-surrogates by Clemens Hage. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Robust PCA and subspace tracking from incomplete observations using L0-surrogates PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.