Table Of ContentLarge-Scale Machine Learning
in the Earth Sciences
Chapman & Hall/CRC
Data Mining and Knowledge Discovery Series
Series Editor: Vipin Kumar
UniversityofMinnesota
DepartmentofComputerScienceandEngineering
Minneapolis,Minnesota,U.S.A.
AIMSANDSCOPE
Thisseriesaimstocapturenewdevelopmentsandapplicationsindataminingandknowledge
discovery,whilesummarizingthecomputationaltoolsandtechniquesusefulindataanalysis.
Thisseriesencouragestheintegrationofmathematical,statistical,andcomputationalmethods
andtechniquesthroughthepublicationofabroadrangeoftextbooks,referenceworks,and
handbooks.Theinclusionofconcreteexamplesandapplicationsishighlyencouraged.The
scopeoftheseriesincludes,butisnotlimitedto,titlesintheareasofdataminingand
knowledgediscoverymethodsandapplications,modeling,algorithms,theoryandfoundations,
dataandknowledgevisualization,dataminingsystemsandtools,andprivacyandsecurity
issues.
PUBLISHEDTITLES
AcceleratingDiscovery:MiningUnstructuredInformationforHypothesisGeneration
ScottSpangler
AdvancesinMachineLearningandDataMiningforAstronomy
MichaelJ.Way,JeffreyD.Scargle,KamalM.Ali,andAshokN.Srivastava
BiologicalDataMining
JakeY.ChenandStefanoLonardi
ComputationalBusinessAnalytics
SubrataDas
ComputationalIntelligentDataAnalysisforSustainableDevelopment
TingYu,NiteshV.Chawla,andSimeonSimoff
ComputationalMethodsofFeatureSelection
HuanLiuandHiroshiMotoda
ConstrainedClustering:AdvancesinAlgorithms,Theory,andApplications
SugatoBasu,IanDavidson,andKiriL.Wagstaff
ContrastDataMining:Concepts,Algorithms,andApplications
GuozhuDongandJamesBailey
DataClassification:AlgorithmsandApplications
CharuC.Aggarawal
DataClustering:AlgorithmsandApplications
CharuC.AggarawalandChandanK.Reddy
DataClusteringinC++:AnObject-OrientedApproach
GuojunGan
DataMining:ATutorial-BasedPrimer,SecondEdition
RichardJ.Roiger
DataMiningforDesignandMarketing
YukioOhsawaandKatsutoshiYada
DataMiningwithR:LearningwithCaseStudies,SecondEdition
LuísTorgo
DataScienceandAnalyticswithPython
JesusRogel-Salazar
EventMining:AlgorithmsandApplications
TaoLi
FoundationsofPredictiveAnalytics
JamesWuandStephenCoggeshall
GeographicDataMiningandKnowledgeDiscovery,SecondEdition
HarveyJ.MillerandJiaweiHan
Graph-BasedSocialMediaAnalysis
IoannisPitas
HandbookofEducationalDataMining
CristóbalRomero,SebastianVentura,MykolaPechenizkiy,andRyanS.J.d.Baker
HealthcareDataAnalytics
ChandanK.ReddyandCharuC.Aggarwal
InformationDiscoveryonElectronicHealthRecords
VagelisHristidis
IntelligentTechnologiesforWebApplications
PritiSrinivasSajjaandRajendraAkerkar
IntroductiontoPrivacy-PreservingDataPublishing:ConceptsandTechniques
BenjaminC.M.Fung,KeWang,AdaWai-CheeFu,andPhilipS.Yu
KnowledgeDiscoveryforCounterterrorismandLawEnforcement
DavidSkillicorn
KnowledgeDiscoveryfromDataStreams
JoãoGama
Large-ScaleMachineLearninginTheEarthSciences
AshokN.Srivastava,RamakrishnaNemani,andKarstenSteinhaeuser
MachineLearningandKnowledgeDiscoveryforEngineeringSystemsHealthManagement
AshokN.SrivastavaandJiaweiHan
MiningSoftwareSpecifications:MethodologiesandApplications
DavidLo,Siau-ChengKhoo,JiaweiHan,andChaoLiu
MultimediaDataMining:ASystematicIntroductiontoConceptsandTheory
ZhongfeiZhangandRuofeiZhang
MusicDataMining
TaoLi,MitsunoriOgihara,andGeorgeTzanetakis
NextGenerationofDataMining
HillolKargupta,JiaweiHan,PhilipS.Yu,RajeevMotwani,andVipinKumar
Rapidminer:DataMiningUseCasesandBusinessAnalyticsApplications
MarkusHofmannandRalfKlinkenberg
RelationalDataClustering:Models,Algorithms,andApplications
BoLong,ZhongfeiZhang,andPhilipS.Yu
Service-OrientedDistributedKnowledgeDiscovery
DomenicoTaliaandPaoloTrunfio
SpectralFeatureSelectionForDataMining
ZhengAlanZhaoandHuanLiu
StatisticalDataMiningUsingSASApplications,SecondEdition
GeorgeFernandez
SupportVectorMachines:OptimizationBasedTheory,Algorithms,andExtensions
NaiyangDeng,YingjieTian,andChunhuaZhang
TemporalDataMining
TheophanoMitsa
TextMining:Classification,Clustering,andApplications
AshokN.SrivastavaandMehranSahami
TextMiningandVisualization:CaseStudiesUsingOpen-SourceTools
MarkusHofmannandAndrewChisholm
TheTopTenAlgorithmsinDataMining
XindongWuandVipinKumar
UnderstandingComplexDatasets:DataMiningwithMatrixDecompositions
DavidSkillicorn
Large-Scale Machine Learning
in the Earth Sciences
Editedby
Ashok N. Srivastava
Ramakrishna Nemani
Karsten Steinhaeuser
MATLABⓇ is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the
accuracyofthetextorexercisesinthisbook.Thisbook’suseordiscussionofMATLABⓇsoftwareorrelatedproductsdoesnot
constitute endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular use of the
MATLABⓇsoftware.
CRCPress
Taylor&FrancisGroup
6000BrokenSoundParkwayNW,Suite300
BocaRaton,FL33487-2742
©2017byTaylor&FrancisGroup,LLC
CRCPressisanimprintofTaylor&FrancisGroup,anInformabusiness
NoclaimtooriginalU.S.Governmentworks
Printedonacid-freepaper
InternationalStandardBookNumber-13:978-1-4987-0387-1(Hardback)
Thisbookcontainsinformationobtainedfromauthenticandhighlyregardedsources.Reasonableeffortshavebeenmadeto
publishreliabledataandinformation,buttheauthorandpublishercannotassumeresponsibilityforthevalidityofallmaterials
ortheconsequencesoftheiruse.Theauthorsandpublishershaveattemptedtotracethecopyrightholdersofallmaterial
reproducedinthispublicationandapologizetocopyrightholdersifpermissiontopublishinthisformhasnotbeenobtained.If
anycopyrightmaterialhasnotbeenacknowledgedpleasewriteandletusknowsowemayrectifyinanyfuturereprint.
ExceptaspermittedunderU.S.CopyrightLaw,nopartofthisbookmaybereprinted,reproduced,transmitted,orutilizedinany
formbyanyelectronic,mechanical,orothermeans,nowknownorhereafterinvented,includingphotocopying,microfilming,
andrecording,orinanyinformationstorageorretrievalsystem,withoutwrittenpermissionfromthepublishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com
(http://www.copyright.com/)orcontacttheCopyrightClearanceCenter,Inc.(CCC),222RosewoodDrive,Danvers,MA01923,
978-750-8400.CCCisanot-for-profitorganizationthatprovideslicensesandregistrationforavarietyofusers.Fororganizations
thathavebeengrantedaphotocopylicensebytheCCC,aseparatesystemofpaymenthasbeenarranged.
TrademarkNotice:Productorcorporatenamesmaybetrademarksorregisteredtrademarks,andareusedonlyforidentifica-
tionandexplanationwithoutintenttoinfringe.
LibraryofCongressCataloging-in-PublicationData
Names:Srivastava,AshokN.(AshokNarain),1969-editor.|Nemani,
Ramakrishna,editor.|Steinhaeuser,Karsten,editor.
Title:Large-scalemachinelearningintheearthsciences/[editedby]Ashok
N.Srivastava,Dr.RamakrishnaNemani,KarstenSteinhaeuser.
Description:BocaRaton:Taylor&Francis,2017.|Series:Chapman&
Hall/CRCdatamining&knowledgediscoveryseries;42|“ACRCtitle,
partoftheTaylor&Francisimprint,amemberoftheTaylor&Francis
Group,theacademicdivisionofT&FInformaplc.”
Identifiers:LCCN2017006160|ISBN9781498703871(hardback:alk.paper)
Subjects:LCSH:Earthsciences–Computernetworkresources.|Earth
sciences–Dataprocessing.
Classification:LCCQE48.87.L372017|DDC550.285/6312–dc23
LCrecordavailableathttps://lccn.loc.gov/2017006160
VisittheTaylor&FrancisWebsiteat
http://www.taylorandfrancis.com
andtheCRCPressWebsiteat
http://www.crcpress.com
Contents
Foreword.............................................................................................................. ix
Editors................................................................................................................. xi
Contributors........................................................................................................xiii
Introduction .........................................................................................................xv
1 NetworkSciencePerspectivesonEngineeringAdaptationtoClimateChangeand
WeatherExtremes ........................................................................................... 1
UditBhatiaandAuroopR.Ganguly
2 StructuredEstimationinHighDimensions:ApplicationsinClimate.....................13
AndréRGoncalves,ArindamBanerjee,VidyashankarSivakumar,andSoumyadeepChatterjee
3 SpatiotemporalGlobalClimateModelTracking .................................................33
ScottMcQuadeandClaireMonteleoni
4 StatisticalDownscalinginClimatewithState-of-the-ArtScalableMachine
Learning........................................................................................................55
ThomasVandal,UditBhatia,andAuroopR.Ganguly
5 Large-ScaleMachineLearningforSpeciesDistributions......................................73
ReidA.Johnson,JasonD.K.Dzurisin,andNiteshV.Chawla
6 UsingLarge-ScaleMachineLearningtoImproveOurUnderstandingofthe
FormationofTornadoes..................................................................................95
AmyMcGovern,CoreyPotvin,andRodgerA.Brown
7 DeepLearningforVeryHigh-ResolutionImageryClassification........................113
SangramGanguly,SaikatBasu,RamakrishnaNemani,SupratikMukhopadhyay,Andrew
Michaelis,PetrVotava,CristinaMilesi,andUttamKumar
vii
viii Contents
8 UnmixingAlgorithms:AReviewofTechniquesforSpectralDetectionand
ClassificationofLandCoverfromMixedPixelsonNASAEarthExchange ..........131
UttamKumar,CristinaMilesi,S.KumarRaja,RamakrishnaNemani,SangramGanguly,
WeileWang,PetrVotava,AndrewMichaelis,andSaikatBasu
9 SemanticInteroperabilityofLong-TailGeoscienceResourcesovertheWeb.........175
MostafaM.Elag,PraveenKumar,LuigiMarini,ScottD.Peckham,andRuiLiu
Index ......................................................................................................201
Foreword
The climate and Earth sciences have recently undergone a rapid transformation from a data-poor to a
data-richenvironment.Inparticular,massiveamountsofclimateandecosystemdataarenowavailable
fromsatelliteandground-basedsensors,andphysics-basedclimatemodelsimulations.Theseinformation-
richdatasetsofferhugepotentialformonitoring,understanding,andpredictingthebehavioroftheEarth’s
ecosystemandforadvancingthescienceofglobalchange.
Whilelarge-scalemachinelearninganddatamininghavegreatlyimpactedarangeofcommercialappli-
cations, their use in the field of Earth sciences is still in the early stages. This book, edited by Ashok
Srivastava,RamakrishnaNemani,andKarstenSteinhaeuser,servesasanoutstandingresourceforany-
oneinterestedintheopportunitiesandchallengesforthemachinelearningcommunityinanalyzingthese
datasetstoanswerquestionsofurgentsocietalinterest.
ThisbookisacompilationofrecentresearchintheapplicationofmachinelearninginthefieldofEarth
sciences.Itdiscussesanumberofapplicationsthatexemplifysomeofthemostimportantquestionsfaced
bytheclimateandecosystemscientiststodayandtherolethedataminingcommunitycanplayinanswer-
ingthem.Chaptersarewrittenbyexpertswhoareworkingattheintersectionofthetwofields.Topics
covered include modeling of weather and climate extremes, evaluation of climate models, and the use
of remote sensing data to quantify land-cover change dynamics. Collectively, they provide an excellent
cross-sectionofresearchbeingdoneinthisemergingfieldofgreatsocietalimportance.
Ihopethatthisbookwillinspiremorecomputerscientiststofocusonenvironmentalapplications,and
Earthscientiststoseekcollaborationswithresearchersinmachinelearninganddataminingtoadvance
thefrontiersinEarthsciences.
VipinKumar,PhD
DepartmentofComputerScienceandEngineering
UniversityofMinnesota
Minneapolis,MN
ix
Description:From the Foreword: "While large-scale machine learning and data mining have greatly impacted a range of commercial applications, their use in the field of Earth sciences is still in the early stages. This book, edited by Ashok Srivastava, Ramakrishna Nemani, and Karsten Steinhaeuser, serves as an ou