Table Of ContentMethods in
Molecular Biology 1375
Pietro Hiram Guzzi Editor
Microarray
Data Analysis
Methods and Applications
Second Edition
M M B
ETHODS IN OLECULAR IOLOGY
SeriesEditor
JohnM.Walker
School of Lifeand MedicalSciences
University ofHertfordshire
Hatfield, Hertfordshire,AL109AB,UK
Forfurther volumes:
http://www.springer.com/series/7651
Microarray Data Analysis
Methods and Applications
Second Edition
Edited by
Pietro Hiram Guzzi
Department of Surgical and Medical Sciences,
University “Magna Græcia” of Catanzaro, Catanzaro, Italy
Editor
PietroHiramGuzzi
DepartmentofSurgicalandMedicalSciences
University“MagnaGræcia”ofCatanzaro
Catanzaro,Italy
ISSN1064-3745 ISSN1940-6029 (electronic)
MethodsinMolecularBiology
ISBN978-1-4939-3172-9 ISBN978-1-4939-3173-6 (eBook)
DOI10.1007/978-1-4939-3173-6
LibraryofCongressControlNumber:2016932899
SpringerNewYorkHeidelbergDordrechtLondon
#SpringerScience+BusinessMediaNewYork2007,2016
Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartofthematerialis
concerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,broadcasting,reproduction
onmicrofilmsorinanyotherphysicalway,andtransmissionorinformationstorageandretrieval,electronicadaptation,
computersoftware,orbysimilarordissimilarmethodologynowknownorhereafterdeveloped.
Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublicationdoesnot
imply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotectivelawsand
regulationsandthereforefreeforgeneraluse.
Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbookarebelievedto
betrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsortheeditorsgiveawarranty,
expressorimplied,withrespecttothematerialcontainedhereinorforanyerrorsoromissionsthatmayhavebeenmade.
Printedonacid-freepaper
HumanaPressisabrandofSpringer
SpringerScience+BusinessMediaLLCNewYorkispartofSpringerScience+BusinessMedia(www.springer.com)
Preface
The development of novel technological platforms in molecular biology has given a large
input to research and in particular has caused a big development of bioinformatics to
support storage, management, and analysis of a large amount of data about different
aspectsoftheomicworld.Wehereinparticularfocusontwomaintechniquesforstudying
the activity of transcriptome, i.e., the set of molecules that play a role in the complex
mechanism of protein synthesis. Such a study focuses on the role of mRNA, i.e., coding
fragmentsofmessengerRNA,andmiRNA,i.e.,smallfragmentsofnoncodingRNA.This
study has been conducted through two main technological platforms: microarray and
miRNA-microarray. More recently, the advent of next-generation sequencing techniques
is gaining a prominent role. Despite this, classical microarray studies are still alive since
thereareaconsiderablenumberofpublishedpapersrelatedtothegenerationandanalysis
ofmicroarraydata.
Theflowofinformationinthisfieldstartsfromtechnologicalplatformsthat produce
different data. Examples of such platforms are microarray for studying the expression of
messenger RNA (mRNA) and microRNA (miRNA); genomic microarrays for studying
copy number variations (CNV) or single-nucleotide polymorphisms (SNP); novel micro-
arraysforstudyingnoncodingRNAs(e.g.,miRNA);andgenomicarraysforpharmacoge-
nomics.
Classicalstudiesfocusedontheindividuationoftheroleofasingleclassofmolecules
intoaspecificdisease.Thereforetheycontainedtheanalysisofasingleclassofdata.More
recently, the biological assumption that different molecules (e.g., miRNA, mRNA, or
TranscriptionFactors)arestronglycorrelatedhasdeterminedtheriseofanoveldiscipline,
often referred to as computational systems biology or network systems biology. In such
disciplinecomputerscience,bioinformatics,andmathematicalmodelingplayasynergistic
role in the interpretation of large data sets belonging to different data sources. Conse-
quently, a big attention has been paid to the development of integrated methods of
analysis, often based on distributed or high-performance architectures (e.g., Cloud) or
onsemantic-basedapproaches,forextractingbiologicallyrelevantknowledgefromdata.In
parallel, a growing number of biological and medical papers have demonstrated the real
applicationofthesemethodologies.
Thisbookisintendedtocovermainaspectsofthisarea,anditcoversalargearea,from
the description of methodologies for data analysis to the real application. The intended
audience is students or researchers that need to learn main topics of research as well as
practitionersthatneedtohavealookonapplications.Thestructureofthepresentationof
allthechaptersmakesitadaptevenfor theuseinbioinformaticscourses.
Thebookiscomposedof15chapters.Itstartsbypresentingmainconceptsrelatedto
dataanalysis.WuandGantierpresentmainmethodologiesforpreprocessingofmicroarray
data in Chapter 1. Cristiano and Veltri present a survey of miRNA Data analysis in
Chapter 2 while Calabrese and Cannataro discuss the rise of Cloud-based approaches in
Chapter 3. Chapter 4 by Lopez Kleine et al. presents the application of data mining
techniquesfordataanalysisandinChapter5Devecietal.focusontheuseofbiclustering
to query different datasets. In Chapter 6 Chang and Lin discuss a web-based tool to
analyze the evolution of miRNA clusters. Roy et al. present in Chapter 7 the application
v
vi Preface
ofbiclusteringtominepatternsofco-regulatedgenes.Chapters8and9presenttheuseof
ontologies;inparticular,Ovaskadiscussestheuseofcsbl.gotoolwhileAgapitoandMilano
surveymainexistingtoolsforsemanticsimilarityanalysisofmicroarraydata.Wangetal.in
Chapter 10 introduce the integration of microarray and proteomic data. Chapter 11 by
Koumakis et al. discusses the relevance of Gene Regulatory Network Inference, while
Chapter 12 by Roy and Guzzi focuses on the assessment of Gene Regulatory Network
methods. The remaining chapters present some relevant applications in different medical
fields. Chapter 13 by Gan et al. is related to the analysis of Mouse data for metabolomics
studies.Chapter14byDiMartinoetal.surveysthefunctionalanalysisofmicroRNAdata
in multiple myeloma that is currentlya big research area. Chapter 15 by Bhawe and Aghi
presents the application of microarray data analysis in glioblastomas. Finally, Chapter 16
discussestheanalysisofmicroRNAdataincardiogenesis.
Catanzaro,Italy PietroHiramGuzzi
Contents
Preface ................................................................... v
Contributors............................................................... ix
NormalizationofAffymetrixmiRNAMicroarrays
for theAnalysisofCancerSamples........................................... 1
DiWuandMichaelP.Gantier
MethodsandTechniquesfor miRNADataAnalysis............................ 11
FrancescaCristianoandPierangeloVeltri
BioinformaticsandMicroarrayDataAnalysisontheCloud ..................... 25
BarbaraCalabreseandMarioCannataro
ClassificationandClusteringonMicroarrayDataforGeneFunctional
PredictionUsingR ........................................................ 41
LilianaLo´pezKleine,RosaMontan˜o,andFranciscoTorres-Avile´s
QueryingCo-regulatedGenesonDiverseGeneExpressionDatasets
ViaBiclustering............................................................ 55
MehmetDeveci,OnurKu¨c¸u¨ktunc¸,KemalEren,DorukBozdag˘,
€
KamerKaya,andUmitV.C¸atalyu¨rek
MetaMirClust:DiscoveryandExplorationofEvolutionarilyConserved
miRNAClusters........................................................... 75
Wen-ChingChanandWen-changLin
AnalysisofGeneExpressionPatternsUsingBiclustering ....................... 91
SwarupRoy,DhrubaK.Bhattacharyya,andJugalK.Kalita
UsingSemanticSimilaritiesandcsbl.goforAnalyzingMicroarrayData........... 105
KristianOvaska
Ontology-BasedAnalysisofMicroarrayData.................................. 117
AgapitoGiuseppeandMariannaMilano
IntegratedAnalysisofTranscriptomicandProteomicDatasetsReveals
InformationonProteinExpressivityandFactorsAffectingTranslational
Efficiency................................................................. 123
JiangxinWang,GangWu,LeiChen,andWeiwenZhang
IntegratingMicroarrayDataandGRNs ...................................... 137
L.Koumakis,G.Potamias,M.Tsiknakis,M.Zervakis,andV.Moustakis
BiologicalNetworkInferencefromMicroarrayData,CurrentSolutions,
andAssessments........................................................... 155
SwarupRoyandPietroHiramGuzzi
AProtocoltoCollectSpecificMouseSkeletalMuscles
forMetabolomicsStudies................................................... 169
ZhuohuiGan,ZhenxingFu,JenniferC.Stowe,FrankL.Powell,
andAndrewD.McCulloch
FunctionalAnalysisofmicroRNAinMultipleMyeloma ........................ 181
MariaTeresaDiMartino,NicolaAmodio,PierfrancescoTassone,
andPierosandro Tagliaferri
vii
viii Contents
MicroarrayAnalysisinGlioblastomas......................................... 195
KaumudiM.BhaweandManishK.Aghi
AnalysisofmicroRNAMicroarraysinCardiogenesis ........................... 207
DiegoFranco,FernandoBonet,FranciscoHernandez-Torres,
EstefaniaLozano-Velasco,FranciscoJ.Esteban,andAmeliaE.Aranega
Erratumto:ClassificationandClusteringonMicroarrayDataforGene
FunctionalPredictionUsingR .............................................. 223
LilianaLo´pezKleine,RosaMontan˜o,andFranciscoTorres-Avile´s
Index..................................................................... 225
Contributors
MANISH K.AGHI (cid:1) GraduateDivisionofBiomedicalSciences(BMS),
DepartmentofNeurosurgeryandBrainTumorResearchCenter,
UniversityofCaliforniaatSanFrancisco(UCSF),SanFrancisco,CA,USA
NICOLA AMODIO (cid:1) DepartmentofExperimentalandClinicalMedicine,
T.CampanellaCancerCenter,MagnaGraeciaUniversity
andMedicalOncologyUnit,Catanzaro,Italy
AMELIA E.ARANEGA (cid:1) CardiovascularDevelopmentGroup,DepartmentofExperimental
Biology,UniversityofJae´n,Jaen,Spain
DHRUBAK.BHATTACHARYYA (cid:1) TezpurUniversity,Napaam,India
KAUMUDIM.BHAWE (cid:1) GraduateDivisionofBiomedicalSciences(BMS),
DepartmentofNeurosurgeryandBrainTumorResearchCenter,
UniversityofCaliforniaatSanFrancisco(UCSF),SanFrancisco,CA,USA
FERNANDOBONET (cid:1) CardiovascularDevelopmentGroup,DepartmentofExperimental
Biology,UniversityofJae´n,Jaen,Spain
DORUKBOZDAG˘ (cid:1) BiomedicalInformatics,TheOhioStateUniversity,Columbus,OH,USA
BARBARACALABRESE (cid:1) DepartmentofMedicalandSurgicalSciences,University
MagnaGraeciaofCatanzaro,Catanzaro,Italy
MARIOCANNATARO (cid:1) DepartmentofMedicalandSurgicalSciences,University
MagnaGraeciaofCatanzaro,Catanzaro,Italy
€
UMITV.C¸ATALYU¨REK (cid:1) BiomedicalInformatics,DepartmentofElectrical
andComputerEngineering,TheOhioStateUniversity,Columbus,OH,USA
WEN-CHING CHAN (cid:1) KaohsiungChangGungMemorialHospital,Kaohsiung,Taiwan,
People’sRepublicofChina;InstituteofBiomedicalSciences,AcademiaSinica,Taipei,
Taiwan,People’sRepublicofChina
LEICHEN (cid:1) LaboratoryofSyntheticMicrobiology,SchoolofChemicalEngineering
andTechnology,TianjinUniversity,Tianjin,People’sRepublicofChina;KeyLaboratory
ofSystemsBioengineering,MinistryofEducationofChina,Tianjin,People’sRepublic
ofChina;CollaborativeInnovationCenterofChemicalScienceandEngineering,
Tianjin,People’sRepublicofChina
FRANCESCACRISTIANO (cid:1) BioinformaticBioinformaticsLaboratory,DepartmentofSurgical
andMedicalSciences,UniversityMagnaGræciaofCatanzaro,Catanzaro,Italy
MEHMETDEVECI (cid:1) ComputerScienceandEngineering,TheOhioStateUniversity,
Columbus,OH,USA
KEMALEREN (cid:1) ComputerScienceandEngineering,TheOhioStateUniversity,Columbus,
OH,USA
FRANCISCOJ.ESTEBAN (cid:1) SystemBiologyGroup,DepartmentofExperimentalBiology,
UniversityofJae´n,Jaen,Spain
DIEGOFRANCO (cid:1) CardiovascularDevelopmentGroup,DepartmentofExperimental
Biology,UniversityofJae´n,Jaen,Spain
ZHENXINGFU (cid:1) DepartmentofMedicine,UniversityofCalifornia,SanDiego,SanDiego,
CA,USA
ZHUOHUIGAN (cid:1) DepartmentofBioengineering,UniversityofCalifornia,SanDiego,
LaJolla,CA,USA
ix