Table Of ContentSaunder January24,2011 10:39 book
Saunder January24,2011 10:39 book
Multimedia Computing, Communication and Intelligence
Series Editors: Chang Wen Chen and Shiguo Lian
Music Emotion Recognition
Yi-Hsuan Yang and Homer H. Chen
ISBN: 978-1-4398-5046-6
TV Content Analysis: Techniques and Applications
Edited by Yiannis Kompatsiaris, Bernard Merialdo, and Shiguo Lian
ISBN: 978-1-4398-5560-7
Saunder January24,2011 10:39 book
Saunder January24,2011 10:39 book
MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does
not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of MAT-
LAB® software or related products does not constitute endorsement or sponsorship by The MathWorks
of a particular pedagogical approach or particular use of the MATLAB® software.
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2011 by Taylor and Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works
Printed in the United States of America on acid-free paper
10 9 8 7 6 5 4 3 2 1
International Standard Book Number: 978-1-4398-5046-6 (Hardback)
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copy-
right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222
Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro-
vides licenses and registration for a variety of users. For organizations that have been granted a pho-
tocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Saunder January24,2011 10:39 book
Contents
Preface.................................................................................................xi
Abbreviations....................................................................................xiii
1 Introduction.................................................................................1
1.1 ImportanceofMusicEmotionRecognition......................................1
1.2 RecognizingthePerceivedEmotionofMusic....................................4
1.3 IssuesofMusicEmotionRecognition...............................................6
1.3.1 AmbiguityandGranularityofEmotionDescription............6
1.3.2 HeavyCognitiveLoadofEmotionAnnotation....................7
1.3.3 SubjectivityofEmotionalPerception...................................8
1.3.4 SemanticGapbetweenLow-LevelAudioSignal
andHigh-LevelHumanPerception.....................................9
1.4 Summary.........................................................................................12
2 OverviewofEmotionDescriptionandRecognition...............15
2.1 EmotionDescription.......................................................................15
2.1.1 CategoricalApproach........................................................16
2.1.2 DimensionalApproach......................................................18
2.1.3 MusicEmotionVariationDetection.................................20
2.2 EmotionRecognition......................................................................21
2.2.1 CategoricalApproach........................................................22
2.2.1.1 DataCollection.................................................23
2.2.1.2 DataPreprocessing............................................25
2.2.1.3 SubjectiveTest..................................................26
2.2.1.4 FeatureExtraction.............................................28
2.2.1.5 ModelTraining.................................................28
2.2.2 DimensionalApproach......................................................29
2.2.3 MusicEmotionVariationDetection.................................31
2.3 Summary.........................................................................................32
v
Saunder January24,2011 10:39 book
vi (cid:1) Contents
3 MusicFeatures...........................................................................35
3.1 EnergyFeatures...............................................................................36
3.2 RhythmFeatures.............................................................................37
3.3 TemporalFeatures..........................................................................42
3.4 SpectrumFeatures...........................................................................44
3.5 HarmonyFeatures...........................................................................51
3.6 Summary.........................................................................................54
4 DimensionalMERbyRegression.............................................55
4.1 AdoptingtheDimensionalConceptualization
ofEmotion......................................................................................55
4.2 VAPrediction.................................................................................57
4.2.1 WeightedSumofComponentFunctions..........................57
4.2.2 FuzzyApproach.................................................................58
4.2.3 SystemIdentificationApproach(SystemID).....................58
4.3 TheRegressionApproach................................................................59
4.3.1 RegressionTheory.............................................................59
4.3.2 ProblemFormulation........................................................60
4.3.3 RegressionAlgorithms.......................................................60
4.3.3.1 MultipleLinearRegression................................60
4.3.3.2 (cid:1)-SupportVectorRegression.............................61
4.3.3.3 AdaBoostRegressionTree(AdaBoost.RT)........62
4.4 SystemOverview.............................................................................62
4.5 Implementation..............................................................................63
4.5.1 DataCollection.................................................................63
4.5.2 FeatureExtraction.............................................................65
4.5.3 SubjectiveTest...................................................................67
4.5.4 RegressorTraining.............................................................67
4.6 PerformanceEvaluation..................................................................68
4.6.1 ConsistencyEvaluationoftheGroundTruth....................68
4.6.2 DataTransformation.........................................................70
4.6.3 FeatureSelection................................................................71
4.6.4 AccuracyofEmotionRecognition.....................................74
4.6.5 PerformanceEvaluationforMusicEmotion
VariationDetection...........................................................77
4.6.6 PerformanceEvaluationforEmotionClassification...........78
4.7 Summary.........................................................................................79
5 Ranking-BasedEmotionAnnotationandModelTraining.....81
5.1 Motivation......................................................................................81
5.2 Ranking-BasedEmotionAnnotation...............................................82
Saunder January24,2011 10:39 book
Contents (cid:1) vii
5.3 ComputationalModelforRankingMusic
byEmotion.....................................................................................84
5.3.1 Learning-to-Rank..............................................................85
5.3.2 RankingAlgorithms...........................................................85
5.3.2.1 RankSVM.........................................................85
5.3.2.2 ListNet..............................................................85
5.3.2.3 RBF-ListNet......................................................87
5.4 SystemOverview.............................................................................90
5.5 Implementation..............................................................................90
5.5.1 DataCollection.................................................................92
5.5.2 FeatureExtraction.............................................................95
5.6 PerformanceEvaluation..................................................................96
5.6.1 CognitiveLoadofAnnotation...........................................97
5.6.2 AccuracyofEmotionRecognition.....................................98
5.6.2.1 ComparisonofDifferentFeature
Representations.................................................99
5.6.2.2 ComparisonofDifferentLearning
Algorithms......................................................100
5.6.2.3 SensitivityTest................................................102
5.6.3 SubjectiveEvaluationofthePredictionResult.................104
5.7 Discussion.....................................................................................104
5.8 Summary.......................................................................................105
6 FuzzyClassificationofMusicEmotion..................................107
6.1 Motivation....................................................................................107
6.2 FuzzyClassification.......................................................................108
6.2.1 Fuzzyk-NNClassifier.....................................................108
6.2.2 FuzzyNearest-MeanClassifier.........................................109
6.3 SystemOverview...........................................................................112
6.4 Implementation............................................................................113
6.4.1 DataCollection...............................................................113
6.4.2 FeatureExtractionandFeatureSelection.........................113
6.5 PerformanceEvaluation................................................................114
6.5.1 AccuracyofEmotionClassification..................................114
6.5.2 MusicEmotionVariationDetection................................114
6.6 Summary.......................................................................................117
7 PersonalizedMERandGroupwiseMER.................................119
7.1 Motivation....................................................................................119
7.2 PersonalizedMER.........................................................................121
7.3 GroupwiseMER...........................................................................122
Saunder January24,2011 10:39 book
viii (cid:1) Contents
7.4 Implementation............................................................................124
7.4.1 DataCollection...............................................................124
7.4.2 PersonalInformationCollection......................................126
7.4.3 FeatureExtraction...........................................................127
7.5 PerformanceEvaluation................................................................128
7.5.1 PerformanceoftheGeneralMethod................................128
7.5.2 PerformanceofGWMER................................................130
7.5.3 PerformanceofPMER.....................................................130
7.6 Summary.......................................................................................134
8 Two-LayerPersonalization.....................................................135
8.1 ProblemFormulation....................................................................135
8.2 Bag-of-UsersModel......................................................................136
8.3 ResidualModelingandTwo-LayerPersonalizationScheme..........137
8.4 PerformanceEvaluation................................................................139
8.5 Summary.......................................................................................143
9 ProbabilityMusicEmotionDistributionPrediction.............145
9.1 Motivation....................................................................................145
9.2 ProblemFormulation....................................................................146
9.3 TheKDE-BasedApproachtoMusicEmotion
DistributionPrediction.................................................................148
9.3.1 GroundTruthCollection................................................148
9.3.2 RegressorTraining...........................................................150
9.3.2.1 ν-SupportVectorRegression...........................151
9.3.2.2 GaussianProcessRegression............................151
9.3.3 RegressorFusion..............................................................153
9.3.3.1 WeightedbyPerformance...............................153
9.3.3.2 Optimization...................................................154
9.3.4 OutputofEmotionDistribution.....................................156
9.4 Implementation............................................................................157
9.4.1 DataCollection...............................................................157
9.4.2 FeatureExtraction...........................................................157
9.5 PerformanceEvaluation................................................................161
9.5.1 ComparisonofDifferentRegressionAlgorithms..............161
9.5.2 ComparisonofDifferentDistribution
ModelingMethods..........................................................162
9.5.3 ComparisonofDifferentFeatureRepresentations...........165
9.5.4 EvaluationofRegressorFusion........................................166
9.6 Discussion.....................................................................................167
9.7 Summary.......................................................................................172
Saunder January24,2011 10:39 book
Contents (cid:1) ix
10 LyricsAnalysisandItsApplicationtoMER...........................173
10.1 Motivation..................................................................................173
10.2 LyricsFeatureExtraction.............................................................174
10.2.1 Uni-Gram....................................................................175
10.2.2 ProbabilisticLatentSemanticAnalysis(PLSA).............176
10.2.3 Bi-Gram.......................................................................177
10.3 MultimodalMERSystem...........................................................179
10.4 PerformanceEvaluation..............................................................181
10.4.1 ComparisonofMultimodalFusionMethods...............181
10.4.2 PerformanceofPLSAModel........................................183
10.4.3 PerformanceofBi-GramModel...................................184
10.5 Summary.....................................................................................184
11 ChordRecognitionandItsApplicationtoMER.....................187
11.1 ChordRecognition.....................................................................187
11.1.1 BeatTrackingandPCPExtraction...............................188
11.1.2 HiddenMarkovModelandN-GramModel................188
11.1.3 ChordDecoding...........................................................190
11.2 ChordFeatures............................................................................191
11.2.1 LongestCommonChordSubsequence.........................192
11.2.2 ChordHistogram.........................................................192
11.3 SystemOverview.........................................................................193
11.4 PerformanceEvaluation..............................................................193
11.4.1 EvaluationofChordRecognitionSystem.....................193
11.4.2 AccuracyofEmotionClassification..............................194
11.5 Summary.....................................................................................196
12 GenreClassificationandItsApplicationtoMER...................197
12.1 Motivation..................................................................................197
12.2 Two-LayerMusicEmotionClassification...................................198
12.3 PerformanceEvaluation..............................................................199
12.3.1 DataCollection............................................................199
12.3.2 AnalysisoftheCorrelationbetweenGenre
andEmotion.................................................................200
12.3.3 EvaluationoftheTwo-LayerEmotion
ClassificationScheme...................................................203
12.3.3.1 ComputationalModel................................203
12.3.3.2 EvaluationMeasures...................................203
12.3.3.3 Results........................................................204
12.4 Summary.....................................................................................205