Table Of ContentNew Developments in Parsing Technology
Text, Speech and Language Technology
VOLUME 23
Series Editors
Nancy Ide, Vassar College, New York
Jean Véronis, Université de Provence and CNRS, France
Editorial Board
Harald Baayen, Max Planck Institute for Psycholinguistics, The Netherlands
Kenneth W. Church, AT & T Bell Labs, New Jersey, USA
Judith Klavans, Columbia University, New York, USA
David T. Barnard, University of Regina, Canada
Dan Tufis, Romanian Academy of Sciences, Romania
Joaquim Llisterri, Universitat Autonoma de Barcelona, Spain
Stig Johansson, University of Oslo, Norway
Joseph Mariani, LIMSI-CNRS, France
The titles published in this series are listed on www.wkap.nl/prod/s/TLTB.
New Developments in
Parsing Technology
Edited by
Harry Bunt
Tilburg University,
Tilburg, The Netherlands
John Carroll
University of Sussex,
Brighton, United Kingdom
and
Giorgio Satta
University of Padua,
Padua, Italy
KLUWER ACADEMIC PUBLISHERS
NEW YORK,BOSTON, DORDRECHT, LONDON, MOSCOW
eBookISBN: 1-4020-2295-6
Print ISBN: 1-4020-2293-X
©2005 Springer Science + Business Media, Inc.
Print ©2004 Kluwer Academic Publishers
Dordrecht
All rights reserved
No part of this eBook maybe reproducedor transmitted inanyform or byanymeans,electronic,
mechanical, recording, or otherwise, without written consent from the Publisher
Created in the United States of America
Visit Springer's eBookstore at: http://ebooks.kluweronline.com
and the Springer Global Website Online at: http://www.springeronline.com
Contents
Preface xi
1
DevelopmentsinParsingTechnology: FromTheorytoApplication 1
HarryBunt, JohnCarrollandGiorgioSatta
1 Introduction 1
2 Aboutthisbook 6
2
ParameterEstimationforStatisticalParsingModels: TheoryandPractice 19
ofDistribution-FreeMethods
MichaelCollins
1 Introduction 19
2 LinearModels 20
3 ProbabilisticContext-FreeGrammars 23
4 StatisticalLearningTheory 26
5 ConvergenceBoundsforFiniteSetsofHypotheses 29
6 ConvergenceBoundsforHyperplaneClassifiers 34
7 ApplicationofMarginAnalysistoParsing 37
8 Algorithms 39
9 Discussion 46
10 Conclusions 49
3
HighPrecisionExtractionofGrammaticalRelations 57
JohnCarrollandTedBriscoe
1 Introduction 57
2 TheAnalysisSystem 59
3 EmpiricalResults 61
4 ConclusionsandFurtherWork 68
4
AutomatedExtractionofTAGsfromthePennTreebank 73
JohnChenandK.VijayShanker
1 Introduction 73
2 TreeExtractionProcedure 74
v
vi
3 Evaluation 80
4 ExtendedExtractedGrammars 84
5 RelatedWork 86
6 Conclusions 87
5
ComputingtheMostProbableParseforaDiscontinuousPhrase-Structure 91
Grammar
OliverPlaehn
1 Introduction 91
2 DiscontinuousPhrase-StructureGrammar 92
3 TheParsingAlgorithm 95
4 ComputingtheMostProbableParse 98
5 Experiments 101
6 ConclusionandFutureWork 103
6
ANeuralNetworkParserthatHandlesSparseData 107
JamesHenderson
1 Introduction 107
2 SimpleSynchronyNetworks 108
3 AProbabilisticParserforSSNs 110
4 EstimatingtheProbabilitieswithaSimpleSynchronyNetwork 113
5 GeneralizingfromSparseData 117
6 Conclusion 123
7
AnEfficientLRParserGeneratorforTree-AdjoiningGrammars 125
CarlosA.Prolo
1 Introduction 125
2 TAGS 127
3 OnSomeDegenerateLRModelsforTAGS 129
4 ProposedAlgorithm 133
5 Implementation 141
6 Example 146
7 SomePropertiesOftheAlgorithms 146
8 Evaluation 151
9 Conclusions 151
8
RelatingTabularParsingAlgorithmsforLIGandTAG 157
MiguelA.Alonso,E´ricdelaClergerie, V´ıctorJ.D´ıazandManuelVilares
1 Introduction 158
2 Tree-AdjoiningGrammars 158
3 LinearIndexedGrammars 160
4 Bottom-upParsingAlgorithms 162
Contents vii
5 Earley-likeParsingAlgorithms 166
6 Earley-likeParsingAlgorithmsPreservingtheCorrectPrefixProp-
erty 171
7 BidirectionalParsing 178
8 SpecializedTAGparsers 180
9 Conclusion 182
9
ImprovedLeft-CornerChartParsingforLargeContext-FreeGrammars 185
RobertC.Moore
1 Introduction 185
2 EvaluatingParsingAlgorithms 186
3 TerminologyandNotation 187
4 TestGrammars 187
5 Left-CornerParsingAlgorithmsandRefinements 188
6 GrammarTransformations 193
7 ExtractingParsesfromtheChart 196
8 ComparisontoOtherAlgorithms 197
9 Conclusions 199
10
OnTwoClassesofFeaturePathsinLarge-ScaleUnificationGrammars 203
LiviuCiortuz
1 Introduction 203
2 CompilingtheQuickCheckFilter 205
3 GeneralisedRuleReduction 215
4 Conclusion 224
11
AContext-FreeSupersetApproximationofUnification-BasedGrammars 229
BerndKieferandHans-UlrichKrieger
1 Introduction 229
2 BasicInventory 231
3 ApproximationasFixpointConstruction 232
4 TheBasicAlgorithm 233
5 ImplementationIssuesandOptimizations 235
6 RevisitingtheFixpointConstruction 240
7 ThreeGrammars 241
8 DisambiguationofUBGsviaProbabilisticApproximations 247
12
ARecognizerforMinimalistLanguages 251
HenkHarkema
1 Introduction 251
2 MinimalistGrammars 252
3 SpecificationoftheRecognizer 256
viii
4 Correctness 260
5 ComplexityResults 264
6 ConclusionsandFutureWork 265
13
RangeConcatenationGrammars 269
PierreBoullier
1 Introduction 269
2 PositiveRangeConcatenationGrammars 270
3 NegativeRangeConcatenationGrammars 276
4 AParsingAlgorithmforRCGs 281
5 ClosurePropertiesandModularity 284
6 Conclusion 286
14
GrammarInductionbyMDL-BasedDistributionalClassification 291
YikunGuo,FuliangWengandLideWu
1 Introduction 292
2 GrammarInductionwiththeMDLPrinciple 293
3 InductionStrategies 295
4 MDLInductionbyDynamicDistributionalClassification(DCC) 299
5 ComparisonandConclusion 303
Appendix 305
15
Optimal Ambiguity Packing in Context-Free Parsers with Interleaved 307
Unification
AlonLavieandCarolynPensteinRose´
1 Introduction 307
2 AmbiguityPackinginContextFreeParsing 309
3 TheRulePrioritizationHeuristic 311
4 EmpiricalEvaluationsandDiscussion 315
5 ConclusionsandFutureDirections 319
16
RobustDataOrientedSpokenLanguageUnderstanding 323
KhalilSima’an
1 Introduction 323
2 BriefOverviewofOVIS 324
3 DOPvs.Tree-Gram 326
4 ApplicationtotheOVISDomain 332
5 Conclusions 335
17
SOUP: AParserforReal-WorldSpontaneousSpeech 339
MarsalGavalda`
1 Introduction 339
Contents ix
2 GrammarRepresentation 340
3 SketchoftheParsingAlgorithm 341
4 Performance 343
5 KeyFeatures 345
6 Conclusion 349
18
ParsingandHypergraphs 351
DanKleinandChristopherD.Manning
1 Introduction 351
2 HypergraphsandParsing 352
3 ViterbiParsingAlgorithm 359
4 Analysis 363
5 Conclusion 368
Appendix 369
19
MeasureforMeasure: TowardsIncreasedComponentComparabilityand 373
Exchange
StephanOepenandUlrichCallmeier
1 Competence&PerformanceProfiling 375
2 StrongEmpiricism: AFewExamples 378
3 PET–SynthesizingCurrentBestPractice 384
4 QuantifyingProgress 385
5 Multi-DimensionalPerformanceProfiling 387
6 Conclusion–RecentDevelopments 391
Index 397
Preface
Thisbookisbasedoncontributionstotwoworkshopsintheseries“Interna-
tionalWorkshoponParsingTechnology”. IWPT2000,the6thworkshopinthe
series,washeldinTrento,Italy,inFebruary 2001,andwasorganized byJohn
Carroll (Programme Chair), Harry Bunt (General Chair) and Alberto Lavelli
(LocalChair). The7thworkshop,IWPT2001,tookplaceinBeijing,China,in
October2001, andwasorganized byGiorgio Satta(ProgrammeChair),Harry
Bunt (General Chair) and Shiwen Yu and Fuliang Weng (Local Co-Chairs).
Fromeach of these events the best papers were selected and re-reviewed, and
subsequentlyrevised,updatedandextendedbytheauthors,resultinginachap-
ter in this volume. The chapter by Alonso, De la Clergerie, D´ıaz and Vilares
is based on material from two papers: the IWPT2000 paper by Alonso, Dela
Clergerie, Gran˜a and Vilares, and the IWPT2001 paper by Alonso, D´ıaz and
Vilares. Thechapter byMichael Collins corresponds tothe paper that he pre-
paredforIWPT2001asaninvitedspeaker,butwhichhewasunabletopresent
at the workshop, due to travel restrictions relating to the events of September
11, 2001. The introductory chapter of this book was written by the editors in
ordertorelatetheindividual chapters torecentissuesanddevelopments inthe
fieldofparsingtechnology.
Wewishtoacknowledgetheimportantroleoftheprogrammecommitteesof
IWPT2000andIWPT2001inreviewing submitted papers; these reviewshave
beenthebasisforselectingthematerialpublishedinthisbook. Thesecommit-
teesconsistedofShuoBai,BobBerwick,EricBrill,HarryBunt,BobCarpen-
ter, John Carroll, Ken Church, E´ric De la Clergerie, Mark Johnson, Aravind
Joshi, RonKaplan,Martin Kay,SadaoKurohashi, Bernard Lang,AlonLavie,
Yuji Matsumoto, Paola Merlo, Mark-Jan Nederhof, Anton Nijholt, Giorgio
Satta, Christer Samuelsson, Satoshi Sekine, Virach Sornlertlamvanich, Mark
Steedman, Oliviero Stock, Hozumi Tanaka, Masaru Tomita, Hans Uszkoreit,
K.Vijay-Shanker,DavidWeir,MatsWire´n,DekaiWuandTiejunZhao.
THEEDITORS
xi