Table Of ContentACL 2013
BioNLP Shared Task 2013
Proceedings of the Workshop
August 9, 2013
Sofia, Bulgaria
ProductionandManufacturingby
Omnipress,Inc.
2600AndersonStreet
Madison,WI53704USA
OrdercopiesofthisandotherACLproceedingsfrom:
AssociationforComputationalLinguistics(ACL)
209N.EighthStreet
Stroudsburg,PA18360
USA
Tel: +1-570-476-8006
Fax: +1-570-476-0860
[email protected]
ISBN:978-1-937284-55-8
ii
Introduction
The BioNLP Shared Task (BioNLP-ST) series represents a community-wide trend in text-mining for
biologytowardfine-grainedinformationextraction(IE).Thetwopreviousevents,BioNLP-ST2009and
2011, attracted wide attention, with over 30 teams submitting final results. The tasks and their data
have since served as the basis of numerous studies, released event extraction systems, and published
datasets. Asinpreviousevents,theresultsofBioNLP-ST2013arepresentedattheACL/HLTBioNLP-
STworkshopcolocatedwiththeBioNLPworkshopinSofia,Bulgaria(9August2013).
BioNLP-ST 2013 follows the general outline and goals of the previous tasks. It identifies biologically
relevant extraction targets and proposes a linguistically motivated approach to event representation.
The tasks in BioNLP-ST 2013 cover many new hot topics in biology that are close to biologist needs.
BioNLP-ST 2013 broadens the scope of the text-mining application domains in biology by introducing
newissuesoncancergeneticsandpathwaycuration. Italsobuildsonthewell-knownpreviousdatasets
GENIA, LLL/BI and BB to propose more realistic tasks that considered previously, closer to the actual
needsofbiologicaldataintegration.
The first event in 2009 triggered active research in the community on a specific fine-grained IE task.
Expanding on this, the second BioNLP-ST was organized under the theme “Generalization”, which
was well received by participants, who introduced numerous systems that could be straightforwardly
applied to multiple tasks. This time, the BioNLP-ST takes a step further and pursues the grand
theme of “Knowledge base construction”, which is addressed in various ways: semantic web (GE,
GRO),pathways(PC),molecularmechanismsofcancer(CG),regulationnetworks(GRN)andontology
population (GRO, BB). A general overview paper in this volume summarizes the organization and
participationinthesharedtasks,with22teamssubmitted38finalresultsthisyear. Eachspecifictaskis
additionallycoveredbyanoverviewpaper.
Asinpreviousevents, manuallyannotateddatawereprovidedfortraining, developmentandevaluation
of information extraction methods. According to their relevance for biological studies, the annotations
are either bound to specific expressions in the text or represented as structured knowledge. Tools for
theevaluationofsystemoutputsarepubliclyavailable. Supportinperforminglinguisticprocessingwas
providedtotheparticipantsintheformofanalysescreatedbyvariousstate-of-thearttoolsonthedataset
texts. Alastoverviewpaperisdedicatedtothepreparationofthesesupportingresources.
Thanks to the many excellent manuscripts received from participants and the efforts of the programme
committee, it is our pleasure to present these proceedings describing the task and the participating
systems.
ClaireNédellec—OrganizingChair
RobertBossy—BBandGRNTaskChair
Jin-DongKim—GETaskChair
Jung-jaeKim—GROTaskChair
TomokoOhta—PCTaskChair
SampoPyysalo—CGTaskChair
PierreZweigenbaum—PCChair
iii
Committees
ScientificAdvisoryBoard OrganizingCommittee
Jun’ichiTsujii(Microsoft) .... Chair ClaireNédellec(INRA) .................. OrganizingChair
PhilippeBessières(INRA) SophiaAnaniadou(NaCTeM,Univ. Manchester)
Sung-PilChoi(KISTI) RobertBossy(INRA) .............. TaskBBandGRNChair
KevinCohen(Univ. Colorado) JulienJourde(INRA)
YujiKohara(DBCLS) Jin-DongKim(DBCLS) .................... TaskGEChair
TapioSalakoski(Univ. Turku) Jung-jaeKim(NTU,Singapore) ........... TaskGROChair
PierreZweigenbaum(CNRS) TomokoOhta(NaCTeM,Univ. Manchester) .. TaskPCChair
SampoPyysalo(NaCTeM,Univ. Manchester) TaskCGChair
PontusStenetorp(Univ. Tokyo)
YueWang(DBCLS)
ProgrammeCommittee
PierreZweigenbaum,NationalCenterforScientificResearch(CNRS) .................... PCChair
SophiaAnaniadou,UniversityofManchester(NaCTeM)
NathalieAussenac-Gilles,NationalCenterforScientificResearch(CNRS)
SabineBergler,ConcordiaUniversity
PhilippeBessières,NationalInstituteforAgriculturalResearch(INRA)
RobertBossy,NationalInstituteforAgriculturalResearch(INRA)
KevinBretonnelCohen,UniversityofColorado
BerrydeBruijn,NationalResearchCouncil(NRC)
DinaDemner-Fushman,NationalLibraryofMedicine(NLM)
JörgHakenberg,ArizonaStateUniversity
Jin-DongKim,DatabaseCenterforLifeScience(DBCLS)
Jung-JaeKim,NanyangTechnologicalUniversity
MartinKrallinger,NationalBiotechnologyCenter(CNB)
DavidMcClosky,StanfordUniversity
RoserMorante,UniversityofAntwerp
ClaireNédellec,NationalInstituteforAgriculturalResearch(INRA)
TomokoOhta,UniversityofManchester(NaCTeM)
ThierryPoibeau,NationalCenterforScientificResearch(CNRS)
SampoPyysalo,UniversityofManchester(NaCTeM)
RafalRak,UniversityofManchester(NaCTeM)
SebastianRiedel,UniversityofMassachusetts
FabioRinaldi,UniversityofZurich
YvanSaeys,GhentUniversity
TapioSalakoski,UniversityofTurku
RuneSætre,NorwegianUniversityofScienceandTechnology(NTNU)
ÖzlemUzuner,StateUniversityofNewYork
AndreasVlachos,UniversityofCambridge
v
Taskorganizers
GEtask
Jin-DongKim(DBCLS)
YueWang(DBCLS)
YasunoriYamamoto(DBCLS)
SabineBergler(ConcordiaUniv.)
RoserMorante(Univ. Antwerp)
KevinCohen(Univ. Colorado) GROtask
Jung-jaeKim(NTU)
HanXu(NTU)
DietrichRebholz-Schuhmann(Univ. Zurich)
VivianLee(EBI)
CGtask
SampoPyysalo(NaCTeMandUniv. Manchester)
TomokoOhta(NaCTeMandUniv. Manchester)
RafalRak(NaCTeMandUniv. Manchester)
AndrewRowley(NaCTeMandUniv. Manchester)
JacobCarter(NaCTeMandUniv. Manchester)
SophiaAnaniadou(NaCTeMandUniv. Manchester)
GRNtask
RobertBossy(INRA)
PhilippeBessières(INRA)
FrédéricPapazian(INRA)
ClaireNédellec(INRA)
PCtask
TomokoOhta(NaCTeMandUniv. Manchester)
SampoPyysalo(NaCTeMandUniv. Manchester)
RafalRak(NaCTeMandUniv. Manchester)
AndrewRowley(NaCTeMandUniv. Manchester)
JacobCarter(NaCTeMandUniv. Manchester)
SophiaAnaniadou(NaCTeMandUniv. Manchester)
Sung-PilChoi(KISTI)
Hong-wooChun(KISTI)
Sung-jaeJung(KISTI)
HyunUkKim(KAIST)
BBtask
JinkiKim(KAIST)
KyusangHwang(KAIST) RobertBossy(INRA)
YonghwaJo PhilippeBessières(INRA)
HyeyeonChoi WiktoriaGolik(INRA)
FrédéricPapazian(INRA)
ZoranaRatkovic(INRA)
ClaireNédellec(INRA)
vi
Table of Contents
OverviewofBioNLPSharedTask2013
Claire Nédellec, Robert Bossy, Jin-Dong Kim, Jung-Jae Kim, Tomoko Ohta, Sampo Pyysalo and
PierreZweigenbaum..........................................................................1
TheGeniaEventExtractionSharedTask,2013Edition-Overview
Jin-DongKim,YueWangandYamamotoYasunori..........................................8
TEES2.1: AutomatedAnnotationSchemeLearningintheBioNLP2013SharedTask
JariBjörneandTapioSalakoski..........................................................16
EVEX in ST’13: Application of a large-scale text mining resource to event extraction and network con-
struction
KaiHakala,SofieVanLandeghem,TapioSalakoski,YvesVandePeerandFilipGinter........26
ExtractingBiomedicalEventsandModificationsUsingSubgraphMatchingwithNoisyTrainingData
AndrewMacKinlay,DavidMartinez,AntonioJimenoYepes,HaibinLiu,WJohnWilburandKarin
Verspoor....................................................................................35
BiomedicalEventExtractionbyMulti-classClassificationofPairsofTextEntities
XiaoLiu,AntoineBordesandYvesGrandvalet............................................45
GROTask: PopulatingtheGeneRegulationOntologywitheventsandrelations
Jung-JaeKim,XuHan,VivianLeeandDietrichRebholz-Schuhmann........................50
OverviewoftheCancerGenetics(CG)taskofBioNLPSharedTask2013
SampoPyysalo,TomokoOhtaandSophiaAnaniadou......................................58
OverviewofthePathwayCuration(PC)taskofBioNLPSharedTask2013
Tomoko Ohta, Sampo Pyysalo, Rafal Rak, Andrew Rowley, Hong-Woo Chun, Sung-Jae Jung,
Sung-PilChoi,SophiaAnaniadouandJun’ichiTsujii...........................................67
GeneralizinganApproximateSubgraphMatching-basedSystemtoExtractEventsinMolecularBiology
andCancerGenetics
HaibinLiu,KarinVerspoor,DonaldC.Comeau,AndrewMacKinlayandWJohnWilbur......76
PerformanceandlimitationsofthelinguisticallymotivatedCocoa/Peaberrysysteminabroadbiological
domain.
SVRamananandP.SenthilNathan.......................................................86
NaCTeMEventMineforBioNLP2013CGandPCtasks
MakotoMiwaandSophiaAnaniadou.....................................................94
BioNLPSharedTask2013: SupportingResources
Pontus Stenetorp, Wiktoria Golik, Thierry Hamon, Donald C. Comeau, Rezarta Islamaj Dogan,
HaibinLiuandWJohnWilbur................................................................99
Afastrule-basedapproachforbiomedicaleventextraction
Quoc-ChinhBui,DavidCampos,ErikvanMulligenandJanKors..........................104
ImprovingFeature-BasedBiomedicalEventExtractionSystembyIntegratingArgumentInformation
LishuangLi,YiwenWangandDegenHuang.............................................109
vii
UZHinBioNLP2013
Gerold Schneider, Simon Clematide, Tilia Ellendorff, Don Tuggener, Fabio Rinaldi and Gintare˙
Grigonyte˙ ................................................................................. 116
AHybridapproachforbiomedicaleventextraction
XuanQuangPham,MinhQuangLeandBaoQuocHo....................................121
IdentificationofGeniaEventsusingMultipleClassifiers
RolandRollerandMarkStevenson......................................................125
ExploringaProbabilisticEarleyParserforEventCompositioninBiomedicalTexts
Mai-VuTran,NigelCollier,Hoang-QuynhLe,Van-ThuyPhiandThanh-BinhPham.........130
DetectingRelationsintheGeneRegulationNetwork
ThomasProvoostandMarie-FrancineMoens.............................................135
Ontology-basedsemanticannotation: anautomatichybridrule-basedmethod
SondesBannour,LaurentAudibertandHenrySoldano....................................139
BuildingAContrastingTaxaExtractorforRelationIdentificationfromAssertions: BIOlogicalTaxonomy
&OntologyPhraseExtractionSystem
CyrilGrouin..........................................................................144
BioNLPSharedTask2013–AnoverviewoftheGenicRegulationNetworkTask
RobertBossy,PhilippeBessièresandClaireNédellec.....................................153
BioNLPsharedTask2013–AnOverviewoftheBacteriaBiotopeTask
RobertBossy,WiktoriaGolik,ZoranaRatkovic,PhilippeBessièresandClaireNédellec......161
Bacteria Biotope Detection, Ontology-based Normalization, and Relation Extraction using Syntactic
Rules
˙IlknurKaradenizandArzucanÖzgür....................................................170
ExtractingGeneRegulationNetworksUsingLinear-ChainConditionalRandomFieldsandRules
SlavkoZitnik,MarinkaŽitnik,BlažZupanandMarkoBajec...............................178
IRISAparticipationtoBioNLP-ST13: lazy-learningandinformationretrievalforinformationextraction
tasks
VincentClaveau.......................................................................188
viii
Workshop Program
Friday,August9,2013
(8:30-9:00)WelcomeandIntroduction
08:45 OverviewofBioNLPSharedTask2013
ClaireNédellec,RobertBossy,Jin-DongKim,Jung-JaeKim,TomokoOhta,Sampo
PyysaloandPierreZweigenbaum
Session1: (9:00-10:30)OralPresentations: GeniaEventExtractionandGene
RegulationOntology
9:00 TheGeniaEventExtractionSharedTask,2013Edition-Overview
Jin-DongKim,YueWangandYamamotoYasunori
9:10–9:30 TEES 2.1: Automated Annotation Scheme Learning in the BioNLP 2013 Shared
Task
JariBjörneandTapioSalakoski
9:30–9:50 EVEXinST’13: Applicationofalarge-scaletextminingresourcetoeventextraction
andnetworkconstruction
Kai Hakala, Sofie Van Landeghem, Tapio Salakoski, Yves Van de Peer and Filip
Ginter
9:50–10:10 Extracting Biomedical Events and Modifications Using Subgraph Matching with
NoisyTrainingData
Andrew MacKinlay, David Martinez, Antonio Jimeno Yepes, Haibin Liu, W John
WilburandKarinVerspoor
10:10–10:30 BiomedicalEventExtractionbyMulti-classClassificationofPairsofTextEntities
XiaoLiu,AntoineBordesandYvesGrandvalet
(10:30-11:00)Break
ix
Friday,August9,2013(continued)
Session2: (11:00-12:30)OralPresentations: CancerGeneticsandPathwayCuration
11:00 GROTask: PopulatingtheGeneRegulationOntologywitheventsandrelations
Jung-JaeKim,XuHan,VivianLeeandDietrichRebholz-Schuhmann
11:10 OverviewoftheCancerGenetics(CG)taskofBioNLPSharedTask2013
SampoPyysalo,TomokoOhtaandSophiaAnaniadou
11:20 OverviewofthePathwayCuration(PC)taskofBioNLPSharedTask2013
Tomoko Ohta, Sampo Pyysalo, Rafal Rak, Andrew Rowley, Hong-Woo Chun, Sung-Jae
Jung,Sung-PilChoi,SophiaAnaniadouandJun’ichiTsujii
11:30–11:50 Generalizing an Approximate Subgraph Matching-based System to Extract Events in
MolecularBiologyandCancerGenetics
HaibinLiu,KarinVerspoor,DonaldC.Comeau,AndrewMacKinlayandWJohnWilbur
11:50–12:10 Performance and limitations of the linguistically motivated Cocoa/Peaberry system in a
broadbiologicaldomain.
SVRamananandP.SenthilNathan
12:10–12:30 NaCTeMEventMineforBioNLP2013CGandPCtasks
MakotoMiwaandSophiaAnaniadou
(12:30-14:00)LunchBreak
Session3: (14:00-15:30)Posters
BioNLPSharedTask2013: SupportingResources
Pontus Stenetorp, Wiktoria Golik, Thierry Hamon, Donald C. Comeau, Rezarta Islamaj
Dogan,HaibinLiuandWJohnWilbur
Afastrule-basedapproachforbiomedicaleventextraction
Quoc-ChinhBui,DavidCampos,ErikvanMulligenandJanKors
Improving Feature-Based Biomedical Event Extraction System by Integrating Argument
Information
LishuangLi,YiwenWangandDegenHuang
UZHinBioNLP2013
Gerold Schneider, Simon Clematide, Tilia Ellendorff, Don Tuggener, Fabio Rinaldi and
Gintare˙ Grigonyte˙
x
Description:ACL 2013. BioNLP Shared Task 2013. Proceedings of the Workshop. August 9, 2013. Sofia, Bulgaria As in previous events, the results of BioNLP-ST 2013 are presented at the ACL/HLT BioNLP-. ST workshop colocated with the BioNLP 78(12):31–38. Claire Nédellec, Mohamed Ould Abdel Vetah,.