Table Of ContentYohei Murakami
Donghui Lin (Eds.)
2 Worldwide Language
4
4
9
AI Service Infrastructure
N
L
Second International Workshop, WLSI 2015
Kyoto, Japan, January 22–23, 2015
Revised Selected Papers
123
fi
Lecture Notes in Arti cial Intelligence 9442
Subseries of Lecture Notes in Computer Science
LNAI Series Editors
Randy Goebel
University of Alberta, Edmonton, Canada
Yuzuru Tanaka
Hokkaido University, Sapporo, Japan
Wolfgang Wahlster
DFKI and Saarland University, Saarbrücken, Germany
LNAI Founding Series Editor
Joerg Siekmann
DFKI and Saarland University, Saarbrücken, Germany
More information about this series at http://www.springer.com/series/1244
Yohei Murakami Donghui Lin (Eds.)
(cid:129)
Worldwide Language
Service Infrastructure
Second International Workshop, WLSI 2015
–
Kyoto, Japan, January 22 23, 2015
Revised Selected Papers
123
Editors
YoheiMurakami Donghui Lin
Unit of Design KyotoUniversity
KyotoUniversity Kyoto
Kyoto Japan
Japan
ISSN 0302-9743 ISSN 1611-3349 (electronic)
Lecture Notesin Artificial Intelligence
ISBN 978-3-319-31467-9 ISBN978-3-319-31468-6 (eBook)
DOI 10.1007/978-3-319-31468-6
LibraryofCongressControlNumber:2016934198
LNCSSublibrary:SL7–ArtificialIntelligence
©SpringerInternationalPublishingSwitzerland2016
Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartofthe
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodologynow
knownorhereafterdeveloped.
Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication
doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant
protectivelawsandregulationsandthereforefreeforgeneraluse.
Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbookare
believedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsortheeditors
give a warranty, express or implied, with respect to the material contained herein or for any errors or
omissionsthatmayhavebeenmade.
Printedonacid-freepaper
ThisSpringerimprintispublishedbySpringerNature
TheregisteredcompanyisSpringerInternationalPublishingAGSwitzerland
Preface
Language technologies and tools (hereafter called language resources) increasingly
require sophisticated infrastructures to share, deploy as services, and combine for
supporting research, development, innovation, and collaboration. To meet this need,
several infrastructures have been already established over the past few years, such as
Language Grid, Language Application Grid, META-SHARE, MLi, and PANACEA.
The main theme of the International Workshop on Worldwide Language Service
Infrastructure (WLSI) is technological and institutional challenges that are significant
for constructing a worldwide interoperable language service infrastructure. The first
workshop focused on language service infrastructures in Asian areas, and a related
workshop, Language Technology Service Platforms: Synergies, Standards, Sharing
(LTSP2014),washeldatthenintheditionoftheLanguageResourcesandEvaluation
Conference (LREC 2014). The aim of LTSP 2014 was to provide a forum to enhance
international cooperation and sustainable collaboration among worldwide initiatives.
The second workshop was held during January 22–23, 2015, in Kyoto, Japan.
The workshop featured five prominent invited speakers: Toru Ishida from the
Department of Social Informatics, Kyoto University, who introduced intercultural
collaboration activities of the Language Grid; Nancy Ide from the Department of
Computer Science, Vassar College, who presented the Language Application Grid
framework to create custom natural language processing applications; Khalid Choukri
fromtheEvaluationsandLanguageResourcesDistributionAgency,whoexplainedthe
MLi Hub Project that aims at compiling the specification of the next generation of
languagegrids;NúriaBelfromtheDepartmentofTranslationandLanguageSciences,
University of Pompeu Fabra, who reported characteristics of users in humanities and
social sciences in the Spanish CLARIN Center; and Nicoletta Calzolari from the
European Language Resource Association, who summarized policy issues related to
language service infrastructures. The first four invited speakers are leaders of each
ongoingprojectinAsia,theUSAandEurope,andthelastoneisarepresentativeofthe
association to promote language resources. The workshop included 11 oral presenta-
tions,andfourposters.Participationintheworkshopwasbyinvitationonly,andthere
were 29 professionals from 10 countries: China, France, Greece, Indonesia, Italy,
Japan, Spain, Thailand, the USA, and Vietnam.
Thisvolumeincludes14selectedpaperspresentedattheworkshop.Thepapersare
categorized into four parts. The first part introduces metadata and annotations to
describe what kind of functionalities and annotations language services provide, and
howtoinvokethelanguageservicesandconverttheoutputofalanguageservicetothe
inputofanotherservice.InMETA-SHARE,Piperidisetal.havefocusedonprocessing
language datasets with appropriate linguistic annotation services such as tokenization,
POStagging,lemmatization,dependencyparsing,andsoon.Ontheotherhand,inthe
Language Application Grid, Ide and Verhagen have addressed the language service
interoperability to combine various services by defining Web Service Exchange
VI Preface
Vocabulary (WS-EV), which specifies a terminology for a core of linguistic objects
exchanged among linguistic annotation services, and LAPPS Interchange Format
(LIF),whichrepresentslinguisticallyannotateddataincludingWS-EVforWebservice
invocations.
The second part provides technologies for service platforms that compose atomic
language services across different interfaces, policies, and licenses. Ide et al. have
proposed the Language Application Grid platform that enables language service
compositionusingGalaxyworkflowengineinworkflowlayer,LIFinmessaginglayer,
and WS-EV in vocabulary layer. To solve licensing issues, Cieri and DiPersio have
proposedtheLanguageApplicationGridlicenseschemabyestablishingtwoclassesof
enforcement, requirement and notification. Mai et al. have tackled policy-aware lan-
guage service composition by modeling the parallel execution policy of atomic lan-
guage services. Moreover, Otani et al. have introduced Language Mashup to combine
different licensed services, commercial language services, and open-sourced language
services.
The third part focuses on the development of language resources and services,
especially low-resource languages. Aili and Wushouer describe how to build Uyghur
language resources such as dependency Treebank and grammatical information dic-
tionary.Martadinataetal.explainhowtoimplementalanguageidentificationtoolwith
WikipediacorpusandTwitterdata.Thistoolisusefulforclassifyingsocialmediaposts
like Twitter into several regional languages in Indonesia, which can contribute to
monolingual corpora creation in those languages.
The fourth part collects reports on language service application. Luong et al. have
developed a Vietnamese multimedia agricultural information retrieval service using a
Vietnameseagriculturalthesaurus.LiuandGaohaveproposedanapproachtominethe
opinion polarity of songs based on song lyrics in a multilingual environment. Gratta
et al. have presented the Cooperative Philology WordNet Platform (CoPhiWordNet)
that connects different WordNets in both modern and classical languages such as the
Ancient Greek WordNet, the Latin WordNet, the Italian WordNet, the Croatian
WordNet,and theArabicWordNet. SornlertlamvanichandKruengkraihaveapplieda
semantic relation extraction approach based on simple relation templates to the Thai
cultural database for generating knowledge maps and infoboxes.
Wehopethisbookwillstronglysupportandencourageresearcherswhoarewilling
to utilize various language services worldwide to create customized language appli-
cationsandmultilingualenvironments.Wearegratefultoalltheparticipantsandthose
who have supported this workshop.
January 2016 Yohei Murakami
Donghui Lin
Organization
WLSI 2015 was organized by the Language Grid Project, Ishida and Matsubara
Laboratory, Department of Social Informatics, Kyoto University.
Organizing Committee
Workshop Co-chairs
Yohei Murakami Kyoto University, Japan
Donghui Lin Kyoto University, Japan
Program Committee
Mirna Adriani University of Indonesia, Indonesia
Mairehaba Aili Xinjiang University, China
Núria Bel Universitat Pompeu Fabra, Spain
Nicoletta Calzolari CNR-ILC, Italy
Khalid Choukri ELDA, France
Luca Dini Ho2S, France
Riccardo Del Gratta CNR-ILC, Italy
Zhiqiang Gao Southeast University, China
Nancy Ide Vassar College, USA
Hitoshi Isahara Toyohashi University of Technology, Japan
Toru Ishida Kyoto University, Japan
Yoshinobu Kano NII, Japan
Monica Monachini CNR-ILC, Italy
Weinila Mushajiang Xinjiang University, China
Masayuki Otani Kyoto University, Japan
Stelios Piperidis ILSP, Greece
James Pustejovsky Brandeis University, USA
Vu Hai Quan University of Natural Sciences, Vietnam National
University, Vietnam
Virach Sornlertlamvanich SIIT, Thailand
Workshop Secretariat
Terumi Kosugi Kyoto University, Japan
Hiroko Yamaguchi Kyoto University, Japan
Sponsor
Grant-in-Aid for Scientific Research (S) (No. 24220002), JSPS
Contents
Metadata and Annotation for Language Services
Combining and Extending Data Infrastructures with Linguistic Annotation
Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Stelios Piperidis, Dimitrios Galanis, Juli Bakagianni,
and Sokratis Sofianopoulos
The Language Application Grid Web Service Exchange Vocabulary . . . . . . . 18
Nancy Ide, Keith Suderman, Marc Verhagen, and James Pustejovsky
The LAPPS Interchange Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Marc Verhagen, Keith Suderman, Di Wang, Nancy Ide, Chunqi Shi,
Jonathan Wright, and James Pustejovsky
Service Platform and Service Management
The Language Application Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Nancy Ide, James Pustejovsky, Christopher Cieri, Eric Nyberg,
Denise DiPersio, Chunqi Shi, Keith Suderman, Marc Verhagen,
Di Wang, and Jonathan Wright
A Policy-Aware Parallel Execution Control Mechanism for Language
Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Mai Xuan Trang, Yohei Murakami, and Toru Ishida
ALicense Scheme for a Global FederatedLanguage ServiceInfrastructure. . . 86
Christopher Cieri and Denise DiPersio
Language Mashup: Personal Grid for Language Resources. . . . . . . . . . . . . . 99
Masayuki Otani, Takao Nakaguchi, Donghui Lin, Yohei Murakami,
and Toru Ishida
Developing Language Resources and Services
Building Indonesian Local Language Detection Tools Using Wikipedia
Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Puji Martadinata, Bayu Distiawan Trisedya, Hisar Maruli Manurung,
and Mirna Adriani
Building Uyghur Dependency Treebank: Design Principles, Annotation
Schema and Tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Mairehaba Aili, Aziguli Xialifu, Maihefureti, and Saimaiti Maimaitimin
X Contents
Building Contemporary Uyghur Grammatical Information Dictionary . . . . . . 137
Jiamila Wushouer, Wayiti Abulizi, Kahaerjiang Abiderexiti,
Tuergen Yibulayin, Maierhaba Aili, and Saimaiti Maimaitimin
Language Service Applications
Vietnamese Multimedia Agricultural Information Retrieval System as an
Info Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Thi H. Luong, Nhut M. Pham, and Quan H. Vu
Mining Opinion Polarity from Multilingual Song Lyrics . . . . . . . . . . . . . . . 161
Qian Liu and Zhiqiang Gao
Cooperative Philology on the Way to Web Services: The Case of the
CoPhiWordNet Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Riccardo Del Gratta, Federico Boschetti, Angelo Del Grosso,
Fahad Khan, and Monica Monachini
EffectivenessofKeywordandSemanticRelationExtractionforKnowledge
Map Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Virach Sornlertlamvanich and Canasai Kruengkrai
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201