Table Of ContentLecture Notes in Computer Science 5831
CommencedPublicationin1973
FoundingandFormerSeriesEditors:
GerhardGoos,JurisHartmanis,andJanvanLeeuwen
EditorialBoard
DavidHutchison
LancasterUniversity,UK
TakeoKanade
CarnegieMellonUniversity,Pittsburgh,PA,USA
JosefKittler
UniversityofSurrey,Guildford,UK
JonM.Kleinberg
CornellUniversity,Ithaca,NY,USA
AlfredKobsa
UniversityofCalifornia,Irvine,CA,USA
FriedemannMattern
ETHZurich,Switzerland
JohnC.Mitchell
StanfordUniversity,CA,USA
MoniNaor
WeizmannInstituteofScience,Rehovot,Israel
OscarNierstrasz
UniversityofBern,Switzerland
C.PanduRangan
IndianInstituteofTechnology,Madras,India
BernhardSteffen
UniversityofDortmund,Germany
MadhuSudan
MicrosoftResearch,Cambridge,MA,USA
DemetriTerzopoulos
UniversityofCalifornia,LosAngeles,CA,USA
DougTygar
UniversityofCalifornia,Berkeley,CA,USA
GerhardWeikum
Max-PlanckInstituteofComputerScience,Saarbruecken,Germany
Yishai A. Feldman
Donald Kraft
Tsvi Kuflik (Eds.)
Next Generation
Information Technologies
and Systems
7th International Conference, NGITS 2009
Haifa, Israel, June 16-18, 2009
Revised Selected Papers
1 3
VolumeEditors
YishaiA.Feldman
IBMHaifaResearchLab
HaifaUniversityCampus,MountCarmel,Haifa31905,Israel
E-mail:[email protected]
DonaldKraft
U.S.AirForceAcademy
DepartmentofComputerScience
2354FairchildDrive,Suite6G-101,ColoradoSprings,CO80840,USA
E-mail:[email protected]
TsviKuflik
TheUniversityofHaifa
ManagementInformationSystemsDepartment
MountCarmel,Haifa31905,Israel
E-mail:[email protected]
LibraryofCongressControlNumber:2009935905
CRSubjectClassification(1998):H.4,H.3,H.5,H.2,D.2.12,C.2.4
LNCSSublibrary:SL3–InformationSystemsandApplication,incl.Internet/Web
andHCI
ISSN 0302-9743
ISBN-10 3-642-04940-0SpringerBerlinHeidelbergNewYork
ISBN-13 978-3-642-04940-8SpringerBerlinHeidelbergNewYork
Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis
concerned,specificallytherightsoftranslation,reprinting,re-useofillustrations,recitation,broadcasting,
reproductiononmicrofilmsorinanyotherway,andstorageindatabanks.Duplicationofthispublication
orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9,1965,
initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violationsareliable
toprosecutionundertheGermanCopyrightLaw.
springer.com
©Springer-VerlagBerlinHeidelberg2009
PrintedinGermany
Typesetting:Camera-readybyauthor,dataconversionbyScientificPublishingServices,Chennai,India
Printedonacid-freepaper SPIN:12772971 06/3180 543210
Foreword
Information technology is a rapidly changing field in which researchers and develop-
ers must continuously set their vision on the next generation of technologies and the
systems that they enable. The Next Generation Information Technologies and Systems
(NGITS) series of conferences provides a forum for presenting and discussing the
latest advances in information technology. NGITS conferences are international
events held in Israel; previous conferences have taken place in 1993, 1995, 1997,
1999, 2002, and 2006.
In addition to 14 reviewed papers, the conference featured two keynote lectures and
an invited talk by notable experts. The selected papers may be classified roughly in
five broad areas:
• Middleware and Integration
• Modeling
• Healthcare/Biomedical
• Service and Information Management
• Applications
NGITS 2009 also included a demonstration session and an industrial track focusing on
how to make software development more efficient by cutting expenses with technol-
ogy and infrastructures.
This event is the culmination of efforts by many talented and dedicated individuals.
We are pleased to extend our thanks to the authors of all submitted papers, the mem-
bers of the program committee, and the external reviewers. Many thanks are also due
to Nilly Schnapp for local organization and logistics, and to Eugeny Myunster for
managing the web site and all other technical things. Finally, we are pleased to ac-
knowledge the support of our institutional sponsors: The University of Haifa, the
Faculty of Social Sciences and the MIS Department at the University of Haifa, the
IBM Haifa Research Lab, and the Technion.
June 2009 Yishai A. Feldman
Donald Kraft
Tsvi Kuflik
Organization
General Chair
Tsvi Kuflik
Steering Committee
Opher Etzion Avigdor Gal Amihai Motro
Program Committee Chairs
Yishai A. Feldman Donald Kraft
Program Committee
Nabil Adam Manolis Koubarakis
Hamideh Afsarmanesh Maria Jose Martin-Bautista
Mathias Bauer Amnon Meisels
Iris Berger Naftaly Minsky
Dan Berry George Papadopoulos
Elisa Bertino Gabriella Pasi
Gloria Bordogna Mor Peleg
Patrick Bosc Haggai Roitman
Rebecca Cathey Doron Rotem
Jen-Yao Chung Steve Schach
Alessandro D'Atri Pnina Soffer
Asuman Dogac Bracha Shapira
Ophir Frieder Bernhard Thalheim
Mati Golani Eran Toch
Paolo Giorgini Yair Wand
Enrique Herrera-Viedma Ouri Wolfson
David Konopnicki Amiram Yehudai
Additional Reviewers
Gunes Aluc Gokce Banu Laleci Erturkmen
Joel Booth Stefania Marrara
Alessio Maria Braccini Simon Samwel Msanjila
Mariangela Contenti Cagdas Ocalan
Ekatarina Ermilova Aabhas Paliwal
VIII Organization
Andrea Resca Venkatakumar Srinivasan
Basit Shafiq Fulya Tuncer
Michal Shmueli-Scheuer Stefano Za
Local Arrangements
Nilly Schnapp
Website Manager
Eugeny Myunster
Table of Contents
Keynote Lectures
Searching in the “Real World”..................................... 1
Ophir Frieder
Structured Data on the Web ...................................... 2
Alon Y. Halevy
1 Middleware and Integration
Worldwide Accessibility to Yizkor Books............................ 3
Rebecca Cathey, Jason Soo, Ophir Frieder, Michlean Amir, and
Gideon Frieder
Biomedical Information Integration Middleware for Clinical
Genomics ....................................................... 13
Simona Rabinovici-Cohen
2 Modeling
Interpretation of History Pseudostates in Orthogonal States of UML
State Machines .................................................. 26
Anna Derezin´ska and Romuald Pilitowski
System Grokking – A Novel Approach for Software Understanding,
Validation, and Evolution......................................... 38
Maayan Goldstein and Dany Moshkovich
Refactoring of Statecharts......................................... 50
Moria Abadi and Yishai A. Feldman
3 Healthcare/Biomedical
Towards Health 2.0: Mashups to the Rescue ......................... 63
Ohad Greenshpan, Ksenya Kveler, Boaz Carmeli, Haim Nelken, and
Pnina Vortman
Semantic Warehousing of Diverse Biomedical Information ............. 73
Stefano Bianchi, Anna Burla, Costanza Conti, Ariel Farkash,
Carmel Kent, Yonatan Maman, and Amnon Shabo
X Table of Contents
InEDvance: Advanced IT in Support of Emergency Department
Management .................................................... 86
Segev Wasserkrug, Ohad Greenshpan, Yariv N. Marmor,
Boaz Carmeli, Pnina Vortman, Fuad Basis, Dagan Schwartz, and
Avishai Mandelbaum
4 Service and Information Management
Enhancing Text Readability in Damaged Documents ................. 96
Gideon Frieder
ITRA under Partitions ........................................... 97
Aviv Dagan and Eliezer Dekel
Short and Informal Documents: A Probabilistic Model for Description
Enrichment ..................................................... 109
Yuval Merhav and Ophir Frieder
5 Applications
Towards a Pan-EuropeanLearning Resource Exchange Infrastructure... 121
David Massart
Performance Improvement of Fault Tolerant CORBA Based Intelligent
Transportation Systems (ITS) with an Autonomous Agent ............ 133
Woonsuk Suh, Soo Young Lee, and Eunseok Lee
A Platformfor LifeEventDevelopmentin a eGovernmentEnvironment:
The PLEDGE Project............................................ 146
Luis A´lvarez Sabucedo, Luis Anido Rifo´n, and Ruben M´ıguez P´erez
Online Group Deliberation for the Elicitation of Shared Values to
Underpin Decision Making ........................................ 158
Faezeh Afshar, Andrew Stranieri, and John Yearwood
Author Index.................................................. 169
Searching in the “Real World”
(Abstract of Invited Plenary Talk)
Ophir Frieder
Information Retrieval Laboratory
Department of Computer Science
Illinois Institute of Technology
[email protected]
For many, "searching" is considered a mostly solved problem. In fact, for text process-
ing, this belief is factually based. The problem is that most "real world" search appli-
cations involve "complex documents", and such applications are far from solved.
Complex documents, or less formally, "real world documents", comprise of a mixture
of images, text, signatures, tables, logos, water-marks, stamps, etc, and are often avail-
able only in scanned hardcopy formats. Search systems for such document collections
are currently unavailable.
We describe our efforts at building a complex document information processing
(CDIP) prototype. This prototype integrates "point solution" (mature) technologies,
such as OCR capability, signature matching and handwritten word spotting tech-
niques, search and mining approaches, among others, to yield a system capable of
searching "real world documents". The described prototype demonstrates the adage
that "the whole is greater than the sum of its parts".
To evaluate our CDIP prototype as well as to provide an evaluation platform for
future CDIP systems, we also introduced a complex document benchmark. This
benchmark is currently in use by the National Institute of Standards and Technology
(NIST) Text REtrieval Conference (TREC) Legal Track. The details of our complex
document benchmark are similarly presented.
Having described the global approach, we describe some additional point solutions
developed in the IIT Information Retrieval Laboratory. These include an Arabic stem-
mer and a natural language source integration fabric called the Intranet Mediator. In
terms of stemming, we developed and commercially licensed an Arabic stemmer and
search system. Our approach was evaluated using benchmark Arabic collections and
favorably compared against the state of the art. We also focused on source integration
and ease of user interaction. By integrating structured and unstructured sources, we de-
signed, implemented, and commercially licensed our mediator technology that provides a
single, natural language interface to querying distributed sources. Rather than providing
a set of links as possible answers, the described approach actually answers the posed
question. Both the Arabic stemmer and the mediator efforts are likewise discussed.
A summary of the efforts discussed is found in [1].
Reference
1. Frieder, O.: On Searching in the ‘Real World’. In: Argamon, S., Howard, N. (eds.) Com-
putational Methods for Counterterrorism, ch. 1. Springer, Heidelberg (2009) ISBN:
978-3-642-01140-5
Y.A. Feldman, D. Kraft, and T. Kuflik (Eds.): NGITS 2009, LNCS 5831, p. 1, 2009.
© Springer-Verlag Berlin Heidelberg 2009
Structured Data on the Web
Alon Y. Halevy
Google Inc.,
1600 AmphitheatreParkway,
Mountain View, California, 94043,
USA
[email protected]
Abstract of Plenary Talk
Though search on the World-Wide Web has focused mostly on unstructured text,
there is an increasing amount of structured data on the Web and growing interest in
harnessing such data. I will describe several current projects at Google whose overall
goal is to leverage structured data and betterexpose it to our users.
The first project is on crawling the deep web. The deep web refers to content that
resides in databases behindforms, but is unreachable bysearch engines because there
are nolinks tothese pages. I will describe asystem that surfaces pages from thedeep
web by guessing queries to submit to these forms, and entering the results into the
Google index [1]. The pages that we generated using this system come from millions
of forms, hundreds of domains and over 40 languages. Pages from the deep web are
served in thetop-10 results on google.com for over1000 queriesper second.
ThesecondprojectconsidersthecollectionofHTMLtablesontheweb.TheWebTa-
bles Project [2] built a corpus of over 150 million tables from HTML tables on the
Web. The WebTables System addresses the challenges of extracting these tables from
theWeb,andofferssearchoverthiscollectionoftables.Theprojectalsoillustratesthe
potential of leveraging the collection of schemas of these tables.
Finally, I’ll discuss currentwork on computingaspects of queriesin orderto better
organize search results for exploratory queries.
Keywords: Deep web, structured data, heterogeneous databases, data integration.
References
1. Madhavan,J.,Ko,D.,Kot,L.,Ganapathy,V.,Rasmussen,A.,Halevy,A.:Google’s
deep-web crawl. In:Proc. of VLDB,pp. 1241–1252 (2008)
2. Cafarella, M.J., Halevy,A., Zhang, Y., Wang,D.Z., Wu,E.: WebTables: Exploring
thePower of Tables on the Web.In:VLDB (2008)
Y.A.Feldman,D.Kraft,andT.Kuflik(Eds.):NGITS2009,LNCS5831,p.2,2009.
(cid:2)c Springer-VerlagBerlinHeidelberg2009