Table Of ContentLecture Notes in Computer Science 6263
CommencedPublicationin1973
FoundingandFormerSeriesEditors:
GerhardGoos,JurisHartmanis,andJanvanLeeuwen
EditorialBoard
DavidHutchison
LancasterUniversity,UK
TakeoKanade
CarnegieMellonUniversity,Pittsburgh,PA,USA
JosefKittler
UniversityofSurrey,Guildford,UK
JonM.Kleinberg
CornellUniversity,Ithaca,NY,USA
AlfredKobsa
UniversityofCalifornia,Irvine,CA,USA
FriedemannMattern
ETHZurich,Switzerland
JohnC.Mitchell
StanfordUniversity,CA,USA
MoniNaor
WeizmannInstituteofScience,Rehovot,Israel
OscarNierstrasz
UniversityofBern,Switzerland
C.PanduRangan
IndianInstituteofTechnology,Madras,India
BernhardSteffen
TUDortmundUniversity,Germany
MadhuSudan
MicrosoftResearch,Cambridge,MA,USA
DemetriTerzopoulos
UniversityofCalifornia,LosAngeles,CA,USA
DougTygar
UniversityofCalifornia,Berkeley,CA,USA
GerhardWeikum
Max-PlanckInstituteofComputerScience,Saarbruecken,Germany
Torben Bach Pedersen Mukesh K. Mohania
A Min Tjoa (Eds.)
Data Warehousing and
Knowledge Discovery
12th International Conference, DaWaK 2010
Bilbao, Spain, August/September 2010
Proceedings
1 3
VolumeEditors
TorbenBachPedersen
AalborgUniversitySelma
DepartmentofComputerScience
LagerløfsVej300
9220Aalborg,Denmark
E-mail:[email protected]
MukeshK.Mohania
IBMIndiaResearchLab
4,BlockC,InstitutionalArea,VasantKunj
NewDelhi110070,India
E-mail:[email protected]
AMinTjoa
ViennaUniversityofTechnology
InstituteofSoftwareTechnologyandInteractiveSystems
Favoritenstr.9/188
1040Wien,Austria
E-mail:[email protected]
LibraryofCongressControlNumber:2010931871
CRSubjectClassification(1998):H.2,H.2.8,H.3,H.4,J.1,H.5
LNCSSublibrary:SL3–InformationSystemsandApplication,incl.Internet/Web
andHCI
ISSN 0302-9743
ISBN-10 3-642-15104-3SpringerBerlinHeidelbergNewYork
ISBN-13 978-3-642-15104-0SpringerBerlinHeidelbergNewYork
Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis
concerned,specificallytherightsoftranslation,reprinting,re-useofillustrations,recitation,broadcasting,
reproductiononmicrofilmsorinanyotherway,andstorageindatabanks.Duplicationofthispublication
orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9,1965,
initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violationsareliable
toprosecutionundertheGermanCopyrightLaw.
springer.com
©Springer-VerlagBerlinHeidelberg2010
PrintedinGermany
Typesetting:Camera-readybyauthor,dataconversionbyScientificPublishingServices,Chennai,India
Printedonacid-freepaper 06/3180
Preface 
 
Data warehousing and knowledge discovery has been widely accepted as a key tech-
nology for enterprises and organizations to improve their abilities in data analysis, 
decision support, and the automatic extraction of knowledge from data. With the 
exponentially growing amount of information to be included in the decision-making 
process, the data to be considered become more and more complex in both structure 
and semantics. New developments such as cloud computing add to the challenges 
with massive scaling, a new computing infrastructure, and new types of data.  
Consequently, the process of retrieval and knowledge discovery from this huge 
amount of heterogeneous complex data forms the litmus test for research in the area. 
In the last decade, the International Conference on Data Warehousing and Knowl-
edge Discovery (DaWaK) has become one of the most important international scien-
tific events bringing together researchers, developers, and practitioners to discuss the 
latest research issues and experiences in developing and deploying data warehousing 
and knowledge discovery systems, applications, and solutions.  
This year’s conference, the 12th International Conference on Data Warehousing 
and Knowledge Discovery (DaWaK 2010), continued the tradition by discussing and 
disseminating innovative principles, methods, algorithms, and solutions to challeng-
ing problems faced in the development of data warehousing, knowledge discovery, 
the emerging area of "cloud intelligence," and applications within these areas. In order 
to better reflect novel trends and the diversity of topics, the conference was organized 
in four tracks: Cloud Intelligence, Data Warehousing, Knowledge Discovery, and 
Industry and Applications.  
The papers presented at DaWaK 2010 covered a wide range of topics within cloud 
intelligence, data warehousing, knowledge discovery, and applications. The topics 
included data warehouse modeling, spatial data warehouses, mining social networks 
and graphs, physical data warehouse design, dependency mining, business intelli-
gence and analytics, outlier and image mining, pattern mining, and data cleaning and 
variable selection.   
It was encouraging to see that many papers covered emerging important issues 
such as social network data, spatio-temporal data, streaming data, non-standard pat-
tern types, complex analytical functionality, multimedia data, as well as real-world 
applications. The wide range of topics bears witness to the fact that the data ware-
housing and knowledge discovery field is dynamically responding to the new chal-
lenges posed by novel types of data and applications.  
From 112 submitted abstracts, we received 89 papers from 16 countries in Europe, 
North and South America, Asia, Africa, and Oceania. The Program Committee finally 
selected 26 papers, yielding an acceptance rate of 29%.  
We would like to express our most sincere gratitude to the members of the Pro-
gram Committee and the external reviewers, who made a huge effort to review the 
papers in a timely and thorough manner. Due to the tight timing constraints and the 
high number of submissions, the reviewing and discussion process was a very chal-
lenging task, but the commitment of the reviewers ensured that a very satisfactory 
result was achieved. We would like to thank Alfredo Cuzzocrea for his tireless
VI    Preface 
contributions as Track Chair and Publicity Chair. We would also like to thank all 
authors who submitted papers to DaWaK 2010, for their contribution to making the 
technical program so excellent.  
Finally, we send our warmest thanks to Gabriela Wagner for delivering an out-
standing level of support within all aspects of the practical organization of DaWaK 
2010. We also thank Amin Anjomshoaa for his support with the conference manage-
ment software.  
 
 
August 2010  Torben Bach Pedersen 
  Mukesh Mohania 
A Min Tjoa
Organization 
Program Chairs 
Torben Bach Pedersen  Aalborg University, Denmark 
Mukesh Mohania  IBM India Research Lab, India 
A Min Tjoa  Vienna University of Technology, Austria 
Publicity Chair 
Alfredo Cuzzocrea    ICAR CNR & University of Calabria, Italy  
Program Committee 
Alberto Abello  Universitat Politecnica de Catalunya, Spain 
Ira Assent  Aalborg University, Denmark 
Elena Baralis  Politecnico di Torino, Italy 
Ladjel Bellatreche  ENSMA, France 
Petr Berka  University of Economics, Prague, Czech Republic 
Jorge Bernardino  ISEC - Instituto Superior de Engenharia de Coimbra, 
Portugal 
Mokrane Bouzeghoub  CNRS - Université de Versailles SQY, France 
Stephane Bressan  National University of Singapore, Singapore 
Peter Brezany  University of Vienna, Austria 
Robert Bruckner  Microsoft, USA 
Jesús Cerquides  Universitat de Barcelona, Spain 
Zhiyuan Chen  University of Maryland Baltimore County, USA 
Sunil Choenni  The Netherlands Ministry of Justice, The Netherlands 
Frans Coenen  University of Liverpool, UK 
Bruno Cremilleux  Université de Caen, France 
Alfredo Cuzzocrea  ICAR-CNR & University of Calabria, Italy  
Agnieszka  Dardzinska  Bialystok University of Technology, Poland 
Karen Davis  University of Cincinnati, USA 
Kevin Desouza  University of Washington, USA 
Curtis Dyreson  Utah State University, USA 
Todd Eavis  Concordia University, Canada 
Johann Eder  University of Klagenfurt, Austria 
Tapio Elomaa  Tampere University of Technology, Finland 
Roberto Esposito  Università di Torino, Italy 
Vladimir Estivill-Castro  Griffith University, Australia 
Christie Ezeife  School of Computer Science, University of Windsor, 
Ontario, Canada 
Jianping Fan  UNC-Charlotte, USA 
Ling Feng  Tsinghua University
VIII   Organization 
Eduardo Fernandez Medina  University of Castilla-La Mancha, Spain 
Dragan Gamberger  Ruder Boskovic Institute, Croatia 
Gyözö Gidófalvi  Royal Institute of Technology (KTH), Sweden 
Matteo Golfarelli  University of Bologna, Italy 
Eui-Hong (Sam) Han  Sears Holdings Corporation, USA 
Wook-Shin Han  Kyungpook National University, Korea 
Jaakko Hollmén  Aalto University School of Science and Technology, 
Finland 
Jimmy Huang  York University, Canada 
Farookh Hussain  Curtin University of Technology, Australia 
Ryutaro Ichise  National Institute of Informatics, Japan 
Mizuho Iwaihara  Waseda University, Japan 
Murat Kantarcioglu  University of Texas at Dallas, USA 
Jinho Kim  Kangwon National University, Korea 
Sang-Wook Kim  Hanyang University, Korea 
Jörg Kindermann  Fraunhofer Institute IAIS, Germany 
Jens Lechtenboerger  Westfälische Wilhelms-Universität Münster,  
Germany 
Wolfgang Lehner  Dresden University of Technology, Germany 
Sanjay Kumar Madria  University of Missouri-Rolla, USA 
Anirban Mondal  University of Tokyo, Japan 
Jose-Norberto Mazón  University of Alicante, Spain 
Ullas Nambiar  IBM Research, India 
Jian Pei  Simon Fraser University, Canada 
Evaggelia Pitoura  University of Ioannina, Greece 
Stefano Rizzi  University of Bologna, Italy 
Alkis Simitsis  HP Labs 
Koichi Takeda  Tokyo Research Laboratory, IBM Research, Japan 
Dimitri Theodoratos  New Jersey Institute of Technology, USA 
Christian Thomsen  Aalborg University, Denmark 
Juan-Carlos Trujillo   
Mondéjar  University of Alicante, Spain 
Vincent Shin-Mu Tseng  National Cheng Kung University, Taiwan 
Panos Vassiliadis  University of Ioannina, Greece 
Wolfram Woess  University of Linz, Austria 
Robert Wrembel  Poznan University of Technology, Poland 
Man Lung Yiu  Hong Kong Polytechnic University, Hong Kong 
Qiankun Zhao  Telefonica, Spain 
Xiaofang Zhou  University of Queensland, Australia 
Esteban Zimányi  Université Libre de Bruxelles, Belgium 
External Reviewers 
Timo Aho  Rajkumar Bondugula 
Annalisa Appice  Panos Bouros 
Ryan Bissell-Siders  Giulia Bruno
Organization  IX  
Peggy Cellier  Christian Koncilia  
Tania Cerquitelli  Jussi Kujala 
Eugenio Cesario  Stefano Lodi 
Fabio Fassetti  Jose-Norberto Mazón 
Christina Feilmayr  Francois Rioult 
Alessandro Fiori  Paolo Serafino 
Paolo Garza  Jose Jacobo Zubcoff 
Teemu Heinimäki
Table of Contents
Data Warehouse Modeling and Spatial Data
Warehouses
Logic Programming for Data Warehouse Conceptual Schema
Validation ...................................................... 1
Carlo dell’Aquila, Francesco Di Tria, Ezio Lefons, and
Filippo Tangorra
A Model-Driven Heuristic Approach for Detecting Multidimensional
Facts in Relational Data Sources................................... 13
Andrea Carm`e, Jose-Norberto Mazo´n, and Stefano Rizzi
Physical Design and Implementation of Spatial Data Warehouses
Supporting Continuous Fields ..................................... 25
Leticia Go´mez, Alejandro Vaisman, and Esteban Zim´anyi
Benchmarking Spatial Data Warehouses ............................ 40
Thiago Lu´ıs Lopes Siqueira, Ricardo Rodrigues Ciferri,
Val´eria Cesa´rio Times, and Cristina Dutra de Aguiar Ciferri
Mining Social Networks and Graphs
Discovering Community-Oriented Roles of Nodes in a Social Network... 52
Bin-Hui Chou and Einoshin Suzuki
A Graph-Based Clustering Scheme for Identifying Related Tags in
Folksonomies .................................................... 65
Symeon Papadopoulos, Yiannis Kompatsiaris, and Athena Vakali
Frequent Sub-graph Mining on Edge Weighted Graphs................ 77
Chuntao Jiang, Frans Coenen, and Michele Zito
Physical Data Warehouse Design
F&A: A Methodology for Effectively and Efficiently Designing Parallel
Relational Data Warehouses on Heterogenous Database Clusters ....... 89
Ladjel Bellatreche, Alfredo Cuzzocrea, and Soumia Benkrid
Yet Another Algorithms for Selecting Bitmap Join Indexes ............ 105
Ladjel Bellatreche and Kamel Boukhalfa
Speeding Up Queries in Column Stores: A Case for Compression....... 117
Christian Lemke, Kai-Uwe Sattler, Franz Faerber, and
Alexander Zeier
XII Table of Contents
Dependency Mining
Mining Non-redundant Information-Theoretic Dependencies between
Itemsets ........................................................ 130
Michael Mampaey
Discovery and Application of Functional Dependencies in Conjunctive
Query Mining ................................................... 142
Bart Goethals, Dominique Laurent, and Wim Le Page
Using Transitivity to Increase the Accuracy of Sample-Based Pearson
CorrelationCoefficients ........................................... 157
Taylor Phillips, Chris GauthierDickey, and Ramki Thurimella
Business Intelligence and Analytics
The NOX Framework: Native Language Queries for Business
Intelligence Applications .......................................... 172
Todd Eavis, Hiba Tabbara, and Ahmad Taleb
Experience in Extending Query Engine for Continuous Analytics....... 190
Qiming Chen and Meichun Hsu
Development of a Business Intelligence Environment for e-Gov Using
Open Source Technologies......................................... 203
Eduardo Zanoni Marques, Rodrigo Sanches Miani,
Everton Luiz de Almeida Gago Ju´nior, and
Leonardo de Souza Mendes
Outlier and Image Mining
A Fast Randomized Method for Local Density-Based Outlier Detection
in High Dimensional Data......................................... 215
Minh Quoc Nguyen, Edward Omiecinski, Leo Mark, and Danesh Irani
Specialty Mining................................................. 227
Hanuma Kumar, Rohit Paravastu, and Vikram Pudi
Region of Interest Based Image Categorization....................... 239
Ashraf Elsayed, Frans Coenen, Marta Garc´ıa-Fin˜ana, and
Vanessa Sluming
Pattern Mining
Meta-learning for Post-processingof Association Rules................ 251
Petr Berka and Jan Rauch