Table Of ContentUNIVERSITY OF TRENTO
DOCTORAL THESIS
Bridging Sensor Data Streams and Human
Knowledge
Author: Supervisor:
Mattia ZENI Prof. Fausto GIUNCHIGLIA
Athesissubmittedinfulfillmentoftherequirements
forthedegreeofDoctorofPhilosophy
inthe
KnowdiveGroup
DepartmentorInformationEngineeringandComputerScience
November2,2017
iii
Declaration of Authorship
I, Mattia ZENI, declare that this thesis titled, “Bridging Sensor Data Streams and
HumanKnowledge”andtheworkpresentedinitaremyown. Iconfirmthat:
• Thisworkwasdonewhollyormainlywhileincandidatureforaresearchde-
greeatthisUniversity.
• Where any part of this thesis has previously been submitted for a degree or
anyotherqualificationatthisUniversityoranyotherinstitution,thishasbeen
clearlystated.
• Where I have consulted the published work of others, this is always clearly
attributed.
• WhereIhavequotedfromtheworkofothers,thesourceisalwaysgiven. With
theexceptionofsuchquotations,thisthesisisentirelymyownwork.
• Ihaveacknowledgedallmainsourcesofhelp.
• Where the thesis is based on work done by myself jointly with others, I have
madeclearexactlywhatwasdonebyothersandwhatIhavecontributedmy-
self.
Signed:
Date: November2,2017
v
“Ithinkthat’sthesinglebestpieceofadvice: constantlythinkabouthowyoucouldbedoing
thingsbetterandquestioningyourself.”
ElonMusk
vii
Abstract
BridgingSensorDataStreamsandHumanKnowledge
byMattia ZENI
Generatingusefulknowledgeoutofpersonalbigdatainformofsensorstreamsis
adifficulttaskthatpresentsmultiplechallengesduetotheintrinsiccharacteristicsof
these type of data, namely their volume, velocity, variety and noisiness. This prob-
lemisawell-knownlongstandingproblemincomputersciencecalledtheSemantic
GapProblem. Itwasoriginallydefinedintheresearchareaofimageprocessingas"...
thelackofcoincidencebetweentheinformationthatonecanextractfromthevisual
dataandtheinterpretationthatthesamedatahaveforauserinagivensituation..."
[Smeulders et al., 2000]. In the context of this work, the lack of coincidence is be-
tweenlow-levelrawstreamingsensordatacollectedbysensorsinamachine-readable
formatandhigher-levelsemanticknowledgethatcanbegeneratedfromthesedataand
thatonlyhumanscanunderstandthankstotheirintelligence,habitsandroutines.
This thesis addresses the semantic gap problem in the context above, propos-
ing an interdisciplinary approach able to generate human level knowledge from
streaming sensor data in open domains. It leverages on two different research
fields: one regarding the collection, management and analysis of big data and the
field of semantic computing, focused on ontologies, which respectively map to the
twoelementsofthesemanticgapmentionedabove.
Thecontributionsofthisthesisare:
• The definition of a methodology based on the idea that the user and the world
surroundinghimcanbemodeled,definingmostoftheelementsofhercontext
as entities (locations, people, objects, among other, and the relations among
them) in addition with the attributes for all of them. The modeling aspects
of this ontology are outside of the scope of this work. Having such a struc-
ture, the task of bridging the semantic gap is divided in many, less complex,
modularandcompositionalmicro-tasksthatarewhichconsistinmappingthe
streaming sensor data using contextual information to the attribute values of
thecorrespondingentities. Inthiswaywecancreateastructureoutoftheun-
structured,noisyandhighlyvariablesensordatathatcanthenbeusedbythe
machinetoprovidepersonalized,context-awareservicestothefinaluser;
• Thedefinitionofareferencearchitecturethatappliesthemethodologyaboveand
addressesthesemanticgapprobleminstreamingsensordata;
• The instantiation of the architecture above in the Stream Base System (SB), re-
sulting in the implementation of its main components using state-of-the-art
softwaresolutionsandtechnologies;
• TheadoptionoftheStreamBaseSysteminfourusecasesthathaveverydiffer-
entobjectivesonerespecttotheother,provingthatitworksinopendomains.
Keywords: Big Data, Ubiquitous Computing, Pervasive Computing, Context
AwareSystems,ComputationalHumanism,SemanticGap,SensorData,Knowledge
ix
Acknowledgements
I feel that this section is the best opportunity I have to thank all of the people who
have helped me throughout my graduate career, and probably the last good op-
portunity to express my gratitude, in writing, to the many individuals who have
supportedmeinthislongandperilousjourney.
Firstly, I would like to express my sincere gratitude to my advisor Prof. Fausto
Giunchiglia for the continuous support of my Ph.D study and related research, for
his patience, motivation, and immense knowledge. His guidance helped me in all
the time of research and writing of this thesis. I could not have imagined having
a better advisor and mentor for my Ph.D study. He also taught me many more
things than simply scholarly matters, which this section is far too short to list in
theirentirety. Intheend,wehadfuntogether.
I would like to thank all the Knowdive members for their devices to test my
applicationsandmoreimportantlyforthestimulatingdiscussions,forthesleepless
nightswewereworkingtogetherbeforedeadlines, andforallthefunwehavehad
inthelastfouryears.
Another important thank goes to my dear colleague Enrico, who helped me
manytimesindifferentsituations,bothfromanacademicbutalsoapersonalpoint
of view. You’re a good friend and we’ve been through a lot. I hope to continue
workingwithyouanddogreatthingstogether.
Onamorepersonallevel,Imustthankmypatientandunderstandinggirlfriend
Valentinawhosupportedmefromthebeginning. Shehasnotonlyacceptedmyself
afflictedimpoverishmentbuthasfedandclothedmeonoccasion. Mostimportantly,
IhaveneverhaddifficultyinleavingworkintheofficeascominghometoValentina
is coming home to the most important girl in the world. Her support in my life
outsideandinsideacademiclifehasbeenandisinvaluableandcannotbeproperly
expressedhereinfewlines. Shehelpedmeinverydifficultmomentsofmylifeand
mostlikelyIwon’tbeherewithouther.
MattiaZeni
UniversityofTrento
December2017
Theworkcompiledinthisthesishasbeenpartiallysupportedby:
• theEuropeanUnion’sHorizon2020(H2020)researchandinnovationprogrammeun-
dergrantagreementn. 732194,QROWD-BecauseBigDataIntegrationisHumanly
Possiblehttp://www.qrowd-project.eu/
• the European Union’s Seventh Framework Program (FP7) under grant agreement
600584, Smart Society - Hybrid and Diversity-aware Collective Adaptive Systems:
WherePeopleMeetMachinestoBuildSmarterSocieties
http://www.smart-society-project.eu/
Description:For this reason, is it not feasible to depend on a system archi- tecture that runs aries", where the respondents fill the data in real time as the day progresses [Juster and Stafford .. In: Mobile Data Management (MDM), 2011 12th.