Table Of ContentStudies in Computational Intelligence 487
Cai-Nicolas Ziegler
Social Web Artifacts for
Boosting Recommenders
Theory and Implementation
1 3
Studies in Computational Intelligence
Volume 487
SeriesEditor
J.Kacprzyk,Warsaw,Poland
Forfurthervolumes:
http://www.springer.com/series/7092
Cai-Nicolas Ziegler
Social Web Artifacts
for Boosting Recommenders
Theory and Implementation
ABC
PDDr.Cai-NicolasZiegler
PAYBACKGmbH(AmericanExpress)
Albert-Ludwigs-UniversitätFreiburgi.Br.
München
Germany
ISSN1860-949X ISSN1860-9503 (electronic)
ISBN978-3-319-00526-3 ISBN978-3-319-00527-0 (eBook)
DOI10.1007/978-3-319-00527-0
SpringerChamHeidelbergNewYorkDordrechtLondon
LibraryofCongressControlNumber:2013937342
(cid:2)c SpringerInternationalPublishingSwitzerland2013
Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof
thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,
broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation
storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology
nowknownorhereafterdeveloped.Exemptedfromthislegalreservationarebriefexcerptsinconnection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of
this publication or parts thereof is permitted only under the provisions of the Copyright Law of the
Publisher’slocation,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.
PermissionsforusemaybeobtainedthroughRightsLinkattheCopyrightClearanceCenter.Violations
areliabletoprosecutionundertherespectiveCopyrightLaw.
Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication
doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant
protectivelawsandregulationsandthereforefreeforgeneraluse.
Whiletheadviceandinformationinthisbookarebelievedtobetrueandaccurateatthedateofpub-
lication,neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityforany
errorsoromissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,withrespect
tothematerialcontainedherein.
Printedonacid-freepaper
SpringerispartofSpringerScience+BusinessMedia(www.springer.com)
Foreword
IfirstmetDr.ZieglerwhenhewasaPh.D.studentspendingafewmonthsvisiting
ourGroupLensResearchlab inMinnesota.FromthefirstI couldtellhe wasa re-
searcher of unusualvision, not contentto work within the boundsof the previous
literatureonrecommenders,butwantingtounderstandhowtheearlyrecommender
toolscouldbereshapedtomeetneedsthattheirusersdidn’tevenimaginetheyhad
yet.Hewasparticularlyinterestedinunderstandingthefitbetweenrecommenders
asthesemagicaldevicesthatwereexpectedtosurpriseanddelighttheirusers,and
theusers’realinformationneedsacrossavarietyofinterests.
Dr.Ziegler’stimewithGroupLenswasfruitful,producinganexcellentpaperthat
characterizedtheproblemoftopicdiversification.Thekeyinsightinthisworkisthat
evenoncearecommenderalgorithmhaszoomedinaccuratelyonauser’sinterests,
providinga set of results composedof the items that are individuallypredictedto
be the most interesting may lead to a bored user. For instance, a user who loves
Star Trek movies might like to have one new Star Trek movie recommended,but
willcertainlybedisappointedtohavealistcontainingonlythetenmostrecentStar
Trek movies.Paradoxically,therecommendermustrecommenditemsthatareless
preferredinordertoproducealistthatismorepreferred.Dr.Ziegler’spaperdefined
thisproblemcarefully,gaveafirmmathematicalfoundationrichenoughtosupport
a variety of approachesto diversification, and demonstrated that in practice users
preferthemorediverselists.
ThisworkisthereforecharacteristicofDr.Ziegler.Hefoundacoreproblemthat
waspoorlyunderstood,gaveitastrongfoundation,andhelpedthecommunitysee
its importance. His paper on this mentioned topic is highly cited in the research
literaturetothisday.
Thisbookcontainsarichsetofexamplesofthisresearchapproachinpractice,in
severalkeydomains.Inadditiontoaframingoftherecommendationproblem,there
are three deep contributions of the book to a richer understanding of recommen-
dation: topic diversification (which we have already discussed), taxonomy-driven
filtering,andtrustmodels.
Thekeyideabehindtaxonomy-drivenfilteringisthatusersoftenhavedifferent
levels of interest for different parts of the taxonomyof an informationspace. For
VI Foreword
instance, one user who works with the Java programming language may be par-
ticularly interested in new work on the type system, while another user may be
mostinterestedinJava-basedWebcontainers.Recommendersthatareawareofthe
possibilityofthesedifferencescangainpredictivepower.Intheearlydaysofrec-
ommendersystemstaxonomiessuchasthesewerenotavailabletouse,evenifthe
algorithmictools hadbeen available.A keycontributionof the bookis to demon-
stratethattodaytherearetworichsourcesofsuchinformation.First,severalSocial
Webprojectsarecreatinglarge,openinformationtaxonomies,suchasthecategory
hierarchyinWikipedia.Second,powerfulmethodsoftextprocessingenabletheau-
tomatic extraction of taxonomiesfrom textualinformationspaces. Lookingto the
future, we can predictthat such tools will soon work for music, photos, and even
movies.Miningandmakinguseofthesetaxonomiesopensthepotentialforpower-
fulnewapproachestorecommendation.
Thekeyideabehindtrustmodelsisthathumanshavenotionsoftrustthatarenot
always compatible with the recommendations from a “black box” recommender.
Exposinghowtherecommendermodelworks,and,crucially,exposingwhichother
humanshavecontributedtoa setofrecommendationscanhavea biginfluenceon
howmuchtherecipientoftherecommendationstrustthem.Thisbookexplorestrust
modelsbasedonhomophilybetweenmembersofarecommendercommunity.Over
the long term we can expect trust models like these that cross communities, that
canbemanipulatedbytheenduser(“no,Idon’ttrustthatguy!”),andthatprovide
explanations for why a recommendation can be trusted (“your friends Alice and
Bethbothlikedthisquadricopter,soyouprobablywilltoo”).
Dr.Zieglerisavisionaryscientist,andthisbookdemonstrateshiskeeninsightto
newapproachestothinkingaboutrecommendationthatarenowbeingexploredby
hundredsof otherscientists worldwide.Inreadingthisbookyouwillengagewith
important problems in recommendation, and will see how thinking deeply about
userneedsleadstofreshinsightsintotechnologicalpossibilities.
Minneapolis,USA JohnRiedl
March2013
Preface
Recommender systems, those software programs that learn from human behavior
and make predictions of what products or services we are expected to appreciate
andthuspurchase,havebecomeanintegralpartofoureverydaylife.Theyprolifer-
ateacrosselectroniccommercearoundtheglobe.Take,forinstance,Amazon.com,
thefirstcommerciallysuccessfulandprominentexampleofsuchsystems,making
use of a broad range of recommendersystem types: The companyis said to have
experiencedsignificantdouble-digitgrowthinsalessolelythroughpersonalization,
thusrepresentinganimpressiveuplift.Numeroussystemshavefollowedandtoday,
recommendersystemsexistforvirtuallyallsortsofconsumablegoods,e.g.,books,
movies,music,andevenjokes.
Atthesametime,anewevolutionontheWebhasstartedtotakeshape,knownas
“participationage”,“collectivewisdom”,and–mostwidelyusedtoday–“Web2.0”
or “Social Web”: Consumer-generatedmedia and content has become rife, social
networkshaveemergedandarepullingsignificantsharesoftheoverallWebtraffic.
Inlinewiththesedevelopments,novelinformationandknowledgestructureshave
becomereadilyavailableontheWeb:People’spersonaltiesandtrustlinks,human-
crafted large taxonomies for organizing and categorizing all kinds of items. For
example,themassiveDMOZOpenDirectoryProjectthathastakenonthechallenge
tocategorizetheentireWebbyitsclassificationsystem.
This textbook presents approaches to exploit the new Social Web fountain of
knowledge,zeroinginfirstandforemostontwoinformationartifacts,namelyclas-
sificationtaxonomiesandtrustnetworks.Thesetwoareusedtoimprovetheperfor-
manceofproduct-focusedrecommendersystems:While classification taxonomies
are appropriate means to fight the sparsity problem prevalent in many productive
recommendersystems, interpersonaltrust ties – when used as proxiesfor interest
similarity–areabletomitigatetherecommenders’scalabilityproblem.
While maintainingthe principalfocusof improvingproductrecommendersys-
tems through taxonomies and trust, several digressions from this main theme are
included,suchastheuseofWeb2.0taxonomiesforcomputingthesemanticprox-
imity of named entity pairs, or the recommending of technology synergies based
on Wikipedia and our semantic proximity framework. These slight digressions,
VIII Preface
however, make the book even more valuable by adding perspectives of what else
canbeachievedwiththosepreciousinstrumentsofknowledgethatcanbecreated
fromtheWeb2.0’sovertlyaccessiblerawmaterialofdataandinformation.
Mu¨nchen,Germany Cai-NicolasZiegler
March2013
Acknowledgements
MostoftheresearchpresentedinthistextbookhasbeenconductedduringmyPh.D.
periodattheAlbert-Ludwigs-Universita¨tFreiburgi.Br.,Germany,aswellasGroup-
LensResearchattheUniversityofMinnesota,USA.
Aboveall,IwouldliketothankProf.Dr.GeorgLausen,mysupervisoratDBIS,
the Institute of Databases and Information Systems in Freiburg. He has been my
mentorthroughoutmy Ph.D. period,and hascontinuedto be so ever since. I owe
himalotandvaluehimnotonlyforhiswork,butalsoforthepersonheis.
Iwouldalsoliketothankmysecondsupervisor,Prof.Dr.JosephA.Konstan,as
wellasProf.Dr.JohnRiedl,bothfromtheGroupLensResearchlabinMinneapolis.
These two subject matter experts have provided fresh new input from a different,
more HCI-focused perspective, which makes this book even more valuable to the
reader.
AbigthanksalsogoestoProf.Dr.Dr.LarsSchmidt-Thieme,whohasintroduced
meto methodsofquantifyingtheperformanceofrecommendersystemsin offline
experiments. It is now many years ago that I first came to his office in order to
discusscollaborativefiltering.AndthatIcameoutofitwithawealthofnewinsights
andknowledge.
Mygratitudeisexpressedalsototheresearcherswhohelpedmealongtheway,
particularly Dr. Paolo Massa, Zvi Topol, Ernesto D´ıaz-Avile´s, Prof. Dr. Jennifer
Golbeck,Dr.SeanM.McNee,Prof.Dr.DanCosley,Dr.MaximilianViermetz,and
Dr.StefanJung.ItgoesontoRonHornbakerandErikBenson,maintainingtheAll
ConsumingandBookCrossingcommunity,respectively.Theyprovidedthecommu-
nitydataforrenderingtheonlineuserstudiespossible.
Uponcoveringtheresearchsideofcontributions,Inowswitchtothemoreemo-
tionalones:Namelymyfamily,myparentsKlausandAngelika,whohavealways
been there for me. As well as my “little” brotherChris. And for sure my beloved
wifeMiriam,thebestthateverhappenedtomeinmylife.
To myparents,andChris,my“little”brother.
Description:Recommender systems, software programs that learn from human behavior and make predictions of what products we are expected to appreciate and purchase, have become an integral part of our everyday life. They proliferate across electronic commerce around the globe and exist for virtually all sorts of