Table Of ContentOptimal Resource Allocation
in Adaptive Survey Designs
Calinescu Melania, 1983 -
Optimal Resource Allocation in Adaptive Survey Designs
ISBN 978-90-820349-1-2
⃝c I. M. Calinescu, Amsterdam 2013
Allrightsreserved. Nopartofthispublicationmaybereproducedunanyformorby
anyelectronicormechanicalmeans(includingphotocopying,recordingorinformation
storage and retrieval systems) without permission in writing from the author.
Cover design by Jakub Peˇc`anka and Melania Calinescu. The front cover visualizes
world internet connectivity using data from 2009 (sources: www.nationmaster.com
and CIA’s World Factbook). Each country is depicted as a circle with radius given
by a logarithmic transformation of the country’s population. In each pie chart the
green colour indicates the percentage of the country’s population that has access to
internet.
Printed by GVO Drukkers & Vormgevers B.V. | Ponsen & Looijen, Ede
VRIJE UNIVERSITEIT
Optimal Resource Allocation
in Adaptive Survey Designs
ACADEMISCH PROEFSCHRIFT
ter verkrijging van de graad Doctor aan
de Vrije Universiteit Amsterdam,
op gezag van de rector magnificus
prof.dr. F.A. van der Duyn Schouten,
in het openbaar te verdedigen
ten overstaan van de promotiecommissie
van de Faculteit der Exacte Wetenschappen
op woensdag 13 november 2013 om 9.45 uur
in de aula van de universiteit,
De Boelelaan 1105
door
Ionela Melania Calinescu
geboren te Cimpulung, Roemeni¨e
promotor: prof.dr. G.M. Koole
copromotoren: dr. S. Bhulai
dr.ir. J.G. Schouten
Acknowledgements
Obtaining a PhD degree is a long journey full of tough and sweet moments that
requires curiosity and enthusiasm for discovery, but frustration and desperation are
rarelyfaraway. Onthisjourney,manypeoplehavehelpedmegetthroughthetougher
moments, but also shared with me the satisfaction of a job well done. The time has
come for me to thank them for all the support I received.
Mydeepestgratitudegoestomysupervisors,SandjaiBhulaiandBarrySchouten.
Sandjai, if it was not for your encouragement to take on this project I would not be
in this position today. There are many things I value about working with you. I
admireyourconstantenthusiasmandtalentfordevisingelegantsolutionstocomplex
problems. ItwasessentialtomysuccessthatIcouldalwayscountonyoutoanswermy
questions, whether they regarded research, administration or personal development.
I very much appreciated our weekly meetings, which made me work hard to raise
interesting questions and even harder to find suitable answers. I hope there will be
many occasions in the future to continue our discussions.
Barry,Ihavealwaysappreciatedandfeltinspiredbyyourpatienceanddedication
toyourwork. Yourcarefordetailtaughtmehowtobemoreprecisewhileyourdiverse
perspectivesontheprojecthelpedourresearchgrowataveryfastpace. Ihavelearned
a lot about survey methodology during our regular meetings “op de gang” as well as
the “do’s” and “don’ts” in survey practice during the surprisingly adventurous trips
to Heerlen. Your constant interest in promoting our results enabled me to become a
contributor to a Wiley Series book and meet some of the most influential researchers
in the field.
To my promotor, Ger Koole, thank you for giving me the opportunity to join the
OBP group and for the inspiring brainstorming sessions in the Alps.
I also want to thank the reading committee members, James Wagner, Annemieke
Luiten,MathiscadeGunst,BertZwartandRommertDekkerfortheircarefulreading
of my thesis, interesting feedback and for dealing with the flood of emails about my
vi Acknowledgements
defence date.
Additional thanks go to my research group at VU for the great time we had to-
gether. ThankyouAlexforkeepingacriticaleyeonmyDutchemailsandassembling
a fluent samenvatting from the chaotic pieces I sent you. To the one-day-a-week re-
searchers,formerandcurrent,thankyouformakingThursdaytheliveliestdayofthe
week. Masha, Demeter, thank you for fun pancake evenings. Alwin, thank you for
the pleasant three years in R-550 and lovely time together with Sylvia.
To the statistics group at VU, thank you for the joyful lunches and coffee breaks,
alas far too rarely enriched with cakes. I believe I owe at least one piece of cake to
many of you, and I promise to pay my dues at my defence party. Beata, thank you
for lovely mid-afternoon chats and for helping me find my passion by taking me to
the best dance class ever.
To my colleagues at Centraal Bureau voor de Statistiek, thank you for making
my time as a PhD student sometimes seem like a serious office job and for trying
to teach me a little Dutch as well. Nino, you brightened up my cloudiest days, I
am truly grateful for having had you there. Henk, Fatima, thank you both for the
many humorous moments in the office. Jan, Sander, thank you for solving my CBS-
beginner questions. Marriette, Martijn, thank you for clarifying some of the many
data collection mysteries.
ToTargol,youhaveknownmygoodsandbadsforthelongest,thankyouforbeing
my best friend all this time. To my Romanian friends, thank you for making home
seem just a step away, especially on December 1st. To my dear friends from home,
thank you for having me over and finding time on short notice to dine and chat. To
my Dutch friends, Laurens, Jeanine, Max, Ivar, thank you for great parties, patient
Dutch lessons and tasty dinners. Thank you Jan, Jarda, Tom´aˇs, Nikke, Fulvio, Raja
for fun trips and dinners together.
ToJakub,withwhomIhavesharedmypastsixyears,thankyouforyourconstant
and creative help, for stirring my curiosity to learn many new things and for a spicy
combination of happy moments and tough life lessons. Last but surely not least, to
my family, Sister, Mom and Dad, thank you for your unconditional love and support
and for the goodies packages that so often brought a little bit of home all the way to
Amsterdam!
Melania Calinescu
September 2013
Contents
1 Introduction 1
1.1 Surveys in practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contributions of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Overview of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Adaptive survey designs and Markov decision theory: an intro-
duction 11
2.1 Resource allocation problems . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Markov decision theory . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 A brief introduction to sample surveys . . . . . . . . . . . . . . . . . . 20
2.4 Adaptive survey designs . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3 The survey resource allocation problem 31
3.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Adaptive survey design policies . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Budget and capacity constraints . . . . . . . . . . . . . . . . . . . . . 39
3.4 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4 The survey resource allocation problem for multiple quality indi-
cators 49
4.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 The two-step algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5 The survey resource allocation problem and measurement errors 65
5.1 Measurement errors in surveys: an introduction . . . . . . . . . . . . . 66
viii Contents
5.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.3 Problem solving technique . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.4 Case study: the Dutch Labor Force Survey . . . . . . . . . . . . . . . 73
5.5 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.6 Appendix: additional optimization results . . . . . . . . . . . . . . . . 86
6 Adaptive survey designs to minimize survey mode effects 91
6.1 Survey mode effects: an introduction . . . . . . . . . . . . . . . . . . . 93
6.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.4 Case study: the Dutch Labor Force Survey . . . . . . . . . . . . . . . 100
6.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7 Dynamic learning in adaptive survey designs 119
7.1 Literature review multi-armed bandit problems . . . . . . . . . . . . . 120
7.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.3 Solving the budgeted MAB via dynamic programming . . . . . . . . . 127
7.4 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
7.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8 Future research directions 141
Bibliography 144
Summary 157
Samenvatting 161
1
Introduction
How did the recent global financial crisis change the world’s economic landscape?
What are the repercussions of current economic and political policies on the future
societaldevelopment,asquantifiedthroughindicatorssuchastheunemploymentrate,
average household income, consumer confidence index? To answer such questions,
policy makers have to collect information from the population and summarize it in
a meaningful way. This is where survey organizations and statistical bureaus play
a crucial role. Collecting information from the entire population requires significant
amountsoftimeandmoney. Alternatively,asamplesurveymaybeconducted,where
onlyasamplefromthespecifiedpopulationisrequestedtoprovideinformation. Using
the results from the survey sample, knowledge can be obtained about the population
of interest.
1.1 Surveys in practice
Surveysareusedallaroundtheworldtomeasuresocio-economicstatusandwell-being
ofpeople,totesttheories,andmakepolicydecisions. However,thedifferentstatistics
computed from the survey data are of interest only if they accurately describe the
correspondingpopulationattribute. Multiplefactorsplayarolethroughoutthecourse
of a survey from its planning to the final systematization of the results. Some factors
may disrupt the framework of statistical inference theory and sampling theory that
grant methods to describe accurately (enough) population’s characteristics given the
surveysampleresults. Suchfactorsarepeople’slackofunderstanding(orinterest)as
towhytheyhavebeenselectedtoparticipateinasurvey,theirattitudetowardsareas
such as privacy and confidentiality of personal information, and the influence exerted
by the attributes of the survey design on their decision to participate in the survey.
Addressing these factors and related social science questions is an integral part of
2 Surveys in practice
3
0. Nonresponse rate=0.05
Nonresponse rate=0.3
Nonresponse rate=0.5
2
0.
as 0.1
bi
nt
e
d
on 0
p
s
e
nr
No 0.1
−
2
0.
−
3
0.
−
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Nonrespondent mean
Figure 1.1: Level of nonresponse bias for various nonresponse rates and nonre-
sponse means; respondent mean fixed at 0.50.
surveyresearch. Tremendousefforthasbeeninvestedintounderstandinghowhuman
behavior and thought may impact the precision and accuracy of survey statistics and
how the effects may be reduced or adjusted (see overviews in Groves et al. 2002,
Lepkowski et al. 2007 and Bethlehem et al. 2011).
In a perfect world, all sampled population units would be willing to participate
in the survey and provide all the requested data. In practical situations, however,
information from some sample units is missing due to factors such as those listed
above. This is called nonresponse and it is one of the most studied errors in the
survey literature. Classic inferential properties of sample estimates require these
statistics be computed from the entire sample. One example statistic is the sample
mean as an estimator of the population mean. In the presence of nonresponse the
sample mean is reduced to the respondent mean, i.e., the sample mean is obtained
based only on information coming from the pool of respondents. The deviation of
the respondent mean from the full sample mean is called nonresponse bias and it is a
function of the nonresponse rate (i.e., the proportion of nonrespondents in the entire
sample)andthedifferencebetweentherespondentandnonrespondentmeans. Figure
1.1 (from Groves and Couper 1998) illustrates the consequences of nonresponse
rates on the precision of the survey estimates. Given the respondent mean of 0.50,
Description:To my promotor, Ger Koole, thank you for giving me the opportunity to join the.
OBP group and for the Dutch lessons and tasty dinners. Thank you Jan, Jarda,