Table Of ContentINTRODUCTORY
STATISTICS AND
ANALYTICS
INTRODUCTORY
STATISTICS AND
ANALYTICS
A Resampling Perspective
PETERC.BRUCE
InstituteforStatisticsEducation
Statistics.com
Arlington,VA
Copyright©2015byJohnWiley&Sons,Inc.Allrightsreserved
PublishedbyJohnWiley&Sons,Inc.,Hoboken,NewJersey
PublishedsimultaneouslyinCanada
Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorbyany
means,electronic,mechanical,photocopying,recording,scanning,orotherwise,exceptaspermittedunder
Section107or108ofthe1976UnitedStatesCopyrightAct,withouteitherthepriorwrittenpermissionofthe
Publisher,orauthorizationthroughpaymentoftheappropriateper-copyfeetotheCopyrightClearanceCenter,
Inc.,222RosewoodDrive,Danvers,MA01923,(978)750-8400,fax(978)750-4470,oronthewebat
www.copyright.com.RequeststothePublisherforpermissionshouldbeaddressedtothePermissions
Department,JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030,(201)748-6011,fax(201)
748-6008,oronlineathttp://www.wiley.com/go/permission.
LimitofLiability/DisclaimerofWarranty:Whilethepublisherandauthorhaveusedtheirbesteffortsin
preparingthisbook,theymakenorepresentationsorwarrantieswithrespecttotheaccuracyorcompletenessof
thecontentsofthisbookandspecificallydisclaimanyimpliedwarrantiesofmerchantabilityorfitnessfora
particularpurpose.Nowarrantymaybecreatedorextendedbysalesrepresentativesorwrittensalesmaterials.
Theadviceandstrategiescontainedhereinmaynotbesuitableforyoursituation.Youshouldconsultwitha
professionalwhereappropriate.Neitherthepublishernorauthorshallbeliableforanylossofprofitoranyother
commercialdamages,includingbutnotlimitedtospecial,incidental,consequential,orotherdamages.
Forgeneralinformationonourotherproductsandservicesorfortechnicalsupport,pleasecontactourCustomer
CareDepartmentwithintheUnitedStatesat(800)762-2974,outsidetheUnitedStatesat(317)572-3993orfax
(317)572-4002.
Wileyalsopublishesitsbooksinavarietyofelectronicformats.Somecontentthatappearsinprintmaynotbe
availableinelectronicformats.FormoreinformationaboutWileyproducts,visitourwebsiteatwww.wiley.com.
LibraryofCongressCataloging-in-PublicationDataisavailable.
ISBN:978-1-118-88135-4
PrintedintheUnitedStatesofAmerica
10987654321
CONTENTS
Preface ix
Acknowledgments xi
Introduction xiii
1 DesigningandCarryingOutaStatisticalStudy 1
1.1 ASmallExample, 3
1.2 IsChanceResponsible?TheFoundationofHypothesisTesting, 3
1.3 AMajorExample, 7
1.4 DesigninganExperiment, 8
1.5 WhattoMeasure—CentralLocation, 13
1.6 WhattoMeasure—Variability, 16
1.7 WhattoMeasure—Distance(Nearness), 19
1.8 TestStatistic, 21
1.9 TheData, 22
1.10 VariablesandTheirFlavors, 28
1.11 ExaminingandDisplayingtheData, 31
1.12 AreweSureweMadeaDifference? 39
Appendix:HistoricalNote, 39
1.13 Exercises, 40
2 StatisticalInference 45
2.1 RepeatingtheExperiment, 46
2.2 HowManyReshuffles? 48
2.3 HowOddisOdd? 53
2.4 StatisticalandPracticalSignificance, 55
v
vi Contents
2.5 WhentouseHypothesisTests, 56
2.6 Exercises, 56
3 DisplayingandExploringData 59
3.1 BarCharts, 59
3.2 PieCharts, 61
3.3 MisuseofGraphs, 62
3.4 Indexing, 64
3.5 Exercises, 68
4 Probability 71
4.1 Mendel’sPeas, 72
4.2 SimpleProbability, 73
4.3 RandomVariablesandtheirProbabilityDistributions, 77
4.4 TheNormalDistribution, 80
4.5 Exercises, 84
5 RelationshipbetweenTwoCategoricalVariables 87
5.1 Two-WayTables, 87
5.2 ComparingProportions, 90
5.3 MoreProbability, 92
5.4 FromConditionalProbabilitiestoBayesianEstimates, 95
5.5 Independence, 97
5.6 ExploratoryDataAnalysis(EDA), 99
5.7 Exercises, 100
6 SurveysandSampling 104
6.1 SimpleRandomSamples, 105
6.2 MarginofError:SamplingDistributionforaProportion, 109
6.3 SamplingDistributionforaMean, 111
6.4 AShortcut—theBootstrap, 113
6.5 BeyondSimpleRandomSampling, 117
6.6 AbsoluteVersusRelativeSampleSize, 120
6.7 Exercises, 120
7 ConfidenceIntervals 124
7.1 PointEstimates, 124
7.2 IntervalEstimates(ConfidenceIntervals), 125
7.3 ConfidenceIntervalforaMean, 126
7.4 Formula-BasedCounterpartstotheBootstrap, 126
7.5 StandardError, 132
7.6 ConfidenceIntervalsforaSingleProportion, 133
7.7 ConfidenceIntervalforaDifferenceinMeans, 136
7.8 ConfidenceIntervalforaDifferenceinProportions, 139
Contents vii
7.9 Recapping, 140
AppendixA:MoreontheBootstrap, 141
ResamplingProcedure—ParametricBootstrap, 141
FormulasandtheParametricBootstrap, 144
AppendixB:AlternativePopulations, 144
AppendixC:BinomialFormulaProcedure, 144
7.10 Exercises, 147
8 HypothesisTests 151
8.1 ReviewofTerminology, 151
8.2 A–BTests:TheTwoSampleComparison, 154
8.3 ComparingTwoMeans, 156
8.4 ComparingTwoProportions, 157
8.5 Formula-BasedAlternative—t-TestforMeans, 159
8.6 TheNullandAlternativeHypotheses, 160
8.7 PairedComparisons, 163
AppendixA:ConfidenceIntervalsVersusHypothesisTests, 167
ConfidenceInterval, 168
RelationshipBetweentheHypothesisTestandtheConfidenceInterval, 169
Comment, 170
AppendixB:Formula-BasedVariationsofTwo-SampleTests, 170
Z-TestWithKnownPopulationVariance, 170
PooledVersusSeparateVariances, 171
Formula-BasedAlternative:Z-TestforProportions, 172
8.8 Exercises, 172
9 HypothesisTesting—2 178
9.1 ASingleProportion, 178
9.2 ASingleMean, 180
9.3 MoreThanTwoCategoriesorSamples, 181
9.4 ContinuousData, 187
9.5 Goodness-of-Fit, 187
Appendix:NormalApproximation;HypothesisTestofaSingle
Proportion, 190
ConfidenceIntervalforaMean, 190
9.6 Exercises, 191
10 Correlation 193
10.1 Example:DeltaWire, 194
10.2 Example:CottonDustandLungDisease, 195
10.3 TheVectorProductandSumTest, 196
10.4 CorrelationCoefficient, 199
10.5 OtherFormsofAssociation, 204
10.6 CorrelationisnotCausation, 205
10.7 Exercises, 206
viii Contents
11 Regression 209
11.1 FindingtheRegressionLinebyEye, 210
11.2 FindingtheRegressionLinebyMinimizingResiduals, 212
11.3 LinearRelationships, 213
11.4 InferenceforRegression, 217
11.5 Exercises, 221
12 AnalysisofVariance—ANOVA 224
12.1 ComparingMoreThanTwoGroups:ANOVA, 225
12.2 TheProblemofMultipleInference, 228
12.3 ASingleTest, 229
12.4 ComponentsofVariance, 230
12.5 Two-WayANOVA, 240
12.6 FactorialDesign, 246
12.7 Exercises, 248
13 MultipleRegression 251
13.1 RegressionasExplanation, 252
13.2 SimpleLinearRegression—ExploretheDataFirst, 253
13.3 MoreIndependentVariables, 257
13.4 ModelAssessmentandInference, 261
13.5 Assumptions, 267
13.6 Interaction,Again, 270
13.7 RegressionforPrediction, 272
13.8 Exercises, 277
Index 283
Description:Concise, thoroughly class-tested primer that features basic statistical concepts in the concepts in the context of analytics, resampling, and the bootstrapA uniquely developed presentation of key statistical topics, Introductory Statistics and Analytics: A Resampling Perspective provides an accessib