|
1 The Big Picture 1
1.1 The importance of careful experimental design . . 3
1.2 Overview of statistical analysis . . . . . 3
1.3 What you should learn here . . . . . . . 6
2 Variable Classification 9
2.1 What makes a “good” variable? . . . . . 10
2.2 Classification by role . . . . . . 11
2.3 Classification by statistical type . . . . . 12
2.4 Tricky cases . . . . . . 16
3 Review of Probability 19
3.1 Definition(s) of probability . . . . . . . . 19
3.2 Probability mass functions and density functions . . . . . . 24
3.2.1 Reading a pdf . . . . . . 27
3.3 Probability calculations . . . . . 28
3.4 Populations and samples . . . . 34
3.5 Parameters describing distributions . . . 35
3.5.1 Central tendency: mean and median . . . 37
3.5.2 Spread: variance and standard deviation . . . . . . 38
3.5.3 Skewness and kurtosis . . . . . . 39
v
vi CONTENTS
3.5.4 Miscellaneous comments on distribution parameters . . . . . 39
3.5.5 Examples . . . . . . . . 40
3.6 Multivariate distributions: joint, conditional, and marginal . . . . . 42
3.6.1 Covariance and Correlation . . . 46
3.7 Key application: sampling distributions . . . . . . 50
3.8 Central limit theorem . . . . . . 52
3.9 Common distributions . . . . . 54
3.9.1 Binomial distribution . . . . . . . 54
3.9.2 Multinomial distribution . . . . . 56
3.9.3 Poisson distribution . . . . . . . . 57
3.9.4 Gaussian distribution . . . . . . . 57
3.9.5 t-distribution . . . . . . 59
3.9.6 Chi-square distribution . . . . . . 59
3.9.7 F-distribution . . . . . . 60
4 Exploratory Data Analysis 61
4.1 Typical data format and the types of EDA . . . . 61
4.2 Univariate non-graphical EDA . . . . . . 63
4.2.1 Categorical data . . . . 63
4.2.2 Characteristics of quantitative data . . . . 64
4.2.3 Central tendency . . . . 67
4.2.4 Spread . . . . . 69
4.2.5 Skewness and kurtosis . . . . . . 71
4.3 Univariate graphical EDA . . . . . . . . 72
4.3.1 Histograms . . . . . . . 72
4.3.2 Stem-and-leaf plOTS . . . . . . . . 78
4.3.3 Boxplots . . . . . . . . . 79
4.3.4 Quantile-normal plots . . . . . . 83
CONTENTS vii
4.4 Multivariate non-graphical EDA . . . . . 88
4.4.1 Cross-tabulation . . . . 89
4.4.2 Correlation for categorical data . . . . . . 90
4.4.3 Univariate statistics by category . . . . . . 91
4.4.4 Correlation and covariance . . . . 91
4.4.5 Covariance and correlation matrices . . . . 93
4.5 Multivariate graphical EDA . . . . . . . 94
4.5.1 Univariate graphs by category . . . . . . . 95
4.5.2 Scatterplots . . . . . . . 95
4.6 A note on degrees of freedom . . . . . . 98
5 Learning SPSS: Data and EDA 101
5.1 Overview of SPSS . . . . . . . . 102
5.2 Starting SPSS . . . . . 104
5.3 Typing in data . . . . . . . . . 104
5.4 Loading data . . . . . 110
5.5 Creating new variables . . . . . 116
5.5.1 Recoding . . . . . . . . 119
5.5.2 Automatic recoding . . . . . . . . 120
5.5.3 Visual binning . . . . . . 121
5.6 Non-graphical EDA . . . . . . . 123
5.7 Graphical EDA . . . . . . . . . 127
5.7.1 Overview of SPSS Graphs . . . . 127
5.7.2 Histogram . . . . . . . . 131
5.7.3 Boxplot . . . . . . . . . 133
5.7.4 Scatterplot . . . . . . . 134
5.8 SPSS convenience item: Explore . . . . . 139 |
|