This section of the site contains miscellaneous examples, explanations, and materials I have found useful in teaching statistics. When something comes up in a class that I find myself explaining more than once but is not otherwise covered in the curriculum, I add it to this list so I can refer students here for extra information.
Keywords: programming, R, software, workshop materials
This document summarizes the first three units of my workshop series introducing new users to the R statistical computing environment. The full workshop series focuses on statistical computing tasks commonly employed by psychologists, but I have revised this summary for a more general audience. The summary makes reference to example data and syntax files, which can be downloaded as a zip archive here: R Workshop Example Data and Syntax.
Keywords: programming, R, software
This document provides a one-page (front and back) summary of some of the basic functions and commands in R. It focuses on statistical computing tasks commonly employed by psychologists. It is designed to accompany my "R for Psychologists" workshop series, but it should also be comprehensible on its own.
Keywords: distribution, histogram, skew
This page clarifies what positive and negative skew "look like." Students sometimes find it difficult to remember which is which, so the plots on this page are accompanied by descriptions that explain in general terms what skew is and how to recognize it.
Keywords: distribution, expected value, F-test, t-test
This image illustrates the expected values of a t-distributed random variable and an F-distributed random variable. Under the null hypothesis, the expected value of t is 0, but the expected value of F is 1. This is sometimes confusing for students given that F is t squared.
Keywords: binary outcome, chi-squared, expected distribution, goodness of fit, independence
To test whether a categorical variable is related to the presence of a binary characteristic, the appropriate chi-squared test is a test of independence (or association). Some students might be curious why we cannot run a chi-squared test of goodness of fit. After all, the null hypothesis assumes that the characteristic in question will be evenly distributed across the categorical variable, and this sounds very much like an "expected distribution." This document explains why the test of independence is more appropriate and what happens if you run a goodness-of-fit test instead.
Keywords: combined variance, repeated measures, SPSS
If we have two groups of n observations each, and we know the mean and sample variance of each group separately, how can we calculate the mean and sample variance of the combination of the two groups? This can become an issue when trying to report means and standard deviations for the main effects in a repeated-measures or mixed-model ANOVA using SPSS. SPSS does not include as part of its typical descriptive statistics the standard deviation of the dependent variable across multiple levels of a repeated-mesures factor.
Keywords: interaction, plot
A common heuristic for interpreting three-way interaction plots is to look at the "simple" two-way interaction in the left half of the plot and the "simple" two-way interaction in the right half of the plot separately. If these two-way interactions "look different," then we say that there is a three-way interaction. This heuristic does not always work, and this document explains why using an example.
Keywords: B, beta, coefficient, estimate, notation, regression, slope
The notation for regression coefficients is often inconsistent. Sometimes, students with prior experience in statistics classes find the notation conventions common in psychology confusing, and vice versa. This table illustrates some of the different symbols that are used to refer to regression coefficients.
Keywords: correlation, regression, simple linear regression, slope, standardize, Z-score
When students learn that the least-squares line drawn through standardized versions of two variables has a slope equal to the correlation between those variables, they sometimes wonder how it can be that it does not matter which variable is presented as the predictor and which is presented as the outcome. After all, the unstandardized slope depends on which variable is the predictor and which is the outcome, so why would correlation (standardized slope) be the same regardless? The images on this page attempt to make this concept more intuitive.