Introduction to basic concepts of probability and statistics including randomness, probability and empirical probability distributions, the reasoning of statistical tests of significance, and simulation as a tool for understanding these concepts.
In a study published in the Journal of Personality and Social Psychology (Butler and Baumeister, 1998), researchers investigated a conjecture that having an observer with a vested interest would decrease subjects’ performance on a skill-based task. Subjects were given time to practice playing a video game that required them to navigate an obstacle course as quickly as possible. They were then told to play the game one final time with an observer present. Subjects were randomly assigned to one of two groups. One group (A) was told that the participant and observer would each win $3 if the participant beat a certain threshold time, and the other group (B) was told only that the participant would win the prize if the threshold were beaten. The threshold was chosen to be a time that they beat in 30% of their practice turns. It turned out that 3 of the 12 subjects in group A beat the threshold, while 8 of 12 subjects in group B achieved success.
|
A: observer shares prize |
B: no sharing of prize |
Total |
Beat threshold |
3 |
8 |
11 |
Do not beat threshold |
9 |
4 |
13 |
Total |
12 |
12 |
24 |
We are going to analyze these results and along the way get introduced to some central concepts in probability and statistics.
The reasoning of statistical tests of significance asks how likely the sample results would have been if in fact the observer’s incentive had no effect on the subjects’ performance. One way to analyze this question is to assume that those 11 subjects who passed the threshold and 13 who did not would have achieved the same outcome regardless of which group they had been assigned to. In other words, we begin by assuming that the observer’s interest had no effect, and we will ask how likely it is to obtain the results the researchers found given this assumption.
We can then simulate the process of assigning subjects at random to the two groups, just as the researchers did at the beginning of their study. Our focus will be on noting how often we obtain a sample result as extreme (3 or fewer successes assigned to A) as in the actual sample. Repeating this a large number of times will give us a sense for how unusual the sample result would be to occur by chance alone.
We will perform our simulation in several ways:
This activity introduces the important idea that statistical significance assesses the likeliness of a sample result by asking how often such an extreme result would occur by chance alone. When the sample result is unlikely to occur by chance, it is said to be statistically significant. The long-run proportion of times that a result as extreme as the sample would occur by chance alone is called the probability of such a result. In the case of statistical tests of significance, this probability is called the p-value of the test. Our simulations have produced an empirical approximations of this probability.
The reasoning process of this activity typifies that of statistical tests of significance. One starts by assuming that there is no difference between the two experimental groups and then investigates how often the observed data would occur if nothing more than the random assignment of subjects to groups were involved. If the answer is that the observed data are quite unlikely to arise due to chance, then the data provide evidence against the assumption of no difference between the groups, thus supporting the hypothesis that the treatment does indeed have an effect. There is no precise rule for determining how “unlikely” the data need to be in order to support the research hypothesis, but the most common standard is to have a p-value less than .05.
To this point we have approximated this probability through physical and computer simulations, i.e. we have computed an empirical approximation of the p-value (i.e. the probability of getting a result as extreme as the data due purely to chance (i.e. the observer has no effect)). This approximation generally gets closer and closer to the probability as one increases the number of repetitions in the simulation. Another approach to calculating this p-value (i.e. probability) is to use probability theory and the mathematics of counting techniques.
We have covered many concepts in this activity. Below is a list of concepts related to this activity. Some of these we explicitly discussed and some we will be covering in the near future. Don't worry if any of this is still fuzzy to you - remember we are just getting started down the exciting road to learning probability!