PLEASE NOTE: This simulator requires Adobe Flash Player, which is being phased out by Adobe. If you do not see the simulator on this page, it is because you do not have Flash installed. It may be difficult or impossible to install Flash in a modern web browser. There is currently no plan to update this simulator to a modern application framework, but this page will remain active to avoid broken incoming links and to provide this message.
How to Use This Correlation Simulator
If you're an undergraduate viewing this page, just play with the sliders! If you're learning about what correlations look like, set N=200 and then slowly slide the magnitude from +1.0, down to -1.0, and back again.
If you are learning about correlation, you might also consider reading what you find in this blog post. It will step you through the basics of what correlations are, how you interpret them, and why they have values between -1.0 and +1.0.
Technical Details of the Simulator
This correlation simulator was designed after I noticed that very few simple, illustrative online correlation simulators were available. Most that I could fined were heavily mathematical and littered with notation. While I appreciate that level of detail (and of course, such knowledge was needed to construct this simulator), it is unnecessary for a basic introductory course in the social sciences, such as Psychology 101. With that need in mind, I designed the simulator below.
The simulator functions by:
- Generating two sets of normally distributed data with mean = 0 and SD = 1 (which we'll call X and RawY)
- Determine Y for each X using the following formula: Y = X * correlation + RawY * (1 - correlation^2) ^ 0.5
- Graph all of the X/Y pairs on a coordinate plane ranging +/-3 on both axes
This procedure essentially generates a dataset of X/Y pairs pulled from a random distribution of correlations centered around the simulation parameters specified. Thus, even if you set r = 0.2, you will not necessarily get precisely r = 0.2 in the generated data. Additionally, as you would expect, smaller N's are more likely to produce increasingly inaccurate representations of the desired correlation. In general, unless you're trying to teach about sampling error, you should instruct your students to keep the correlation simulator's sample size set to 200.