Click each topic to learn more.
Construct a ScatterplotConstruct a scatterplot of the data. Analyze the distribution of the points in the scatterplot to determine the possible correlation between the variables.
In this example, it appears that there may be a linear relationship between a student’s ACT score and his/her SAT score.
ACT Score | SAT Score |
---|---|
17 | 1200 |
19 | 1375 |
21 | 1410 |
23 | 1675 |
25 | 1600 |
27 | 2125 |
29 | 1940 |
31 | 2070 |
33 | 1990 |
35 | 2350 |
data:image/s3,"s3://crabby-images/d08a1/d08a1c81fc45029f66ed238a544b677147f751e3" alt=""
Calculate the regression curve, or the curve of best fit.
In this example, the linear regression equation is y = 58.4091x + 254.864, where x is the ACT score and y is the corresponding SAT score.
ACT Score | SAT Score |
---|---|
17 | 1200 |
19 | 1375 |
21 | 1410 |
23 | 1675 |
25 | 1600 |
27 | 2125 |
29 | 1940 |
31 | 2070 |
33 | 1990 |
35 | 2350 |
data:image/s3,"s3://crabby-images/f48fa/f48fa82011e02915555391914088c23e2579f13d" alt=""
A residual is the difference between the actual data point and the corresponding predicted value along the regression curve (or line).
data:image/s3,"s3://crabby-images/8dd4a/8dd4a2bc7d6f2847c3cecafd0bec06f9d8d065df" alt=""
A residual plot graphs these residual values with respect to the independent variable in the situation.
data:image/s3,"s3://crabby-images/b33d2/b33d241f2c41fab7e5263ced15d05e0ef06d77c0" alt=""
Residual plots that display a scatter of points equally above and below the horizontal axis, with no distinct pattern, signify that the regression curve is a good fit for the data.
This residual plot signifies the regression curve is a good fit.
data:image/s3,"s3://crabby-images/b33d2/b33d241f2c41fab7e5263ced15d05e0ef06d77c0" alt=""
(This residual plot signifies the regression curve is a bad fit.)
data:image/s3,"s3://crabby-images/b8869/b8869fe385fcf51106024fe737a97492646f9c54" alt=""