Office of Educational Assessment

Understanding Correlations

ScorePak® can compute Pearson Product Moment Correlation coefficients among any number of scores of any type. The results are presented within a square correlation matrix of up to ten variables each. Several matrices will be produced if intercorrelations are requested among more than ten variables.

Sample Correlation Report (9K PDF)

Correlation Coefficients

Correlation coefficients index the extent to which two scores are related, and the direction of that relationship. They reflect the tendency of the variables to “co-vary”; that is, for changes in the value of one variable to be associated with changes in the value of the other. In interpreting correlation coefficients, two properties are important.

  • Magnitude. Correlations range in magnitude from -1.00 to 1.00. The larger the absolute value of the coefficient (the size of the number without regard to the sign) the greater the magnitude of the relationship. For example, correlations of .60 and -.60 are of equal magnitude, and are both larger than a correlation of .30. When there is no linear relationship, the correlation will be 0.00; when there is a perfect linear relationship (one-to-one correspondence between the values of the variables), the correlation will be 1.00 or -1.00.
  • Direction. The direction of the relationship (positive or negative) is indicated by the sign of the coefficient. A positive correlation implies that increases in the value of one score tend to be accompanied by increases in the other. A negative correlation implies that increases in one are accompanied by decreases in the other.

Interpretation

Because ScorePak® scores are generally test scores, most of the relationships among them can be expected to be positive. The greater the degree to which the tests are measuring the same thing, the stronger the relationship between them. Scores are often weighted and summed to create a composite score which is then used to assign grades. In such applications, moderately-sized positive correlations (r>.30) among scores are desirable. Negative or small positive correlations (r<.20) among test scores imply that the composite score may be unreliable.

Missing Data

In computing correlations, ScorePak® includes pairs of observations for which neither test score is missing. However, ScorePak® does not delete an entire case just because data are missing on one or more scores; if you are intercorrelating several scores, test scores for a particular individual will be included in those coefficients for which both scores are present, and excluded from those coefficients for which one or both scores are missing.

Composite scores are created by combining scores using one or more transformation steps. A composite score may or may not be missing if one or more of the scores on which it is based is missing. Check the description of missing values for each transformation if you plan to correlate composite scores. In general, the correlations of a composite score with the scores from which it is derived tend to be relatively large because of the shared variance of the scores with the composite score. However, these “part-whole” correlations can be misleadingly small if there is much missing data within the scores making up the composite, and the composite score is not set to missing if it contains missing scores.

Attenuation

It is important to keep in mind that test scores are themselves unreliable to some extent. Only the reliable portions of two sets of scores can be correlated; the unreliable portion is random error and thus will be uncorrelated. As a result, the magnitude of the correlation between any two test scores is limited or attenuated by the unreliability of each. If the reliability of the test scores is known, the correlation can be corrected for attenuation. ScorePak® does not make this correction, because the reliability coefficient is not available at the time that the program computes the correlations. However, if you are correlating raw scores, you can use the reliability coefficients given in the ScorePak® Item Analysis to correct the correlations according to the following formula:

rxy’ = rxy / [(SQRT(rxx*ryy)], where

rxy’ = the corrected correlation of test score “x” with test score “y”
rxy = the uncorrected correlation
rxx = the reliability of test score “x”
ryy = the reliability of test score “y”