Pearson's
Product Moment Correlation Coefficient
Correlation
coefficients estimate strength and direction of association between two interval/ratio
level variables. Used to create a summary
measure that reflects the covariation between two interval/ratio variables, the Pearson
Correlation Coefficient presented here can range from a 1.00 to 1.00. A positive coefficient indicates the values of
variable A vary in the same direction as variable B. A negative coefficient indicates the
values of variable A and variable B vary in opposite directions.
The
following data were collected to estimate the correlation between years of formal
education and income at age 35.

Susan 
Bill 
Bob 
Tracy 
Joan 
Education
(years) 
12 
14 
16 
18 
12 
Income
($1000) 
25 
27 
32 
44 
26 
Verify
Conditions for using Pearson
r
Interval/ratio
data must be from paired observations.
A
linear relationship should exist between the variables
 verified by plotting the data on a scattergram.
No
extreme values in the data
Y:
Income
44.0
*



34.5

*


 * *
25.0 *

12.0 15.0 18.0
X:
Education
Compute
Pearson's
r

Education 
Income 




(Years) 
($1000) 



Name 
X 
Y 
XY 
X2 
Y2 
Susan 
12 
25 
300 
144 
625 
Bill 
14 
27 
378 
196 
729 
Bob 
16 
32 
512 
256 
1024 
Tracy 
18 
44 
792 
324 
1936 
Joan 
12 
26 
312 
144 
676 
S
= 
72 
154 
2294 
1064 
4990 
n
= 
5 




Interpret
A
positive coefficient indicates the values of variable A vary in the same direction as
variable B. A negative coefficient indicates the values of variable A and variable B vary
in opposite directions.
Characterizations
of Pearson
r
.9
to 1 very high correlation
.7
to .9 high correlation
.5
to .7 moderate correlation
.3
to .5 low correlation
0
to .3 little if any correlation
In
this example, there is a very high positive correlation between the variation of education
and the variation of income. Individuals with higher levels of education earn more than
those with comparably lower levels of education.
Determine
Coefficient
of Determination
Eightyseven
percent of the variance
displayed in the income variable is associated with the variance displayed in the
education variable.
Hypothesis
Testing for Pearson r
Assumptions
Data
originated from a random sample
Data
are interval/ratio
Both
variables
are distributed normally
Linear
relationship and homoscedasticity
Determine
statistical significance based on a Pearson
r of .933 for annual income and education obtained from a national random sample of 20
employed adults.
State
the Hypothesis
Ho:
There is no association
between annual income and education for employed adults.
Ha:
There is an association
between annual income and education for employed adults.
Set
the Rejection Criteria
Determine
the degrees of freedom (df) df=n – 2 or
202=18
Determine
the confidence level, alpha
(1tailed or 2tailed)
Use
the critical values from the t distribution at df=18
tcv
@ .05 alpha
(2tailed) = 2.101
Compute
Test Statistic
Decide
Results
Since
the test statistic 11.022 exceeds the critical value 2.101, there is a statistically
significant association
in the national population between an employed adult's education and their annual income.
Software Output Example
