Learn moreLearn moreApplied Statistics Handbook

Table of Contents

 


 

Pearson's Product Moment Correlation Coefficient

Correlation coefficients estimate strength and direction of association between two interval/ratio level variables.  Used to create a summary measure that reflects the covariation between two interval/ratio variables, the Pearson Correlation Coefficient presented here can range from a -1.00 to 1.00.  A positive coefficient indicates the values of variable A vary in the same direction as variable B. A negative coefficient indicates the values of variable A and variable B vary in opposite directions.

The following data were collected to estimate the correlation between years of formal education and income at age 35.

 

 

Susan

Bill

Bob

Tracy

Joan

Education (years)

12

14

16

18

12

Income ($1000)

25

27

32

44

26

 Verify Conditions for using Pearson r

Interval/ratio data must be from paired observations.

A linear relationship should exist between the variables -- verified by plotting the data on a scattergram.

No extreme values in the data

 

Y: Income

      44.0|                             *

          |                              

          |                              

          |                              

      34.5|                              

          |                       *      

          |                               

          |                              

          | *        *                   

      25.0| *                            

           ---------------|--------------|

          12.0          15.0          18.0

          X: Education

 

Compute Pearson's r

 

Education

Income

 

 

 

 

(Years)

($1000)

 

 

 

Name

X

Y

XY

X2

Y2

Susan

12

25

300

144

625

Bill

14

27

378

196

729

Bob

16

32

512

256

1024

Tracy

18

44

792

324

1936

Joan

12

26

312

144

676

S =

72

154

2294

1064

4990

n =

5

 

 

 

 

 

     

                       

 

Interpret

A positive coefficient indicates the values of variable A vary in the same direction as variable B. A negative coefficient indicates the values of variable A and variable B vary in opposite directions.

Characterizations of Pearson r

.9 to 1 very high correlation

.7 to .9 high correlation

.5 to .7 moderate correlation

.3 to .5 low correlation

0 to .3 little if any correlation

In this example, there is a very high positive correlation between the variation of education and the variation of income. Individuals with higher levels of education earn more than those with comparably lower levels of education.

 

Determine Coefficient of Determination

               

Eighty-seven percent of the variance displayed in the income variable is associated with the variance displayed in the education variable. 

 

Hypothesis Testing for Pearson r

Assumptions

Data originated from a random sample

Data are interval/ratio

Both variables are distributed normally

Linear relationship and homoscedasticity

 

Determine statistical significance based on a Pearson r of .933 for annual income and education obtained from a national random sample of 20 employed adults.

 

State the Hypothesis

Ho: There is no association between annual income and education for employed adults.

Ha: There is an association between annual income and education for employed adults.

 Set the Rejection Criteria

Determine the degrees of freedom (df)  df=n – 2 or 20-2=18

Determine the confidence level,  alpha (1-tailed or 2-tailed)

Use the critical values from the t distribution at df=18

tcv @ .05 alpha (2-tailed) = 2.101

 Compute Test Statistic

         

 Decide Results

Since the test statistic 11.022 exceeds the critical value 2.101, there is a statistically significant association in the national population between an employed adult's education and their annual income.

Software Output Example


Google

 


Copyright 2015, AcaStat Software. All Rights Reserved.