How to interpret results from the correlation test?

By Riya Jain & Priya Chetty on September 19, 2019

Correlation is a statistical measure that helps in determining the extent of the relationship between two or more variables or factors. For example, growth in crime is positively related to growth in the sale of guns. Growth in obesity is positively correlated to growth in consumption of junk food. However, growth in environmental degradation is negatively correlated with the rate of education and awareness. A previous article explained how to perform the correlation test in SPSS software. This article explains how to interpret the results of that test.

The below table represents a sample correlation matrix result. The purpose of this analysis was to determine the relationship between social factors and crime rates. Herein, unemployment rate, GDP per capita, population growth rate, and secondary enrollment rate are the social factors.

CRAEIRPCPPIA
CRPC
S
N
1

265
.582*
.042
265
-0.34
.08
265
-0.46
1.68
265
.632**
.000
265
AEP.C.
S
N
.582*
.042
265
1

265
-0.736*
.03
265
-0.912**
.000
265
.674
.07
265
IRPPC
S
N
-0.46
1.68
265
-0.912**
.000
265
.676*
.025
265
1

265
.693**
.000
265
CPPC
S
N
-0.46
1.68
265
-0.912**
.000
265
.676*
.025
265
1

265
.693**
.000
265
PIAPC
S
N
.632**
.000
265
.674
.07
265
-0.782**
.000
265
.693**
.000
265
1

265
**. Correlation is significant at the 0.01 level (2-tailed).
*. Correlation is significant at the 0.05 level (2-tailed).
CR: Crime Rate (dependent)
AE: Availability of Education (Independent Variable)
IRP: Implementation of regulations and penalties (Independent Variable)
CP: Confidence in Police (Independent Variable)
PIA: Promotion of Illegal Activities (Independent Variable)
PC: Pearson Correlation
S: Significance
N: 2-tailed

In the above table, rows 2-5 are the same as columns 2-5. Either of them can be removed. Remove the columns, so that the table looks like below.

CR
CRPC
S
N
1

265
AEPC
S
N
.582*
.042
265
IRPPC
S
N
-.340
.08
265
CPPC
S
N
-.460
1.68
265
PIAPC
S
N
.632**
.000
265
**. Correlation is significant at the 0.01 level (2-tailed).
*. Correlation is significant at the 0.05 level (2-tailed).
CR: Crime Rate (dependent)
AE: Availability of Education (Independent Variable)
IRP: Implementation of regulations and penalties (Independent Variable)
CP: Confidence in Police (Independent Variable)
PIA: Promotion of Illegal Activities (Independent Variable)
PC: Pearson Correlation
S: Significance (2-tailed)

Each row has three elements present in it:

  • Pearson Correlation,
  • Sig (2-tailed) and
  • N.

Pearson’s correlation value

1st Element is Pearson Correlation values. This value can range from -1 to 1. The presence of a relationship between two factors is primarily determined by this value.

  • 0- No correlation
  • -0.2 to 0 /0 to 0.2 – very weak negative/ positive correlation
  • -0.4 to -0.2/0.2 to 0.4 – weak negative/positive correlation
  • -0.6 to -0.4/0.4 to 0.6 – moderate negative/positive correlation
  • -0.8 to -0.6/0.6 to 0.8 – strong negative/positive correlation
  • -1 to -0.8/0.8 to 1 – very strong negative/positive correlation
  • -1/1 – perfectly negative/positive correlation

Value for 1st cell for Pearson coefficient will always be 1 because it represents the relationship between the same variable (circled in image below). For subsequent variables Pearson’s coefficient value will be vary from -1 to 1.

1st cell of the correlation matrix
Table 3: 1st cell of the correlation matrix

Significance (2-tailed) value

2nd element is the significance value Significance (2-tailed) value. It represents the risk of representing the existence of a correlation between the variables when no such relation exists. This means chances of error in the results. To make sure that the data results do not have too many errors, set a ‘confidence interval’. Generally, this confidence interval ranges from 90 to 99%. The result is shown in the form of a ‘significance level’ in a correlation table. The section below explains how to determine the confidence interval ideal for a study.

Determining the optimum confidence interval

Usually, the confidence interval is set at 99%, 95% or 90%.

Confidence interval MeaningSignificance levelWhen is it used?
99%Allowing only 1% chance of errors in the result. 0.01Studies on social sciences or any study involving primary data to check respondents’ opinions/ perspectives.
95%Allowing only a 5% chance of error in the result. 0.05Studies on social sciences or any study involving primary data to check respondents’ opinions/ perspectives.
90%Allowing up to 10% chances of error in the result0.10Secondary data-based studies such as macroeconomic data and financial results data, in cases in which the chances of error are beyond the researcher’s control.

In the case of the present example, a confidence interval of 95% is set. Therefore, the Significance (2-tailed) value to look for in all variables should be less than 0.05. Next, see if the Significance (2-tailed) value for all the independent variables is less than 0.05 or not.

N value

3rd Element present in each cell is N. It determines the number of observations considered for analysis. In order to study correlation, this value is not relevant.  However, the N value should be uniform across the correlation matrix else the results would be biased.

Interpretation of Pearson’s correlation values

In the case of the above example, below are Pearson’s correlation values for the four independent variables:

Independent variable name Pearson correlation valueResult
Availability of education 0.582 Moderate positive correlation
Implementation of regulations -.340 Weak negative correlation
Confidence in police -.460 Moderate negative correlation
Promotion of illegal activities 0.632 Strong positive  correlation

Interpretation of Significance (2-tailed) values

Independent variable name Significance (2-tailed) valueResult (at 95% confidence interval)
Availability of education0.042 Not acceptable
Implementation of regulations 0.08 Not acceptable
Confidence in police 1.68 Not acceptable
Promotion of illegal activities 0.000 Acceptable

Therefore out of all the variables, only the availability of education rate and promotion of illegal activities show an acceptable level of error.

Process for regression test

The next step is to determine which of these variables is qualified to be included in the regression analysis. Only those variables need to be considered which are significant and have a Pearson coefficient value greater/less than 0.4/-0.4 i.e. at least a moderate relationship should exist between variables. For the given sample, only ‘availability of education’ and ‘promotion of illegal activities’ qualify for further regression analysis with the dependent variable, i.e. crime rate.

NOTES

Discuss

1 thought on “How to interpret results from the correlation test?”