How to interpret results from the correlation test?

By Riya Jain & Priya Chetty on September 19, 2019

Correlation is a statistical measure that helps in determining the extent of the relationship between two or more variables or factors. For example, growth in crime is positively related to growth in the sale of guns. Growth in obesity is positively correlated to growth in consumption of junk food. However, growth in environmental degradation is negatively correlated with the rate of education and awareness. A previous article explained how to perform the correlation test in SPSS software. This article explains how to interpret the results of that test.

The below table represents a sample correlation matrix result. The purpose of this analysis was to determine the relationship between social factors and crime rates. Herein, unemployment rate, GDP per capita, population growth rate, and secondary enrollment rate are the social factors.

		CR	AE	IRP	CP	PIA
CR	PC S N	1 265	.582* .042 265	-0.34 .08 265	-0.46 1.68 265	.632** .000 265
AE	P.C. S N	.582* .042 265	1 265	-0.736* .03 265	-0.912** .000 265	.674 .07 265
IRP	PC S N	-0.46 1.68 265	-0.912** .000 265	.676* .025 265	1 265	.693** .000 265
CP	PC S N	-0.46 1.68 265	-0.912** .000 265	.676* .025 265	1 265	.693** .000 265
PIA	PC S N	.632** .000 265	.674 .07 265	-0.782** .000 265	.693** .000 265	1 265

**. Correlation is significant at the 0.01 level (2-tailed).
 *. Correlation is significant at the 0.05 level (2-tailed).
CR: Crime Rate (dependent)
AE: Availability of Education (Independent Variable)
IRP: Implementation of regulations and penalties (Independent Variable)
CP: Confidence in Police (Independent Variable)
PIA: Promotion of Illegal Activities (Independent Variable)
PC: Pearson Correlation
S: Significance
N: 2-tailed

In the above table, rows 2-5 are the same as columns 2-5. Either of them can be removed. Remove the columns, so that the table looks like below.

		CR
CR	PC S N	1 265
AE	PC S N	.582* .042 265
IRP	PC S N	-.340 .08 265
CP	PC S N	-.460 1.68 265
PIA	PC S N	.632** .000 265

**. Correlation is significant at the 0.01 level (2-tailed).
 *. Correlation is significant at the 0.05 level (2-tailed).
CR: Crime Rate (dependent)
AE: Availability of Education (Independent Variable)
IRP: Implementation of regulations and penalties (Independent Variable)
CP: Confidence in Police (Independent Variable)
PIA: Promotion of Illegal Activities (Independent Variable)
PC: Pearson Correlation
S: Significance (2-tailed)

Each row has three elements present in it:

Pearson Correlation,
Sig (2-tailed) and
N.

Pearson’s correlation value

1^st Element is Pearson Correlation values. This value can range from -1 to 1. The presence of a relationship between two factors is primarily determined by this value.

0- No correlation
-0.2 to 0 /0 to 0.2 – very weak negative/ positive correlation
-0.4 to -0.2/0.2 to 0.4 – weak negative/positive correlation
-0.6 to -0.4/0.4 to 0.6 – moderate negative/positive correlation
-0.8 to -0.6/0.6 to 0.8 – strong negative/positive correlation
-1 to -0.8/0.8 to 1 – very strong negative/positive correlation
-1/1 – perfectly negative/positive correlation

Value for 1^st cell for Pearson coefficient will always be 1 because it represents the relationship between the same variable (circled in image below). For subsequent variables Pearson’s coefficient value will be vary from -1 to 1.

Table 3: 1st cell of the correlation matrix

Significance (2-tailed) value

2^nd element is the significance value Significance (2-tailed) value. It represents the risk of representing the existence of a correlation between the variables when no such relation exists. This means chances of error in the results. To make sure that the data results do not have too many errors, set a ‘confidence interval’. Generally, this confidence interval ranges from 90 to 99%. The result is shown in the form of a ‘significance level’ in a correlation table. The section below explains how to determine the confidence interval ideal for a study.

Determining the optimum confidence interval

Usually, the confidence interval is set at 99%, 95% or 90%.

Confidence interval	Meaning	Significance level	When is it used?
99%	Allowing only 1% chance of errors in the result.	0.01	Studies on social sciences or any study involving primary data to check respondents’ opinions/ perspectives.
95%	Allowing only a 5% chance of error in the result.	0.05	Studies on social sciences or any study involving primary data to check respondents’ opinions/ perspectives.
90%	Allowing up to 10% chances of error in the result	0.10	Secondary data-based studies such as macroeconomic data and financial results data, in cases in which the chances of error are beyond the researcher’s control.

In the case of the present example, a confidence interval of 95% is set. Therefore, the Significance (2-tailed) value to look for in all variables should be less than 0.05. Next, see if the Significance (2-tailed) value for all the independent variables is less than 0.05 or not.

N value

3^rd Element present in each cell is N. It determines the number of observations considered for analysis. In order to study correlation, this value is not relevant. However, the N value should be uniform across the correlation matrix else the results would be biased.

Interpretation of Pearson’s correlation values

In the case of the above example, below are Pearson’s correlation values for the four independent variables:

Independent variable name	Pearson correlation value	Result
Availability of education	0.582	Moderate positive correlation
Implementation of regulations	-.340	Weak negative correlation
Confidence in police	-.460	Moderate negative correlation
Promotion of illegal activities	0.632	Strong positive correlation

Interpretation of Significance (2-tailed) values

Independent variable name	Significance (2-tailed) value	Result (at 95% confidence interval)
Availability of education	0.042	Not acceptable
Implementation of regulations	0.08	Not acceptable
Confidence in police	1.68	Not acceptable
Promotion of illegal activities	0.000	Acceptable

Therefore out of all the variables, only the availability of education rate and promotion of illegal activities show an acceptable level of error.

Process for regression test

The next step is to determine which of these variables is qualified to be included in the regression analysis. Only those variables need to be considered which are significant and have a Pearson coefficient value greater/less than 0.4/-0.4 i.e. at least a moderate relationship should exist between variables. For the given sample, only ‘availability of education’ and ‘promotion of illegal activities’ qualify for further regression analysis with the dependent variable, i.e. crime rate.