Correlation is a statistical measure that helps in determining the extent of the relationship between two or more variables or factors. For example, growth in crime is positively related to growth in the sale of guns. Growth in obesity is positively correlated to growth in consumption of junk food. However, growth in environmental degradation is negatively correlated with the rate of education and awareness. A previous article explained how to perform the correlation test in SPSS software. This article explains how to interpret the results of that test.

Below table represents a sample correlation matrix result. The purpose of this analysis was to determine the relationship between social factors and crime rate. Herein, unemployment rate, GDP per capita, population growth rate, and secondary enrollment rate are the social factors.

CR | AE | IRP | CP | PIA | ||
---|---|---|---|---|---|---|

CR | PC S N | 1 265 | .582* .042 265 | -0.34 .08 265 | -0.46 1.68 265 | .632** .000 265 |

AE | P.C. S N | .582* .042 265 | 1 265 | -0.736* .03 265 | -0.912** .000 265 | .674 .07 265 |

IRP | PC S N | -0.46 1.68 265 | -0.912** .000 265 | .676* .025 265 | 1 265 | .693** .000 265 |

CP | PC S N | -0.46 1.68 265 | -0.912** .000 265 | .676* .025 265 | 1 265 | .693** .000 265 |

PIA | PC S N | .632** .000 265 | .674 .07 265 | -0.782** .000 265 | .693** .000 265 | 1 265 |

**. Correlation is significant at the 0.01 level (2-tailed).

*. Correlation is significant at the 0.05 level (2-tailed).

CR: Crime Rate (dependent)

AE: Availability of Education (Independent Variable)

IRP: Implementation of regulations and penalties (Independent Variable)

CP: Confidence in Police (Independent Variable)

PIA: Promotion of Illegal Activities (Independent Variable)

PC: Pearson Correlation

S: Significance

N: 2-tailed

In the above table, rows 2-5 are the same as columns 2-5. Either of them can be removed. Remove the columns, so that the table looks like below.

CR | ||
---|---|---|

CR | PC S N | 1 265 |

AE | PC S N | .582* .042 265 |

IRP | PC S N | -.340 .08 265 |

CP | PC S N | -.460 1.68 265 |

PIA | PC S N | .632** .000 265 |

**. Correlation is significant at the 0.01 level (2-tailed).

*. Correlation is significant at the 0.05 level (2-tailed).

CR: Crime Rate (dependent)

AE: Availability of Education (Independent Variable)

IRP: Implementation of regulations and penalties (Independent Variable)

CP: Confidence in Police (Independent Variable)

PIA: Promotion of Illegal Activities (Independent Variable)

PC: Pearson Correlation

S: Significance (2-tailed)

Each row has three elements present in it:

- Pearson Correlation,
- Sig (2-tailed) and
- N.

## Pearson’s correlation value

1^{st} Element is Pearson Correlation values. This value can range from -1 to 1. The presence of a relationship between two factors is primarily determined by this value.

- 0- No correlation
- -0.2 to 0 /0 to 0.2 – very weak negative/ positive correlation
- -0.4 to -0.2/0.2 to 0.4 – weak negative/positive correlation
- -0.6 to -0.4/0.4 to 0.6 – moderate negative/positive correlation
- -0.8 to -0.6/0.6 to 0.8 – strong negative/positive correlation
- -1 to -0.8/0.8 to 1 – very strong negative/positive correlation
- -1/1 – perfectly negative/positive correlation

Value for 1^{st} cell for Pearson coefficient will always be 1 because it represents the relationship between the same variable (circled in image below). For subsequent variables Pearson’s coefficient value will be vary from -1 to 1.

## Significance (2-tailed) value

2^{nd} element is the significance value Significance (2-tailed) value. It represents the risk of representing the existence of a correlation between the variables when no such relation exists. This means chances of error in the results. To make sure that the data results do not have too many errors, set a ‘confidence interval’. Generally, this confidence interval ranges from 90- 99%. The result is shown in the form of ‘significance level’ in a correlation table. The section below explains how to determine the confidence interval ideal for a study.

### Determining the optimum confidence interval

Usually, the confidence interval is set at 99%, 95% or 90%.

Confidence interval | Meaning | Significance level | When is it used? |
---|---|---|---|

99% | Allowing only 1% chance of errors in the result. | 0.01 | Studies on social sciences or any study involving primary data to check respondents’ opinions/ perspectives. |

95% | Allowing only 5% chances of error in the result. | 0.05 | Studies on social sciences or any study involving primary data to check respondents’ opinions/ perspectives. |

90% | Allowing up to 10% chances of error in the result | 0.10 | Secondary data-based studies such as macroeconomic data and financial results data, in cases of which the chances of error are beyond the researcher’s control. |

In the case of the present example, a confidence interval of 95% is set. Therefore, Significance (2-tailed) value to look for in all variables should be less than 0.05. Next, see if the Significance (2-tailed) value for all the independent variables is less than 0.05 or not.

## N value

3^{rd}
Element present in each cell is **N**. It determines the number of
observations considered of analysis. In order to study correlation, this value
is not relevant. However, N value should
be uniform across the correlation matrix else the results would be biased.

## Interpretation of Pearson’s correlation values

In the case of the above example, below are Pearson’s correlation values for the four independent variables:

Independent variable name | Pearson correlation value | Result |
---|---|---|

Availability of education | 0.582 | Moderate positive correlation |

Implementation of regulations | -.340 | Weak negative correlation |

Confidence in police | -.460 | Moderate negative correlation |

Promotion of illegal activities | 0.632 | Strong positive correlation |

## Interpretation of Significance (2-tailed) values

Independent variable name | Significance (2-tailed) value | Result (at 95% confidence interval) |
---|---|---|

Availability of education | 0.042 | Not acceptable |

Implementation of regulations | 0.08 | Not acceptable |

Confidence in police | 1.68 | Not acceptable |

Promotion of illegal activities | 0.000 | Acceptable |

Therefore out of all the variables, only availability of education rate and promotion of illegal activities show an acceptable level of error.

## Process for regression test

The next step is to determine which of these variables is qualified to be included in the regression analysis. Only those variables need to be considered which are significant and have Pearson coefficient value greater/less than 0.4/-0.4 i.e. at least moderate relationship should exist between variables. For the given sample, only ‘availability of education’ and ‘promotion of illegal activities’ qualify for further regression analysis with the dependent variable, i.e. crime rate.

### Riya Jain

#### Latest posts by Riya Jain (see all)

- Why is it important to test heteroskedasticity in a dataset? - March 23, 2020
- Why conduct a multicollinearity test in econometrics? - March 19, 2020
- How to test normality statistically? - March 16, 2020

## Discuss