Using Pearson correlation test on secondary data in SPSS

By Priya Chetty on February 19, 2022

There are many ways to calculate correlation in SPSS, but the Pearson correlation method is the most popular. It is popularly used for social sciences data like primary data but rarely used for secondary data in SPSS. Econometrics data like statistics of a country’s population, income, poverty, and mortality are available as secondary data. Applications like STATA and R are also used because of their flexibility and the possibility to conduct many tests. For this article, SPSS is used to perform the correlation test on secondary data from a small case study.

A correlation test explains two main things:

  1. Whether two factors are connected.
  2. If yes, then is their association positive (direct) or negative (inverse)?

Case description for secondary data in SPSS

Many economists have shown that population growth and unemployment rate have an impact on the economic growth indicator Gross Domestic Product (GDP) of a country. In this case, we will examine this scenario in the context of India, which has a big population and suffers from a high unemployment rate.

Therefore, there are three factors:

  • Unemployment rate (UNE)%
  • Population growth rate (POPG)%
  • Economic output (GDP)- US$ per year

The dataset consisted of data from the period 2012-2018. The data was obtained from the World Bank website.

The first step is to compute natural logarithm on the secondary data in SPSS

Since the dataset had different units, a natural logarithm for each variable must be computed. This is necessary so that any inaccuracy in results due to a lack of consistency in units of measurement can be removed. For this, click on ‘Transform’ and then ‘Compute Variable’.

Transforming a secondary data in SPSS
Figure 1: Transforming secondary data in SPSS

With this, a new dialogue box will open.

Procedure for computing the transformed variable in SPSS
Figure 2: Procedure for computing the transformed variable in SPSS

Herein, as shown in the figure above, type ‘ln’ for natural log transformation and move the respective variable in the bracket to formulate a formula i.e. ln(GDP). The target variable defines the name of the new variable. Type it as LnGDP and click Ok. A new natural log-transformed variable of GDP would be computed. Repeat this procedure for unemployment and population growth variable.

The next step is to compute the correlation

For computing, the correlation, click on ‘Analyze’ on the main menu, then ‘Correlate’ and then ‘Bivariate’ as shown below.

Procedure for conducting bivariate correlation test in SPSS
Figure 3: Procedure for conducting bivariate correlation test in SPSS

With this, the Bivariate correlation window will appear.

Now move the natural log-transformed variables

Move the natural log-transformed variables into variable columns i.e. LnGDP, LnUNE, and LnPOPG. Then click on ‘Ok’.

Bivariate correlation window in SPSS
Figure 4: Bivariate correlation window in SPSS

With this, the correlation analysis would be done and the output window will appear with correlation results as shown below.

Results of bivariate correlation on secondary data in SPSS
Figure 5: Results of bivariate correlation on secondary data in SPSS

Interpreting the results from the correlation test on secondary data in SPSS

To prove the association between any two elements in statistics, the Sig. the value should be less than 0.10. The above figure shows that the significance ‘Sig.’ value for LnUNE is 0.021 and LnPOPG is 0.00. This is less than the required significance level of 0.10. Therefore, it can be concluded that there is a correlation between all three variables, i.e. GDP, unemployment rate and population growth rate of India for the period 2012-18. Further, the Pearson correlation value for unemployment growth and population growth relationship with GDP is -0.55 and -0.96, showing a negative association.

Discuss