How to do the correlation analysis in STATA?

By Priya Chetty on November 23, 2016

Correlation analysis is conducted to examine the relationship between dependent and independent variables. There are two types of correlation analysis in STATA.

  1. Pairwise correlation treats each pair of variables separately and only includes observations that have valid values for each pair in the data set.
  2. The second type of correlation is the normal correlation which takes the entire data set as one and calculates the correlation for all valid values.

In other words, in pairwise correlation, the linear relationship between the variables is computed. However, the only difference is in the way missing values are handled. In the case of Pairwise correlation, a pair of data of points are deleted from the computation in case one or both the data points are missing in the dataset. In case the varlist is not defined then the matrix is displayed for all the variables in the dataset.

Doing correlation analysis using the dropdown list


Statistics > Summaries, tables, and tests > Summary and Descriptive Tests > Correlations and covariances

Pairwise Correlation

Statistics > Summaries, tables, and tests > Summary and Descriptive Tests > Pairwise Correlation

Defining relationship between variables using correlation analysis in STATA
Correlation analysis using STATA

In order to improve the viability of results, pairwise correlation is done in this article with an example. From the drop-down button, select the variables that you need to correlate.

one can choose different options ( sig level, number of obs ) while conducting correlation analysis is STATA
Various options available for correlation analysis in STATA

Using the graphical user interface, the commands which have been discussed above can be carried out by selecting the variables. Next check the boxes titled:

  • Print Number of observations for each entry.
  • Print significance level for each entry.
  • The significance level for displaying with a star.

See the image below:

selection of different options for correlation analysis in STATA
Selecting different options in a correlation analysis

Commands used for pairwise correlation

The basic code for pairwise Correlation is:

pwcorr VariableA VariableB

In case one wants STATA to produce a p-value (statistically significance level), one needs to add sig, at the end of the command like shown below:

pwcorr VariableA VariableB, sig

In case the researcher wants to determine if the results are significant at a specific confidence interval (ex: p < .05 or .01), then the command is preceded by sig star (.05 or .01)

pwcorr VariableA VariableB, sig star (.05)

In case the researcher wants to observe the number of observations (N or sample size) i.e. obs, then the command is:

pwcorr VariableA VariableB, sig star(.05) obs

Output for pairwise correlation in STATA

The pairwise correlation was done between price, mileage (mpg), repair record 1978 (rep78) and headroom. The table below reflects the Pearson coefficient value for each variable, the significance value and the sample size in the data set (variable, as in case of rep78 it is 69 and for rest, it is 74).

pwcorr price mpg rep78 headroom, obs sig star(5)
rep780.0066 0.4023*1.0000 
headroom0.1145 -0.4138*-0.14801.0000

The output reflects that there is a negative correlation between mpg and the price of the car which is significant at a 5% significance level. Similarly, a negative correlation exists between headroom and mpg. Further, the positive correlation between rep78 and mpg.

I am a management graduate with specialisation in Marketing and Finance. I have over 12 years' experience in research and analysis. This includes fundamental and applied research in the domains of management and social sciences. I am well versed with academic research principles. Over the years i have developed a mastery in different types of data analysis on different applications like SPSS, Amos, and NVIVO. My expertise lies in inferring the findings and creating actionable strategies based on them. 

Over the past decade I have also built a profile as a researcher on Project Guru's Knowledge Tank division. I have penned over 200 articles that have earned me 400+ citations so far. My Google Scholar profile can be accessed here

I now consult university faculty through Faculty Development Programs (FDPs) on the latest developments in the field of research. I also guide individual researchers on how they can commercialise their inventions or research findings. Other developments im actively involved in at Project Guru include strengthening the "Publish" division as a bridge between industry and academia by bringing together experienced research persons, learners, and practitioners to collaboratively work on a common goal. 



0 thoughts on “How to do the correlation analysis in STATA?”