Performing pooled panel data regression in STATA

By Saptarshi Basu Roy Choudhury & Priya Chetty on October 30, 2018

Before applying panel data regression, the first step is to disregard the effects of space and time and perform pooled regression instead. In this, a usual OLS regression helps to see the effect of independent variables on the dependent variables disregarding the fact that data is both cross-sectional and time series. The underlying assumption in pooled regression is that space and time dimensions do not create any distinction within the observations and there is no set of fixed effects in the data. This article explains how to perform pooled panel data regression in STATA.

In order to start with pooled regression, first, create dummies for all the cross-sectional units. In this case, it is the companies from the previous article (Introduction to panel data analysis in STATA). To make the dummies for all 30 companies, use the below command:

tabulate compnam, gen(companies)

Note that the “compnam” is the panel data variable. Create the dummies for each of the companies using this variable. The results will appear.

Figure 1: Dummies for panel variable to perform pooled panel data regression in STATA

The figure above shows the dummies for 30 companies in STATA. Now perform pooled regression using all 30 dummies using the following command.

reg EBIT LTD INT companies2 companies3 companies4 companies5 companies6 companies7 companies8 companies9 companies10 companies11 companies12 companies13 companies14 companies15 companies16 companies17 companies18 companies19 companies20 companies21 companies22 companies23 companies24 companies25 companies26 companies27 companies28

Note that this dataset contains 29 dummies starting from company 2. Here skip one dummy to make constant variable carry its effect.

The figure below shows the results.

Figure 2: Pooled regression results in STATA

Furthermore, to check if the above result is appropriate for the panel dataset, confirm that the above dummies have no joint effect on the results. If they carry any joint effects then the pooled regression estimates are not viable. To check the joint hypothesis of dummies, use the below command.

testparm companies2 companies3 companies4 companies5 companies6 companies7 companies8 companies9 companies10 companies11 companies12 companies13 companies14 companies15 companies16 companies17 companies18 companies19 companies20 companies21 companies22 companies23 companies24 companies25 companies26 companies27 companies28 companies29 companies30

The results will appear.

Figure 3: Results of joint hypothesis of dummies for pooled panel data regression in STATA — Figure 3: Results of the joint hypothesis of dummies for pooled panel data regression in STATA

Here, the null hypothesis suggests that the joint effect of all the dummies is zero. Therefore, the effects of the alternative coming from variations in data due to the distinction of companies do not affect this model. However, the results suggest p values equal to 0.000 which indicates that the null hypothesis can be rejected. It thus confirms the fact that pooled regression is not free from the joint effects of dummies.

Therefore the panel data set here carries the variables due to the distinction between the companies. Moreover, the regression analysis of this data may carry some sort of fixed effects. Resultantly, the pooled regression technique is obsolete for this dataset and therefore move towards either fixed or random effects panel data regression. To start with panel data regression, ensure the absence of a unit root problem since this data also carries time dimensions.

Stationarity or unit root in panel data

It is not possible to perform a stationarity test in the case of panel data using the augmented Dicky Fuller test. Test the unit root for the panel data using the Leuin-lin-Chu test using the below command.

xtunitroot llc LTD

In the above command, ‘LTD’ is the variable, and “xtunitroot llc” is the syntax. The below results will appear.

Figure 4: Leuin-lin-Chu unit root test results for pooled panel data regression in STATA

The p-values of adjusted t* are 0.0004, therefore rejecting the null hypothesis that the unit root problem is present. Therefore panel data series Long Term Debt (LTD) is stationary. In a similar manner, check the stationarity of other variables, EBIT (Earnings before Interest and Taxes) and INT (Interest payments). In order to do so, use the below commands.

xtunitroot llc EBIT
xtunitroot llc INT

The below results will appear.

Figure 5: Leuin-lin-Chu unit root test results for pooled panel data regression in STATA

This article showed how the pooled regression technique fails to analyze panel data series correctly, therefore proceeding to the stationarity test. The next article shows how to start with panel data analysis and understand the concept of random effects and fixed effects in panel data.

Stationarity or unit root in panel data

Discuss

4 thoughts on “Performing pooled panel data regression in STATA”

proofreading