# Performing pooled panel data regression in STATA

Before applying panel data regression, the first step is to disregard the effects of space and time and perform pooled regression instead. In this, a usual OLS regression helps to see the effect of independent variables on the dependent variables disregarding the fact that data is both cross-sectional and time series. The underlying assumption in pooled regression is that space and time dimensions do not create any distinction within the observations and there are no set of fixed effects in the data. This article explains how to perform pooled panel data regression in STATA.

In order to start with pooled regression, first, create dummies for all the cross-sectional units. In this case, it is the companies from the previous article (Introduction to panel data analysis in STATA).  To make the dummies for all 30 companies, use the below command:

`tabulate compnam, gen(companies)`

Note that the “compnam” is the panel data variable. Create the dummies for each of the companies using this variable. The below results will appear.

Figure 1: Dummies for panel variable to perform pooled panel data regression in STATA

The figure above shows the dummies for 30 companies in STATA. Now perform pooled regression using all 30 dummies using the following command.

`reg EBIT LTD INT companies2 companies3 companies4 companies5 companies6 companies7 companies8 companies9 companies10 companies11 companies12 companies13 companies14 companies15 companies16 companies17 companies18 companies19 companies20 companies21 companies22 companies23 companies24 companies25 companies26 companies27 companies28`

Note that this dataset contains 29 dummies starting from company 2. Here skip one dummy to make constant variable carry its effect.

The figure below shows the results.

Figure 2: Pooled regression results in STATA

Furthermore, to check if the above result is appropriate for the panel dataset, confirm that the above dummies have no joint effect on the results. If they carry any joint effects then the pooled regression estimates are not viable. To check the joint hypothesis of dummies, use the below command.

`testparm companies2 companies3 companies4 companies5 companies6 companies7 companies8 companies9 companies10 companies11 companies12 companies13 companies14 companies15 companies16 companies17 companies18 companies19 companies20 companies21 companies22 companies23 companies24 companies25 companies26 companies27 companies28 companies29 companies30`

The below results will appear.

Figure 3: Results of the joint hypothesis of dummies for pooled panel data regression in STATA

Here, the null hypothesis suggests that the joint effect of all the dummies is zero. Therefore, the alternatives effects coming from variations in data due to the distinction of companies does not affect this model. However, the results suggest p values equal to 0.000 which indicates that the null hypothesis can be rejected. It thus confirms the fact that pooled regression is not free from the joint effects of dummies.

Therefore the panel data set here carries the variables due to the distinction between the companies. Moreover, the regression analysis of this data may carry some sort of fixed effects. Resultantly, the pooled regression technique is obsolete for this dataset and therefore move towards either fixed or random effects panel data regression. To start with panel data regression, ensure the absence of unit root problem since this data also carries time dimensions.

## Stationarity or unit root in panel data

It is not possible to perform stationarity test in case of panel data using augmented Dicky Fuller test. Test the unit root for the panel data using the Leuin-lin-Chu test using the below command.

`xtunitroot llc LTD`

In the above command, ‘LTD’ is the variable and “xtunitroot llc” is the syntax. The below results will appear.

Figure 4: Leuin-lin-Chu unit root test results for pooled panel data regression in STATA

The p-values of adjusted t* is 0.0004, therefore reject the null hypothesis that unit root problem is present. Therefore panel data series Long Term Debt (LTD) is stationary. In a similar manner, check the stationarity of other variables, EBIT (Earnings before Interest and Taxes) and INT (Interest payments). In order to do so, use the below commands.

```xtunitroot llc EBIT
xtunitroot llc INT```

The below results will appear.

Figure 5: Leuin-lin-Chu unit root test results for pooled panel data regression in STATA

This article showed how the pooled regression technique fails to analyze panel data series correctly, therefore proceeding to the stationarity test. The next article shows how to start with panel data analysis and understand the concept of random effects and fixed effects in panel data.

### Saptarshi Basu Roy Choudhury

Senior Research Analyst at Project Guru
Saptarshi has done his M. Phil in International Trade and Development and Masters in Economics from Jawaharlal Nehru University, New Delhi. His academic interests include issues related to economics of climate change, regulation and contemporary trade theories. He has a keen interest in current affairs and likes to read and travel in his spare time.

### Related articles

• How to perform Panel data regression for random effect model in STATA? The previous article (Pooled panel data regression in STATA) showed how to conduct pooled regression analysis with dummies of 30 American companies. The results revealed that the joint hypothesis of dummies reject the null hypothesis that these companies do not have any alternative or […]
• What is panel data analysis in STATA? This article of the module explains how to perform panel data analysis using STATA. In the case of panel data, the observations are present in time and space dimensions. For instance, a survey of the same cross-sectional unit such as firm, country or state over time.
• Building univariate ARIMA model for time series analysis in STATA Autoregressive Integrated Moving Average (ARIMA) is popularly known as Box-Jenkins method. The emphasis of this method is on analyzing the probabilistic or stochastic properties of a single time series. Unlike regression models where Y is explained by X1 X2….XN regressor (like […]
• How to perform panel data analysis in E-Views? E- Views offer an impressive toolkit that involves the series or the group of series that allows estimating panel data analysis ranging from the simplest to the complex types. Performing data analysis in E-views is easier to understand as all the necessary statistical modelling can be […]
• Lag selection and stationarity in VAR with three variables in STATA This article incorporates Gross Fixed Capital Formation (GFC) and again performs the lag selection test and check for stationarity for both, GFC and PFC. Thus this article incorporates the VAR with three variables in STATA.

We are looking for candidates who have completed their master's degree or Ph.D. Click here to know more about our vacancies.