Faecal coliform is as an important indicator of water pollution. It represents the microorganisms present in the water due to the presence of faecal material of human and animals. This leads to contamination of water along with water-borne diseases from bacteria and pathogens. The Ministry of Urban Development had recommended a level of 500 MPN/100ml as the desirable limit and 2500 MPN/100ml as the permissible limit for rivers.

However, it was deduced that the coliform level is so high in sewage water in India that even after treatment this level can’t be reduced to the permissible limit. Affected by faecal coliform, some water streams became the main source of health risks (Area & Seth, 2018; M. Kumar & Puri, 2012; Prathibha & Murulidhar, 2015).

## Presence of faecal coliform in major Indian rivers

In India, open defecation is the main reason for the high amount of faecal coliform in water bodies. Other reasons include the disposal of industrial effluents and sewage (CPCB, 2015; CSIR-NEERI, 2018; Fakhr, Gohar, & Atta, 2016).

The below figure shows the level of faecal coliform in the **Indian rivers** for the period 2002-2017. The graph shows that the amount present in Indian rivers is very high. Until 2008, the level was relatively less but shot up during the period 2008-2013. However, after 2013, it fell below 10000000 MPN/100ml. Ganga and Yamuna are the two rivers which constantly have a very high amount of faecal coliform. Followed by them are Brahmaputra, Brahmani, and Mahanadi. Mahi, Narmada, and Beas have the minimum level of faecal coliform.

There are many factors causing a high level of faecal coliform in water bodies. The aim of this article is to study the impact of **FDI (Foreign Direct Investment)** inflows on faecal coliform in the rivers. For this purpose, data from 15 major Indian rivers were collected for the time period 2007-2017. The data on water pollution indicators were obtained from the official websites of the National Water Mission and Ministry of Statistics and Programme Implementation.

## Investigating the impact of **FDI** inflows on faecal coliform levels

The first step in a time series analysis is to check the data for variability. According to Lütkepohl & Xu, (2009), any dataset should be stable with minimum variation. The variability in this dataset was found to be high. Therefore using natural log transformation in MS Excel, the dataset was first stabilized. This stabilized dataset was then used for further analysis.

In order to investigate the impact of **FDI** inflows on the level of faecal coliform in Indian rivers, the following hypothesis was framed.

H

_{0}: There is no significant impact ofFDIinflows on faecal coliform levels in Indian rivers.H

_{A}: There is a significant impact ofFDIInflows on faecal coliform levels in Indian rivers.

## Pre-condition tests

The dataset should first be tested for stationarity, normality, and cointegration of the variables. Augmented Dickey-Fuller (ADF) test, Johansen cointegration test, and Shapiro-Wilk tests were applied respectively in STATA software.

### a) Stationarity

Stationarity is that property of the time series which ensures that the mean and variance value of the variable is constant. This assumption helps in determining the relationship reliably (Adhikari, 2013; Gujarati & Porter, 2009; Nason, 2018). The below table represents the results from the ADF test for stationarity.

Variable | Test-Statistic | 5% Critical Value | p-value |
---|---|---|---|

LnFC | -4.367 | -3.000 | 0.0003 |

LnFDI | -1.511 | -3.000 | 0.5283 |

LnFDI with drift | -1.511 | -1.771 | 0.0774 |

LnFDI with trend | -1.259 | -3.600 | 0.8976 |

DiffLnFDI | -3.584 | -3.000 | 0.0061 |

Table 1: ADF results

The above table shows that for Faecal coliform, the p-value is 0.0003 < 0.05, the significance level. Furthermore, absolute test statistic value is greater than the absolute critical value i.e. 4.367 > 3.000. Thus, the null hypothesis of data being non-stationary without intercept is rejected. Hence, LnFC is a stationary variable.

In the case of **FDI** inflows, the p-value is 0.5283 > 0.05, the significance level. Furthermore, the absolute value of the test statistic is also less than the absolute critical value. Thus, the null hypothesis of having non-stationarity without intercept is not rejected.

For drift and trend adjusted data of **FDI **inflows, (intercept and deterministic trend) included data from the variables (Gujarati & Porter, 2009), where non-stationarity did not exist, as the p-value is 0.5283 and 0.0774. Lastly, the stationarity is derived at first-order differentiation as p-value is 0.0061 < 0.05. Even, absolute test statistic value is greater than absolute critical value i.e. 3.584 > 3.000. Thus, the null hypothesis of **FDI** data being non-stationarity with intercept is rejected. Thus, DifflnFDI is generated to represent a stationary form of **FDI **inflows.

### b) Cointegration

Cointegration test studies the nature of the relationship between the variables. It determines whether there is any relationship or linkage between the variables. Existence of cointegration between the variables ensures that there is a linkage between the variables and no spurious or non-sense relationship is studied (Gujarati & Porter, 2009). Johansen cointegration results to test the long-run relationship between **FDI** inflows and faecal coliform is given below.

Max. Ranks | Trace Statistic | 5% Critical Value | Max Statistic | 5% Critical Value |
---|---|---|---|---|

0 | 12.6195* | 15.41 | 10.3693 | 14.07 |

1 | 2.2502 | 3.76 | 2.2502 | 3.76 |

**represent significant at 5%level*

Table 2: Johansen cointegration test results

The above table shows that the value of trace statistic is less than the critical value for 0 rank i.e. 12.6195 < 15.41. Furthermore, max statistic value is also less than the critical value i.e. 10.3693 < 14.07 for 0 ranks. Thus, the null hypothesis of no long-run cointegration between faecal coliform and **FDI** inflows is not rejected.

Hence, 0 cointegrating vectors represent the long-run relationship of variables. This technique of studying the cointegration is also used by Kumar & Chander, (2016). In their study of deriving the impact of **FDI **inflows on environmental pollution, the trace and max statistic was compared with the critical values. If the value is greater than the critical level, then the study has cointegration in the variables thus showing the movement of variables together in the long run. However, Table 2 shows that the value of trace statistic is less than the critical value in rank 0, thus together no long-run movement takes place for both the variables.

As the long-run relationship does not exist, the short-run relationship between the variables is tested using VECM (Vector Error Correction Model) (Azhagaiah & Banumathy, 2015; Zou, 2018). Results of the test are given below.

Cointegrating Equation Variable | Coefficient |
---|---|

LnFC | 1 |

DifflnFDI | 4.65517 |

Constant | -15.60355 |

Table 3: VECM model Results

The results show that the coefficient value of DifflnFDI is 4.65517. This depicts that there is a significant magnitude of the impact of **FDI** inflows on faecal coliform levels. Thus, the short-run movement takes place between both the variables (Zou, 2018). Hence, short-run cointegration exists.

### c) Normality

A dataset is said to be normally distributed if the values are symmetrically distributed. Existence of normality in the value of the variables is the assumption of the classical linear regression (Casson & Farmer, 2014). Thus, to depict the impact of **FDI **inflows on the faecal coliform level, normality of **FDI **inflows and faecal coliform data is tested. Shapiro-Wilk test was used for this. The result of the test is represented below.

Variable | P-value |
---|---|

LnFC | 0.32124 |

Difflnfdi | 0.50621 |

Table 4: Shapiro-Wilk test results

The above table shows that the p-value is greater than the level of significance of the study i.e. 0.32124 and 0.50621 > 0.05. Thus, the null hypothesis of the presence of normal distribution in the dataset of the variables is not rejected. Hence, **FDI **inflows and faecal coliform data are normally distributed.

## Regression

Considering the stationary, cointegrated, and normally distributed form of variables, the regression test is performed using the below equation.

Wherein,

Variables | Nature of variable | Description |
---|---|---|

Dependent | Stationary form of Natural Log-transformation for the average faecal coliform level of Indian rivers at the t-time period | |

Independent | Stationary form of Natural log-transformation for net FDI Inflows at the t-time period | |

Coefficients | Intercept, Slope Coefficient | |

Error Term | Residual | |

t | Time |

Table 5: Variables description of regression test

Regression results for the final model are given in the below table.

LnFC | Coefficient | t-value | p-value | R^{2} value | Adjusted R^{2} value |
---|---|---|---|---|---|

Difflnfdi | -2.943657 | -2.58 | 0.023 | 0.3390 | 0.2881 |

Constant | 15.25517 | 34.67 | 0.000 |

Table 6: Regression results

Results in the above table show that the final model is better. The value of R^{2} and Adjusted R^{2} is 0.3390, and 0.2881. Furthermore, the p-value is 0.023 that is less than the significance level of the study i.e. 0.05. However, before testing the hypothesis, some more tests need to be performed to avoid the presence of biases in the results (Casson & Farmer, 2014).

The above figure shows that data is scattered away from the fitted line. Thus, though there is a significant impact of **FDI** inflows on the faecal coliform level still some biases do exist. Furthermore, the fitted line shows that there is a negative relationship between **FDI** inflows and the Faecal coliform level. Thus, to predict more appropriate results, further diagnostic tests were applied to the residual value of the regression.

## Diagnostic tests

### a) Autocorrelation

Autocorrelation means existence of interdependence between the error terms. Presence of autocorrelation in the model leads to decreasing validity and precision of the results of the model. No autocorrelation is also the assumption of the classical linear regression (Huitema & Laraway, 2006). Durbin Watson test determine whether the residuals are interrelated. Result of the Durbin Watson test for data is presented below.

D-statistic | D_{L} | D_{U} | 4-D_{U} | 4-D_{L} |
---|---|---|---|---|

3.025302 | 0.946 | 1.543 | 2.457 | 3.054 |

Table 7: Durbin Watson Result

The above table shows that the value of D-statistic lies between 4-D_{U} and 4-D_{L} i.e. 2.457<3.025302<3.054. Though the value lies in the indeterminate zone it is close to negative serial correlation. Okumoko, Akarara, & Opuofoni, (2018), stated that the values close to 2 mean the model has no autocorrelation.

Herein, the value of Durbin Watson statistic is far from 2, thus further processing needs to be done to fulfil the condition of the classical linear regression model (Casson & Farmer, 2014) and derive the effective results. Hence, in order to remove autocorrelation from the model, Cochrane-Orcutt AR (1) regression test (Wooldridge, 2002) was performed.

LnFC | Coefficient | t-value | p-value | R^{2} value | Adjusted R^{2} value |
---|---|---|---|---|---|

Difflnfdi | -4.235418 | -5.19 | 0.000 | 0.6915 | 0.6658 |

Constant | 15.49288 | 61.90 | 0.000 |

Table 8: Cochrane-Orcutt AR (1) regression results for the final model

Results of the Durbin Watson test for the above stated Cochrane-Orcutt AR (1) regression model is

D-statistic | D_{L} | D_{U} | 4-D_{U} | 4-D_{L} |
---|---|---|---|---|

2.587558 | 0.905 | 1.551 | 2.449 | 3.095 |

Table 9: Durbin Watson Result

The result of the Durbin Watson test shows that the value of D-statistic lies between 4-D_{U} and 4-D_{L} i.e. 2.449 < 2.587558 < 3.095. However, the value is close to 4-D_{U} thus depicting that there is a possibility of rejecting the null hypothesis and having no autocorrelation presence in the model. Thus, the problem of autocorrelation has been removed from the model.

### Heteroscedasticity

Classical linear regression model states that the variance of the predicted values of the model needs to be constant to derive reliable results. Thus, heteroscedasticity is tested for the model before interpreting the results of the analysis (Casson & Farmer, 2014; Salkind, 2007). Bartlett’s Periodogram based white noise Heteroscedasticity test was applied to test the presence of variability in the newly stated Cochrane-Orcutt AR (1) regression model.

Bartlett’s (B)- Statistic | P-value |
---|---|

1.1821 | 0.1222 |

Table 10: Heteroscedasticity Test results

The above table shows that the p-value for the B-statistic is greater than the significance level of the study i.e. 0.1222 > 0.05. Thus, the null hypothesis of no heteroscedasticity in the model is not rejected. Hence, the model is homoscedastic.

**FDI** inflows have a negative impact on the levels of faecal coliform in Indian rivers

The diagnostic test of the final model to the derivation of the Cochrane-Orcutt AR (1) regression model is free from the problem of autocorrelation and heteroscedasticity. Table 8 shows that the value of R^{2} has improved in comparison to the final model i.e. 0.6915 > 0.3390.

Thus, about 69% of the variation in faecal coliform is explained by **FDI **inflows. Faraway, (2014) stated that the value of R^{2} needs to be greater than 0 to derive an effective model. As the value depicted in Table 8 is higher than 0.6, thus model derives the effective results.

Further p-value too is 0.023. As the value is less than the significance level i.e. 0.000 < 0.05, thus the null hypothesis of having no significant impact of **FDI** inflows on water pollution is rejected. Coefficient value depicts the magnitude of **FDI **inflows impact i.e. with a 1% increase in **FDI** inflows, faecal coliform decreases by 4.235418%.

The above graph shows that there is a perfect negative relationship between **FDI** Inflows and faecal coliform level. Furthermore, as the values of faecal coliform lie close to the predicted line, the model used to describe the impact of **FDI** inflows on faecal coliform is level is more appropriate and significant. Hence, with an increase in **FDI** Inflows the water pollution level decreases.

Bao, Chen, & Song, (2011) also focused on studying the relationship between **FDI** inflows and environment degradation. They stated that due to technological effect, the growth in India represented by **FDI **inflows helps in reducing the level of pollution emission. Studying simultaneous equations of growth and environment, they concluded that technological spillover effect tends to exist in an economy that contributes to improving the environmental quality. This technological up-gradation is also depicted in Indian water reports (OSEC, 2010; Sahasranaman & Ganguly, 2018), that result in a decrease in water pollution.

#### References

- Adhikari K., R., & R.K., A. (2013).
*An Introductory Study on Time Series Modeling and Forecasting Ratnadip Adhikari R. K. Agrawal*. https://doi.org/10.1210/jc.2006-1327 - Area, I., & Seth, B. L. (2018).
*What should be the Coliform standard in India ’ s sewage treatment protocol in order to promote safe reuse of reclaimed water for domestic, industrial and agricultural use ; are stringent standards affordable ?* - Azhagaiah, R., & Banumathy, K. (2015). Long – Run and Short – Run Causality between Stock Price and Gold Price : Evidence of VECM Analysis from India.
*Management Studies and Economic Systems*,*1*(4), 247–256. https://doi.org/10.12816/0019391 - Bao, Q., Chen, Y., & Song, L. (2011). Foreign direct investment and environmental pollution in China: A simultaneous equations estimation.
*Environment and Development Economics*,*16*(1), 71–92. https://doi.org/10.1017/S1355770X10000380 - Casson, R. J., & Farmer, L. D. M. (2014). Understanding and checking the assumptions of linear regression: A primer for medical researchers.
*Clinical and Experimental Ophthalmology*,*42*(6), 590–596. https://doi.org/10.1111/ceo.12358 - CPCB. (2015).
*A plan on Conservation of Water Quality of River Ganga.* - CSIR-NEERI. (2018).
*Assessment of Water Quality and Sediment to understand the Special Properties of River Ganga*. - Fakhr, A. E., Gohar, M. K., & Atta, A. H. (2016). Impact of Some Ecological Factors on Faecal Contamination of Drinking Water by Diarrheagenic Antibiotic-Resistant Escherichia coli in Zagazig City, Egypt.
*International Journal of Microbiology*. https://doi.org/10.1155/2016/6240703 - Faraway, J. J. (2014).
*Texts in Statistical Science Series, Linear Models with R*(M. Tanner, J. Zidek, & C. Chatfield, eds.). Taylor & Francis. - Gujarati, D. N., & Porter, D. C. (2009). Basic Econometrics (5th ed.). In
*Basic Econometrics*. - Huitema, B., & Laraway, S. (2006). Autocorrelation.
*Encyclopedia of Measurement and Statistics*. - Kostakis, Sardianou, S. and, Lolos, I. and, & Eleni. (2016). Foreign direct investment and environmental degradation : Further evidence from Brazil and Singapore. MPRA, (75643).
- Kumar, M., & Puri, A. (2012). A review of permissible limits of drinking water. Indian Journal of Occupational and Environmental Medicine, 16(1), 40–44. https://doi.org/10.4103/0019-5278.99696
- Kumar, V., & Chander, R. (2016). Foreign Direct Investment And Air Pollution : Granger Causality Analysis. IOSR Journal of Business and Management (IOSR-JBM), 12–17.
- Nason, G. P. (2018). Stationary and non-stationary time series. Statistics in Volcanology, (1994), 129–142. https://doi.org/10.1144/iavcei001.11
- Okumoko, T. P., Akarara, E. A., & Opuofoni, C. A. (2018). Impact of Foreign Direct Investment on Economic Growth in Nigeria. International Journal of Humanities and Social Science, 8(1).
- OSEC. (2010). Water & water treatment in India. Market opportunities for Swiss companies.
- Prathibha, S., & Murulidhar, V. N. (2015). Original research article diversity and density of coliform bacteria in River Tunga at Shivamogga city, Karnataka, India. Int.J.Curr.Microbiol.App.Sci (2015), 4(7), 624–631.
- Sahasranaman, M., & Ganguly, A. (2018). Wastewater Treatment for Water Security in India.
- Salkind, N. (2007). Heteroscedasticity and Homoscedasticity. Encyclopedia of Measurement and Statistics, 8–10. https://doi.org/10.4135/9781412952644.n201
- Seltman, H. J. (2018). Experimental Design and Analysis.
- Wooldridge. (2002). Serial correlation. In
*Introductory Econometrics*. - Zou, X. (2018). VECM Model Analysis of Carbon Emissions, GDP, and International Crude Oil Prices.
*Discrete Dynamics in Nature and Society*,*2018*. https://doi.org/10.1155/2018/5350308

- How to write the introduction of a research paper? - July 7, 2020
- What is null and alternative hypothesis? - June 19, 2020
- Understanding various hypothesis testing steps - June 17, 2020

## Discuss