What is the relevance of significant results in regression analysis?

By Riya Jain & Priya Chetty on February 28, 2020
Photo by Fauxels from Pexels

Regression analysis is the statistical measurement which helps in linking the variables and determining the strength of the relationship between them. As stated in the previous article that regression analysis is used to determine the influence of independent variables on the dependent variables, thus, it is essential to determine the significant and effective conclusion about the relationship between the variables.

For example, consider a study to determine the impact of emotional awareness on the creativity level of students. In order to fulfil this purpose, it is required to determine the strength of the relationship between the factors affecting emotional awareness and creativity level of students.

The previous article has discussed the process of regression analysis and mentioned about the method to interpret the results derived from the analysis. However, there is a possibility of deriving an insignificant or biased result. Considering this inconclusiveness of the results, this article is based on stating the need for removing biases from the data and deriving accurate results.

What influences the significant regression analysis results?

Regression analysis results are mainly categorized into three:

  1. Model summary,
  2. ANOVA results and,
  3. coefficient table.

Herein each part of the analysis provides information about the significance of the model in deriving the relationship between the independent and dependent variable.

Factors deriving significant results of the regression analysis
Figure 1: Factors deriving significant results of the regression analysis

Factors affecting efficient results in the model summary

The first part of the regression results depicts the value of the coefficient of determination (R2) and Adjusted R2. Both values explain the proportion of variation that could be caused in the dependent variable due to the independent variables included in the model.

In case of simple linear regression R2 value, while in multiple linear regression Adjusted R2 value, briefly summarizes the efficiency of the model. Thus, for a general overview of the model, the requirement is to have R2 and Adjusted R2 greater than 0.5.

The formula for the computation of R2 and Adjusted R2 value states higher the correlation between the variables, more would be the value of R2 and adjusted R2. Coefficient value can also increase by including a large number of observations or reducing the number of independent variables.

Factors affecting ANOVA efficient results

The F-ratio computed in the ANOVA table represents the improvement in the prediction of the value of the dependent variable after considering the inaccuracy present in the model. The value of F-ratio should be greater than 1. F-ratio compares the dataset of different variables, thus the presence of high variability in the dataset of dependent and independent variables tends to reduce the value of F-ratio. Hence, in order to improve F-ratio and make it greater than 1, there should be less variability in the dataset. Furthermore, the presence of a large number of observations too increases the value of F-ratio and raises the prediction of the dependent variable from the independent variables.

Factors affecting the efficiency of significant coefficients

Coefficient values determine whether there is any relevant or significant impact of the independent variable on the dependent variable. The T-score value defines the significance level of a coefficient in the model. This should be less than the error or insignificance (1% or 5% or 10%) allowed in the model. The formula of T-score value shows that higher the variability in the dataset less would be the t-score value. Thus, the efficiency of the t-score is influenced by the presence of high variability. Furthermore, the presence of a smaller number of observations and a smaller number of coefficients (few independent variables) in the model too reduces the level of significance of the study. Hence, in order to derive the significant coefficient, it is required to raise the number of observation or sample size and reduce the number of independent variables and variability of the dataset.

Why is dataset processing for regression analysis is needed?

Regression analysis helps in stating the influence of independent variables on the dependent variables. Therefore it is necessary to ensure that the dataset is free from anomalies or outliers. However, many-a-times due to the presence of randomness and biases in human behaviour, there are chances of deriving inadequate or inefficient results. As social science studies are based on the analysis of the perception of people, there is a high possibility of variability in the dataset. Thus, in order to control this variability caused due to respondent biases, the processing of dataset is required before statistical analysis.

I am a management graduate with specialisation in Marketing and Finance. I have over 12 years' experience in research and analysis. This includes fundamental and applied research in the domains of management and social sciences. I am well versed with academic research principles. Over the years i have developed a mastery in different types of data analysis on different applications like SPSS, Amos, and NVIVO. My expertise lies in inferring the findings and creating actionable strategies based on them. 

Over the past decade I have also built a profile as a researcher on Project Guru's Knowledge Tank division. I have penned over 200 articles that have earned me 400+ citations so far. My Google Scholar profile can be accessed here

I now consult university faculty through Faculty Development Programs (FDPs) on the latest developments in the field of research. I also guide individual researchers on how they can commercialise their inventions or research findings. Other developments im actively involved in at Project Guru include strengthening the "Publish" division as a bridge between industry and academia by bringing together experienced research persons, learners, and practitioners to collaboratively work on a common goal. 

 

I am a Senior Analyst at Project Guru, a research and analytics firm based in Gurugram since 2012. I hold a master’s degree in economics from Amity University (2019). Over 4 years, I have worked on worked on various research projects using a range of research tools like SPSS, STATA, VOSViewer, Python, EVIEWS, and NVIVO. My core strength lies in data analysis related to Economics, Accounting, and Financial Management fields.

Discuss