Understanding various hypothesis testing steps

By Riya Jain & Priya Chetty on June 17, 2020

A research hypothesis is a clear statement or assumption of a researcher about the possible relationship between two variables, or elements, in a study. It could be related to a property or characteristic of the population or an event. As stated in the previous article, the testing of the hypothesis is an essential step in scientific research. This article emphasizes the process of various hypothesis testing steps.

What types of research need hypothesis testing?

A hypothesis is tested in studies that are quantitative in nature, i.e. where the data is in numeric form. Moreover, the purpose of the study must be to verify a claim or assumption of the researcher. Studies that need hypothesis testing can be of two types:

Descriptive: Identifying factors that affect a phenomenon. For example, factors affecting customer satisfaction of FMCG products.
Explanatory: Verifying the relationship between variables. For example, the relationship between the flexibility in work timings and employees’ productivity.

Not all quantitative studies require hypothesis testing. Studies that are exploratory in nature follow the inductive design and do not require hypothesis testing. Thus, the decision of opting for a hypothesis-based analysis depends on the purpose and research questions of the study.

Figure 1: Types of researches that require hypothesis testing

Hypothesis testing steps that need to be followed in a research

Step 1: Formulation of the hypothesis

The first step is to frame the hypothesis statement. A hypothesis is of null and alternative types. The null hypothesis presents the researcher’s perception is negative, i.e. that the assumption of the researcher is false. On the other hand, the alternative hypothesis is the clear statement of the perception i.e. the theory of the researcher. Therefore, before stating the hypothesis, the researcher must decide the intended outcome. Process of stating the hypothesis is explained here.

For example, in a study targeting to study the impact of macro-economic factors on the economic growth of the country, the hypothesis would be stated as follows:

Null hypothesis (H0): There is no significant impact of macro-economic factors on economic growth.
Alternative hypothesis (H1): There is a significant impact of macro-economic factors on economic growth.

OR

Alternative hypothesis (H1): There is a significant positive impact of macro-economic factors on economic growth.

OR

Alternative hypothesis (H1): There is a significant negative impact of macro-economic factors on economic growth.

Step 2: Collection of the data

The method of collecting the data is defined early on in the research process. After deciding the research questions, the researcher decides on the method that will be used to collect the data i.e. primary or secondary. Generally, primary data is collected by surveys, focus groups, observation, or interviews, and secondary data is collected from journals, articles, books, online web portals, or government reports. Choice of type of data depends upon the research aim and resources available at the researcher’s disposal.

For example, continuing the example stated in Step 1, the data for all the relevant macro-economic factors like FDI, employment rate, trade openness or inflation and economic growth would be collected using secondary sources because original research won’t provide reliable data and that process of the collection would be time and cost consuming.

Step 3: Determination of the appropriate test statistic and significance level

‘Test statistic’ determines the degree of agreement between the sample and the null hypothesis. Based on the characteristics of the sample, its distribution, and the sample size, a relevant test statistic must be selected. As stated in the previous article, for the samples having normal distribution Z-test, T-test, χ2-test, and F-distribution test statistics could be used. While for non-normal distribution Wilcoxon rank-sum test, Wilcoxon signed-rank test, and Kruskal Wallis test are applied. A non-parametric alternative for the Z/t-test is the Wilcoxon sign test, the T-test is the Wilcoxon rank-sum test and for ANOVA is Kruskal-Wallis test. Thus, by comparing the sample with the above-stated criterion, test-statistic could be determined. The basis of determining the appropriate test statistic for the parametric tests (normal distribution tests) is shown below.

Figure 2: Parametric test statistics in hypothesis testing steps

On the other hand, ‘significance level’ is the value that represents the criteria for accepting or rejecting a hypothesis. A researcher who aims for the least chances of rejecting the null hypothesis sets a low significance value. The generally accepted significance values are:

For secondary data: 1% or 5%
For primary data: 1%, 5%, or 10%.

For example, if a study has 5% significance value, then there are 95% chances that the null hypothesis will not be rejected.

Step 4: Determining the decision-making criteria

The criteria for the decision-making is determined based on the rejection region for each test-statistic. As per the specified statistical significance level of the study, the number of independent variables, and the sample size, the threshold or critical value of the test statistic is determined. The rejection region for each of the test statistic is such that,

Probability (p-value) should be less than the significance level of the study.

Thus, regardless of the method used for the study, the above stated decision rule would be used for rejecting or not rejecting the stated null hypothesis.

Step 5: Choosing an application and running the test

In this step, the researcher decides the statistical software that would be used for testing the hypothesis. Although there are a number of options available, the popular ones in academic research are MS Excel, SPSS, Stata, EViews, and R. Selection of the software is a researcher’s personal choice. MS Excel and SPSS are the most user-friendly software for hypothesis testing. However, SPSS, Stata, and R are considered the most reliable tools for deriving the statistical results.

Step 6: Deciding the outcome (i.e. reject or do not reject region)

This is the final hypothesis testing step. If the p-value is less than the significance level of the study, the alternative hypothesis is accepted. Based on the decision rule, the final outcome of the study is decided. For example, for the case stated in Step 1 if the p-value of the Z-test is 0.13 wherein the significance level of the study was 5%, the null hypothesis that there is no significant impact of macro-economic factors on the economic growth would not be rejected. Hence, the outcome would be that macro-economic factors does not have a significant influence on the economic growth of the country.

Figure 3: Hypothesis testing outcome decision

The figure below represents the various hypothesis testing steps.

Important points to note while following the hypothesis testing steps

As the final outcome of an empirical study is dependent on the outcome of the hypothesis test. Therefore, it is essential to note certain things for generating efficient results:

Clearly specify the alternative hypothesis using simple words, as it presents the theory of the researcher.
The hypothesis considered for the statistical analysis should not be generalised or ambiguous.
Quantifiable variables should be considered in the hypothesis
For deriving reliable results, try to include large population at least greater than 200 sample size.
As statistical significance level defines the error chance, try to maintain it at minimum level i.e. 5% is the appropriate level.
Clear understanding of p-value is required for having the correct interpretation of the results.