How to develop questionnaire for correlation and regression test?

There different instruments to collect primary data and the most widely used is the questionnaire in a survey method. Correlation and regression tests are two of the basic statistical tools that are widely applied to analyze data. In research, these tests are particularly used when the researcher seeks to find relations between variables or impacts of variables on other variables in order to draw inferences. Correlation analysis helps to find the degree of linear association between any two variables. Regression analysis, on the other hand, helps to determine whether a set of variables (independent variables) have any impact on another variable (dependent variable). Here ‘impact’ means how much of the variations (change) in the values of the dependent variable can be explained by the variations (change) in the values of the independent variables. Furthermore, correlation analysis is done first. Once it is found that there is a significant relationship between the dependent and the independent variables, the researcher can proceed to conduct a regression analysis.

Need to develop a questionnaire

The need for developing a questionnaire arises when the study is based on primary data.  Questionnaire serves as one of the most common research instruments for primary data collection. A questionnaire can be quantitative or qualitative in nature. A qualitative questionnaire contains open-ended questions and responses are collected by interviews. Moreover, a quantitative questionnaire is the one that contains closed-ended questions and responses containing numerical values. These values are then coded with the use of suitable statistical software such as SPSS, STATA or R. Following this, data analysis can be done for correlation and regression tests. However, it is challenging to develop the questionnaire in a way that is suitable for these types of analyses.

Stages of developing a questionnaire for correlation and regression

  1. Identifying the dependent and independent variables– From the review of the literature, identify the dependent and independent variables that will establish the aim and objectives of the research. While reviewing literature, focus on the objectives of the previous studies and their findings.
  2. Framing the conceptual framework– After the identification, these variables can then be presented in the form of a flow chart known as the conceptual framework. Furthermore, independent variables are depicted to affect the dependent variable. Moreover, the conceptual framework is a part of the literature review.
  3. Framing the hypotheses– This step is very important for correlation and regression tests as it presents the conjectures in the form of testable statements. Typically, there are two types of hypotheses. The first is a null hypothesis which states ‘no effect’ or ‘no impact’ of the independent variables on the dependent variables. The second is an alternative hypothesis which contradicts the null hypothesis and states that ‘there is an effect’. From the depiction in the conceptual framework, these hypotheses can be framed. Furthermore, hypothesis framing is the part of the chapter on research methodology.
  4. Framing questions or statements in the questionnaire– A quantitative questionnaire generally contains two parts. The first part collects responses about the demographic profile and general background of the survey participants. The second part is for inferential analysis including correlation and regression tests. For this purpose, make a number of questions or statements that seek responses on a scale. The most commonly used scale in survey research is Likert scale.

An example of a questionnaire

Suppose the investigation is about the effect of organizational factors on work-life balance (WLB). Therefore WLB is the dependent variable in the study.

Stage 1

Let’s say the following independent variables are identified from the literature review.

  1. Compensation
  2. The safe environment at work
  3. Training
  4. Job engagement
  5. Workload
  6. Scope of promotion
  7. Social security
  8. Organizational support

Stage 2

Frame the conceptual framework.

Conceptual framework for questionnaire survey

Figure 1: Conceptual framework

Stage 3

Frame the hypotheses.

Null hypothesis– Organizational factors namely compensation, safe environment at work, training, job engagement, workload, the scope of promotion, social security, and organizational support do not have any effect on work-life balance.

Alternative hypothesis– Organizational factors namely compensation, safe environment at work, training, job engagement, workload, the scope of promotion, social security, and organizational support have a significant effect on work-life balance.

Stage 4

Frame questions and/or statements in the questionnaire as shown in the table. The survey participants can be asked to put a tick mark in the boxes against each statement of what they think to be an appropriate choice.

Statements

Likert scale

Strongly Disagree Disagree Neither Agree nor Disagree Agree Strongly Agree
There is a significant impact of organizational factors on WLB.
Compensation at the workplace has an effect on WLB.
A safe environment in the workplace has an effect on WLB.
Training received at the workplace has an effect on WLB.
Job engagement of employees affects their WLB.
Workload has an effect on WLB.
The scope of promotion at the workplace has an effect on WLB.
Social security offered by employers at the workplace has an effect on WLB.
Organizational support at the workplace has an effect on WLB.

Table 1: Questionnaire statements for inferential analysis

In the above table, the first statement (highlighted in bold) collects responses that will serve as the data for the dependent variable (WLB). Subsequently, all the other statements gather responses which will provide the data for the respective independent variables or factors.

Important points to note

  1. The statements in the questionnaire for correlation and regression tests need to follow the conceptual framework and the hypotheses directly.
  2. The independent variables that have a significant relationship with the dependent variable as found through correlation analysis should ideally be included in the regression analysis.
  3. The example given in this article contains a small number of statements. In actual researches such as in a PhD thesis, often there are a large number of factors (independent variables) that are identified from the review of the literature. In those cases, an exploratory factor analysis (EFA) is suggested to be performed before correlation analysis. EFA helps to club similar factors into one and reduce the number of statements that are subsequently included in the correlation analysis.
Saptarshi Basu Roy Choudhury

Saptarshi Basu Roy Choudhury

Senior Research Analyst at Project Guru
Saptarshi has done his M. Phil in International Trade and Development and Masters in Economics from Jawaharlal Nehru University, New Delhi. His academic interests include issues related to economics of climate change, regulation and contemporary trade theories. He has a keen interest in current affairs and likes to read and travel in his spare time.
Saptarshi Basu Roy Choudhury

Latest posts by Saptarshi Basu Roy Choudhury (see all)

Related articles

  • Structural equation model (SEM) Structural equation model is a statistical modeling technique. Structural equation model (SEM) tests estimate or establish relationships between variables. It is a multivariate statistical data analysis technique. SEM analyzes the structural relationships or to establish causal […]
  • Different statistical formulas used in hypothesis testing The article focuses on different inferential statistics tools which are used for hypothesis testing. This article introduces both the terms.
  • Hypothesis testing in a research paper Hypothesis testing is testing of evidences through mathematical and statistical tools to find whether the researcher was right or wrong in his initial views.
  • Benefits of outlining your research work Dissertation is a research work written by a scholar pursuing masters or doctorate. The whole dissertation revolves around a subject.
  • Auto regressive distributed lag model (ARDL) and its advantages Auto regressive Distributed Lag Models (ARDL) model plays a vital role when comes a need to analyze a  economic scenario. In an economy, change in any economic variables may bring change in another economic variables beyond the time.
Discussions

1 Comments.

Discuss

Trackbacks and Pingbacks:

We are looking for candidates who have completed their master's degree or Ph.D. Click here to know more about our vacancies.