How to develop a questionnaire for correlation and regression test?

By Saptarshi Basu Roy Choudhury & Priya Chetty on January 25, 2019

There are different instruments to collect primary data and the most widely used is the questionnaire in a survey method. Correlation and regression tests are two of the basic statistical tools that are widely applied to analyze data. In research, these tests are particularly used when the researcher seeks to find relations between variables or impacts of variables on other variables in order to draw inferences. Correlation analysis helps to find the degree of linear association between any two variables. Regression analysis, on the other hand, helps to determine whether a set of variables (independent variables) have any impact on another variable (dependent variable). Here ‘impact’ means how much of the variations (change) in the values of the dependent variable can be explained by the variations (change) in the values of the independent variables. Furthermore, correlation analysis is done first. Once it is found that there is a significant relationship between the dependent and the independent variables, the researcher can proceed to conduct a regression analysis.

Need to develop a questionnaire

The need for developing a questionnaire arises when the study is based on primary data.  The questionnaire serves as one of the most common research instruments for primary data collection. A questionnaire can be quantitative or qualitative in nature. A qualitative questionnaire contains open-ended questions and responses are collected by interviews. Moreover, a quantitative questionnaire is one that contains closed-ended questions and responses containing numerical values. These values are then coded with the use of suitable statistical software such as SPSS, STATA or R. Following this, data analysis can be done for correlation and regression tests. However, it is challenging to develop the questionnaire in a way that is suitable for these types of analyses.

Stages of developing a questionnaire for correlation and regression

  1. Identifying the dependent and independent variables– From the review of the literature, identify the dependent and independent variables that will establish the aim and objectives of the research. While reviewing literature, focus on the objectives of the previous studies and their findings.
  2. Framing the conceptual framework– After the identification, these variables can then be presented in the form of a flow chart known as the conceptual framework. Furthermore, independent variables are depicted to affect the dependent variable. Moreover, the conceptual framework is a part of the literature review.
  3. Framing the hypotheses– This step is very important for correlation and regression tests as it presents the conjectures in the form of testable statements. Typically, there are two types of hypotheses. The first is a null hypothesis which states ‘no effect’ or ‘no impact’ of the independent variables on the dependent variables. The second is an alternative hypothesis which contradicts the null hypothesis and states that ‘there is an effect’. From the depiction in the conceptual framework, these hypotheses can be framed. Furthermore, hypothesis framing is the part of the chapter on research methodology.
  4. Framing questions or statements in the questionnaire– A quantitative questionnaire generally contains two parts. The first part collects responses about the demographic profile and general background of the survey participants. The second part is for inferential analysis including correlation and regression tests. For this purpose, make a number of questions or statements that seek responses on a scale. The most commonly used scale in survey research is Likert scale.

An example of a questionnaire

Suppose the investigation is about the effect of organizational factors on work-life balance (WLB). Therefore WLB is the dependent variable in the study.

Stage 1

Let’s say the following independent variables are identified from the literature review.

  1. Compensation
  2. The safe environment at work
  3. Training
  4. Job engagement
  5. Workload
  6. Scope of promotion
  7. Social security
  8. Organizational support

Stage 2

Frame the conceptual framework.

Conceptual framework for questionnaire survey
Figure 1: Conceptual framework

Stage 3

Frame the hypotheses.

Null hypothesis– Organizational factors namely compensation, safe environment at work, training, job engagement, workload, the scope of promotion, social security, and organizational support do not have any effect on work-life balance.

The alternative hypothesis– Organizational factors namely compensation, safe environment at work, training, job engagement, workload, the scope of promotion, social security, and organizational support have a significant effect on work-life balance.

Stage 4

Frame questions and/or statements in the questionnaire as shown in the table. The survey participants can be asked to put a tick mark in the boxes against each statement of what they think to be an appropriate choice.


Likert scale

 Strongly DisagreeDisagreeNeither Agree nor DisagreeAgreeStrongly Agree
There is a significant impact of organizational factors on WLB.     
Compensation at the workplace has an effect on WLB.     
A safe environment in the workplace has an effect on WLB.     
Training received at the workplace has an effect on WLB.     
Job engagement of employees affects their WLB.     
Workload has an effect on WLB.     
The scope of promotion at the workplace has an effect on WLB.     
Social security offered by employers at the workplace has an effect on WLB.     
Organizational support at the workplace has an effect on WLB.     

Table 1: Questionnaire statements for inferential analysis

In the above table, the first statement (highlighted in bold) collects responses that will serve as the data for the dependent variable (WLB). Subsequently, all the other statements gather responses that will provide the data for the respective independent variables or factors.

Offer ID is invalid

Important points to note

  1. The statements in the questionnaire for correlation and regression tests need to follow the conceptual framework and the hypotheses directly.
  2. The independent variables that have a significant relationship with the dependent variable as found through correlation analysis should ideally be included in the regression analysis.
  3. The example given in this article contains a small number of statements. In actual researches such as in a PhD thesis, often there are a large number of factors (independent variables) that are identified from the review of the literature. In those cases, an exploratory factor analysis (EFA) is suggested to be performed before correlation analysis. EFA helps to club similar factors into one and reduce the number of statements that are subsequently included in the correlation analysis.


6 thoughts on “How to develop a questionnaire for correlation and regression test?”