Significance of statistical analysis in epidemiological studies

By Avishek Majumder & Chandrika Kapagunta on September 16, 2017

Infectious diseases continue to pose a significant threat to humans and animals. Stringent disease control policies and advancements in vaccines have not eradicated them. Therefore, in 2015 alone, 10 top deadly diseases were responsible for killing 30 million people. Among these diseases, communicable diseases like lower respiratory infections, diarrhoea, tuberculosis and HIV were the major culprits (World Health Organization, 2017). New emerging infectious diseases now pose a greater challenge to humans. These include SARS, MERS, HIV or AIDS, Ebola and Zika among others (Heesterbeek et al., 2015). This continuous threat posed by infectious diseases has renewed efforts in epidemiological studies worldwide.

Consequently, researchers and public health officials have to study a wide range of factors influencing disease patterns in a population or area. They have to understand the relationships existing between these factors and make predictions. Therefore they use statistical and mathematical models. It helps them understand patterns in the occurrence of diseases. So in this article, the role and advantages of statistical or mathematical models in infectious disease control are explored.

Need for statistical modelling in infectious disease research

A model is a simplified version of a complex phenomenon. In this case, it pertains to a disease event or epidemic or pandemic (Vynnycky & White, 2010). Furthermore, these models can help understand a system better by studying different interactions. Moreover, they help derive the influence of external and internal factors on the system. Epidemiological models can be defined as mathematical representations of the epidemiology of disease transmissions and their associated processes (Dubé et al., 2007). Infectious diseases require close monitoring of disease spread. Also, population parameters, rates of infection, mortality and risk factors need monitoring. Data on disease epidemiology can be collected using monitoring and biosurveillance methods. This depends upon the purpose of the study. Therefore, biosurveillance models refer to a certain abstract computational, algorithmic, statistical or mathematical representation. They produce an informative output related to event detection and risk. Such models are designed based on past data.

Difference between mathematical and statistical models in epidemiology

Biosurveillance models can be of different types. These are (Corley et al., 2014):

  • Proactive or anticipatory (detect or forecast an event)
  • Assessment of risks or
  • Descriptive interpretation (understanding disease dynamics or drivers).

There is a considerable difference between mathematical and statistical models. Mathematical models quantitatively study the underlying dynamics of a disease epidemic. Conversely statistical models ‘formalize relationships between different variables. These relationships include causation between 2 or more variables, variables influencing the spread of disease and hypothesis testing of assumptions (Chubb & Jacobsen, 2010).

Advantages of statistical analysis in epidemiological studies

The major advantages of using modelling techniques in epidemiological studies are (Chubb & Jacobsen, 2010):

  • Firstly, statistical modelling helps detect disease outbreaks at an early stage. As a result, they successfully contribute to control strategies.
  • In addition, population risks and factors contributing to the risk can be assessed. Thus the prevalence of disease in a particular area can be minimised.
  • It helps to predict the spread of diseases across populations and geographical regions. Moreover, vector-borne diseases across natural landscapes can also be predicted.
  • Stochastic models help assess multiple possible outcomes of a particular event or strategy in disease control.
  • Also, spatial modelling can help predict the impact of spatially targeted strategies on disease spread.
  • Furthermore, simulation models can help stimulate disease epidemics in-silico. They enable understanding of factors influencing spread. Moreover, the most probable outcome of a disease epidemic can also be identified.
  • Finally, prediction models can help predict or forecast disease incidence. Consequently, their prevalence within communities and regions can be estimated.

Types of variables considered in epidemiological data

Epidemiological data often consists of different types of variables. Each of these is important in data analysis. 7 major classes of variables have been identified. This classification is based on the properties they represent. The table below discusses them along with examples.

Variables and examples in epidemiological studies.
Variables and examples in epidemiological studies. (Source: Dicker, Coronado, Koo, & Parrish, 2006; Ressing, Blettner, & Klug, 2010)

Types of statistical analyses in epidemiological studies

Two broad types of analyses are undertaken in epidemiology. They depend upon the study aims (Fos, 2010):

  1. Descriptive epidemiology
  2. Analytical epidemiology

Descriptive epidemiology involves the study of the amount and distribution of disease within a population. This is determined according to the following variables:

  • personal (demographic variables)
  • place (country or state or city or urban or rural area of residence) and
  • time variable (long or short-term trends in disease conditions).

Information from the descriptive analysis can tell about the overall or specific impact of the disease. Thus, the probable cause of the disease can be identified. Furthermore, background knowledge about trends and possible causal factors will develop hypotheses. After that, testing can be done using Analytical epidemiology. In this case, the possible relationship between disease outcomes and the risk factor is studied through hypothesis testing (Szklo & Nieto, 2014).  Therefore, in order to effectively model a particular disease data, statistical models are selected. The selection depends upon the type of data, variables and aim of the study.

The table below shows the major differences between the two types of epidemiological studies.

CharacteristicsDescriptive EpidemiologyAnalytical Epidemiology
DefinitionStudy of distribution of diseaseStudy to explain disease occurrence or causal mechanisms of disease
AimTo describe trends or distributions and potential associationsTo evaluate the causality of associations
Questions askedWho, What, When, WhereWhy and How
Data usedPreexisting dataNew data collected or developed
Hypotheses Formulate HypothesesTest Hypotheses
SubtypesIndividual-level and Population (ecologic) levelIntervention and Observational (Cross-sectional, Case-control and Cohorts)
Variable TriadPerson, Place and TimeAgent, Host, Environment
Characteristics of Descriptive and Analytical Epidemiology (Source: Fos, 2010)

As seen from the table above, there is a stark difference between the two types of epidemiological studies. This is most obvious from the study aims and types of variables considered. Therefore, statistical tests and models applied in each of the types of research are different.


  • Chubb, M. C., & Jacobsen, K. H. (2010). Mathematical modelling and the epidemiological research process. European Journal of Epidemiology, 25(1), 13–19.
  • Corley, C. D., Pullum, L. L., Hartley, D. M., Benedum, C., Noonan, C., Rabinowitz, P. M., & Lancaster, M. J. (2014). Disease Prediction Models and Operational Readiness. PLOS ONE, 9(3), e91989.
  • Dicker, R. C., Coronado, F., Koo, D., & Parrish, R. G. (2006). Introduction to Epidemiology. In Principles of Epidemiology in Public Health Practice: An Introduction to Applied Epidemiology and Biostatistics (Third, pp. 65–88).
  • Dubé, C., Garner, G., Stevenson, M., Sanson, R., Estrada, C., & Willeberg, P. (2007). The use of epidemiological models for the mangement of animal diseases. In 75th General Session of the International Committee of the World Organisation for Animal Health (OIE) (pp. 13–23). Paris.
  • Fos, P. J. (2010). Epidemiology Foundations: The Science of Public Health. John Wiley & Sons.
  • Heesterbeek, H., Anderson, R. M., Andreasen, V., Bansal, S., De Angelis, D., Dye, C., … Hollingsworth, T. D. (2015). Modeling infectious disease dynamics in the complex landscape of global health. Science, 347(6227), aaa4339.
  • Ressing, M., Blettner, M., & Klug, S. J. (2010). Data analysis of epidemiological studies: Part 11 of a series on evaluation of scientific publications. Deutsches Arzteblatt International, 107(11), 187–192.
  • Szklo, M., & Nieto, J. (2014). Basic Study Designs in Analytical Epidemiology. In M. Szklo & J. Nieto (Eds.), Epidemiology : Beyond the Basics (Third, pp. 3–44). Jones & Bartlett Learning.
  • Vynnycky, E., & White, R. (2010). Introduction. The basics: Infections, Transmission and Models. In E. Vynnycky & R. White (Eds.), An Introduction to Infectious Disease Modelling (First, pp. 1–12). OUP Oxford.
  • World Health Organization. (2017). The top 10 causes of death.


0 thoughts on “Significance of statistical analysis in epidemiological studies”