# Significance of statistical analysis in epidemiological studies

Infectious diseases continue to pose significant threat to humans and animals. Stringent disease control policies and advancement in vaccines have not eradicated them. Therefore, in 2015 alone, 10 top deadly diseases were responsible for killing 30 million people. Among these diseases, communicable diseases like lower respiratory infections, diarrhea, tuberculosis and HIV were the major culprits (World Health Organization, 2017). New emerging infectious diseases now pose a greater challenge to humans. These include SARS, MERS, HIV or AIDS, Ebola and Zika among others (Heesterbeek et al., 2015). This continuous threat posed by infectious diseases has renewed effort in epidemiological studies worldwide.

Consequently, researchers and public health officials have to study a wide range of factors influencing disease patterns in a population or area. They have to understand the relationships existing between these factors and make predictions. Therefore they use statistical and mathematical models. It helps them understand patterns in the occurrence of diseases. So in this article, the role and advantages of statistical or mathematical models in infectious disease control is explored.

## Need for statistical modeling in infectious disease research

A model is a simplified version of a complex phenomenon. In this case, it pertains to a disease event or epidemic or pandemic (Vynnycky & White, 2010). Furthermore, these models can help understand a system better by studying different interactions. Moreover, they help derive the influence of external and internal factors on system. Epidemiological models can be defined as mathematical representations of epidemiology of disease transmissions and its associated processes (Dubé et al., 2007). Infectious diseases require close monitoring of disease spread. Also, population parameters, rates of infection, mortality and risk factors need monitoring. Data on disease epidemiology can be collected using monitoring and biosurveillance methods. This depend upon the purpose of the study. Therefore, biosurveillance models refer to certain abstract computational, algorithmic, statistical or mathematical representation. They produce informative output related to event detection and risk. Such models are designed based on past data.

## Difference between mathematical and statistical models in epidemiology

Biosurveillance models can be of different types. These are (Corley et al., 2014):

• Pproactive or anticipatory (detect or forecast an event)
• Assessment of risks or
• Descriptive interpretation (understanding disease dynamics or drivers).

There is considerable difference between mathematical and statistical models. Mathematical models quantitatively study the underlying dynamics of a disease epidemic. Conversely statistical models ‘formalize relationships between different variables’. These relationships include causation between 2 or more variables, variables influencing spread of disease and hypothesis testing of assumptions (Chubb & Jacobsen, 2010).

## Advantages of statistical analysis in epidemiological studies

The major advantages of using modeling techniques in epidemiological studies are (Chubb & Jacobsen, 2010):

• Firstly, statistical modeling helps detect disease outbreaks at an early stage. As a result they successfully contribute towards control strategies.
• In addition, population risks and factors contributing to the risk can be assessed. Thus prevalence of disease in a particular area can be minimised.
• It helps predict spread of diseases across populations and geographical regions. Moreover, vector-borne diseases across natural landscapes can also be predicted.
• Stochastic models help assess multiple possible outcomes of a particular event or strategy in disease control.
• Also, spatial modeling can help predict the impact of spatially targeted strategies on disease spread.
• Furthermore, stimulation models can help stimulate disease epidemics in-silico. They enable understanding of factors influencing spread. Moreover, the most probable outcome of a disease epidemic can also be identified.
• Finally, prediction models can help predict or forecast disease incidence. Consequently, their prevalence within communities and regions can be estimated.

## Types of variables considered in epidemiological data

Epidemiological data often consists of different types of variables. Each of these are important in data analysis. 7 major classes of variables have been identified. This classification is based on the properties they represent. The table below discusses them along with examples.

Variables and examples in epidemiological studies. (Source: Dicker, Coronado, Koo, & Parrish, 2006; Ressing, Blettner, & Klug, 2010)

## Types of statistical analyses in epidemiological studies

Two broad types of analyses are undertaken in epidemiology. They depend upon the study aims (Fos, 2010):

1. Descriptive epidemiology
2. Analytical epidemiology

Descriptive epidemiology involves the study of the amount and distribution of disease within a population. This is determined according to the following variables:

• personal (demographic variables)
• place (country or state or city or urban or rural area of residence) and
• time variable (long or short term trends in disease conditions).

Information from descriptive analysis can tell about overall or specific impact of the disease. Thus, the probable cause of the disease can be identified. Furthermore, background knowledge about trends and possible causal factors will lead to hypotheses development. After that, testing can be done using Analytical epidemiology. In this case, the possible relationship between disease outcomes and risk factor is studied through hypothesis testing (Szklo & Nieto, 2014).  Therefore, in order to effectively model a particular disease data, statistical models are selected. The selection depends upon the type of data, variables and aim of the study.

The table below shows the major differences between the two types of epidemiological studies.

 Characteristics Descriptive Epidemiology Analytical Epidemiology Definition Study of distribution of disease Study to explain disease occurrence or causal mechanisms of disease Aim To describe trends or distributions and potential associations To evaluate causality of associations Questions asked Who, What, When, Where Why and How Data used Preexisting data New data collected or developed Hypotheses Formulate Hypotheses Test Hypotheses Subtypes Individual level and Population (ecologic) level Intervention and Observational (Cross-sectional, Case control and Cohorts) Variable Triad Person, Place and Time Agent, Host, Environment

Characteristics of Descriptive and Analytical Epidemiology (Source: Fos, 2010)

As seen from the table above, there is a stark difference between the two types of epidemiological studies. This is most obvious from the study aims and types of variables considered. Therefore, statistical tests and models applied in each of the type of research are different. In the next article, the common statistical tests and models applied in descriptive and analytical epidemiological studies is discussed.

#### References

• Chubb, M. C., & Jacobsen, K. H. (2010). Mathematical modeling and the epidemiological research process. European Journal of Epidemiology, 25(1), 13–19.
• Corley, C. D., Pullum, L. L., Hartley, D. M., Benedum, C., Noonan, C., Rabinowitz, P. M., & Lancaster, M. J. (2014). Disease Prediction Models and Operational Readiness. PLOS ONE, 9(3), e91989.
• Dicker, R. C., Coronado, F., Koo, D., & Parrish, R. G. (2006). Introduction to Epidemiology. In Principles of Epidemiology in Public Health Practice:An Introduction to Applied Epidemiology and Biostatistics (Third, pp. 65–88).
• Dubé, C., Garner, G., Stevenson, M., Sanson, R., Estrada, C., & Willeberg, P. (2007). The use of epidemiological models for the mangement of animal diseases. In 75th General Session of the International Committee of the World Organisation for Animal Health (OIE) (pp. 13–23). Paris.
• Fos, P. J. (2010). Epidemiology Foundations: The Science of Public Health. John Wiley & Sons.
• Heesterbeek, H., Anderson, R. M., Andreasen, V., Bansal, S., De Angelis, D., Dye, C., … Hollingsworth, T. D. (2015). Modeling infectious disease dynamics in the complex landscape of global health. Science, 347(6227), aaa4339.
• Ressing, M., Blettner, M., & Klug, S. J. (2010). Data analysis of epidemiological studies: Part 11 of a series on evaluation of scientific publications. Deutsches Arzteblatt International, 107(11), 187–192.
• Szklo, M., & Nieto, J. (2014). Basic Study Designs in Analytical Epidemiology. In M. Szklo & J. Nieto (Eds.), Epidemiology : Beyond the Basics (Third, pp. 3–44). Jones & Bartlett Learning.
• Vynnycky, E., & White, R. (2010). Introduction. The basics: Infections, Transmission and Models. In E. Vynnycky & R. White (Eds.), An Introduction to Infectious Disease Modelling (First, pp. 1–12). OUP Oxford.
• World Health Organization. (2017). The top 10 causes of death.

### Chandrika Kapagunta

Research Analyst at Project Guru
Chandrika is a nature enthusiast with special love for the marine world. Her Master’s degree in Marine Biotechnology and Scuba Diving experience has made her a strong advocate of environment and marine conservation, especially through bioremediation. She believes in finding solutions of everyday human problems in nature, be it medicines, technology or philosophy. Having worked as a volunteer at The Bombay Natural History Society and as a Senior Research Fellow at Central Institute of Fisheries Education, she has had exposure to the current state of the academic research, specifically in the field of environmental biotechnology.

### Related articles

We are looking for candidates who have completed their master's degree or Ph.D. Click here to know more about our vacancies.