Understanding effect size in CMA

By Yashika Kapoor & Priya Chetty on April 29, 2018

In the previous article: Data entry formats in CMA, different data entry formats were shown. In the discussion surrounding meta-analysis, effect size holds the prime importance, when it comes to analysis. An effect size is the magnitude or size of an effect resulting from a clinical treatment. Thus, in Comprehensive Meta Analysis (CMA), it assumes the reference of “treatment effect”. As different studies generate different datasets of variable nature, hence the effect size for a given study cannot be constrained to one particular index or measure (“measures” will be used in the article). These could be reported in terms of events or non-events, mean and standard deviation. With this the measure of effect size changes. The following rules govern the selection of appropriate measure of effect size for analysis:

  1. Effect sizes from different studies should measure the same phenomenon.
  2. Estimates of effect sizes should not require re-analyzing the raw data.
  3. Effect sizes should have sound technical characteristics.
  4. The reported effect sizes should be significantly interpretable.

Figure below shows the different effect size measures computable in CMA, as determined by the different types of outcomes.

Different types of effect size indices depending on the type of outcome
Figure 1: Different types of effect size indices depending on the type of outcome

This article defines the concept of different effect size measures. It will help the to understand and select the most appropriate effect size measure for analyzing data.

Different outcomes different effect size

The different study designs yield different outcomes, whether primary or secondary. When conducting a meta-analysis, it is essential to select the appropriate effect size measure. This facilitates the fulfillment of research goals and objectives. The different types of outcomes constitute continuous (mean and standard deviation), binary data and correlations. Figure above shows the decision flow which can assist in selecting the appropriate effect size for the study.

Figure 2: Selecting effect size for continuous outcome
Figure 2: Selecting effect size for continuous outcome

Continuous outcomes

The CMA software, allows to calculate Difference in means (Raw mean difference, unstandardized), the Standard difference in means, Standard paired difference, and Hedge’s g (Standardized mean differences), for continuous outcomes.

Quick tip: If the study results report outcomes on the same scale of measurement, then calculate the Difference in means. However, Difference in means cannot be used if the studies report outcomes over different instruments in cases such as psychological scales or educational tests.

In such scenarios, resort to the measures of Standardized mean difference (SMD). The standardized measures include dividing the mean difference in each study by the respective standard deviation. This creates the standard index which helps in comparing the outcomes across studies. The important point of consideration arises, with respect to the standard deviation, when we compare the results from different study designs, having different study groups.

Quick tip: Some CMA formats require SD difference, whereas others require Common SD or SD of difference. SD difference simply implies the difference between the standard deviation of pre-post scores within a single group, such as either treatment or control.

Common SD refers to pooled SD of the treatment and control groups (independent/ unmatched). It involves the following formula:

Pooled SD of the treatment and control groups
Pooled SD of the treatment and control groups

Where N = sample size of the first group, n = sample size of the second group.

Pooled SD of the treatment and control groups
Pooled SD of the treatment and control groups

SD difference also refers to the pooled standard deviation, but for the matched groups and one group (pre-post) designs.

Note: CMA does not offer the calculation of response ratio effect sizes, which is the log proportional change in the means of a treatment and control group (Lajeunesse 2011).

However, this measure proves to be useful in case of outcome measurements on a physical scale (length, weight). Response ratios are useful when the outcome measurements employe ratio scale. These are not useful when the studies use non-natural units of scale, such as GRE test scores. Further, behavioral attitude scores do not have a natural zero point.

Dichotomous outcome

When dealing with dichotomous outcomes, such as the number of events, Risk ratio, Odds ratio or Risk difference can be selected. Calculate varied statistical versions of these classical effect measures, as shown in Figure 2 above. In simple terms, risk ratio is the ratio of two risks, and odds ratio is the ratio of two odds. The different measures of these ratios shown in Figure 2 above, help to draw statistical inferences of higher suitability or relevance.

Odds ratio and risk ratio

These give information about the odds and risk of an event, respectively, usually for a single stratum or group such as the odds ratio of respondents having PTSD and suffering from a disability (Byers et al. 2014). However, a study design might consist of subgroups, such that the observations have been divided into two/four-fold table, such as MH odds ratio for circumcised men having high ALEX score and normal men having low ALEX scores (Morris & Waskett 2012). In such a scenario, the user will find MH (Mantel-Haenszel) odds/risk ratio to be relevant. The MH ratio allows for the calculation of pooled odds/risk ratio across the strata of fourfold tables (Tripepi et al. 2010). The Peto odds ratio is another effect measure helpful in pooling the odds ratio from fourfold tables, and has a reputation for dealing with rare events (Bradburn et al. 2007).

Figure 3: Variations of classical odds ratio, risk ratio and risk difference
Figure 3: Variations of classical odds ratio, risk ratio and risk difference

Figure 2 also shows the logarithmic of classical and varied odds or risk ratios. Here the natural logs of these ratios are taken because the log ratios prove to be better approximated by a normal distribution. On the log scales, the ratios are the difference in the log parameters, aiding the maintenance of symmetry in the analysis (Balakrishnan 2014).

Risk difference

It represents the difference between the two risks, computed using the raw data.  Computed using the raw units, risk difference proves to be an absolute measure, being sensitive to baseline risks. While reporting the clinical impact of a treatment, it should be given preference over the two ratios. The MH risk difference helps combine the risk differences from different strata and produce a common risk difference (Klingenberg 2013).


The correlation variable can serve as an effect size measure for studies having one group only. As correlations are strongly influencing variance, hence the Fisher’s z score is also preferred to carry out the meta-analysis. CMA offers the user with both Correlation and Fisher’s Z effect size measures.


  • Balakrishnan, N., 2014. Methods and applications of statistics in clinical trials, volume 1: concepts, principles, trials, and designs, John Wiley & Sons.
  • Bradburn, M.J. et al., 2007. Much ado about nothing: a comparison of the performance of meta-analytical methods with rare events. Statistics in medicine, 26(1), pp.53–77.
  • Byers, A.L. et al., 2014. Chronicity of posttraumatic stress disorder and risk of disability in older persons. JAMA psychiatry, 71(5), pp.540–546.
  • Klingenberg, B., 2013. A new and improved confidence interval for the Mantel-Haenszel risk difference. Statistics in medicine, 33(17), pp.2968–2983.
  • Lajeunesse, M.J., 2011. On the meta-analysis of response ratios for studies with correlated and multi-group designs. Ecology, 92(1), pp.2049–2055.
  • Morris, B.J. & Waskett, J.H., 2012. Claims that circumcision increases Alexithymia and Erectile Dysfunction are unfounded: A critique of Bollinger and Van Howe’s Alexithymia and Circumcision Trauma: A preliminary investigation. International Journal of Men’s Health, 11(2), pp.177–184.
  • Tripepi, G. et al., 2010. Stratification for confounding – Part 1: The Mantel-Haenszel formula. Nephron Clinical Practice, 116(4), pp.317–321.

Priya is the co-founder and Managing Partner of Project Guru, a research and analytics firm based in Gurgaon. She is responsible for the human resource planning and operations functions. Her expertise in analytics has been used in a number of service-based industries like education and financial services.

Her foundational educational is from St. Xaviers High School (Mumbai). She also holds MBA degree in Marketing and Finance from the Indian Institute of Planning and Management, Delhi (2008).

Some of the notable projects she has worked on include:

  • Using systems thinking to improve sustainability in operations: A study carried out in Malaysia in partnership with Universiti Kuala Lumpur.
  • Assessing customer satisfaction with in-house doctors of Jiva Ayurveda (a project executed for the company)
  • Predicting the potential impact of green hydrogen microgirds (A project executed for the Government of South Africa)

She is a key contributor to the in-house research platform Knowledge Tank.

She currently holds over 300 citations from her contributions to the platform.

She has also been a guest speaker at various institutes such as JIMS (Delhi), BPIT (Delhi), and SVU (Tirupati).