Understanding effect size in CMA

In the previous article: Data entry formats in CMA, different data entry formats were shown. In the discussion surrounding meta-analysis, effect size holds the prime importance, when it comes to analysis. An effect size is the magnitude or size of an effect resulting from a clinical treatment. Thus, in Comprehensive Meta Analysis (CMA), it assumes the reference of “treatment effect”. As different studies generate different datasets of variable nature, hence the effect size for a given study cannot be constrained to one particular index or measure (“measures” will be used in the article). These could be reported in terms of events or non-events, mean and standard deviation. With this the measure of effect size changes. The following rules govern the selection of appropriate measure of effect size for analysis:

  1. Effect sizes from different studies should measure the same phenomenon.
  2. Estimates of effect sizes should not require re-analyzing the raw data.
  3. Effect sizes should have sound technical characteristics.
  4. The reported effect sizes should be significantly interpretable.

Figure below shows the different effect size measures computable in CMA, as determined by the different types of outcomes.

Different types of effect size indices depending on the type of outcome

Figure 1: Different types of effect size indices depending on the type of outcome

This article defines the concept of different effect size measures. It will help the to understand and select the most appropriate effect size measure for analyzing data.

Different outcomes different effect size

The different study designs yield different outcomes, whether primary or secondary. When conducting a meta-analysis, it is essential to select the appropriate effect size measure. This facilitates the fulfillment of research goals and objectives. The different types of outcomes constitute continuous (mean and standard deviation), binary data and correlations. Figure above shows the decision flow which can assist in selecting the appropriate effect size for the study.

Figure 2: Selecting effect size for continuous outcome

Figure 2: Selecting effect size for continuous outcome

Continuous outcomes

The CMA software, allows to calculate Difference in means (Raw mean difference, unstandardized), the Standard difference in means, Standard paired difference, and Hedge’s g (Standardized mean differences), for continuous outcomes.

Quick tip: If the study results report outcomes on the same scale of measurement, then calculate the Difference in means. However, Difference in means cannot be used if the studies report outcomes over different instruments in cases such as psychological scales or educational tests.

In such scenarios, resort to the measures of Standardized mean difference (SMD). The standardized measures include dividing the mean difference in each study by the respective standard deviation. This creates the standard index which helps in comparing the outcomes across studies. The important point of consideration arises, with respect to the standard deviation, when we compare the results from different study designs, having different study groups.

Quick tip: Some CMA formats require SD difference, whereas others require Common SD or SD of difference. SD difference simply implies the difference between the standard deviation of pre-post scores within a single group, such as either treatment or control.

Common SD refers to pooled SD of the treatment and control groups (independent/ unmatched). It involves the following formula:

Pooled SD of the treatment and control groups

Pooled SD of the treatment and control groups

Where N = sample size of the first group, n = sample size of the second group.

Pooled SD of the treatment and control groups

Pooled SD of the treatment and control groups

SD difference also refers to the pooled standard deviation, but for the matched groups and one group (pre-post) designs.

Note: CMA does not offer the calculation of response ratio effect sizes, which is the log proportional change in the means of a treatment and control group (Lajeunesse 2011).

However, this measure proves to be useful in case of outcome measurements on a physical scale (length, weight). Response ratios are useful when the outcome measurements employe ratio scale. These are not useful when the studies use non-natural units of scale, such as GRE test scores. Further, behavioral attitude scores do not have a natural zero point.

Dichotomous outcome

When dealing with dichotomous outcomes, such as the number of events, Risk ratio, Odds ratio or Risk difference can be selected. Calculate varied statistical versions of these classical effect measures, as shown in Figure 2 above. In simple terms, risk ratio is the ratio of two risks, and odds ratio is the ratio of two odds. The different measures of these ratios shown in Figure 2 above, help to draw statistical inferences of higher suitability or relevance.

Odds ratio and risk ratio

These give information about the odds and risk of an event, respectively, usually for a single stratum or group such as the odds ratio of respondents having PTSD and suffering from a disability (Byers et al. 2014). However, a study design might consist of subgroups, such that the observations have been divided into two/four-fold table, such as MH odds ratio for circumcised men having high ALEX score and normal men having low ALEX scores (Morris & Waskett 2012). In such a scenario, the user will find MH (Mantel-Haenszel) odds/risk ratio to be relevant. The MH ratio allows for the calculation of pooled odds/risk ratio across the strata of fourfold tables (Tripepi et al. 2010). The Peto odds ratio is another effect measure helpful in pooling the odds ratio from fourfold tables, and has a reputation for dealing with rare events (Bradburn et al. 2007).

Figure 3: Variations of classical odds ratio, risk ratio and risk difference

Figure 3: Variations of classical odds ratio, risk ratio and risk difference

Figure 2 also shows the logarithmic of classical and varied odds or risk ratios. Here the natural logs of these ratios are taken because the log ratios prove to be better approximated by a normal distribution. On the log scales, the ratios are the difference in the log parameters, aiding the maintenance of symmetry in the analysis (Balakrishnan 2014).

Risk difference

It represents the difference between the two risks, computed using the raw data.  Computed using the raw units, risk difference proves to be an absolute measure, being sensitive to baseline risks. While reporting the clinical impact of a treatment, it should be given preference over the two ratios. The MH risk difference helps combine the risk differences from different strata and produce a common risk difference (Klingenberg 2013).


The correlation variable can serve as an effect size measure for studies having one group only. As correlations are strongly influencing variance, hence the Fisher’s z score is also preferred to carry out the meta-analysis. CMA offers the user with both Correlation and Fisher’s Z effect size measures.


  • Balakrishnan, N., 2014. Methods and applications of statistics in clinical trials, volume 1: concepts, principles, trials, and designs, John Wiley & Sons.
  • Bradburn, M.J. et al., 2007. Much ado about nothing: a comparison of the performance of meta-analytical methods with rare events. Statistics in medicine, 26(1), pp.53–77.
  • Byers, A.L. et al., 2014. Chronicity of posttraumatic stress disorder and risk of disability in older persons. JAMA psychiatry, 71(5), pp.540–546.
  • Klingenberg, B., 2013. A new and improved confidence interval for the Mantel-Haenszel risk difference. Statistics in medicine, 33(17), pp.2968–2983.
  • Lajeunesse, M.J., 2011. On the meta-analysis of response ratios for studies with correlated and multi-group designs. Ecology, 92(1), pp.2049–2055.
  • Morris, B.J. & Waskett, J.H., 2012. Claims that circumcision increases Alexithymia and Erectile Dysfunction are unfounded: A critique of Bollinger and Van Howe’s Alexithymia and Circumcision Trauma: A preliminary investigation. International Journal of Men’s Health, 11(2), pp.177–184.
  • Tripepi, G. et al., 2010. Stratification for confounding – Part 1: The Mantel-Haenszel formula. Nephron Clinical Practice, 116(4), pp.317–321.
Yashika Kapoor

Yashika Kapoor

Research analyst at Project Guru
Yashika has completed her bachelors in life sciences and holds a masters in forensic sciences. Being a major in forensic biology, she is trained in techniques of DNA extraction and sequencing. She also has hands on experience of dealing with sensitive evidences and case files. She aims at developing her knowledge base through fact based learning. With a penchant for reading, and writing, she likes to keep her facts concrete. She is a confident person and aims at achieving perfection in every task assigned to her. She aims at securing a place in her professional life which allows her to explore different areas relevant to her field of work. Along with academics, she is a creative soul. Food, art and craft are some of her other passions.
Yashika Kapoor

Related articles


We are looking for candidates who have completed their master's degree or Ph.D. Click here to know more about our vacancies.