Common sample sizes needed are computed and presented in this vignette.
## -- Given ---------------------------
## r = 0.2
## d = 0.4082483
## sig_level = 0.05
## -- Post-hoc power analysis --------
## n = NULL
## power_post = NULL
## -- Sample size planning -----------
## power = 0.8
## Minimum sample size = 193
Gignac and Szodorai (2016) recommended to consider correlations of 0.10, 0.20, and 0.30 as relatively small, typical, and relatively large for individual differences research. Use 0.20 as the prior effect size of Pearson’s r, then the sample size needed is 193.
Assume that we test the significance of the indirect effect using the causal steps approach (Hayes, 2009). Assume a is the effect of X on M, and b is the effect of M on Y controlling for X, then the indirect effect of X on Y through M is a*b. For statistical inference, if a and b are significant, then we could infer the product of a and b (the indirect effect) should be significant. To plan the sample size for mediation models, we should ensure the significance of a and b in regression models.
According to Gignac and Szodorai (2016), we set PRE of a and b to be the square of 0.2. That’s 0.04.
The regression with a has two terms: intercept, and X, hence
PA
= 2. The regression without a has one term: intercept,
hence PC
= 1. Then we have:
## -- Given ---------------------------
## PRE = 0.04
## f_squared = 0.04166667
## PC = 1
## PA = 2
## sig_level = 0.05
## -- Post-hoc power analysis --------
## n = NULL
## power_post = NULL
## -- Sample size planning -----------
## Expected power = 0.8
## Minimum sample size = 193
Accordingly, the minimum sample size for a is 193.
The regression with b has three terms: intercept, X, and M, hence
PA
= 3. The regression without b has two terms: intercept,
and X, hence PC
= 2.
## -- Given ---------------------------
## PRE = 0.04
## f_squared = 0.04166667
## PC = 2
## PA = 3
## sig_level = 0.05
## -- Post-hoc power analysis --------
## n = NULL
## power_post = NULL
## -- Sample size planning -----------
## Expected power = 0.8
## Minimum sample size = 194
Accordingly, the minimum sample size for b is 194.
Taking both a and b into account, the minimum sample size for the indirect effect is 194.
For moderation model, we should ensure the significance of the
interaction term. Assume the effect of X on Y is moderated by W, then
the compact regression model has three terms: intercept, X, and W, hence
the argument PC
is 3. The augmented regression model has
four terms: intercept, X, W, and the interaction term X*W, hence the
argument PA
is 4. Following Gignac and Szodorai (2016), we
assume the PRE of the interaction term is 0.04, then
## -- Given ---------------------------
## PRE = 0.04
## f_squared = 0.04166667
## PC = 3
## PA = 4
## sig_level = 0.05
## -- Post-hoc power analysis --------
## n = NULL
## power_post = NULL
## -- Sample size planning -----------
## Expected power = 0.8
## Minimum sample size = 195
Accordingly, the minimum sample size we need is 195.
When an independent variable has more-than-two levels, it’s commonly recommended to use ANOVA. Actually, we could substitute regression for ANOVA by transforming a multicategorical predictor with k levels into (k-1) predictors.
For example, we have two factors: A and B. Both A and B have three
levels. In regression analysis, we should transform A into 2 predictors,
A_code1, A_code2. B should be tranformed into B_code1 and B_code2. To
test the interaction of A and B, we should make products of A and B,
then we attain A_code1*B_code1, A_code2*B_code1, A_code2*B_code1,
A_code2*B_code1. The regression for this ANOVA finally has 9 terms:
intercept, A_code1, A_code2, B_code1, B_code2, A_code1*B_code1,
A_code2*B_code1, A_code2*B_code1, A_code2*B_code1, hence PA
= 9. To test the significance of the interaction of A and B, we should
conduct hierarchical regression to test the effect of all four products,
hence PC
= 5. For the PRE of all four products as a whole,
PRE should be larger than 0.04, and we assume the PRE is 0.13 using
Cohen’s (1988) criterion for medium effect size. Then we have,
## -- Given ---------------------------
## PRE = 0.13
## f_squared = 0.1494253
## PC = 5
## PA = 9
## sig_level = 0.05
## -- Post-hoc power analysis --------
## n = NULL
## power_post = NULL
## -- Sample size planning -----------
## Expected power = 0.8
## Minimum sample size = 94
Accordingly, the minimum sample size we need is 94.
Assume the PRE of the focal predictor (e.g., the mediator M in the mediation model, or the interaction term in the moderation model) is the same, the addition of other covariates only changes PC and PA. For example, if we additionally control for gender and age, then for the moderation model we have:
## -- Given ---------------------------
## PRE = 0.04
## f_squared = 0.04166667
## PC = 5
## PA = 6
## sig_level = 0.05
## -- Post-hoc power analysis --------
## n = NULL
## power_post = NULL
## -- Sample size planning -----------
## Expected power = 0.8
## Minimum sample size = 197
Accordingly, the minimum sample size we need is 197.
Ideally assuming the PRE of the focal predictor(s) keeping constant, when the number of the covariate(s) increases by one, the planed sample size increases by one. In practice, the addition of covariates almost always reduces the PRE of the focal predictor(s), hence the more conservative criterion of Gignac and Szodorai (2016) is adopted here.
Two hundred is recommended as the minimal sample size for common psychological research.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Routledge.
Gignac, G. E., & Szodorai, E. T. (2016). Effect size guidelines for individual differences researchers. Personality and Individual Differences, 102, 74-78.
Hayes, A. F. (2009). Beyond Baron and Kenny: Statistical mediation analysis in the new millennium. Communication monographs, 76(4), 408-420.