Aiming to help researchers to understand the role of PRE in
regression, this vignette will present several ways of examining the
unique effect of problem-focused coping(cope_task1) on
depression(depr1) controlling for emotion-focused
coping(cope_emo1) and avoidance
coping(cope_avo1) using the first-wave data subset in
internal data depress.
Four ways will be present in the following:
Firstly, examine the unique effect of cope_task1 using
t-test. Model C (Compact model) regresses depr1 on
cope_emo1 and cope_avo1. Model A(Augmented
model) regresses depr1 on cope_task1,
cope_emo1, and cope_avo1.
# multiple regression
fitC <- lm(depr1 ~ cope_emo1 + cope_avo1, depress)
fitA <- lm(depr1 ~ cope_task1 + cope_emo1 + cope_avo1, depress)
summary(fitA)
#>
#> Call:
#> lm(formula = depr1 ~ cope_task1 + cope_emo1 + cope_avo1, data = depress)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.68456 -0.22996 -0.00466 0.22131 0.98298
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 2.21320 0.17230 12.845 < 2e-16 ***
#> cope_task1 -0.19433 0.03414 -5.692 5.40e-08 ***
#> cope_emo1 0.22592 0.03680 6.139 5.67e-09 ***
#> cope_avo1 -0.09672 0.04221 -2.291 0.0232 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.3134 on 170 degrees of freedom
#> (11 observations deleted due to missingness)
#> Multiple R-squared: 0.3233, Adjusted R-squared: 0.3114
#> F-statistic: 27.08 on 3 and 170 DF, p-value: 2.293e-14As shown, the partial regression coefficient of
cope_task1 is -0.16705, t(90) = -3.349, p
= 0.00119.
Secondly, examine the unique effect of cope_task1 using hierarchical regression and its F-test. In SPSS, this F-test is presented as the F-test for R2 change.
anova(fitC, fitA)
#> Analysis of Variance Table
#>
#> Model 1: depr1 ~ cope_emo1 + cope_avo1
#> Model 2: depr1 ~ cope_task1 + cope_emo1 + cope_avo1
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 171 19.883
#> 2 170 16.701 1 3.1827 32.397 5.4e-08 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1As shown, F (1, 90) = 11.217, p = 0.001185. This F-test is equivalent to the t-test above, since they both examine the unique effect of cope_task1. In the case that the df of F’s numerator is 1, F = t2, and t’s df equals to the df of F’s denominator.
Thirdly, examine the unique effect of cope_task1 using PRE.
print(compare_lm(fitC, fitA), digits = 3)
#> Baseline C A A vs. C
#> SSE 24.7 1.99e+01 1.67e+01 3.18e+00
#> n 174.0 1.74e+02 1.74e+02 1.74e+02
#> Number of parameters 1.0 3.00e+00 4.00e+00 1.00e+00
#> df 173.0 1.71e+02 1.70e+02 1.00e+00
#> R_squared NA 1.94e-01 3.23e-01 1.29e-01
#> f_squared NA 2.41e-01 4.78e-01 1.91e-01
#> R_squared_adj NA 1.85e-01 3.11e-01 NA
#> PRE NA 1.94e-01 3.23e-01 1.60e-01
#> F(PA-PC,n-PA) NA 2.06e+01 2.71e+01 3.24e+01
#> p NA 9.41e-09 2.29e-14 5.40e-08
#> PRE_adj NA 1.85e-01 3.11e-01 1.55e-01
#> power_post NA 1.00e+00 1.00e+00 1.00e+00As shown, F (1, 90) = 11.217, p = 0.00119. The F-test of PRE is equivalent to the F-test of anova above.
Fourthly, examine the unique effect of cope_task1 using
residuals. Regress depr1 on cope_emo1 and
cope_avo1, and attain the residuals of depr1,
dm_res, which partials out the effect of
cope_emo1 and cope_avo1 on
depr1.
Regress cope_task1 on cope_emo1 and
cope_avo1, and attain the residuals of
cope_task1, pm_res, which partials out the
effect of cope_emo1 and cope_avo1 on
cope_task1.
Correlate dm_res with pm_res, we attain the
partial correlation of depr1 and
cope_task1.
dm_res <- lm(depr1 ~ cope_emo1 + cope_avo1, depress)$residuals
pm_res <- lm(cope_task1 ~ cope_emo1 + cope_avo1, depress)$residuals
resDat <- data.frame(dm_res, pm_res)
cor(dm_res, pm_res)
#> [1] -0.4000835As shown, the partial correlation of depr1 and
cope_task1 is -0.3329009.
Regress dm_res on pm_res, and we attain the
unique effect of cope_task1 on depr1.
summary(lm(dm_res ~ pm_res, data.frame(dm_res, pm_res)))
#>
#> Call:
#> lm(formula = dm_res ~ pm_res, data = data.frame(dm_res, pm_res))
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.68456 -0.22996 -0.00466 0.22131 0.98298
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -1.301e-17 2.362e-02 0.000 1
#> pm_res -1.943e-01 3.394e-02 -5.725 4.51e-08 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.3116 on 172 degrees of freedom
#> Multiple R-squared: 0.1601, Adjusted R-squared: 0.1552
#> F-statistic: 32.78 on 1 and 172 DF, p-value: 4.51e-08As shown, the regression coefficient of pm_res equals
the partial regression coefficients of cope_task1 in
fitA. However, their ts, as well as ps,
are different. Why? Let’s examine the unique effect of
pm_res using PRE. Note that the F-test of
one parameter’s PRE is equivalent to the t-test of
this parameter. In addition, Model A is relative to Model C. With your
statistical purpose changing, the referents of Model C and Model A
change.
fitC <- lm(dm_res ~ 1, resDat)
fitA <- lm(dm_res ~ pm_res, resDat)
print(compare_lm(fitC, fitA), digits = 3)
#> Baseline C A A vs. C
#> SSE 19.9 19.9 1.67e+01 3.18e+00
#> n 174.0 174.0 1.74e+02 1.74e+02
#> Number of parameters 1.0 1.0 2.00e+00 1.00e+00
#> df 173.0 173.0 1.72e+02 1.00e+00
#> R_squared NA 0.0 1.60e-01 1.60e-01
#> f_squared NA 0.0 1.91e-01 1.91e-01
#> R_squared_adj NA 0.0 1.55e-01 NA
#> PRE NA 0.0 1.60e-01 1.60e-01
#> F(PA-PC,n-PA) NA NA 3.28e+01 3.28e+01
#> p NA NA 4.51e-08 4.51e-08
#> PRE_adj NA 0.0 1.55e-01 1.55e-01
#> power_post NA NA 1.00e+00 1.00e+00Compare the PRE of pm_res with the PRE
of cope_task1. It’s shown that two PREs are
equivalent. However, df2s are different, which make
Fs, as well as ps, different. In other words, though
the unique effect of cope_task1 is constant, the compact
models and augmented models used to evaluate its significance are
different, which lead to different comparison conclusions (i.e.,
F-test and t-test results). Rethinking the
F-test formula of PRE, we reach the following
conclusion: With PRE being equal, the significance of
PRE is determined by the df of Model C and the
df-change of Model A against Model C.
Therefore, given the PRE of a specific set of predictor(s), the power of this specific set of predictor(s) are determined by the sample size n and the number of parameters [and hence the total number of predictor(s)] in the regression model. Similarly, given the PRE of a specific set of predictor(s), the required power for this specific set of predictor(s), and the number of parameters [and hence the total number of predictor(s)] in the regression model, we could compute the required sample size n.