| Title: | Knock Errors Off Nice Guesses |
|---|---|
| Description: | Miscellaneous functions and data used in psychological research and teaching. Keng currently has four built-in datasets, and could (1) scale a vector; (2) divide a vector into three groups, (3) compute the cut-off values of Pearson's r with known sample size; (4) test the significance and compute the post-hoc power for Pearson's r with known sample size; (5) conduct a priori power analysis and plan the sample size for Pearson's r; (6) compare lm()'s fitted outputs using R-squared, f_squared, post-hoc power, and PRE (Proportional Reduction in Error, also called partial R-squared or partial Eta-squared); (7) calculate PRE from partial correlation, Cohen's f, or f_squared; (8) conduct a priori power analysis and plan the sample size for one or a set of predictors in regression analysis; (9) conduct post-hoc power analysis for one or a set of predictors in regression analysis with known sample size; (10) randomly pick numbers for Chinese Super Lotto and Double Color Balls; (11) assess course objective achievement in Outcome-Based Education. |
| Authors: | Qingyao Zhang [aut, cre] (ORCID: <https://orcid.org/0000-0002-6891-5982>) |
| Maintainer: | Qingyao Zhang <[email protected]> |
| License: | CC BY 4.0 |
| Version: | 2026.3.19.9000 |
| Built: | 2026-05-24 15:23:11 UTC |
| Source: | https://github.com/qyaozh/keng |
Assess course objective achievement
assess_coa(data, session_weights, objective_weights1, ...)assess_coa(data, session_weights, objective_weights1, ...)
data |
A wide-format data.frame that only contains student's grades of each session.
|
session_weights |
A vector that Weights sessions for the final grade.
The length of |
objective_weights1 |
A vector that Weights course objectives for session 1.
The length of objective_weights1 is the number of course objectives.
The range of each weight should be 0-1. The sum of |
... |
objective_weights2, objective_weights3, ...
Other vectors that Weight course objectives for session1, session2, ...
The number of objective_weights* arguments should be equal to the length of |
A data.frame containing grades of each session, final grades, and achievements of each objective. This data.frame also has an attribute named "weights" that contains a list of session_weights, objective_weights_matrix, and weighted_objective_weights_matrix
data <- data.frame( session1 = 60 + sample.int(40, 100, 1), session2 = 60 + sample.int(40, 100, 1), session3 = 60 + sample.int(40, 100, 1) ) session_weights <- c(0.2, 0.3, 0.5) objective_weights1 <- c(0.1, 0.4, 0.5) objective_weights2 <- c(0.2, 0.2, 0.6) objective_weights2 <- c(0.3, 0, 0.7) coa <- assess_coa( data, session_weights, objective_weights1, objective_weights2, objective_weights2 ) head(coa) attr(coa, "weights") colMeans(coa[row.names(attr(coa, "weights")[[2]])])data <- data.frame( session1 = 60 + sample.int(40, 100, 1), session2 = 60 + sample.int(40, 100, 1), session3 = 60 + sample.int(40, 100, 1) ) session_weights <- c(0.2, 0.3, 0.5) objective_weights1 <- c(0.1, 0.4, 0.5) objective_weights2 <- c(0.2, 0.2, 0.6) objective_weights2 <- c(0.3, 0, 0.7) coa <- assess_coa( data, session_weights, objective_weights1, objective_weights2, objective_weights2 ) head(coa) attr(coa, "weights") colMeans(coa[row.names(attr(coa, "weights")[[2]])])
Calculate PRE from Cohen's f, f_squared, or partial correlation
calc_PRE(f = NULL, f_squared = NULL, r_p = NULL)calc_PRE(f = NULL, f_squared = NULL, r_p = NULL)
f |
Cohen's f. Cohen (1988) suggested >=0.1, >=0.25, and >=0.40 as cut-off values of f for small, medium, and large effect sizes, respectively. |
f_squared |
Cohen's f_squared. Cohen (1988) suggested >=0.02, >=0.15, and >=0.35 as cut-off values of f for small, medium, and large effect sizes, respectively. |
r_p |
Partial correlation. |
A list including PRE, the absolute value of r_p (partial correlation), Cohen's f_squared, and f.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Routledge.
calc_PRE(f = 0.1) calc_PRE(f_squared = 0.02) calc_PRE(r_p = 0.2)calc_PRE(f = 0.1) calc_PRE(f_squared = 0.02) calc_PRE(r_p = 0.2)
Compare lm()'s fitted outputs using PRE and R-squared.
compare_lm( fitC = NULL, fitA = NULL, n = NULL, PC = NULL, PA = NULL, SSEC = NULL, SSEA = NULL )compare_lm( fitC = NULL, fitA = NULL, n = NULL, PC = NULL, PA = NULL, SSEC = NULL, SSEA = NULL )
fitC |
The result of |
fitA |
The result of |
n |
Sample size of the model C or model A.
Model C and model A must use the same sample, and hence have the same sample size.
Non-integer |
PC |
The number of parameters in model C.
Non-integer |
PA |
The number of parameters in model A.
Non-integer |
SSEC |
The Sum of Squared Errors (SSE) of model C. |
SSEA |
The Sum of Squared Errors of model A. |
compare_lm() compares model A with model C using PRE (Proportional Reduction in Error) , R-squared, f_squared, and post-hoc power.
PRE is partial R-squared (called partial Eta-squared in Anova).
There are two ways of using compare_lm().
The 1st is giving compare_lm() fitC and fitA.
The 2nd is giving n, PC, PA, SSEC, and SSEA.
The 1st way is more convenient, and it minimizes precision loss by omitting copying-and-pasting.
Note that the F-tests for PRE and that for R-squared change are equivalent.
Please refer to Judd et al. (2017) for more details about PRE, and refer to Aberson (2019) for more details about f_squared and post-hoc power.
A matrix with 12 rows and 4 columns. The 1st column reports information for the baseline model (intercept-only model). the 2nd for model C, the third for model A, and the fourth for the change (model A vs. model C). SSE (Sum of Squared Errors), sample size n, df of SSE, and the number of parameters for baseline model, model C, model A, and change (model A vs. model C) are reported in rows 1-3. The information in the 4th column are all for the change; put differently, these results could quantify the effect of one or a set of new parameters model A has but model C doesn't. If fitC and fitA are not inferior to the intercept-only model, R-squared, Adjusted R-squared, PRE, PRE_adjusted, and f_squared for the full model (compared with the baseline model) are reported for model C and model A. If model C or model A has at least one predictor, F-test with p, and post-hoc power would be computed for the corresponding full model.
Aberson, C. L. (2019). Applied power analysis for the behavioral sciences. Routledge.
Judd, C. M., McClelland, G. H., & Ryan, C. S. (2017). Data analysis: A model Comparison approach to regression, ANOVA, and beyond. Routledge.
x1 <- rnorm(193) x2 <- rnorm(193) y <- 0.3 + 0.2*x1 + 0.1*x2 + rnorm(193) dat <- data.frame(y, x1, x2) # Fix the intercept to constant 1 using I(). fit1 <- lm(I(y - 1) ~ 0, dat) # Free the intercept. fit2 <- lm(y ~ 1, dat) compare_lm(fit1, fit2) # One predictor. fit3 <- lm(y ~ x1, dat) compare_lm(fit2, fit3) # Fix the intercept to 0.3 using offset(). intercept <- rep(0.3, 193) fit4 <- lm(y ~ 0 + x1 + offset(intercept), dat) compare_lm(fit4, fit3) # Two predictors. fit5 <- lm(y ~ x1 + x2, dat) compare_lm(fit2, fit5) compare_lm(fit3, fit5) # Fix the slope of x2 to 0.05 using offset(). fit6 <- lm(y ~ x1 + offset(0.05*x2), dat) compare_lm(fit6, fit5)x1 <- rnorm(193) x2 <- rnorm(193) y <- 0.3 + 0.2*x1 + 0.1*x2 + rnorm(193) dat <- data.frame(y, x1, x2) # Fix the intercept to constant 1 using I(). fit1 <- lm(I(y - 1) ~ 0, dat) # Free the intercept. fit2 <- lm(y ~ 1, dat) compare_lm(fit1, fit2) # One predictor. fit3 <- lm(y ~ x1, dat) compare_lm(fit2, fit3) # Fix the intercept to 0.3 using offset(). intercept <- rep(0.3, 193) fit4 <- lm(y ~ 0 + x1 + offset(intercept), dat) compare_lm(fit4, fit3) # Two predictors. fit5 <- lm(y ~ x1 + x2, dat) compare_lm(fit2, fit5) compare_lm(fit3, fit5) # Fix the slope of x2 to 0.05 using offset(). fit6 <- lm(y ~ x1 + offset(0.05*x2), dat) compare_lm(fit6, fit5)
Cut-off values of Pearson's correlation r with known sample size n.
cut_r(n)cut_r(n)
n |
Sample size of Pearson's correlation r. |
Given n and p, t and then r could be determined. The formula used could be found in test_r()'s documentation.
A data.frame including the cut-off values of r at the significance levels of p = 0.1, 0.05, 0.01, 0.001. r with the absolute value larger than the cut-off value is significant at the corresponding significance level.
cut_r(193)cut_r(193)
D refers to Depression. D research is a three-wave longitudinal survey research.
Here are four data sets, depress1 only includes data collected at wave 1,
depress2 at wave 2, and depress3 at wave 3. depress was created by joining depress1,
depress2, and depress3 together. depress has done reverse-coding and
has computed variable scores like depr1, which indicates the depression score at wave 1.
depress serves to ease users' practicing plotting and statistical analyzing. depress1,
depress2, and depress3 serves to support users' practicing data wrangling.
depressdepress
depressA three-wave data frame with 185 rows and 323 columns:
Date of the response. The 1 in the hundreds place indicates the 1st-wave survey. The rest variables also follow this naming rule.
Participant id
Gender. 1 = boy, 2 = girl.
Age in year.
Grade.
Class.
Boarder. 1 = this student lived at school during term time, 2 = this student did not live at school during term time.
Cost in yuan per month.
Father education level. 0 = no formal school education, 1 = primary school, 2 = middle school, 3 = high school or equivalent, 4 = college or equivalent, 5 = master's degree, 6 = doctor's degree.
Mother education level. The coding scheme is identical to that of fedu.
Intervene. 0 = control group, 1 = intervention group.
Attributional style questionnaire (Peterson, 1982). All items in this questionnaire were rated on a 7-point bipolar scale. The 1 in the tens place indicates the 1st situation, positive achievement situation. The 2 in the last place indicates the 2nd item, internal attribution. High scores indicate more internal attribution.
Attributional style questionnaire. The 3 in the last place indicates the 3rd item, stable attribution. High scores indicate more stable attribution.
Attributional style questionnaire. The 4 in the last place indicates the 4th item, global attribution. High scores indicate more global attribution.
Attributional style questionnaire. The 5 in the last place indicates the 5th item, the importance of this situation. High scores indicate more important.
Attributional style questionnaire. The 2 in the tens place indicates the 2nd situation, negative interpersonal relationship situation.
Attributional style questionnaire. The 3 in the tens place indicates the 3rd situation, positive interpersonal relationship situation.
Attributional style questionnaire. The 4 in the tens place indicates the 4th situation, negative achievement situation.
Attributional style questionnaire. The 5 in the tens place indicates the 5th situation, positive interpersonal relationship situation.
Attributional style questionnaire. The 6 in the tens place indicates the 6th situation, negative achievement situation.
Attributional style questionnaire. The 7 in the tens place indicates the 7th situation, positive achievement situation.
Attributional style questionnaire. The 8 in the tens place indicates the 8th situation, negative interpersonal relationship situation.
Coping inventory (Endler & Parker, 1994). All items in this inventory were rated on a 5-point Likert scale with 1 = rarely, 2 = seldom, 3 = moderately, 4 = often, 5 = very often. The "01" in the last two places indicates item 01. Item 01, as well as item 02, 06, 10, 16, 22, 25, 27, 28, 37, 40, 42, 43, 44, 47, and 48, measured task-oriented coping.
Coping inventory. The "05" in the last two places indicates item 05. Item 05, as well as item 07, 08, 14, 15, 17, 18, 20, 23, 26, 29, 31, 34, 35, 39, and 46, measured emotion-oriented coping.
Coping inventory. The "03" in the last two places indicates item 03. Item 03, as well as item 04, 09, 11, 12, 19, 21, 24, 30, 32, 33, 36, 38, 41, 45, and 49, measured avoidance-oriented coping.
Drinking. A coping strategy added by the D research to coping inventory.
Smoking. A coping strategy added by the D research to coping inventory.
Coping inventory, Wave 2, item 01.
Coping inventory, Wave 3, item 02.
Depression scale (Zung, et al., 1965). All items in this scale were rated on a 4-point Likert scale with 1 = rarely, 2 = seldom, 3 = often, 5 = always.
Depression scale. Item 02, 05, 06, 11, 12, 14, 16, 17, 18, and 20 has been reverse-coded. High scores indicate severe depression.
Experience in close relationships scale (Fraley, et al., 2011). All items in this scale were rated on a 7-point Likert scale with 1 = strongly disagree, 2 = disagree, 3 = somewhat disagree, 4 = Neutral, 5 = somewhat agree, 6 = agree, 7 = strongly agree. Item 01, 03, 04, 06, 08, and 09 measured attachment avoidance.
Experience in close relationships scale. Item 02. Item 02, 05, and 07 measured attachment anxiety.
Experience in close relationships scale. Item 03. Item 03, 04, 06, and 09 has been reverse-coded.
Not-available waves, ranging from 1 to 3.
Attachment anxiety, computed by averaging corresponding items.
Attachment aviodance, computed by averaging corresponding items.
Internal attribution for positive situations at wave 1, computed by averaging corresponding items.
Stable attribution for positive situations at wave 1, computed by averaging corresponding items.
Global attribution for positive situations at wave 1, computed by averaging corresponding items.
Internal attribution for negative situations at wave 1, computed by averaging corresponding items.
Internal attribution for positive situations at wave 2, computed by averaging corresponding items.
Task-oriented coping at wave 1, computed by averaging corresponding items.
Emotion-oriented coping at wave 1, computed by averaging corresponding items.
Avoidance-oriented coping at wave 1, computed by averaging corresponding items.
Task-oriented coping at wave 2, computed by averaging corresponding items.
Depression at wave 1, computed by averaging corresponding items.
Keng package.
D refers to Depression. D research is a three-wave longitudinal survey research.
Here are four data sets, depress1 only includes data collected at wave 1,
depress2 at wave 2, and depress3 at wave 3. depress was created by joining depress1,
depress2, and depress3 together. depress has done reverse-coding and
has computed variable scores like depr1, which indicates the depression score at wave 1.
depress serves to ease users' practicing plotting and statistical analyzing. depress1,
depress2, and depress3 serves to support users' practicing data wrangling.
depress1depress1
depress1A data frame with 174 rows and 122 columns at wave 1:
Date of the response.
Student id.
Experience in close relationships scale, item 01.
Experience in close relationships scale, item 03.
Item 03, 04, 06, and 09 has NOT been reverse-coded in depress1.
Attribution style questionnaire, situation 1, item 2.
Depression Scale, item 02. Item 02, 05, 06, 11, 12, 14, 16, 17, 18,
and 20 has NOT been reverse-coded in depress1.
Keng package.
D refers to Depression. D research is a three-wave longitudinal survey research.
Here are four data sets, depress1 only includes data collected at wave 1,
depress2 at wave 2, and depress3 at wave 3. depress was created by joining depress1,
depress2, and depress3 together. depress has done reverse-coding and
has computed variable scores like depr1, which indicates the depression score at wave 1.
depress serves to ease users' practicing plotting and statistical analyzing. depress1,
depress2, and depress3 serves to support users' practicing data wrangling.
depress2depress2
depress2A data frame with 173 rows and 104 columns at wave 2:
Depression Scale, item 02. Item 02, 05, 06, 11, 12, 14, 16, 17, 18,
and 20 has NOT been reverse-coded in depress2.
Keng package.
D refers to Depression. D research is a three-wave longitudinal survey research.
Here are four data sets, depress1 only includes data collected at wave 1,
depress2 at wave 2, and depress3 at wave 3. depress was created by joining depress1,
depress2, and depress3 together. depress has done reverse-coding and
has computed variable scores like depr1, which indicates the depression score at wave 1.
depress serves to ease users' practicing plotting and statistical analyzing. depress1,
depress2, and depress3 serves to support users' practicing data wrangling.
depress3depress3
depress3A data frame with 172 rows and 72 columns at wave 3:
Depression Scale, item 02. Item 02, 05, 06, 11, 12, 14, 16, 17, 18,
and 20 has NOT been reverse-coded in depress3.
Keng package.
Divide a vector into three groups
divide(x, criterion = "1sd")divide(x, criterion = "1sd")
x |
A numeric vector. |
criterion |
A character value |
divide() divides a vector into three groups using the criterion of 1 SD,
or proportions like 0.27. Differing from cut(), the medium interval of divide()
does not include the left and right limits.
An ordered factor with levels "Low < Medium < High".
x <- 1:100 divide(x) divide(x, 0.27)x <- 1:100 divide(x) divide(x, 0.27)
Pick Double Color Balls
pick_dcb(size = 1L, verbose = TRUE)pick_dcb(size = 1L, verbose = TRUE)
size |
The size of sets of Super Lotto numbers to pick. |
verbose |
A logical value. Print the numbers picked or not. |
Print the numbers picked, and return the invisible balls list that stored these numbers.
pick_dcb(10) out <- pick_dcb(10, verbose = FALSE) outpick_dcb(10) out <- pick_dcb(10, verbose = FALSE) out
Pick Super Lotto numbers
pick_sl(size = 1L, verbose = TRUE)pick_sl(size = 1L, verbose = TRUE)
size |
An integer. The size of sets of Super Lotto numbers to pick. |
verbose |
A logical value. Print the numbers picked or not. |
Print the numbers picked, and return the invisible balls list that stored these numbers.
# Example 1 pick_sl(10) # Example 2 out <- pick_sl(10, verbose = FALSE) out # Example 3 # create an empty list balls <- list(c(front = rep(NA, 5), back = rep(NA, 2)) ) luck <- list(c(front = c(10L, 13L, 14L, 19L, 27L), back = c(6L, 10L))) # limit the max number of draws max <- 9999 # try # count the number of draws i <- 0 while (!identical(balls, luck, max)) { i = i + 1 balls = pick_sl(verbose = FALSE) if (identical(balls, luck)) { print(i) print(balls) } else if (i == max) { cat(i, "failed\n") break} }# Example 1 pick_sl(10) # Example 2 out <- pick_sl(10, verbose = FALSE) out # Example 3 # create an empty list balls <- list(c(front = rep(NA, 5), back = rep(NA, 2)) ) luck <- list(c(front = c(10L, 13L, 14L, 19L, 27L), back = c(6L, 10L))) # limit the max number of draws max <- 9999 # try # count the number of draws i <- 0 while (!identical(balls, luck, max)) { i = i + 1 balls = pick_sl(verbose = FALSE) if (identical(balls, luck)) { print(i) print(balls) } else if (i == max) { cat(i, "failed\n") break} }
Plot the power against the sample size for the Keng_power class
## S3 method for class 'Keng_power' plot(x, ...)## S3 method for class 'Keng_power' plot(x, ...)
x |
The output object of |
... |
Further arguments passed to or from other methods. |
A plot of power against sample size.
plot(power_lm()) out <- power_r(0.2, n = 193) plot(out)plot(power_lm()) out <- power_r(0.2, n = 193) plot(out)
Conduct a priori power analysis, and plan the sample size for regression analysis
power_lm( PRE = 0.02, PC = 1L, PA = 2L, sig_level = 0.05, power = 0.8, n_ul = 1450000000L )power_lm( PRE = 0.02, PC = 1L, PA = 2L, sig_level = 0.05, power = 0.8, n_ul = 1450000000L )
PRE |
Proportional Reduction in Error. PRE = The square of partial correlation. Cohen (1988) suggested >=0.02, >=0.13, and >=0.26 as cut-off values of PRE for small, medium, and large effect sizes, respectively. |
PC |
Number of parameters of model C (compact model) without focal predictors of interest.
Non-integer |
PA |
Number of parameters of model A (augmented model) with focal predictors of interest.
Non-integer |
sig_level |
Expected significance level for effects of focal predictors. |
power |
Expected statistical power for effects of focal predictors. |
n_ul |
The upper limit of the sample size below which the minimum required sample size is searched for.
Non-integer |
n_ul determines the upper limit of the interval below which power_lm() searches for the minimum required sample size,
hence the number of rows of the returned power table priori and the right limit of the horizontal axis of the returned power plot.
When PRE is very small (e.g., less than 0.001) and power is larger than 0.8,
a huge increase in sample size only brings about a trivial increase in power, which is cost-ineffective.
To make power_lm() omit unnecessary searching, you could set n_ul to be a value less than 1.45e+09 (e.g., 10000).
A Keng_power class, also a list. If sample size n is not given, the following results would be returned:
[[1]] PRE;
[[2]] f_squared, Cohen's f_squared derived from PRE;
[[3]] PC;
[[4]] PA;
[[5]] sig_level, expected significance level for effects of focal predictors;
[[6]] power, expected statistical power for effects of focal predictors;
[[7]] n_ul, the upper limit of sample size;
[[8]] root, the exact decimal n found by internal uniroot() to reach the expected power;
[[9]] minimum, the minimum sample size required for focal predictors to reach the
expected statistical power and significance level;
[[10]] priori, a priori power table with increasing sample sizes (n_i), and corresponding
df_A_C(the df of the numerator of the F-test, i.e., the difference of the dfs between model C and model A),
df_A_i(the df of the denominator of the F-test, i.e., the df of the model A at the sample size n_i),
F_i(the F-test of PRE at the sample size n_i),
p_i(the p-value of F_i),
lambda_i(the non-centrality parameter of the F-distribution for the alternative hypothesis, given PRE and n_i),
power_i(the actual power of PRE at the sample size n_i).
By default, print() prints the primary but not all contents of the Keng_power class.
To inspect more contents, use print.AsIs() or list extracting.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Routledge.
power_lm() print(power_lm()) plot(power_lm())power_lm() print(power_lm()) plot(power_lm())
Conduct a priori power analysis, and plan the sample size for r.
power_r(r = 0.2, sig_level = 0.05, power = 0.8, n_ul = 1450000000L)power_r(r = 0.2, sig_level = 0.05, power = 0.8, n_ul = 1450000000L)
r |
Pearson's correlation. Cohen(1988) suggested >=0.1, >=0.3, and >=0.5 as cut-off values of Pearson's correlation r for small, medium, and large effect sizes, respectively. |
sig_level |
Expected significance level. |
power |
Expected statistical power. |
n_ul |
The upper limit of the sample size below which the minimum required sample size is searched for.
Non-integer |
Power_r() follows Aberson (2019) approach to conduct power analysis.
n_ul determines the upper limit of the interval below which power_r() searches
for the minimum required sample size, hence the number of rows of the returned
power table priori and the right limit of the horizontal axis of the returned power plot.
When r is very small and power is larger than 0.8, a huge increase of
sample size only brings about a trivial increase in power,
which is cost-ineffective. To make power_r() omit unnecessary searching,
you could set n_ul to be a value less than 1.45e+09 (e.g., 10000).
A Keng_power class, also a list. If n is not given, the following results would be returned:
[[1]] r, the given r;
[[2]] d, Cohen's d derived from r; Cohen (1988) suggested >=0.2, >=0.5, and >=0.8
as cut-off values of d for small, medium, and large effect sizes, respectively;
[[3]] sig_level, the expected significance level;
[[4]] power, the expected power;
[[5]] n_ul, the upper limit of sample size;
[[6]] root, the exact decimal n found by internal uniroot() to reach the expected power;
[[7]] minimum, the minimum required sample size.
[[8]] priori, a priori power table with increasing sample sizes (n_i), and corresponding
df_i (the df of t-test at the sample size n_i, df_i = n_i - 2),
SE_i (the SE of r at the sample size n_i),
t_i (the t-test of r),
p_i (the p-value of t_i),
delta_i (the non-centrality parameter of the t-distribution for the alternative hypothesis, given r and n_i),
power_i (the actual power of r at the sample size n_i);.
If sample size n is given, the following results would also be returned:
Integer n, the t_test of r at the sample size n with
df, SE of r, p (the p-value of t-test), and the post-hoc power analysis with
delta_post (the non-centrality parameter of the t-distribution for the alternative hypothesis),
and power_post (the post-hoc power of r at the sample size n).
By default, print() prints the primary but not all contents of the Keng_power class.
To inspect more contents, use print.AsIs() or list extracting.
Aberson, C. L. (2019). Applied power analysis for the behavioral sciences. Routledge.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Routledge.
power_r(0.2) print(power_r(0.04)) plot(power_r(0.04))power_r(0.2) print(power_r(0.04)) plot(power_r(0.04))
Compute lm's post-hoc power
powered_lm(PRE = 0.04, PC = 1L, PA = 2L, n = 200L, sig_level = 0.05)powered_lm(PRE = 0.04, PC = 1L, PA = 2L, n = 200L, sig_level = 0.05)
PRE |
Proportional Reduction in Error. PRE = The square of partial correlation. Cohen (1988) suggested >=0.02, >=0.13, and >=0.26 as cut-off values of PRE for small, medium, and large effect sizes, respectively. |
PC |
Number of parameters of model C (compact model) without focal predictors of interest.
Non-integer |
PA |
Number of parameters of model A (augmented model) with focal predictors of interest.
Non-integer |
n |
The current sample size. Integer |
sig_level |
Expected significance level for effects of focal predictors. |
Integer n, the F_test of PRE at the sample size n with
df_A_C,
df_A (the df of the model A at the sample size n),
F (the F-test of PRE at the sample size n),
p (the p-value of F-test at the sample size n), and the post-hoc power analysis with
lambda (the non-centrality parameter of F at the sample size n),
and power (the post-hoc power at the sample size n).
powered_lm()powered_lm()
Compute r's post-hoc power
powered_r(r = 0.2, n = 200L, sig_level = 0.05)powered_r(r = 0.2, n = 200L, sig_level = 0.05)
r |
Pearson's correlation. Cohen(1988) suggested >=0.1, >=0.3, and >=0.5 as cut-off values of Pearson's correlation r for small, medium, and large effect sizes, respectively. |
n |
The current sample size. Non-integer |
sig_level |
Expected significance level. |
Integer n, the t_test of r at the sample size n with df, SE of r,
p (the p-value of t-test), and the post-hoc power analysis with delta
(the non-centrality parameter of the t-distribution for the alternative hypothesis),
and power (the post-hoc power of r at the sample size n).
Aberson, C. L. (2019). Applied power analysis for the behavioral sciences. Routledge.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Routledge.
powered_r()powered_r()
Print primary but not all contents of the Keng_power class
## S3 method for class 'Keng_power' print(x, ...)## S3 method for class 'Keng_power' print(x, ...)
x |
The output object of |
... |
Further arguments passed to or from other methods. |
None (invisible NULL).
power_lm() power_lm(n = 200) print(power_lm(n = 200)) x <- power_r(0.2, n = 193) xpower_lm() power_lm(n = 200) print(power_lm(n = 200)) x <- power_r(0.2, n = 193) x
Scale a vector
Scale(x, m = 0, sd = NULL)Scale(x, m = 0, sd = NULL)
x |
The original vector. |
m |
The expected Mean of the scaled vector. |
sd |
The expected Standard Deviation (unit) of the scaled vector. |
To scale x, its mean, or standard deviation (unit), or both, could be changed.
The missing values of x are removed if any.
If m = 0 (default), and sd = NULL(default), x would be mean-centered (default).
If m = NULL, the mean of x would not changed.
If m is not NULL, the mean of x would be transformed to sd.
If m = NULL or 0, and sd = 1, x would be standardized to be its z-score with m = 0 and sd = 1.
The mean and standardized score are not necessarily 0 and 1.
The scaled vector.
(x <- rnorm(10, 5, 2)) # Mean-center x. Scale(x) Scale(x, m = 0) # Transform the mean of x to 3. Scale(x, m = 3) # Transform x to its z-score. Scale(x, sd = 1) # Standardize x with m = 100 and sd = 15. Scale(x, m = 100, sd = 15)(x <- rnorm(10, 5, 2)) # Mean-center x. Scale(x) Scale(x, m = 0) # Transform the mean of x to 3. Scale(x, m = 3) # Transform x to its z-score. Scale(x, sd = 1) # Standardize x with m = 100 and sd = 15. Scale(x, m = 100, sd = 15)
Test the significance, analyze the power, and plan the sample size for r.
test_r(r = NULL, n = NULL, sig_level = 0.05, power = 0.8)test_r(r = NULL, n = NULL, sig_level = 0.05, power = 0.8)
r |
Pearson's correlation. Cohen(1988) suggested >=0.1, >=0.3, and >=0.5 as cut-off values of Pearson's correlation r for small, medium, and large effect sizes, respectively. |
n |
Sample size of r. Non-integer |
sig_level |
Expected significance level. |
power |
Expected statistical power. |
To test the significance of the r using the one-sample t-test,
the SE of r is determined by the following formula: SE = sqrt((1 - r^2)/(n - 2)).
Another way is transforming r to Fisher's z using the following formula:
fz = atanh(r) with the SE of fz being sqrt(n - 3).
Fisher's z is commonly used to compare two Pearson's correlations from independent samples.
Fisher's transformation is presented here only to satisfy the curiosity of users who are
interested in the difference between t-test and Fisher's transformation.
The post-hoc power of r's t-test is computed through the way of Aberson (2019).
Other software and R packages like SPSS and pwr give different power estimates due to
underlying different formulas. Keng adopts Aberson's approach because this approach guarantees
the equivalence of r and PRE.
A list with the following results:
[[1]] r, the given r;
[[2]] d, Cohen's d derived from r; Cohen (1988) suggested >=0.2, >=0.5, and >=0.8
as cut-off values of d for small, medium, and large effect sizes, respectively.
[[3]] Integer n;
[[4]] t-test of r (incl., r, df of r, SE_r, t, p_r),
95% CI of r based on t -test (LLCI_r_t, ULCI_r_t),
and post-hoc power of r (incl., delta_post, power_post);
[[5]] Fisher's z transformation (incl., fz of r, z-test of fz [SE_fz, z, p_fz],
and 95% CI of r derived from fz.
Note that the returned CI of r may be out of r's valid range [-1, 1].
This "error" is deliberately left to users, who should correct the CI manually in reports.
Aberson, C. L. (2019). Applied power analysis for the behavioral sciences. Routledge.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Routledge.
test_r(0.2, 193) # compare the p-values of t-test and Fisher's transformation for (i in seq(30, 200, 10)) { cat(c("n = ", i, ", difference between ps = ", format( abs(test_r(0.2, i)[["t_test"]]["p_r"] - test_r(0.2, i)[["Fisher_z"]]["p_fz"]), nsmall = 12, scientific = FALSE)), sep = "", fill = TRUE) }test_r(0.2, 193) # compare the p-values of t-test and Fisher's transformation for (i in seq(30, 200, 10)) { cat(c("n = ", i, ", difference between ps = ", format( abs(test_r(0.2, i)[["t_test"]]["p_r"] - test_r(0.2, i)[["Fisher_z"]]["p_fz"]), nsmall = 12, scientific = FALSE)), sep = "", fill = TRUE) }
W refers to well-being. W research is a three-wave longitudinal survey research.
Here are four data sets, well1 only includes data collected at wave 1,
well2 at wave 2, and well3 at wave 3. well was created by joining well1,
well2, and well3 together. well has done reverse-coding and
has computed variable scores like flourish1, which indicates the flourish score at wave 1.
well serves to ease users' practicing plotting and statistical analyzing. well1,
well2, and well3 serves to support users' practicing data wrangling.
wellwell
wellA three-wave data frame with 200 rows and 413 columns:
Participant id
Gender. 0 = boy, 1 = girl. This variable ends with "100". The 1st digit indicates the wave number, and the last two digits indicate the item number. The rest variables also follow this naming rule.
Age in years, wave 1.
Grade, wave 1.
Monthly income (RMB) of the family, wave1.
National student loan in the past year. Reported at wave 3. 0 = Not received, 1 = Received.
Monthly cost of the participant, wave1. "1" = "<500RMB", "2" = "500-1000RMB", "3" = "1000-1500RMB","4" = "1500-2000RMB","5" = "2000-2500RMB", "6" = ">2500RMB".
Subjective Socioeconomic Status, wave1. SSES was measured through the 10-step social ladder, with high scores indicating high status.
HEEMA scale, wave 1, item 01. HEEMA = The Hedonic, Eudaimonic, and Extrinsic Motivations for Activities (Lefebvre & Huta, 2021). HEEMA was rated on a 7-point scale (1 = not at all, 7 = very much). HEEMA measures three kind of motivations: Hedonic (item 01, 04, 06, 07, 09, 16), eudaimonic (item 02, 03, 05, 08, 10), and Extrinsic (item 11, 12, 13, 14, 15).
Career Social Support from Parents, wave 1, item 01. CSS investigates four kind of support using a 5-point scale (1 = almost none, 5 = very much; Hou et al, 2010): material (item 01, 02, 03), advice (item 04, 05, 06, 07, 08, 09), emotional (item 10, 11, 12, 13), and information (14, 15, 16, 17, 18, 19, 20).
Career Social Support from Siblings, wave 1, item 01.
Career Social Support from Teachers, wave 1, item 01.
Career Social Support from Friends, wave 1, item 01.
Career Social Support from Relatives, wave 1, item 01.
Grit scale (Wei & Hu, 2017), wave 1, item 01. Grit is rated on a 5-point scale (1 = not at all like me, 5 = very much like me). Item 1, 3, 5, and 6 should be reverse-scored. Grit has two dimensions: perseverance (item 02, 04, 07, 08) and passion (item 01, 03, 05, 06). After reverse scoring, high scores indicate high perseverance and passion.
CET4 score, wave 1. CET4 = College English Test Band 4. 0 = participants had not taken CET4 test.
Flourish scale, wave 1, item 01. Flourish scale (Tong & Wang, 2017) was rated on a 5-point scale (1 = very disagree, 7 = very agree), with high scores indicating high eudaimonic well-being.
SPNE scale, wave 1, item 01. SPNE = Scale of Positive and Negative Experience (Tong & Wang, 2017). SPNE measures positive (item 01, 03, 05, 07, 10, 12) and negative (item 02, 04, 06, 08, 9, 11) emotions on a 5-point scale (1 = very rarely or never, 5 = very often or always). High positive emotion and low negative emotion indicate high hedonic well-being.
Flourish, wave 1. All scale scores and dimension scores in this data are computed by averaging corresponding items.
Positive Emotions, wave 1.
Negative Emotions, wave 1.
Grit, wave 1.
Perseverance, wave 1.
Passion, wave 1.
Hedonic well-being motivation, wave 1.
Eudaimonic well-being motivation, wave 1.
Extrinsic well-being motivation, wave 1.
Career Social Support, Parents, Material, wave 1.
Career Social Support, Parents, Advice, wave 1.
Career Social Support, Parents, Emotional, wave 1.
Career Social Support, Parents, Information, wave 1.
Career Social Support, Siblings, Material, wave 1.
Career Social Support, Teachers, Material, wave 1.
Career Social Support, Friends, Material, wave 1.
Career Social Support, Relatives, Material, wave 1.
Career Social Support, Parents, wave 1.
Career Social Support, Material, wave 1.
Keng package.
Lefebvre, A., & Huta, V. (2021). Age and gender differences in eudaimonic, hedonic, and extrinsic motivations. Journal of Happiness Studies, 22, 2299-2321.
Hou, Z., Bai, R., & Yao, Y. (2010). Development of career social support inventory for Chinese college students. Chinese Journal of Clinical Psychology, 18(4), 439-442.
Wei, Y., & Hu, J. (2017). Grit: Concept, structure, influencing factors, and consequences. Psychology: Techniques and Applications, 5(1), 52-61.
Tong, K. K. & Wang, Y. Y. (2017). Validation of the Flourishing Scale and Scale of Positive and Negative Experience in a Chinese Community Sample. PLUS ONE, 12(8), e0181616.
well1The 1st wave data frame with 200 rows and 152 columns:
Participant id
Gender. 0 = boy, 1 = girl. \itemageAge in years. \itemgradeGrade. \itemincomeMonthly income of the family. \itemcostMonthly cost of the participant. \itemSSESSubjective Socioeconomic Status. SSES was measured through the 10-step social ladder, with high scores indicating high status. \itemheema01HEEMA scale, item 01. HEEMA = The Hedonic, Eudaimonic, and Extrinsic Motivations for Activities (Lefebvre & Huta, 2021). HEEMA was rated on a 7-point scale (1 = not at all, 7 = very much). HEEMA measures three kind of motivations: Hedonic (item 01, 04, 06, 07, 09, 16), eudaimonic (02, 03, 05, 08, 10), and Extrinsic (item 11, 12, 13, 14, 15). \itemcssp01Career Social Support from Parents, item 01. CSS investigates four kind of support using a 5-point scale (1 = almost none, 5 = very much; Hou et al, 2010): material (item 01, 02, 03), advice (item 04, 05, 06, 07, 08, 09), emotional (item 10, 11, 12, 13), and information (14, 15, 16, 17, 18, 19, 20). \itemcsss01Career Social Support from Siblings, item 01. \itemcsst01Career Social Support from Teachers, item 01. \itemcssf01Career Social Support from Friends, item 01. \itemcssr01Career Social Support from Relatives, item 01. \itemgrit01Grit scale, item 01. Grit (Wei & Hu, 2017) is rated on a 5-point scale (1 = not at all like me, 5 = very much like me). Item 1, 3, 5, and 6 should be reverse-scored. Grit has two dimensions: perseverance (item 02, 04, 07, 08) and passion (item 01, 03, 05, 06). After reverse scoring, high scores indicate high perseverance and passion. \itemCET4CET4 score. CET4 = College English Test Band 4. CET4 = 0 indicates paticipants had not taken CET4 test. \itemflourish01Flourish scale, item 01. Flourish scale (Tong & Wang, 2017) was rated on a 5-point scale (1 = very disagree, 5 = very agree), with high scores indicating high eudaimonic well-being. \itemspne01SPNE scale, item 01. SPNE = Scale of Positive and Negative Experience (Tong & Wang, 2017). SPNE measures positive (item 01, 03, 05, 07, 10, 12) and negative (item 02, 04, 06, 08, 9, 11) emotions on a 5-point scale (1 = very rarely or never, 5 = very often or always). High positive emotion and low negative emotion indicate high hedonic well-being.
Keng package.
well1
W refers to well-being. W research is a three-wave longitudinal survey research.
Here are four data sets, well1 only includes data collected at wave 1,
well2 at wave 2, and well3 at wave 3. well was created by joining well1,
well2, and well3 together. well has done reverse-coding and
has computed variable scores like flourish1, which indicates the flourish score at wave 1.
well serves to ease users' practicing plotting and statistical analyzing. well1,
well2, and well3 serves to support users' practicing data wrangling.
Lefebvre, A., & Huta, V. (2021). Age and gender differences in eudaimonic, hedonic, and extrinsic motivations. Journal of Happiness Studies, 22, 2299-2321.
Hou, Z., Bai, R., & Yao, Y. (2010). Development of career social support inventory for Chinese college students. Chinese Journal of Clinical Psychology, 18(4), 439-442.
Wei, Y., & Hu, J. (2017). Grit: Concept, structure, influencing factors, and consequences. Psychology: Techniques and Applications, 5(1), 52-61.
Tong, K. K. & Wang, Y. Y. (2017). Validation of the Flourishing Scale and Scale of Positive and Negative Experience in a Chinese Community Sample. PLUS ONE, 12(8), e0181616.
datasets
W refers to well-being. W research is a three-wave longitudinal survey research.
Here are four data sets, well1 only includes data collected at wave 1,
well2 at wave 2, and well3 at wave 3. well was created by joining well1,
well2, and well3 together. well has done reverse-coding and
has computed variable scores like flourish1, which indicates the flourish score at wave 1.
well serves to ease users' practicing plotting and statistical analyzing. well1,
well2, and well3 serves to support users' practicing data wrangling.
well2well2
wellThe 2nd wave data frame with 200 rows and 152 columns:
Participant id
Gender. 0 = boy, 1 = girl. This variable ends with "100". The 1st digit indicates the wave number, and the last two digits indicate the item number. The rest variables also follow this naming rule.
Age in years.
Grade.
Monthly income of the family.
Monthly cost of the participant.
Subjective Socioeconomic Status. SSES was measured through the 10-step social ladder, with high scores indicating high status.
HEEMA scale, item 01. HEEMA = The Hedonic, Eudaimonic, and Extrinsic Motivations for Activities (Lefebvre & Huta, 2021). HEEMA was rated on a 7-point scale (1 = not at all, 7 = very much). HEEMA measures three kind of motivations: Hedonic (item 01, 04, 06, 07, 09, 16), eudaimonic (02, 03, 05, 08, 10), and Extrinsic (item 11, 12, 13, 14, 15).
Career Social Support from Parents, item 01. CSS investigates four kind of support using a 5-point scale (1 = almost none, 5 = very much; Hou et al, 2010): material (item 01, 02, 03), advice (item 04, 05, 06, 07, 08, 09), emotional (item 10, 11, 12, 13), and information (14, 15, 16, 17, 18, 19, 20).
Career Social Support from Siblings, item 01.
Career Social Support from Teachers, item 01.
Career Social Support from Friends, item 01.
Career Social Support from Relatives, item 01.
Grit scale, item 01. Grit (Wei & Hu, 2017) is rated on a 5-point scale (1 = not at all like me, 5 = very much like me). Item 1, 3, 5, and 6 should be reverse-scored. Grit has two dimensions: perseverance (item 02, 04, 07, 08) and passion (item 01, 03, 05, 06). After reverse scoring, high scores indicate high perseverance and passion.
CET4 score. CET4 = College English Test Band 4. CET4 = 0 indicates paticipants had not taken CET4 test.
Flourish scale, item 01. Flourish scale (Tong & Wang, 2017) was rated on a 5-point scale (1 = very disagree, 5 = very agree), with high scores indicating high eudaimonic well-being.
SPNE scale, item 01. SPNE = Scale of Positive and Negative Experience (Tong & Wang, 2017). SPNE measures positive (item 01, 03, 05, 07, 10, 12) and negative (item 02, 04, 06, 08, 9, 11) emotions on a 5-point scale (1 = very rarely or never, 5 = very often or always). High positive emotion and low negative emotion indicate high hedonic well-being.
Keng package.
Lefebvre, A., & Huta, V. (2021). Age and gender differences in eudaimonic, hedonic, and extrinsic motivations. Journal of Happiness Studies, 22, 2299-2321.
Hou, Z., Bai, R., & Yao, Y. (2010). Development of career social support inventory for Chinese college students. Chinese Journal of Clinical Psychology, 18(4), 439-442.
Wei, Y., & Hu, J. (2017). Grit: Concept, structure, influencing factors, and consequences. Psychology: Techniques and Applications, 5(1), 52-61.
Tong, K. K. & Wang, Y. Y. (2017). Validation of the Flourishing Scale and Scale of Positive and Negative Experience in a Chinese Community Sample. PLUS ONE, 12(8), e0181616.
W refers to well-being. W research is a three-wave longitudinal survey research.
Here are four data sets, well1 only includes data collected at wave 1,
well2 at wave 2, and well3 at wave 3. well was created by joining well1,
well2, and well3 together. well has done reverse-coding and
has computed variable scores like flourish1, which indicates the flourish score at wave 1.
well serves to ease users' practicing plotting and statistical analyzing. well1,
well2, and well3 serves to support users' practicing data wrangling.
well3well3
wellThe 3rd wave data frame with 200 rows and 152 columns:
Participant id
Gender. 0 = boy, 1 = girl. This variable ends with "100". The 1st digit indicates the wave number, and the last two digits indicate the item number. The rest variables also follow this naming rule.
Age in years.
Grade.
Received national student loan in the past year.
HEEMA scale, item 01. HEEMA = The Hedonic, Eudaimonic, and Extrinsic Motivations for Activities (Lefebvre & Huta, 2021). HEEMA was rated on a 7-point scale (1 = not at all, 7 = very much). HEEMA measures three kind of motivations: Hedonic (item 01, 04, 06, 07, 09, 16), eudaimonic (02, 03, 05, 08, 10), and Extrinsic (item 11, 12, 13, 14, 15).
Grit scale, item 01. Grit (Wei & Hu, 2017) is rated on a 5-point scale (1 = not at all like me, 5 = very much like me). Item 1, 3, 5, and 6 should be reverse-scored. Grit has two dimensions: perseverance (item 02, 04, 07, 08) and passion (item 01, 03, 05, 06). After reverse scoring, high scores indicate high perseverance and passion.
CET4 score. CET4 = College English Test Band 4. CET4 = 0 indicates paticipants had not taken CET4 test.
Flourish scale, item 01. Flourish scale (Tong & Wang, 2017) was rated on a 5-point scale (1 = very disagree, 5 = very agree), with high scores indicating high eudaimonic well-being.
SPNE scale, item 01. SPNE = Scale of Positive and Negative Experience (Tong & Wang, 2017). SPNE measures positive (item 01, 03, 05, 07, 10, 12) and negative (item 02, 04, 06, 08, 9, 11) emotions on a 5-point scale (1 = very rarely or never, 5 = very often or always). High positive emotion and low negative emotion indicate high hedonic well-being.
Keng package.
Lefebvre, A., & Huta, V. (2021). Age and gender differences in eudaimonic, hedonic, and extrinsic motivations. Journal of Happiness Studies, 22, 2299-2321.
Hou, Z., Bai, R., & Yao, Y. (2010). Development of career social support inventory for Chinese college students. Chinese Journal of Clinical Psychology, 18(4), 439-442.
Wei, Y., & Hu, J. (2017). Grit: Concept, structure, influencing factors, and consequences. Psychology: Techniques and Applications, 5(1), 52-61.
Tong, K. K. & Wang, Y. Y. (2017). Validation of the Flourishing Scale and Scale of Positive and Negative Experience in a Chinese Community Sample. PLUS ONE, 12(8), e0181616.