PowerUpR
This document accompanies the workshop on cluster randomized trials led
by Larry V. Hedges, Sophie E. Stallasch & Martin Brunner at the
University of Potsdam on June 9-10, 2022. It provides the corresponding
R code for the exercises on power analysis done during the workshop with
the ‘PowerUp!’ Excel tool by Dong & Maynard
(2013), which is accessible at the “Causal
Evaluation” website. Here, we will use the R implementation of
‘PowerUp!’, namely the PowerUpR
package built by Bulus et al. (2021).
You can install the latest version of PowerUpR
(currently, this is version 1.1.0) from CRAN. Throughout this tutorial,
we will also use a bunch of tidyverse
(Wickham et al., 2019) functions, as well as the
kableExtra
(Zhu, 2021)
package, so make sure you have installed these packages (for details on
the versions used here, see the ‘Session Info’ section at the end of
this document).
# -- load packages (or install from CRAN if not installed)
packages <- c("PowerUpR", "tidyverse", "kableExtra")
invisible(
lapply(packages,
function(x) {
if (!require(x, character.only = TRUE)) {
install.packages(x, dependencies = TRUE); require(x, character.only = TRUE)}
}
)
)
To plan rigorous intervention studies on student achievement with sufficient statistical power, reliable estimates of multilevel design parameters are critical. These design parameters mirror the (clustered) variance structure of the outcome, including intraclass correlation coefficients ρ that quantify achievement differences between clusters and R2 values that quantify the amounts of explained variance by covariates at the various levels. Design parameters should match the target population of the intervention, the specific hierarchical data structure and the achievement outcome under investigation.
There are multiple resources of empirical values for ρ and R2 for student achievement. A review of existing international and German research on design parameters can be found in Stallasch et al. (2021). One useful collection of estimates for the United States is, for instance, the “Online Intraclass Correlation Database” created by Hedges and colleagues and hosted by the Institute for Policy Research of the Northwestern University.
For the majority of workshop exercises, we draw on the compilation of ρ and R2 values (and corresponding standard errors) published in Stallasch et al. (2021). Based on three longitudinal German large-scale assessments (NEPS, PISA-I+, and DESI) that provide achievement data across the entire school career (grades 1 to 12), the authors generated design parameters that apply to
R2 values at each level are available for three covariate sets: (a) pretest scores, (b) sociodemographic characteristics (comprising students’ gender and migration background, as well as parents’ educational attainment, and families’ HISEI), and (c) the combination thereof.
The design parameters are provided via an interactive Excel file (Supplemental Online Material B) that comes with a detailed introduction on the application scopes of the various sets of estimates. This document can be downloaded from the OSF or the Journal’s website.
To facilitate the workflow in R (e.g., to avoid time-consuming and error-prone C&P of estimates), an .rda file that encloses the full compilation of design parameters (as a list of data frames) is shared in this course’s repository on github which is ready to be directly loaded:
# load design parameters from github
load(url("https://github.com/sophiestallasch/2022-workshop-CRT/blob/main/data/multides1.rda?raw=true"))
If you run into problems, click here to download the data to your local machine and then load it into R.
# inspect list
summary(multides1)
## Length Class Mode
## B1_General 28 tbl_df list
## B2_General_ND 14 tbl_df list
## B3_Adjusted 28 tbl_df list
## B4_Adjusted_ND 14 tbl_df list
## B5_Academic 28 tbl_df list
## B6_Academic_ND 14 tbl_df list
## B7_Non-Academic 28 tbl_df list
## B8_Non-Academic_ND 14 tbl_df list
## B9_General-2l 20 tbl_df list
## B10_General-2l_ND 10 tbl_df list
## B11_Adjusted-2l 20 tbl_df list
## B12_Adjusted-2l_ND 10 tbl_df list
## B13_Academic-2l 20 tbl_df list
## B14_Academic-2l_ND 10 tbl_df list
## B15_Non-Academic-2l 20 tbl_df list
## B16_Non-Academic-2l_ND 10 tbl_df list
The list contains 16 data frames, that can be grouped into two broad classes:
Point Estimates and Standard Errors
Data frames B1, B3, B5, B7, B9, B11, B13, and B15 contain the full sets
of (population-specific) empirical estimates of design parameters for
each domain, subdomain, and grade, along with their standard
errors.
Normative Distributions
Data frames B2, B4, B6, B8, B10, B12, B14, and B16 (i.e., data frames
ending with “_ND”) contain (population-specific) normative distributions
(i.e., minimum, 25th percentile, median, 75th percentile, and maximum)
of those design parameters summarized across domains and/or grades.
These distributions can serve as guesstimates to plan studies whose
target domain and/or grade is not covered in Stallasch et al. (2021).
Data frame | Description | Grades | Scope of application | Target design | Hierarchical structure | ||
---|---|---|---|---|---|---|---|
L1 | L2 | L3 | |||||
B1_General | Design parameters for the general (total) student population | 1-10 | Nationwide/ across all school types | 3L-(MS)CRT | Students | Classrooms | Schools |
11-12 | 2L-MSIRT/CRT | Students | - | Schools | |||
B2_General_ND | Normative distributions of B1 | 1-10 | Nationwide/ across all school types; target domain and/or grade not covered | 3L-(MS)CRT | Students | Classrooms | Schools |
11-12 | 2L-MSIRT/CRT | Students | - | Schools | |||
B3_Adjusted | Design parameters for the general (total) student population, adjusted for mean-level differences between school types | 5-10 | 1 school type in the non-academic track | 3L-(MS)CRT | Students | Classrooms | Schools |
11-12 | 2L-MSIRT/CRT | Students | - | Schools | |||
B4_Adjusted_ND | Normative distributions of B3 | 5-10 | 1 school type in the non-academic track; target domain and/or grade not covered | 3L-(MS)CRT | Students | Classrooms | Schools |
11-12 | 2L-MSIRT/CRT | Students | - | Schools | |||
B5_Academic | Design parameters for the academic track | 5-10 | Academic track school (“Gymnasium”) | 3L-(MS)CRT | Students | Classrooms | Schools |
11-12 | 2L-MSIRT/CRT | Students | - | Schools | |||
B6_Academic_ND | Normative distributions of B5 | 5-10 | Academic track school (“Gymnasium”); target domain and/or grade not covered | 3L-(MS)CRT | Students | Classrooms | Schools |
11-12 | 2L-MSIRT/CRT | Students | - | Schools | |||
B7_Non-Academic | Design parameters for the non-academic track | 5-10 | 2+ different school types in the non-academic track | 3L-(MS)CRT | Students | Classrooms | Schools |
11-12 | 2L-MSIRT/CRT | Students | - | Schools | |||
B8_Non-Academic_ND | Normative distributions of B7 | 5-10 | 2+ different school types in the non-academic track; target domain and/or grade not covered | 3L-(MS)CRT | Students | Classrooms | Schools |
11-12 | 2L-MSIRT/CRT | Students | - | Schools | |||
B9_General-2l | Design parameters for the general (total) student population; for two-level designs | 1-10 | Nationwide/ across all school types | 2L-MSIRT/CRT | Students | - | Schools |
B10_General-2l_ND | Normative distributions of B9 | 1-10 | Nationwide/ across all school types; target domain and/or grade not covered | 2L-MSIRT/CRT | Students | - | Schools |
B11_Adjusted-2l | Design parameters for the general (total) student population, adjusted for mean-level differences between school types; for two-level designs | 5-10 | 1 school type in the non-academic track | 2L-MSIRT/CRT | Students | - | Schools |
B12_Adjusted-2l_ND | Normative distributions of B11 | 5-10 | 1 school type in the non-academic track; target domain and/or grade not covered | 2L-MSIRT/CRT | Students | - | Schools |
B13_Academic-2l | Design parameters for the academic track; for two-level designs | 5-10 | Academic track school (“Gymnasium”) | 2L-MSIRT/CRT | Students | - | Schools |
B14_Academic-2l_ND | Normative distributions of B13 | 5-10 | Academic track school (“Gymnasium”); target domain and/or grade not covered | 2L-MSIRT/CRT | Students | - | Schools |
B15_Non-Academic-2l | Design parameters for the non-academic track; for two-level designs | 5-10 | 2+ different school types in the non-academic track | 2L-MSIRT/CRT | Students | - | Schools |
B16_Non-Academic-2l_ND | Normative distributions of B15 | 5-10 | 2+ different school types in the non-academic track; target domain and/or grade not covered | 2L-MSIRT/CRT | Students | - | Schools |
Further, Stallasch et al. (2021) created a
flow chart to guide researchers through the choice of appropriate design
parameters as a function of key characteristics of the target
intervention.
Data frames that contain the point estimates and standard errors (i.e., B1, B3, B5, B7, B9, B11, B13, and B15) are structured as follows.
Variable | Description |
---|---|
domain | Domain of achievement outcome (for details, see section ‘Achievement Domains’) |
subdomain | Subdomain of achievement outcome (for details, see section ‘Achievement Domains’) |
grade | Grade of achievement outcome |
study | Large-scale assessment study (and cohort): ‘NEPS-SC2’, ‘NEPS-SC3’, ‘NEPS-SC4’, ‘PISA-I+’, ‘DESI’ |
wave | Wave of large-scale assessment study |
model | Hierarchical structure of specified multilevel model: ‘3l’, ‘2l’ |
icc_l2.est | ICC at classroom level ρL2 |
icc_l2.se | SE of ICC at classroom level SE(ρL2) |
icc_l3.est | ICC at school level ρL3 |
icc_l3.se | SE of ICC at school level SE(ρL3) |
r2_l1_pretest.est | Explained variance by a pretest at student level R2L1 |
r2_l1_pretest.se | SE of explained variance by a pretest at student level SE(R2L1) |
r2_l2_pretest.est | Explained variance by a pretest at classroom level R2L2 |
r2_l2_pretest.se | SE of explained variance by a pretest at classroom level SE(R2L2) |
r2_l3_pretest.est | Explained variance by a pretest at school level R2L3 |
r2_l3_pretest.se | SE of explained variance by a pretest at school level SE(R2L3) |
r2_l1_ses.est | Explained variance by sociodemographics at student level R2L1 |
r2_l1_ses.se | SE of explained variance by sociodemographics at student level SE(R2L1) |
r2_l2_ses.est | Explained variance by sociodemographics at classroom level R2L2 |
r2_l2_ses.se | SE of explained variance by sociodemographics at classroom level SE(R2L2) |
r2_l3_ses.est | Explained variance by sociodemographics at school level R2L3 |
r2_l3_ses.se | SE of explained variance by sociodemographics at school level SE(R2L3) |
r2_l1_pretestses.est | Explained variance by a pretest and sociodemographics at student level R2L1 |
r2_l1_pretestses.se | SE of explained variance by a pretest and sociodemographics at student level SE(R2L1) |
r2_l2_pretestses.est | Explained variance by a pretest and sociodemographics at classroom level R2L2 |
r2_l2_pretestses.se | SE of explained variance by a pretest and sociodemographics at classroom level SE(R2L2) |
r2_l3_pretestses.est | Explained variance by a pretest and sociodemographics at school level R2L3 |
r2_l3_pretestses.se | SE of explained variance by a pretest and sociodemographics at school level SE(R2L3) |
Note that three-level design parameters (students at L1 within classrooms at L2 within schools at L3) were estimated for grades 1 to 10 only. For grades 11 to 12, no L2 estimates are available as 11th and 12th graders did not attend intact classrooms, but rather the grouping of students varied depending on the subject taught. Therefore, two-level design parameters (students at L1 within schools at L3) were estimated instead. Two-level equivalents for grades 1 to 10 (i.e., ignoring classroom-level clustering) are also provided; you can access them in the data frames labeled with ‘-2l’. Keep in mind that the top level (which is always the school level) is consistently indicated as ’_l3’ across all data frames; irrespective of whether L2 estimates were estimated (i.e., also for two-level designs). Detailed information on the provided design parameters and underlying analysis models can be retrieved from Stallasch et al. (2021).
Data frames that contain normative distributions (i.e., B2, B4, B6, B8, B10, B12, B14, and B16) are structured as follows.
Variable | Description |
---|---|
domain | Domain of summarized parameters (for details, see section ‘Achievement Domains’) |
grade_range | Grade range of summarized parameters |
statistic | Summary statistic: ‘Minimum’, ‘25th Percentile’, ‘Median’, ‘75th Percentile’, ‘Maximum’ |
icc_l2.est | ICC at classroom level ρL2 |
icc_l2.se | SE of ICC at classroom level SE(ρL2) |
icc_l3.est | ICC at school level ρL3 |
icc_l3.se | SE of ICC at school level SE(ρL3) |
r2_l1_pretest.est | Explained variance by a pretest at student level R2L1 |
r2_l1_pretest.se | SE of explained variance by a pretest at student level SE(R2L1) |
r2_l2_pretest.est | Explained variance by a pretest at classroom level R2L2 |
r2_l2_pretest.se | SE of explained variance by a pretest at classroom level SE(R2L2) |
r2_l3_pretest.est | Explained variance by a pretest at school level R2L3 |
r2_l3_pretest.se | SE of explained variance by a pretest at school level SE(R2L3) |
r2_l1_ses.est | Explained variance by sociodemographics at student level R2L1 |
r2_l1_ses.se | SE of explained variance by sociodemographics at student level SE(R2L1) |
r2_l2_ses.est | Explained variance by sociodemographics at classroom level R2L2 |
r2_l2_ses.se | SE of explained variance by sociodemographics at classroom level SE(R2L2) |
r2_l3_ses.est | Explained variance by sociodemographics at school level R2L3 |
r2_l3_ses.se | SE of explained variance by sociodemographics at school level SE(R2L3) |
r2_l1_pretestses.est | Explained variance by a pretest and sociodemographics at student level R2L1 |
r2_l1_pretestses.se | SE of explained variance by a pretest and sociodemographics at student level SE(R2L1) |
r2_l2_pretestses.est | Explained variance by a pretest and sociodemographics at classroom level R2L2 |
r2_l2_pretestses.se | SE of explained variance by a pretest and sociodemographics at classroom level SE(R2L2) |
r2_l3_pretestses.est | Explained variance by a pretest and sociodemographics at school level R2L3 |
r2_l3_pretestses.se | SE of explained variance by a pretest and sociodemographics at school level SE(R2L3) |
Note that three-level design parameters (students at L1 within classrooms at L2 within schools at L3) were estimated for grades 1 to 10 only. For grades 11 to 12, no L2 estimates are available as 11th and 12th graders did not attend intact classrooms, but rather the grouping of students varied depending on the subject taught. Therefore, two-level design parameters (students at L1 within schools at L3) were estimated instead. Two-level equivalents for grades 1 to 10 (i.e., ignoring classroom-level clustering) are also provided; you can access them in the data frames labeled with ‘-2l’. Keep in mind that the top level (which is always the school level) is consistently indicated as ’_l3’ across all data frames; irrespective of whether L2 estimates were estimated (i.e., also for two-level designs). Detailed information on the provided design parameters and underlying analysis models can be retrieved from Stallasch et al. (2021).
The design parameters cover the following achievement domains.
Domain | Subdomain |
---|---|
Mathematics | Mathematics |
Science | Science |
Verbal Skills in German (as First Language) | Reading Comprehension |
Reading Speed | |
Spelling | |
Grammar | |
Vocabulary | |
Writing | |
Argumentation | |
Listening Comprehension | |
Verbal Skills in English (as Foreign Language) | Reading Comprehension |
Text Reconstruction (C-Test) | |
Language Awareness: Sociopragmatics | |
Language Awareness: Grammar | |
Writing | |
Listening Comprehension | |
Domain-General Achievement | Declarative Metacognition |
ICT Literacy | |
Problem Solving | |
Basic Cognitive Functions: Perception Speed | |
Basic Cognitive Functions: Reasoning |
Note that the (sub)domain character strings in the data frames are named
exactly like shown here. For instance, if we want to filter design
parameters for English text reconstruction from data frame B1, we could
use the following code.
# filter English text reconstruction from data frame B1
eng_ctest <- multides1[["B1_General"]] %>%
filter(domain == "Verbal Skills in English (as Foreign Language)",
subdomain == "Text Reconstruction (C-Test)")
Tip: However, avoid to type the full strings. We
recommend using pattern matching functions like from the
grep()
family instead that will do the job for you.
# better:
eng_ctest <- multides1[["B1_General"]] %>%
filter(grepl("Eng", domain),
grepl("C-Test", subdomain))
Let’s have a look at the filtered design parameters.
eng_ctest %>%
t() %>% # transpose for better readability
kbl %>%
kable_styling(bootstrap_options = c("condensed", "responsive")) %>%
scroll_box(height = "350px")
domain | Verbal Skills in English (as Foreign Language) | Verbal Skills in English (as Foreign Language) | Verbal Skills in English (as Foreign Language) |
subdomain | Text Reconstruction (C-Test) | Text Reconstruction (C-Test) | Text Reconstruction (C-Test) |
grade | 9 | 9 | 9 |
study | DESI | DESI | DESI (Pooled) |
wave | 1 | 2 | NA |
model | 3l | 3l | 3l |
icc_l2.est | 0.07484727 | 0.09329552 | 0.08274759 |
icc_l2.se | 0.009634811 | 0.011132797 | 0.007285329 |
icc_l3.est | 0.5839861 | 0.5519840 | 0.5687772 |
icc_l3.se | 0.01762045 | 0.01851544 | 0.01276421 |
r2_l1_pretest.est | NA | 0.4270869 | 0.4270869 |
r2_l1_pretest.se | NA | 0.007683032 | 0.007683032 |
r2_l2_pretest.est | NA | 0.8440895 | 0.8440895 |
r2_l2_pretest.se | NA | 0.01575307 | 0.01575307 |
r2_l3_pretest.est | NA | 0.998872 | 0.998872 |
r2_l3_pretest.se | NA | 0.0006166911 | 0.0006166911 |
r2_l1_ses.est | 0.01315652 | 0.01675238 | 0.01483982 |
r2_l1_ses.se | 0.002402910 | 0.002561325 | 0.001752443 |
r2_l2_ses.est | 0.6742777 | 0.6423608 | 0.6577857 |
r2_l2_ses.se | 0.09820881 | 0.09497836 | 0.06827333 |
r2_l3_ses.est | 0.8845148 | 0.9068157 | 0.8968653 |
r2_l3_ses.se | 0.01972262 | 0.01770283 | 0.01317418 |
r2_l1_pretestses.est | NA | 0.4306028 | 0.4306028 |
r2_l1_pretestses.se | NA | 0.007663661 | 0.007663661 |
r2_l2_pretestses.est | NA | 0.8838199 | 0.8838199 |
r2_l2_pretestses.se | NA | 0.01990558 | 0.01990558 |
r2_l3_pretestses.est | NA | 0.9993725 | 0.9993725 |
r2_l3_pretestses.se | NA | 0.000219044 | 0.000219044 |
As can be seen, there are three entries for English text reconstruction
as the respective test was administered to students at two time points
in the DESI study, namely at the beginning and at the end of grade 9.
For the beginning of grade 9, no pretests were available, therefore the
corresponding cells are set to NA
. A third set of estimates
is provided here that contains the meta-analytically pooled results
across these two time points (as indicated by ‘DESI (Pooled)’). Note
that this integration strategy (applying a fixed effect model approach)
was also adopted for other domains in grade 9 in case multiple design
parameters were available (as obtained either from several studies [as
indicated by ‘All (Pooled)’] or from the two time points in DESI).
The following three exercises cover power analysis for cluster randomized trials (CRTs). In a two-level CRT (2L-CRT; see Scenario 1) students at L1 are nested within schools at L2 and entire schools are randomly assigned to the experimental conditions (i.e., in a two-arm study either the treatment or control condition). In a three-level CRT (3L-CRT; see Scenarios 2 and 3) students at L1 are nested within classrooms at L2 which are, in turn, nested within schools at L3. Here, randomization/treatment assignment occurs, again, at the school level L3.
Research Team 1 would like to conduct a 2L-CRT on the effectiveness of a school-wide intervention to improve 4th graders’ mathematical achievement in Germany. Team 1 plans to sample 40 students from each school. To improve statistical power, the researchers administer a math pretest to each participating student. How many schools are required to detect a standardized intervention effect of d = .15?
Which data frame contains the appropriate design parameters?
# -- choose data frame B9
dp_s1 <- multides1[["B9_General-2l"]] %>%
# filter mathematics in grade 4
filter(domain == "Mathematics", grade == 4)
Let’s have a look at the relevant design parameters.
dp_s1 %>%
# select ICCs and R-squared values for pretests
select(contains(c("icc", "pretest."))) %>%
round(2) %>%
pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>%
kbl(linesep = "") %>%
kable_styling(full_width=FALSE, position = "left",
bootstrap_options = c("condensed", "responsive"))
Variable | Value |
---|---|
icc_l3.est | 0.12 |
icc_l3.se | 0.01 |
r2_l1_pretest.est | 0.40 |
r2_l1_pretest.se | 0.01 |
r2_l3_pretest.est | 0.64 |
r2_l3_pretest.se | 0.03 |
Based on the expression for the MDES of a 2L-CRT with randomization/treatment assignment at L2 as given in Dong & Maynard (2013, p. 51), the number of schools \(J\) is computed as \[ J = \left(\frac{M_{J-g^*_{L2}-2}}{MDES}\right)^2\left(\frac{\rho_{L2}(1-R^2_{L2})}{P(1-P)}+\frac{(1-\rho_{L2})(1-R^2_{L1})}{P(1-P)n} \right)\] with
The initial design assumptions are as follows.
Target | PowerUpR conversion |
---|---|
Minimum required sample size of a 2L-CRT: Number of schools J | mrss.cra2() |
Statistical test | PowerUpR conversion |
---|---|
Two-tailed test | two-tailed = TRUE (default) |
Significance level α = .05 | alpha = .05 (default) |
Power 1-β = .80 | power = .80 (default) |
Sample sizes | PowerUpR conversion |
---|---|
Harmonic mean number of L1 units per L2 unit n = 20 | n = 20 |
Starting value for J, J0 = 8 | J0 = 8 |
Proportion of sample assigned to treatment P = .50 (balanced) | p = .50 (default) |
Design parameters | PowerUpR conversion |
---|---|
ICC at L2 ρL2 = .12 | rho2 = dp_s1$icc_l3.est |
Explained variance at L1 R2L1 = .40 | r21 = dp_s1$r2_l1_pretest.est |
Explained variance at L2 R2L2 = .64 | r22 = dp_s1$r2_l3_pretest.est |
Further parameters | PowerUpR conversion |
---|---|
Number of covariates at L2 g*L2 = 1 | g2 = 1 |
Minimum detectable effect size MDES = .15 | es = .15 |
PowerUpR
# -- choose data frame B9
dp_s1 <- multides1[["B9_General-2l"]] %>%
# filter mathematics in grade 4
filter(domain == "Mathematics", grade == 4)
d.a <- mrss.cra2(es = .15, n = 40, rho2 = dp_s1$icc_l3.est,
g2 = 1, r21 = dp_s1$r2_l1_pretest.est, r22 = dp_s1$r2_l3_pretest.est)
## J = 80
Given the assumptions, at least 80 schools are necessary to achieve an MDES of .15.
Of course, it is also possible to directly insert the numeric values for the design parameters, but using the variables instead is safer, more precise, and prevents typos. Note that the results can differ due to rounding.
d.a1 <- mrss.cra2(es = .15, n = 40, rho2 = .12,
g2 = 1, r21 = .40, r22 = .64)
## J = 81
Everything else being equal (and using the design with a math pretest as baseline), what happens, when…
# adjust the power argument to .90
d.b <- mrss.cra2(power = .90, es = .15, n = 40, rho2 = dp_s1$icc_l3.est,
g2 = 1, r21 = dp_s1$r2_l1_pretest.est, r22 = dp_s1$r2_l3_pretest.est)
## J = 107
Tip: We could also simply update the respective
parameter in the list $parms
stored in the
mrss
object and then call the function again with the
modified argument.
# save design (a) as a new design
d.b <- d.a
# update power argument to .90
d.b$parms$power = .90
# call the function
d.b <- exec("mrss.cra2", !!!d.b$parms)
## J = 107
Team 1 wants to explore, how precision can be improved if they additionally control for some vital sociodemographics, such as students’ gender and migration background, as well as parents’ educational attainment and the families’ HISEI, (1) at L1 only, and (2) at both L1 and L2.
Let’s have a look at the relevant design parameters first.
dp_s1 %>%
# select R-squared values for the combination of pretest and sociodemographics
select(contains("pretestses")) %>%
round(2) %>%
pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>%
kbl(linesep = "") %>%
kable_styling(full_width=FALSE, position = "left",
bootstrap_options = c("condensed", "responsive"))
Variable | Value |
---|---|
r2_l1_pretestses.est | 0.43 |
r2_l1_pretestses.se | 0.01 |
r2_l3_pretestses.est | 0.76 |
r2_l3_pretestses.se | 0.03 |
# add 4 sociodemographics at L1 only: adjust only r21
d.c1 <- mrss.cra2(es = .15, n = 40, rho2 = dp_s1$icc_l3.est,
g2 = 1, r21 = dp_s1$r2_l1_pretestses.est, r22 = dp_s1$r2_l3_pretest.est)
## J = 80
# add 4 sociodemographics at both L1 and 2
d.c2 <- mrss.cra2(es = .15, n = 40, rho2 = dp_s1$icc_l3.est,
g2 = 5, r21 = dp_s1$r2_l1_pretestses.est, r22 = dp_s1$r2_l3_pretestses.est)
## J = 60
# -- set all covariate-specific arguments to zero
# this can be achieved by **not** specifying the covariate-specific arguments
# (defaults: g2 = 0, r21 = 0, r22 = 0)
d.d <- mrss.cra2(es = .15, n = 40, rho2 = dp_s1$icc_l3.est)
## J = 199
# adjust rho2 to .2
d.e <- mrss.cra2(es = .15, n = 40, rho2 = .20,
g2 = 1, r21 = dp_s1$r2_l1_pretest.est, r22 = dp_s1$r2_l3_pretest.est)
## J = 119
The design parameters are estimates of population quantities, and,
thus, associated with statistical uncertainty. The uncertainty can be
incorporated by using the standard errors for each design parameter.
Assuming large-sample properties, research Team 1 uses the standard
normal distribution to compute the limits for 95% confidence intervals
by means of the standard errors for the design parameters.
A conservative approach for planning the sample size is to use the upper
bound estimate of the 95% CI for \(\rho\), and the lower bound estimates for
the R2 values.
d.f <- mrss.cra2(es = .15, n = 40,
rho2 = dp_s1$icc_l3.est+1.96*dp_s1$icc_l3.se,
g2 = 1,
r21 = dp_s1$r2_l1_pretest.est-1.96*dp_s1$r2_l1_pretest.se,
r22 = dp_s1$r2_l3_pretest.est-1.96*dp_s1$r2_l3_pretest.se)
## J = 106
Let’s have a look at the results. Move your cursor across the bars to learn about the required number schools for each design and how large the difference in % is compared to the baseline design with a single math pretest as covariate at L1 and L2.
packages <- c("plotly", "htmlwidgets")
invisible(
lapply(packages,
function(x) {
if (!require(x, character.only = TRUE)) {
install.packages(x, dependencies = TRUE); require(x, character.only = TRUE)}
}
)
)
design <- list(d.a, d.b, d.c1, d.c2, d.d, d.e, d.f) %>%
set_names("d.a", "d.b", "d.c1", "d.c2", "d.d", "d.e", "d.f")
delta <- function(design) (design[["J"]]-d.a[["J"]])/d.a[["J"]]*100
p <- data.frame(design = factor(names(design)),
schools = map_dbl(design, ~.[["J"]]),
delta = map_dbl(design, delta)) %>%
ggplot(aes(design, schools, text = paste(
factor(design,
levels = c("d.a", "d.b", "d.c1", "d.c2", "d.d", "d.e", "d.f"),
labels = c("The baseline design with a math pretest as covariate",
"... increasing power to 90%",
" ... adding sociodemographics at L1",
" ... adding sociodemographics at both L1 and L2",
"... using no covariates",
"... assuming increased achievement differences at L2 of 20%",
"... applying a conservative approach")),
" requires J = ", schools, " schools.","\n",
ifelse(design == "d.a", "",
ifelse(design != "d.a" & delta > 0,
paste0("This amounts to ", round(delta, 0),
"% more schools compared to the baseline design."),
ifelse(design != "d.a" & delta == 0,
"The required number of schools does not change compared to the baseline design.",
paste0("This amounts to ", round(delta, 0)*-1,
"% fewer schools compared to the baseline design.")))),
sep = ""
))) +
geom_bar(stat="identity", width=0.9, fill = "#2c3e50") +
scale_x_discrete("Design") +
scale_y_continuous("Number of schools") +
theme_minimal() +
theme(axis.ticks = element_blank(),
plot.title = element_text(hjust = 0.5)) +
ggtitle("Number of Schools Required for an MDES of .15 by Design")
ggplotly(p, tooltip = "text")
PowerUpR
also has built-in functionality for plotting.
We will quickly check out the plot()
function with design
(a). We can either plot statistical power as a function of the required
number of schools (which is the default in plot()
)…
plot(d.a, main = "MDES as a Function of Number of Schools")
… or we can plot the MDES along with its (1-α)*100% CI as a function of the required number of schools.
# include `ypar = "mdes"` to plot the MDES
# include `locate=TRUE` to locate parameter values for the applied design
plot(d.a, ypar = "mdes",
main = "MDES as a Function of Number of Schools", locate = TRUE)
Research Team 2 would like to expand research Team 1’s CRT to a three-level design, where classroom-level information is available. Team 2 plans to sample 3 classrooms with 20 students per classroom from every school and, again, to use a math pretest as obtained for all students as covariate. The researchers are interested in the number of schools necessary to detect a standardized intervention effect of d = 0.25.
Which data frame contains the appropriate design parameters?
# -- choose data frame B1
dp_s2 <- multides1[["B1_General"]] %>%
# filter mathematics achievement in grade 4
filter(domain == "Mathematics", grade == 4)
Let’s have a look at the relevant design parameters.
dp_s2 %>%
# select ICCs and R-squared values for pretests
select(contains(c("icc", "pretest."))) %>%
round(2) %>%
pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>%
kbl(linesep = "") %>%
kable_styling(full_width=FALSE, position = "left",
bootstrap_options = c("condensed", "responsive"))
Variable | Value |
---|---|
icc_l2.est | 0.05 |
icc_l2.se | 0.02 |
icc_l3.est | 0.10 |
icc_l3.se | 0.02 |
r2_l1_pretest.est | 0.40 |
r2_l1_pretest.se | 0.01 |
r2_l2_pretest.est | 0.35 |
r2_l2_pretest.se | 0.04 |
r2_l3_pretest.est | 0.76 |
r2_l3_pretest.se | 0.03 |
The number of schools \(K\) for a 3L-CRT with randomization/treatment assignment at L3 is given in Dong & Maynard (2013, p. 52) as \[ K = \left(\frac{M_{K-g^*_{L3}-2}}{MDES}\right)^2\left(\frac{\rho_{L3}(1-R^2_{L3})}{P(1-P)}+\frac{\rho_{L2}(1-R^2_{L2})}{P(1-P)J}+\frac{(1-\rho_{L3}-\rho_{L2})(1-R^2_{L1})}{P(1-P)Jn} \right)\]
The initial design assumptions are as follows.
Target | PowerUpR conversion |
---|---|
Minimum required sample size of a 3L-CRT: Number of schools K | mrss.cra3() |
Statistical test | PowerUpR conversion |
---|---|
Two-tailed test | two-tailed = TRUE (default) |
Significance level α = .05 | alpha = .05 (default) |
Power 1-β = .80 | power = .80 (default) |
Sample sizes | PowerUpR conversion |
---|---|
Harmonic mean number of L1 units per L2 unit n = 20 | n = 20 |
Harmonic mean number of L2 units per L3 unit J = 3 | J = 3 |
Starting value for K, K0 = 8 | K0 = 8 |
Proportion of sample assigned to treatment P = .50 (balanced) | p = .50 (default) |
Design parameters | PowerUpR conversion |
---|---|
ICC at L2 ρL2 = .05 | rho2 = dp_s2$icc_l2.est |
ICC at L3 ρL3 = .10 | rho3 = dp_s2$icc_l3.est |
Explained variance at L1 R2L1 = .40 | r21 = dp_s2$r2_l1_pretest.est |
Explained variance at L2 R2L2 = .35 | r22 = dp_s2$r2_l2_pretest.est |
Explained variance at L3 R2L3 = .76 | r23 = dp_s2$r2_l3_pretest.est |
Further parameters | PowerUpR conversion |
---|---|
Number of covariates at L3 g*L3 = 1 | g3 = 1 |
Minimum detectable effect size MDES = .25 | es = .25 |
PowerUpR
# -- choose data frame B1
dp_s2 <- multides1[["B1_General"]] %>%
# filter mathematics achievement in grade 4
filter(domain == "Mathematics", grade == 4)
d.a <- mrss.cra3(n = 20, J = 3, K0 = 8,
rho2 = dp_s2$icc_l2.est,
rho3 = dp_s2$icc_l3.est,
g3 = 1,
r21 = dp_s2$r2_l1_pretest.est,
r22 = dp_s2$r2_l2_pretest.est,
r23 = dp_s2$r2_l3_pretest.est)
## K = 24
Everything else being equal (and using the design with the math pretest as baseline), what happens, when…
Team 2 opts for a conservative approach (see Scenario 1, design f).
d.b <- mrss.cra3(n = 20, J = 3, K0 = 8,
rho2 = dp_s2$icc_l2.est+1.96*dp_s2$icc_l2.se,
rho3 = dp_s2$icc_l3.est+1.96*dp_s2$icc_l3.se,
g3 = 1,
r21 = dp_s2$r2_l1_pretest.est-1.96*dp_s2$r2_l1_pretest.se,
r22 = dp_s2$r2_l2_pretest.est-1.96*dp_s2$r2_l2_pretest.se,
r23 = dp_s2$r2_l3_pretest.est-1.96*dp_s2$r2_l3_pretest.se)
## K = 36
Let’s have a look at the relevant design parameters.
dp_s2 %>%
# select R-squared values for the combination of pretest and sociodemographics
select(contains("pretestses")) %>%
round(2) %>%
pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>%
kbl(linesep = "") %>%
kable_styling(full_width=FALSE, position = "left",
bootstrap_options = c("condensed", "responsive"))
Variable | Value |
---|---|
r2_l1_pretestses.est | 0.42 |
r2_l1_pretestses.se | 0.01 |
r2_l2_pretestses.est | 0.83 |
r2_l2_pretestses.se | 0.03 |
r2_l3_pretestses.est | 0.83 |
r2_l3_pretestses.se | 0.03 |
# -- point estimates
d.c1 <- mrss.cra3(n = 20, J = 3, K0 = 8,
rho2 = dp_s2$icc_l2.est,
rho3 = dp_s2$icc_l3.est,
g3 = 5,
r21 = dp_s2$r2_l1_pretestses.est,
r22 = dp_s2$r2_l2_pretestses.est,
r23 = dp_s2$r2_l3_pretestses.est)
## K = 17
# -- conservative approach
d.c2 <- mrss.cra3(n = 20, J = 3, K0 = 8,
rho2 = dp_s2$icc_l2.est+1.96*dp_s2$icc_l2.se,
rho3 = dp_s2$icc_l3.est+1.96*dp_s2$icc_l3.se,
g3 = 1,
r21 = dp_s2$r2_l1_pretestses.est-1.96*dp_s2$r2_l1_pretestses.se,
r22 = dp_s2$r2_l2_pretestses.est-1.96*dp_s2$r2_l2_pretestses.se,
r23 = dp_s2$r2_l3_pretestses.est-1.96*dp_s2$r2_l3_pretestses.se)
## K = 24
Now assume that the researchers’ resources allow to sample up to 50 schools, at maximum. Team 2 wants to explore the levels of statistical power the study design can reach with and without 4 sociodemographics as covariates, given this sample size constraint.
Let’s have a look at the relevant design parameters.
dp_s2 %>%
# select R-squared values for sociodemographics
select(contains("_ses")) %>%
round(2) %>%
pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>%
kbl(linesep = "") %>%
kable_styling(full_width=FALSE, position = "left",
bootstrap_options = c("condensed", "responsive"))
Variable | Value |
---|---|
r2_l1_ses.est | 0.10 |
r2_l1_ses.se | 0.01 |
r2_l2_ses.est | 0.84 |
r2_l2_ses.se | 0.03 |
r2_l3_ses.est | 0.69 |
r2_l3_ses.se | 0.05 |
To vectorize the functions of PowerUpR
over a range of
possible parameters (here: 8 to 50 schools), we need to write a custom
function similar to what is suggested by the package authors in their vignette
on vectorization).
custom_power.cra3 <- function(K, covariates = FALSE) {
parms <- list(rho2 = dp_s2$icc_l2.est, rho3 = dp_s2$icc_l3.est,
g3 = 0, r21 = 0, r22 = 0, r23 = 0,
n = 20, J = 2, K = K)
# to include 4 sociodemographic covariates
if(isTRUE(covariates)) {
parms <- list(rho2 = dp_s2$icc_l2.est, rho3 = dp_s2$icc_l3.est,
g3 = 4,
r21 = dp_s2$r2_l1_ses.est,
r22 = dp_s2$r2_l2_ses.est,
r23 = dp_s2$r2_l3_ses.est,
n = 20, J = 2, K = K)
}
design <- exec("power.cra3", !!!parms)
design$power[1]
}
Now we just have to map the function across the various sample sizes up to 50 schools.
# number of schools ranging from 8 to 50
K <- seq(8, 50, 1)
d.d <- map_dbl(K, custom_power.cra3) %>% set_names(K)
d.e <- map_dbl(K, ~custom_power.cra3(K = ., covariates = TRUE)) %>% set_names(K)
Let’s first have a look at the results for designs a to c (i.e., where the target outcome was the required number of schools). Move your cursor across the bars to learn about the required number schools for each design and how large the difference in % is compared to the baseline design with a single math pretest as covariate at L1 and L2.
packages <- c("plotly", "htmlwidgets")
invisible(
lapply(packages,
function(x) {
if (!require(x, character.only = TRUE)) {
install.packages(x, dependencies = TRUE); require(x, character.only = TRUE)}
}
)
)
design <- list(d.a, d.b, d.c1, d.c2) %>%
set_names("d.a", "d.b", "d.c1", "d.c2")
delta <- function(design) (design[["K"]]-d.a[["K"]])/d.a[["K"]]*100
p <- data.frame(design = factor(names(design)),
schools = map_dbl(design, ~.[["K"]]),
delta = map_dbl(design, delta)) %>%
ggplot(aes(design, schools, text = paste(
factor(design,
levels = c("d.a", "d.b", "d.c1", "d.c2"),
labels = c("The baseline design with a math pretest as covariate",
"... applying a conservative approach",
" ... adding sociodemographics",
" ... adding sociodemographics and applying a conservative approach")),
" requires K = ", schools, " schools.","\n",
ifelse(design == "d.a", "",
ifelse(design != "d.a" & delta > 0,
paste0("This amounts to ", round(delta, 0),
"% more schools compared to the baseline design."),
ifelse(design != "d.a" & delta == 0,
"The required number of schools does not change compared to the baseline design.",
paste0("This amounts to ", round(delta, 0)*-1,
"% fewer schools compared to the baseline design.")))),
sep = ""
))) +
geom_bar(stat="identity", width=0.9, fill = "#2c3e50") +
scale_x_discrete("Design") +
scale_y_continuous("Number of schools") +
theme_minimal() +
theme(axis.ticks = element_blank(),
plot.title = element_text(hjust = 0.5)) +
ggtitle("Number of Schools Required for an MDES of .25 by Design")
ggplotly(p, tooltip = "text")
Now we plot the results for designs d and e (i.e., where the target outcome was the achieved power level). Move your cursor across the lines to learn about the achieved level of statistical power.
packages <- c("plotly", "htmlwidgets")
invisible(
lapply(packages,
function(x) {
if (!require(x, character.only = TRUE)) {
install.packages(x, dependencies = TRUE); require(x, character.only = TRUE)}
}
)
)
p <- data.frame(K = K, d.d = d.d, d.e = d.e) %>%
pivot_longer(d.d:d.e) %>%
mutate(name = fct_recode(name,
"No covariates" = "d.d",
"Sociodemographics as covariates" = "d.e")) %>%
ggplot(aes(x = as.numeric(K), y = value, color = name,
text = paste(
"With " , K, " schools ",
round(value*100, 0), "% Power is achieved.", sep = ""),
group = 1)) +
geom_line() +
geom_hline(yintercept = .8, lty = "dotted") +
scale_y_continuous("Statistical Power", limits = c(0,1)) +
scale_x_continuous("Number of schools", limits = c(8, 50)) +
scale_color_manual(values = c("#2c3e50", "#18bc9c")) +
theme_minimal() +
theme(axis.ticks = element_blank(),
axis.title.x = element_text(margin = margin(0.5, 0, 0, 0)),
plot.title = element_text(hjust = 0.5)) +
ggtitle("Statistical Power as a Function of the Number of Schools")
ggplotly(p, tooltip = "text") %>%
layout(legend = list(
orientation = "h",
x = 0.2,
y = -0.2
))
Suppose that research Team 3 plans a 3L-CRT to study the impact of an intervention that is intended to affect students’ history achievement in comprehensive schools (grades 5 to 12). Due to budgetary constraints, a fixed maximum number of 40 schools (with 2 classrooms, and 20 students each) are at the researchers’ disposal. Given these limits, the primary concern of Team 3 is to ensure that the attainable MDES lies within the range of effects on students’ achievement that is typically observed for this kind of intervention (i.e., 0.20 ≤ d ≤ 0.30). Team 3 considers two designs: (1) a design without covariates, and (2) a design that adjusts for a pretest and 4 sociodemographics (i.e., students’ gender and migration background, parents’ educational attainment, and families’ HISEI).
Which data frame contains the appropriate design parameters?
Team 3 consults data frame B4 outlining the normative distributions across the various achievement domains to determine small (i.e., 25th percentile), medium (i.e., median), and large (i.e., 75th percentile) values of ρ and R2.
# -- choose data frame B4
dp_s3 <- multides1[["B4_Adjusted_ND"]] %>%
# filter P25, median, and P75 for design parameters that are summarized across domains and secondary school grades
filter(domain == "All", grade_range == "5-12", !grepl("Min|Max", statistic)) %>%
# select relevant design parameters
select(statistic, contains(c("icc", "pretestses"))) %>%
# simplify variable names
# (we will need this later on; this will facilitate the vectorization over the normative distributions)
set_names(c("statistic", "rho2", "rho3", "r21", "r22", "r23"))
Let’s have a look at the design parameters.
dp_s3 %>%
mutate(across(where(is.numeric), round, 2)) %>%
kbl(linesep = "") %>%
kable_styling(full_width=FALSE, position = "left",
bootstrap_options = c("condensed", "responsive"))
statistic | rho2 | rho3 | r21 | r22 | r23 |
---|---|---|---|---|---|
25th Percentile | 0.04 | 0.08 | 0.17 | 0.70 | 0.84 |
Median | 0.06 | 0.10 | 0.22 | 0.77 | 0.90 |
75th Percentile | 0.09 | 0.12 | 0.31 | 0.86 | 0.97 |
The \(MDES\) for a 3l-CRT with randomization/treatment assignment at L3 is given in Dong & Maynard (2013, p. 52) as \[ MDES = M_{K-g^*_{L3}-2}\left(\frac{\rho_{L3}(1-R^2_{L3})}{P(1-P)K}+\frac{\rho_{L2}(1-R^2_{L2})}{P(1-P)KJ}+\frac{(1-\rho_{L3}-\rho_{L2})(1-R^2_{L1})}{P(1-P)KJn} \right)\]
The design assumptions are as follows.
Target | PowerUpR conversion |
---|---|
Minimum detectable effect size MDES of a 3L-CRT | mdes.cra3() |
Statistical test | PowerUpR conversion |
---|---|
Two-tailed test | two-tailed = TRUE (default) |
Significance level α = .05 | alpha = .05 (default) |
Power 1-β = .80 | power = .80 (default) |
Sample sizes | PowerUpR conversion |
---|---|
Harmonic mean number of L1 units per L2 unit n = 20 | n = 20 |
Harmonic mean number of L2 units per L3 unit J = 2 | J = 2 |
Number of L3 units K = 40 | K = 40 |
Proportion of sample assigned to treatment P = .50 (balanced) | p = .50 (default) |
Design parameters | PowerUpR conversion |
---|---|
All designs | |
ICC at L2 ρL2 = .04/.06/.09 (small/medium/large) | rho2 = dp_s3$icc_l2.est |
ICC at L3 ρL3 = .08/.10/.12 (small/medium/large) | rho3 = dp_s3$icc_l3.est |
Design (a) | |
Explained variance at L1 R2L1 = 0 | r21 = 0 (default) |
Explained variance at L2 R2L2 = 0 | r22 = 0 (default) |
Explained variance at L3 R2L3 = 0 | r23 = 0 (default |
Design (b) | |
Explained variance at L1 R2L1 = .17/.22/.31 (small/medium/large) | r21 = dp_s3$r2_l1_pretestses.est |
Explained variance at L2 R2L2 = .70/.77/.86 (small/medium/large) | r22 = dp_s3$r2_l2_pretestses.est |
Explained variance at L3 R2L3 = .84/.90/.97 (small/medium/large) | r23 = dp_s3$r2_l3_pretestses.est |
Further parameters | PowerUpR conversion |
---|---|
Design (a) | |
Number of covariates at L3 g*L3 = 0 | g3 = 0 (default) |
Design (b) | |
Number of covariates at L3 g*L3 = 5 | g3 = 5 |
PowerUpR
# -- choose data frame B4
dp_s3 <- multides1[["B4_Adjusted_ND"]] %>%
# filter P25, median, and P75 for design parameters that are summarized across domains and secondary school grades
filter(domain == "All", grade_range == "5-12", !grepl("Min|Max", statistic)) %>%
# select relevant design parameters
select(statistic, contains(c("icc", "pretestses"))) %>%
# simplify variable names
set_names(c("statistic", "rho2", "rho3", "r21", "r22", "r23"))
Again, we need to write a custom function (see also Scenario 2) to vectorize over a range of possible parameters (here: small, medium, and large design parameters).
custom_mdes.cra3 <- function(rho2, rho3, g3 = 0, r21 = 0, r22 = 0, r23 = 0, ...) {
parms <- list(rho2 = rho2, rho3 = rho3,
g3 = g3,
r21 = r21, r22 = r22, r23 = r23,
n = 20, J = 2, K = 40)
design <- exec("mdes.cra3", !!!parms)
mdes <- design$mdes[1] # MDES point estimate
ci.lb <- design$mdes[2] # lower bound of 95% CI
ci.ub <- design$mdes[3] # upper bound of 95% CI
return(data.frame(mdes = mdes, ci.lb = ci.lb, ci.ub = ci.ub))
}
# apply the function to compute the MDES for small, medium, and large intraclass correlations
d.a <- pmap(dp_s3 %>% select(contains("rho")), custom_mdes.cra3) %>%
set_names(dp_s3$statistic) %>%
bind_rows(.id = "statistic")
When adding covariates, Team 3 uses the large values for ρ (i.e., the 75th percentile) to apply a conservative approach.
# use the 75th percentile of the ICCs and adjust for 5 covariates (1 pretest + 4 sociodemographics)
d.b <- pmap(dp_s3 %>%
mutate(rho2 = as.numeric(.[.$statistic == "75th Percentile", "rho2"]),
rho3 = as.numeric(.[.$statistic == "75th Percentile", "rho3"]),
g3 = 5),
custom_mdes.cra3) %>%
set_names(unique(dp_s3$statistic)) %>%
bind_rows(.id = "statistic")
Let’s have a look at the results.
p <- ggplot(bind_rows(d.a, d.b, .id = "design") %>%
mutate(statistic = factor(statistic,
levels = c("25th Percentile",
"Median",
"75th Percentile"),
labels = c("Small", "Medium", "Large")))) +
geom_pointrange(aes(x = statistic, y = mdes, ymin = ci.lb, ymax = ci.ub, col = design),
position = position_dodge(width = 0.5)) +
geom_hline(yintercept = .3, lty = "dotted") +
scale_y_continuous("MDES", breaks = seq(0, 0.7, 0.1))+
scale_color_manual(values = c("#2c3e50", "#18bc9c"),
labels = c("No covariates",
expression("Pretest + Sociodemographics (with large values of"~rho*")"))) +
theme_minimal() +
theme(axis.title.x = element_blank(),
legend.title = element_blank(),
legend.position = "bottom",
plot.title = element_text(hjust = 0.5)) +
ggtitle("MDES (with 95% CI) for Normative Distributions of Design Parameters") +
annotate("rect", xmin = -Inf, xmax = Inf, ymin = 0.3, ymax = Inf,
alpha = .2)
p
Team 3 learns that the attainable MDES is 0.31/0.35/0.39 (point estimates) for small/medium/large values of ρL2 and ρL3. Including both a pretest and sociodemographics as covariates and using the 75th percentiles of the values for ρL2 and ρL3 (as more conservative upper bounds), the respective values for the MDES reduce to 0.2/0.18/0.14 for small/medium/large values of R2 at the various levels. Consequently, Team 3 can be quite confident that their CRT design offers sufficient sensitivity to detect a true intervention effect within the desired range when including both pretests and sociodemographics.
The following two exercises cover power analysis for multisite trials (MSTs). The MST design can be seen as an extension of the CRT design. In a two-level multisite individually randomized trial (2L-MSIRT; see Scenario 4), students at L1 are nested within schools at L2 and randomization/treatment assignment occurs at the individual student level L1 within L2 school clusters. In a three-level multisite cluster randomized trial (3L-MSCRT; see Scenario 5), students at L1 are nested within classrooms at L2 which are, in turn, nested within schools at L3. Here, entire classroom subclusters at L2 within L3 school clusters are randomly allocated to experimental conditions.
In this exercise, we employ the worked example of Bloom & Spybrook (2017).
Note that in this 2L-MSIRT, the intervention effects are assumed to vary across sites.
The MDES for a 2L-MSIRT with randomization/treatment assignment at L1 and random cross-site effects is given in Dong & Maynard (2013, p. 48) as \[ MDES = M_{J-g^*_{L2}-1}\left(\frac{\rho_{L2}\omega_{T_{L2}}(1-R^2_{T_{L2}})}{J}+\frac{(1-\rho_{L2})(1-R^2_{L1})}{P(1-P)Jn} \right)\]
The design assumptions are as follows.
Target | PowerUpR conversion |
---|---|
Minimum detectable effect size MDES of a 2L-MSIRT with random site effects | mdes.bira2() or
mdes.bira2r1() |
Statistical test | PowerUpR conversion |
---|---|
Two-tailed test | two-tailed = TRUE (default) |
Significance level α = .05 | alpha = .05 (default) |
Power 1-β = .80 | power = .80 (default) |
Sample sizes | PowerUpR conversion |
---|---|
Harmonic mean number of L1 units per L2 unit n = 50 | n = 50 |
Number of L2 units J = 30 | J = 30 |
Proportion of sample assigned to treatment P = .60 (unbalanced) | p = .60 |
Design parameters | PowerUpR conversion |
---|---|
All designs | |
ICC at L2 ρL2 = .18 | rho2 = .18 |
Explained variance at L1 R2L1 = .38 | r21 = .38 |
Design (a) | |
Treatment effect heterogeneity at L2 ωTL2 = .25^2/.18 | omega2 = .25^2/.18 |
Effect size variability at L2 τ2TL2 = .25^2 | esv2 = .25^2 |
Design (b) | |
Treatment effect heterogeneity at L2 ωTL2 = 0 | omega2 = 0 (default) |
Effect size variability at L2 τ2TL2 = 0 | esv2 = 0 (default) |
Further parameters | PowerUpR conversion |
---|---|
Number of covariates at L2 g*L2 = 0 | g3 = 0 (default) |
PowerUpR
# -- use treatment effect heterogeneity argument `omega2` (equivalent to the 'PowerUp!' Excel tool)
d.a1 <- mdes.bira2r1(rho2 = 0.18, omega2 = 0.25^2/0.18,
p = 0.6, n = 50, J = 30,
r21 = 0.38)
##
## Minimum detectable effect size:
## ---------------------------------------
## 0.171 95% CI [0.051,0.292]
## ---------------------------------------
## Degrees of freedom: 29
## Standardized standard error: 0.059
## Type I error rate: 0.05
## Type II error rate: 0.2
## Two-tailed test: TRUE
# -- use effect size variability argument `esv2` (not available in the 'PowerUp!' Excel tool)
d.a2 <- mdes.bira2r1(rho2 = 0.18, esv2 = 0.25^2,
p = 0.6, n = 50, J = 30,
r21 = 0.38)
# check for identity
identical(d.a1[["mdes"]], d.a2[["mdes"]])
## [1] TRUE
d.b <- mdes.bira2r1(rho2 = 0.18, omega2 = 0,
p = 0.6, n = 50, J = 30,
r21 = 0.38)
##
## Minimum detectable effect size:
## ---------------------------------------
## 0.109 95% CI [0.032,0.186]
## ---------------------------------------
## Degrees of freedom: 29
## Standardized standard error: 0.038
## Type I error rate: 0.05
## Type II error rate: 0.2
## Two-tailed test: TRUE
extract_mdes <- function(design) {
data.frame(mdes = design$mdes[1],
ci.lb = design$mdes[2],
ci.ub = design$mdes[3])
}
p <- map_df(list(d.a = d.a1, d.b = d.b), extract_mdes, .id = "design") %>%
ggplot() +
geom_pointrange(aes(x = design, y = mdes, ymin = ci.lb, ymax = ci.ub),
position = position_dodge(width = 0.5), col = "#2c3e50") +
scale_x_discrete(label = c(bquote(tau[T[L2]]~"= .25"), bquote(tau[T[L2]]~"= .0")))+
scale_y_continuous("MDES", breaks = seq(0, 0.3, 0.05), limits = c(0, 0.3))+
theme_minimal() +
theme(axis.title.x = element_blank(),
legend.title = element_blank(),
legend.position = "bottom",
plot.title = element_text(hjust = 0.5)) +
ggtitle("MDES (with 95% CI) by Design")
p
Research Team 5 would like to conduct a 3L-MSCRT to study the effects of a new teaching method involving learning software developed to enhance grade 9 students’ English listening comprehension skills in the academic track. In this 3L-MSCRT, students (L1) are nested within classrooms (L2), and classrooms are nested within schools (L3), where randomization/treatment assignment occurs at L2 subclusters within L3 sites (i.e., entire classrooms within schools serving as sites, are randomly assigned to experimental conditions). Team 5 plans to have 4 classrooms with 20 students each. The researchers consider an intervention effect of d = 0.10 policy-relevant. Since the goal of Team 5 is to generalize the study findings to the population of German academic track schools beyond those sampled for their MSCRT, they treat the school (site) effects as random. Team 5 assumes that the effect size variability equals τ2TL3 = .152. In a first step, the researchers want to know how many schools would have to be sampled in the absence of covariates (design (a)). Moreover, they wonder, how much precision may be improved when adjusting (b) for an English listening pretest, (c) for 4 sociodemographics (including students’ gender, migration background, parents’ educational attainment and families’ HISEI), and (d) when combining these two covariate sets. Opting for a conservative approach, Team 5 Team 3 assumes that covariates will explain considerably less variability in the intervention effect across schools compared to between-school differences in achievement (i.e., 1/10). Following this rationale and having a reliable empirical estimate of R2L3 for a pretest/sociodemographics/their combination of .90/.87/.97, Team 5 assumes that R2 in τ2TL3 equals R2TL3 = .09/.09/.10.
Which data frame contains the appropriate design parameters?
# -- choose data frame B5
dp_s5 <- multides1[["B5_Academic"]] %>%
# filter English Listening in grade 9
filter(grepl("English", domain), grepl("Listening", subdomain), grade == 9) %>%
# since there are multiple estimates for that very same subdomain in grade 9
# (multiple time points: beginning of grade 9 and end of grade 9),
# we use the meta-analytically pooled values
filter(grepl("Pooled", study))
Have a look at the design parameters.
dp_s5 %>%
# select ICCs and R-squared values
select(contains(c("icc", "r2"))) %>%
round(2) %>%
pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>%
kbl(linesep = "") %>%
kable_styling(full_width=FALSE, position = "left",
bootstrap_options = c("condensed", "responsive")) %>%
scroll_box(height = "350px")
Variable | Value |
---|---|
icc_l2.est | 0.19 |
icc_l2.se | 0.02 |
icc_l3.est | 0.07 |
icc_l3.se | 0.02 |
r2_l1_pretest.est | 0.11 |
r2_l1_pretest.se | 0.01 |
r2_l2_pretest.est | 0.57 |
r2_l2_pretest.se | 0.05 |
r2_l3_pretest.est | 0.90 |
r2_l3_pretest.se | 0.04 |
r2_l1_ses.est | 0.01 |
r2_l1_ses.se | 0.00 |
r2_l2_ses.est | 0.72 |
r2_l2_ses.se | 0.07 |
r2_l3_ses.est | 0.87 |
r2_l3_ses.se | 0.04 |
r2_l1_pretestses.est | 0.12 |
r2_l1_pretestses.se | 0.01 |
r2_l2_pretestses.est | 0.87 |
r2_l2_pretestses.se | 0.04 |
r2_l3_pretestses.est | 0.97 |
r2_l3_pretestses.se | 0.01 |
The number of schools \(K\) for a 3L-MSCRT with randomization/treatment assignment at L2 and random cross-site effects at L3 is given in Dong & Maynard (2013, p. 54) as \[ K = \left(\frac{M_{K-g^*_{L3}-2}}{MDES}\right)^2\left(\rho_{L3}\omega_{T_{L3}}(1-R^2_{T_{L3}})+\frac{\rho_{L2}(1-R^2_{L2})}{P(1-P)J}+\frac{(1-\rho_{L3}-\rho_{L2})(1-R^2_{L1})}{P(1-P)Jn}\right)\]
The design assumptions are as follows.
Target | PowerUpR conversion |
---|---|
Minimum required sample size of a 3L-MSCRT with random site effects: Number of schools K | mrss.bcra2() or
mrss.bcra3r2() |
Statistical test | PowerUpR conversion |
---|---|
Two-tailed test | two-tailed = TRUE (default) |
Significance level α = .05 | alpha = .05 (default) |
Power 1-β = .80 | power = .80 (default) |
Sample sizes | PowerUpR conversion |
---|---|
Harmonic mean number of L1 units per L2 unit n = 20 | n = 20 |
Harmonic mean number of L2 units per L3 unit J = 4 | J = 4 |
Starting value for K: K0 = 10 | K0 = 10 (default) |
Proportion of sample assigned to treatment P = .50 (balanced) | p = .50 (default) |
Design parameters | PowerUpR conversion |
---|---|
All designs | |
ICC at L2 ρL2 = .19 | rho2 = dp_s5$icc_l2.est |
ICC at L3 ρL3 = .07 | rho3 = dp_s5$icc_l3.est |
Treatment effect heterogeneity at L3 ωTL3 = .15^2/.07 | omega3 = .15^2/dp_s5$icc_l2.est |
Effect size variability at L3 τ2TL3 = .15^2 | esv3 = .15^2 |
Design (a) | |
Explained variance at L1 R2L1 = 0 | r21 = 0 (default) |
Explained variance at L2 R2L2 = 0 | r22 = 0 (default) |
Proportion of τ2TL3 explained by L3 covariates R2TL3 = 0 | r2t2 = 0 (default) |
Design (b) | |
Explained variance at L1 R2L1 = .11 | r21 = dp_s5$r2_l1_pretest.est |
Explained variance at L2 R2L2 = .57 | r22 = dp_s5$r2_l2_pretest.est |
Proportion of τ2TL3 explained by L3 covariates R2TL3 = .09 | r2t2 = .09 |
Design (c) | |
Explained variance at L1 R2L1 = .01 | r21 = dp_s5$r2_l1_ses.est |
Explained variance at L2 R2L2 = .72 | r22 = dp_s5$r2_l2_ses.est |
Proportion of τ2TL3 explained by L3 covariates R2TL3 = .09 | r2t2 = .09 |
Design (d) | |
Explained variance at L1 R2L1 = .12 | r21 = dp_s5$r2_l1_pretestses.est |
Explained variance at L2 R2L2 = .87 | r22 = dp_s5$r2_l2_pretestses.est |
Proportion of τ2TL3 explained by L3 covariates R2TL3 = .10 | r2t2 = .10 |
Further parameters | PowerUpR conversion |
---|---|
All designs | |
Minimum detectable effect size MDES = .10 | es = .10 |
Design (a) | |
Number of covariates at L3 g*L3 = 0 | g3 = 0 |
Design (b) | |
Number of covariates at L3 g*L3 = 1 | g3 = 1 |
Design (c) | |
Number of covariates at L3 g*L3 = 4 | g3 = 4 |
Design (d) | |
Number of covariates at L3 g*L3 = 5 | g3 = 5 |
PowerUpR
# -- choose data frame B5
dp_s5 <- multides1[["B5_Academic"]] %>%
# filter English Listening in grade 9
filter(grepl("English", domain), grepl("Listening", subdomain), grade == 9) %>%
# since there are multiple estimates for that very same subdomain in grade 9
# (multiple time points: beginning of grade 9 and end of grade 9),
# we use the meta-analytically pooled values
filter(grepl("Pooled", study))
# -- use treatment effect heterogeneity argument `omega3` (equivalent to the 'PowerUp!' Excel tool)
d.a <- mrss.bcra3r2(es = .10, rho2 = dp_s5$icc_l2.est, rho3 = dp_s5$icc_l3.est,
omega3 = .15^2/dp_s5$icc_l3.est,
n = 20, J = 4)
## K = 200
# -- use effect size variability argument `esv3` (not available in the 'PowerUp!' Excel tool)
d.a <- mrss.bcra3r2(es = .10, rho2 = dp_s5$icc_l2.est, rho3 = dp_s5$icc_l3.est,
esv3 = .15^2,
n = 20, J = 4)
## K = 200
d.b <- mrss.bcra3r2(es = .10,
rho2 = dp_s5$icc_l2.est, rho3 = dp_s5$icc_l3.est,
esv3 = .15^2,
n = 20, J = 4, g3 = 1,
r21 = dp_s5$r2_l1_pretest.est, r22 = dp_s5$r2_l2_pretest.est, r2t3 = .09)
## K = 109
d.c <- mrss.bcra3r2(es = .10,
rho2 = dp_s5$icc_l2.est, rho3 = dp_s5$icc_l3.est,
esv3 = .15^2,
n = 20, J = 4, g3 = 4,
r21 = dp_s5$r2_l1_ses.est, r22 = dp_s5$r2_l2_ses.est, r2t3 = .09)
## K = 89
d.d <- mrss.bcra3r2(es = .10,
rho2 = dp_s5$icc_l2.est, rho3 = dp_s5$icc_l3.est,
esv3 = .15^2,
n = 20, J = 4, g3 = 5,
r21 = dp_s5$r2_l1_pretestses.est, r22 = dp_s5$r2_l2_pretestses.est, r2t3 = .09)
## K = 63
packages <- c("plotly", "htmlwidgets")
invisible(
lapply(packages,
function(x) {
if (!require(x, character.only = TRUE)) {
install.packages(x, dependencies = TRUE); require(x, character.only = TRUE)}
}
)
)
design <- list(d.a, d.b, d.c, d.d) %>%
set_names("d.a", "d.b", "d.c", "d.d")
delta <- function(design) (design[["K"]]-d.a[["K"]])/d.a[["K"]]*100
p <- data.frame(design = factor(names(design)),
schools = map_dbl(design, ~.[["K"]]),
delta = map_dbl(design, delta)) %>%
ggplot(aes(design, schools, text = paste(
factor(design,
levels = c("d.a", "d.b", "d.c", "d.d"),
labels = c("The baseline design without covariates",
"... adjusting for a pretest",
" ... adjusting for sociodemographics",
" ... adjusting for both a pretest and sociodemographics")),
" requires K = ", schools, " schools.","\n",
ifelse(design == "d.a", "",
ifelse(design != "d.a" & delta > 0,
paste0("This amounts to ", round(delta, 0),
"% more schools compared to the baseline design."),
ifelse(design != "d.a" & delta == 0,
"The required number of schools does not change compared to the baseline design.",
paste0("This amounts to ", round(delta, 0)*-1,
"% fewer schools compared to the baseline design.")))),
sep = ""
))) +
geom_bar(stat="identity", width=0.9, fill = "#2c3e50") +
scale_x_discrete("Design") +
scale_y_continuous("Number of schools") +
theme_minimal() +
theme(axis.ticks = element_blank(),
plot.title = element_text(hjust = 0.5)) +
ggtitle("Number of Schools Required for an MDES of .10 by Design")
ggplotly(p, tooltip = "text")
if(!require(devtools)) {
install.packages("devtools"); require(devtools)}
session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.2.0 (2022-04-22 ucrt)
## os Windows 10 x64 (build 22000)
## system x86_64, mingw32
## ui RTerm
## language (EN)
## collate German_Germany.utf8
## ctype German_Germany.utf8
## tz Europe/Berlin
## date 2022-05-30
## pandoc 2.17.1.1 @ C:/Program Files/RStudio/bin/quarto/bin/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.0)
## backports 1.4.1 2021-12-13 [1] CRAN (R 4.2.0)
## brio 1.1.3 2021-11-30 [1] CRAN (R 4.2.0)
## broom 0.8.0 2022-04-13 [1] CRAN (R 4.2.0)
## bslib 0.3.1 2021-10-06 [1] CRAN (R 4.2.0)
## cachem 1.0.6 2021-08-19 [1] CRAN (R 4.2.0)
## callr 3.7.0 2021-04-20 [1] CRAN (R 4.2.0)
## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.2.0)
## cli 3.3.0 2022-04-25 [1] CRAN (R 4.2.0)
## colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.2.0)
## crayon 1.5.1 2022-03-26 [1] CRAN (R 4.2.0)
## crosstalk 1.2.0 2021-11-04 [1] CRAN (R 4.2.0)
## data.table 1.14.2 2021-09-27 [1] CRAN (R 4.2.0)
## DBI 1.1.2 2021-12-20 [1] CRAN (R 4.2.0)
## dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.2.0)
## desc 1.4.1 2022-03-06 [1] CRAN (R 4.2.0)
## devtools * 2.4.3 2021-11-30 [1] CRAN (R 4.2.0)
## digest 0.6.29 2021-12-01 [1] CRAN (R 4.2.0)
## dplyr * 1.0.9 2022-04-28 [1] CRAN (R 4.2.0)
## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0)
## evaluate 0.15 2022-02-18 [1] CRAN (R 4.2.0)
## fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0)
## farver 2.1.0 2021-02-28 [1] CRAN (R 4.2.0)
## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0)
## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.2.0)
## fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0)
## generics 0.1.2 2022-01-31 [1] CRAN (R 4.2.0)
## ggplot2 * 3.3.6 2022-05-03 [1] CRAN (R 4.2.0)
## glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.2.0)
## haven 2.5.0 2022-04-15 [1] CRAN (R 4.2.0)
## highr 0.9 2021-04-16 [1] CRAN (R 4.2.0)
## hms 1.1.1 2021-09-26 [1] CRAN (R 4.2.0)
## htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.2.0)
## htmlwidgets * 1.5.4 2021-09-08 [1] CRAN (R 4.2.0)
## httr 1.4.3 2022-05-04 [1] CRAN (R 4.2.0)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.2.0)
## jsonlite 1.8.0 2022-02-22 [1] CRAN (R 4.2.0)
## kableExtra * 1.3.4 2021-02-20 [1] CRAN (R 4.2.0)
## klippy 0.0.0.9500 2022-05-18 [1] Github (rlesur/klippy@378c247)
## knitr 1.39 2022-04-26 [1] CRAN (R 4.2.0)
## labeling 0.4.2 2020-10-20 [1] CRAN (R 4.2.0)
## lazyeval 0.2.2 2019-03-15 [1] CRAN (R 4.2.0)
## lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.2.0)
## lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.2.0)
## magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
## memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.0)
## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.2.0)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0)
## pillar 1.7.0 2022-02-01 [1] CRAN (R 4.2.0)
## pkgbuild 1.3.1 2021-12-20 [1] CRAN (R 4.2.0)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
## pkgload 1.2.4 2021-11-30 [1] CRAN (R 4.2.0)
## plotly * 4.10.0 2021-10-09 [1] CRAN (R 4.2.0)
## PowerUpR * 1.1.0 2021-10-25 [1] CRAN (R 4.2.0)
## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.2.0)
## processx 3.5.3 2022-03-25 [1] CRAN (R 4.2.0)
## ps 1.7.0 2022-04-23 [1] CRAN (R 4.2.0)
## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.2.0)
## R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0)
## readr * 2.1.2 2022-01-30 [1] CRAN (R 4.2.0)
## readxl 1.4.0 2022-03-28 [1] CRAN (R 4.2.0)
## remotes 2.4.2 2021-11-30 [1] CRAN (R 4.2.0)
## reprex 2.0.1 2021-08-05 [1] CRAN (R 4.2.0)
## rlang 1.0.2 2022-03-04 [1] CRAN (R 4.2.0)
## rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.2.0)
## rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.2.0)
## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.2.0)
## rvest 1.0.2 2021-10-16 [1] CRAN (R 4.2.0)
## sass 0.4.1 2022-03-23 [1] CRAN (R 4.2.0)
## scales 1.2.0 2022-04-13 [1] CRAN (R 4.2.0)
## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
## stringi 1.7.6 2021-11-29 [1] CRAN (R 4.2.0)
## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.2.0)
## svglite 2.1.0 2022-02-03 [1] CRAN (R 4.2.0)
## systemfonts 1.0.4 2022-02-11 [1] CRAN (R 4.2.0)
## testthat 3.1.4 2022-04-26 [1] CRAN (R 4.2.0)
## tibble * 3.1.7 2022-05-03 [1] CRAN (R 4.2.0)
## tidyr * 1.2.0 2022-02-01 [1] CRAN (R 4.2.0)
## tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.2.0)
## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.2.0)
## tufte 0.12 2022-01-27 [1] CRAN (R 4.2.0)
## tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.0)
## usethis * 2.1.5 2021-12-09 [1] CRAN (R 4.2.0)
## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0)
## vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0)
## viridisLite 0.4.0 2021-04-13 [1] CRAN (R 4.2.0)
## webshot 0.5.3 2022-04-14 [1] CRAN (R 4.2.0)
## withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0)
## xfun 0.30 2022-03-02 [1] CRAN (R 4.2.0)
## xml2 1.3.3 2021-11-30 [1] CRAN (R 4.2.0)
## yaml 2.3.5 2022-02-21 [1] CRAN (R 4.2.0)
##
## [1] C:/Users/Sophie Stallasch/AppData/Local/R/win-library/4.2
## [2] C:/Program Files/R/R-4.2.0/library
##
## ──────────────────────────────────────────────────────────────────────────────
© 2022 stallasch[at]uni-potsdam.de