This document accompanies the workshop on cluster randomized trials led by Larry V. Hedges, Sophie E. Stallasch & Martin Brunner at the University of Potsdam on June 9-10, 2022. It provides the corresponding R code for the exercises on power analysis done during the workshop with the ‘PowerUp!’ Excel tool by Dong & Maynard (2013), which is accessible at the “Causal Evaluation” website. Here, we will use the R implementation of ‘PowerUp!’, namely the PowerUpR package built by Bulus et al. (2021).

Setup

You can install the latest version of PowerUpR (currently, this is version 1.1.0) from CRAN. Throughout this tutorial, we will also use a bunch of tidyverse (Wickham et al., 2019) functions, as well as the kableExtra (Zhu, 2021) package, so make sure you have installed these packages (for details on the versions used here, see the ‘Session Info’ section at the end of this document).

# -- load packages (or install from CRAN if not installed)
packages <- c("PowerUpR", "tidyverse", "kableExtra")

invisible(
  lapply(packages,
         function(x) {
           if (!require(x, character.only = TRUE)) {
             install.packages(x, dependencies = TRUE); require(x, character.only = TRUE)}
         }
  )
)

Multilevel Design Parameters

To plan rigorous intervention studies on student achievement with sufficient statistical power, reliable estimates of multilevel design parameters are critical. These design parameters mirror the (clustered) variance structure of the outcome, including intraclass correlation coefficients ρ that quantify achievement differences between clusters and R² values that quantify the amounts of explained variance by covariates at the various levels. Design parameters should match the target population of the intervention, the specific hierarchical data structure and the achievement outcome under investigation.

There are multiple resources of empirical values for ρ and R² for student achievement. A review of existing international and German research on design parameters can be found in Stallasch et al. (2021). One useful collection of estimates for the United States is, for instance, the “Online Intraclass Correlation Database” created by Hedges and colleagues and hosted by the Institute for Policy Research of the Northwestern University.

Design Parameters for the German School Context

For the majority of workshop exercises, we draw on the compilation of ρ and R² values (and corresponding standard errors) published in Stallasch et al. (2021). Based on three longitudinal German large-scale assessments (NEPS, PISA-I+, and DESI) that provide achievement data across the entire school career (grades 1 to 12), the authors generated design parameters that apply to

several student populations,
both three-level designs (i.e., students at level [L] 1 within classrooms at L2 within schools at L3) as well as two-level designs (i.e., students at L1 within schools at L3), and
a broad array of domains.

R² values at each level are available for three covariate sets: (a) pretest scores, (b) sociodemographic characteristics (comprising students’ gender and migration background, as well as parents’ educational attainment, and families’ HISEI), and (c) the combination thereof.

The design parameters are provided via an interactive Excel file (Supplemental Online Material B) that comes with a detailed introduction on the application scopes of the various sets of estimates. This document can be downloaded from the OSF or the Journal’s website.

To facilitate the workflow in R (e.g., to avoid time-consuming and error-prone C&P of estimates), an .rda file that encloses the full compilation of design parameters (as a list of data frames) is shared in this course’s repository on github which is ready to be directly loaded:

# load design parameters from github
load(url("https://github.com/sophiestallasch/2022-workshop-CRT/blob/main/data/multides1.rda?raw=true"))

If you run into problems, click here to download the data to your local machine and then load it into R.

# inspect list
summary(multides1)

##                        Length Class  Mode
## B1_General             28     tbl_df list
## B2_General_ND          14     tbl_df list
## B3_Adjusted            28     tbl_df list
## B4_Adjusted_ND         14     tbl_df list
## B5_Academic            28     tbl_df list
## B6_Academic_ND         14     tbl_df list
## B7_Non-Academic        28     tbl_df list
## B8_Non-Academic_ND     14     tbl_df list
## B9_General-2l          20     tbl_df list
## B10_General-2l_ND      10     tbl_df list
## B11_Adjusted-2l        20     tbl_df list
## B12_Adjusted-2l_ND     10     tbl_df list
## B13_Academic-2l        20     tbl_df list
## B14_Academic-2l_ND     10     tbl_df list
## B15_Non-Academic-2l    20     tbl_df list
## B16_Non-Academic-2l_ND 10     tbl_df list

The list contains 16 data frames, that can be grouped into two broad classes:

Point Estimates and Standard Errors
Data frames B1, B3, B5, B7, B9, B11, B13, and B15 contain the full sets of (population-specific) empirical estimates of design parameters for each domain, subdomain, and grade, along with their standard errors.
Normative Distributions
Data frames B2, B4, B6, B8, B10, B12, B14, and B16 (i.e., data frames ending with “_ND”) contain (population-specific) normative distributions (i.e., minimum, 25th percentile, median, 75th percentile, and maximum) of those design parameters summarized across domains and/or grades. These distributions can serve as guesstimates to plan studies whose target domain and/or grade is not covered in Stallasch et al. (2021).

Overview of Data Frames

Data frame	Description	Grades	Scope of application	Target design	Hierarchical structure
Data frame	Description	Grades	Scope of application	Target design	L1	L2	L3
B1_General	Design parameters for the general (total) student population	1-10	Nationwide/ across all school types	3L-(MS)CRT	Students	Classrooms	Schools
B1_General		11-12	Nationwide/ across all school types	2L-MSIRT/CRT	Students	-	Schools
B2_General_ND	Normative distributions of B1	1-10	Nationwide/ across all school types; target domain and/or grade not covered	3L-(MS)CRT	Students	Classrooms	Schools
B2_General_ND	Normative distributions of B1	11-12		2L-MSIRT/CRT	Students	-	Schools
B3_Adjusted	Design parameters for the general (total) student population, adjusted for mean-level differences between school types	5-10	1 school type in the non-academic track	3L-(MS)CRT	Students	Classrooms	Schools
B3_Adjusted		11-12	1 school type in the non-academic track	2L-MSIRT/CRT	Students	-	Schools
B4_Adjusted_ND	Normative distributions of B3	5-10	1 school type in the non-academic track; target domain and/or grade not covered	3L-(MS)CRT	Students	Classrooms	Schools
B4_Adjusted_ND	Normative distributions of B3	11-12		2L-MSIRT/CRT	Students	-	Schools
B5_Academic	Design parameters for the academic track	5-10	Academic track school (“Gymnasium”)	3L-(MS)CRT	Students	Classrooms	Schools
B5_Academic	Design parameters for the academic track	11-12	Academic track school (“Gymnasium”)	2L-MSIRT/CRT	Students	-	Schools
B6_Academic_ND	Normative distributions of B5	5-10	Academic track school (“Gymnasium”); target domain and/or grade not covered	3L-(MS)CRT	Students	Classrooms	Schools
B6_Academic_ND	Normative distributions of B5	11-12		2L-MSIRT/CRT	Students	-	Schools
B7_Non-Academic	Design parameters for the non-academic track	5-10	2+ different school types in the non-academic track	3L-(MS)CRT	Students	Classrooms	Schools
B7_Non-Academic	Design parameters for the non-academic track	11-12	2+ different school types in the non-academic track	2L-MSIRT/CRT	Students	-	Schools
B8_Non-Academic_ND	Normative distributions of B7	5-10	2+ different school types in the non-academic track; target domain and/or grade not covered	3L-(MS)CRT	Students	Classrooms	Schools
B8_Non-Academic_ND	Normative distributions of B7	11-12		2L-MSIRT/CRT	Students	-	Schools
B9_General-2l	Design parameters for the general (total) student population; for two-level designs	1-10	Nationwide/ across all school types	2L-MSIRT/CRT	Students	-	Schools
B10_General-2l_ND	Normative distributions of B9	1-10	Nationwide/ across all school types; target domain and/or grade not covered	2L-MSIRT/CRT	Students	-	Schools
B11_Adjusted-2l	Design parameters for the general (total) student population, adjusted for mean-level differences between school types; for two-level designs	5-10	1 school type in the non-academic track	2L-MSIRT/CRT	Students	-	Schools
B12_Adjusted-2l_ND	Normative distributions of B11	5-10	1 school type in the non-academic track; target domain and/or grade not covered	2L-MSIRT/CRT	Students	-	Schools
B13_Academic-2l	Design parameters for the academic track; for two-level designs	5-10	Academic track school (“Gymnasium”)	2L-MSIRT/CRT	Students	-	Schools
B14_Academic-2l_ND	Normative distributions of B13	5-10	Academic track school (“Gymnasium”); target domain and/or grade not covered	2L-MSIRT/CRT	Students	-	Schools
B15_Non-Academic-2l	Design parameters for the non-academic track; for two-level designs	5-10	2+ different school types in the non-academic track	2L-MSIRT/CRT	Students	-	Schools
B16_Non-Academic-2l_ND	Normative distributions of B15	5-10	2+ different school types in the non-academic track; target domain and/or grade not covered	2L-MSIRT/CRT	Students	-	Schools

Further, Stallasch et al. (2021) created a flow chart to guide researchers through the choice of appropriate design parameters as a function of key characteristics of the target intervention.

Structure of Data Frames

Point Estimates and Standard Errors

Data frames that contain the point estimates and standard errors (i.e., B1, B3, B5, B7, B9, B11, B13, and B15) are structured as follows.

Variable	Description
domain	Domain of achievement outcome (for details, see section ‘Achievement Domains’)
subdomain	Subdomain of achievement outcome (for details, see section ‘Achievement Domains’)
grade	Grade of achievement outcome
study	Large-scale assessment study (and cohort): ‘NEPS-SC2’, ‘NEPS-SC3’, ‘NEPS-SC4’, ‘PISA-I+’, ‘DESI’
wave	Wave of large-scale assessment study
model	Hierarchical structure of specified multilevel model: ‘3l’, ‘2l’
icc_l2.est	ICC at classroom level ρ_L2
icc_l2.se	SE of ICC at classroom level SE(ρ_L2)
icc_l3.est	ICC at school level ρ_L3
icc_l3.se	SE of ICC at school level SE(ρ_L3)
r2_l1_pretest.est	Explained variance by a pretest at student level R²_L1
r2_l1_pretest.se	SE of explained variance by a pretest at student level SE(R²_L1)
r2_l2_pretest.est	Explained variance by a pretest at classroom level R²_L2
r2_l2_pretest.se	SE of explained variance by a pretest at classroom level SE(R²_L2)
r2_l3_pretest.est	Explained variance by a pretest at school level R²_L3
r2_l3_pretest.se	SE of explained variance by a pretest at school level SE(R²_L3)
r2_l1_ses.est	Explained variance by sociodemographics at student level R²_L1
r2_l1_ses.se	SE of explained variance by sociodemographics at student level SE(R²_L1)
r2_l2_ses.est	Explained variance by sociodemographics at classroom level R²_L2
r2_l2_ses.se	SE of explained variance by sociodemographics at classroom level SE(R²_L2)
r2_l3_ses.est	Explained variance by sociodemographics at school level R²_L3
r2_l3_ses.se	SE of explained variance by sociodemographics at school level SE(R²_L3)
r2_l1_pretestses.est	Explained variance by a pretest and sociodemographics at student level R²_L1
r2_l1_pretestses.se	SE of explained variance by a pretest and sociodemographics at student level SE(R²_L1)
r2_l2_pretestses.est	Explained variance by a pretest and sociodemographics at classroom level R²_L2
r2_l2_pretestses.se	SE of explained variance by a pretest and sociodemographics at classroom level SE(R²_L2)
r2_l3_pretestses.est	Explained variance by a pretest and sociodemographics at school level R²_L3
r2_l3_pretestses.se	SE of explained variance by a pretest and sociodemographics at school level SE(R²_L3)

Note that three-level design parameters (students at L1 within classrooms at L2 within schools at L3) were estimated for grades 1 to 10 only. For grades 11 to 12, no L2 estimates are available as 11th and 12th graders did not attend intact classrooms, but rather the grouping of students varied depending on the subject taught. Therefore, two-level design parameters (students at L1 within schools at L3) were estimated instead. Two-level equivalents for grades 1 to 10 (i.e., ignoring classroom-level clustering) are also provided; you can access them in the data frames labeled with ‘-2l’. Keep in mind that the top level (which is always the school level) is consistently indicated as ’_l3’ across all data frames; irrespective of whether L2 estimates were estimated (i.e., also for two-level designs). Detailed information on the provided design parameters and underlying analysis models can be retrieved from Stallasch et al. (2021).

Normative Distributions

Data frames that contain normative distributions (i.e., B2, B4, B6, B8, B10, B12, B14, and B16) are structured as follows.

Variable	Description
domain	Domain of summarized parameters (for details, see section ‘Achievement Domains’)
grade_range	Grade range of summarized parameters
statistic	Summary statistic: ‘Minimum’, ‘25th Percentile’, ‘Median’, ‘75th Percentile’, ‘Maximum’
icc_l2.est	ICC at classroom level ρ_L2
icc_l2.se	SE of ICC at classroom level SE(ρ_L2)
icc_l3.est	ICC at school level ρ_L3
icc_l3.se	SE of ICC at school level SE(ρ_L3)
r2_l1_pretest.est	Explained variance by a pretest at student level R²_L1
r2_l1_pretest.se	SE of explained variance by a pretest at student level SE(R²_L1)
r2_l2_pretest.est	Explained variance by a pretest at classroom level R²_L2
r2_l2_pretest.se	SE of explained variance by a pretest at classroom level SE(R²_L2)
r2_l3_pretest.est	Explained variance by a pretest at school level R²_L3
r2_l3_pretest.se	SE of explained variance by a pretest at school level SE(R²_L3)
r2_l1_ses.est	Explained variance by sociodemographics at student level R²_L1
r2_l1_ses.se	SE of explained variance by sociodemographics at student level SE(R²_L1)
r2_l2_ses.est	Explained variance by sociodemographics at classroom level R²_L2
r2_l2_ses.se	SE of explained variance by sociodemographics at classroom level SE(R²_L2)
r2_l3_ses.est	Explained variance by sociodemographics at school level R²_L3
r2_l3_ses.se	SE of explained variance by sociodemographics at school level SE(R²_L3)
r2_l1_pretestses.est	Explained variance by a pretest and sociodemographics at student level R²_L1
r2_l1_pretestses.se	SE of explained variance by a pretest and sociodemographics at student level SE(R²_L1)
r2_l2_pretestses.est	Explained variance by a pretest and sociodemographics at classroom level R²_L2
r2_l2_pretestses.se	SE of explained variance by a pretest and sociodemographics at classroom level SE(R²_L2)
r2_l3_pretestses.est	Explained variance by a pretest and sociodemographics at school level R²_L3
r2_l3_pretestses.se	SE of explained variance by a pretest and sociodemographics at school level SE(R²_L3)

Achievement Domains

The design parameters cover the following achievement domains.

Domain	Subdomain
Mathematics	Mathematics
Science	Science
Verbal Skills in German (as First Language)	Reading Comprehension
	Reading Speed
	Spelling
	Grammar
	Vocabulary
	Writing
	Argumentation
	Listening Comprehension
Verbal Skills in English (as Foreign Language)	Reading Comprehension
	Text Reconstruction (C-Test)
	Language Awareness: Sociopragmatics
	Language Awareness: Grammar
	Writing
	Listening Comprehension
Domain-General Achievement	Declarative Metacognition
	ICT Literacy
	Problem Solving
	Basic Cognitive Functions: Perception Speed
	Basic Cognitive Functions: Reasoning

Note that the (sub)domain character strings in the data frames are named exactly like shown here. For instance, if we want to filter design parameters for English text reconstruction from data frame B1, we could use the following code.

# filter English text reconstruction from data frame B1
eng_ctest <- multides1[["B1_General"]] %>% 
  filter(domain == "Verbal Skills in English (as Foreign Language)", 
         subdomain == "Text Reconstruction (C-Test)")

Tip: However, avoid to type the full strings. We recommend using pattern matching functions like from the grep() family instead that will do the job for you.

# better:
eng_ctest <- multides1[["B1_General"]] %>% 
  filter(grepl("Eng", domain), 
         grepl("C-Test", subdomain))

Let’s have a look at the filtered design parameters.

eng_ctest %>%
  t() %>% # transpose for better readability
  kbl %>% 
  kable_styling(bootstrap_options = c("condensed", "responsive")) %>% 
  scroll_box(height = "350px")

domain	Verbal Skills in English (as Foreign Language)	Verbal Skills in English (as Foreign Language)	Verbal Skills in English (as Foreign Language)
subdomain	Text Reconstruction (C-Test)	Text Reconstruction (C-Test)	Text Reconstruction (C-Test)
grade	9	9	9
study	DESI	DESI	DESI (Pooled)
wave	1	2	NA
model	3l	3l	3l
icc_l2.est	0.07484727	0.09329552	0.08274759
icc_l2.se	0.009634811	0.011132797	0.007285329
icc_l3.est	0.5839861	0.5519840	0.5687772
icc_l3.se	0.01762045	0.01851544	0.01276421
r2_l1_pretest.est	NA	0.4270869	0.4270869
r2_l1_pretest.se	NA	0.007683032	0.007683032
r2_l2_pretest.est	NA	0.8440895	0.8440895
r2_l2_pretest.se	NA	0.01575307	0.01575307
r2_l3_pretest.est	NA	0.998872	0.998872
r2_l3_pretest.se	NA	0.0006166911	0.0006166911
r2_l1_ses.est	0.01315652	0.01675238	0.01483982
r2_l1_ses.se	0.002402910	0.002561325	0.001752443
r2_l2_ses.est	0.6742777	0.6423608	0.6577857
r2_l2_ses.se	0.09820881	0.09497836	0.06827333
r2_l3_ses.est	0.8845148	0.9068157	0.8968653
r2_l3_ses.se	0.01972262	0.01770283	0.01317418
r2_l1_pretestses.est	NA	0.4306028	0.4306028
r2_l1_pretestses.se	NA	0.007663661	0.007663661
r2_l2_pretestses.est	NA	0.8838199	0.8838199
r2_l2_pretestses.se	NA	0.01990558	0.01990558
r2_l3_pretestses.est	NA	0.9993725	0.9993725
r2_l3_pretestses.se	NA	0.000219044	0.000219044

As can be seen, there are three entries for English text reconstruction as the respective test was administered to students at two time points in the DESI study, namely at the beginning and at the end of grade 9. For the beginning of grade 9, no pretests were available, therefore the corresponding cells are set to NA. A third set of estimates is provided here that contains the meta-analytically pooled results across these two time points (as indicated by ‘DESI (Pooled)’). Note that this integration strategy (applying a fixed effect model approach) was also adopted for other domains in grade 9 in case multiple design parameters were available (as obtained either from several studies [as indicated by ‘All (Pooled)’] or from the two time points in DESI).

Cluster Randomized Trials

The following three exercises cover power analysis for cluster randomized trials (CRTs). In a two-level CRT (2L-CRT; see Scenario 1) students at L1 are nested within schools at L2 and entire schools are randomly assigned to the experimental conditions (i.e., in a two-arm study either the treatment or control condition). In a three-level CRT (3L-CRT; see Scenarios 2 and 3) students at L1 are nested within classrooms at L2 which are, in turn, nested within schools at L3. Here, randomization/treatment assignment occurs, again, at the school level L3.

Scenario 1: How Many Schools are Required for a 2L-CRT?

Research Team 1 would like to conduct a 2L-CRT on the effectiveness of a school-wide intervention to improve 4th graders’ mathematical achievement in Germany. Team 1 plans to sample 40 students from each school. To improve statistical power, the researchers administer a math pretest to each participating student. How many schools are required to detect a standardized intervention effect of d = .15?

Design Parameters

Which data frame contains the appropriate design parameters?

# -- choose data frame B9 
dp_s1 <- multides1[["B9_General-2l"]] %>% 
  # filter mathematics in grade 4
  filter(domain == "Mathematics", grade == 4)

Let’s have a look at the relevant design parameters.

dp_s1 %>% 
  # select ICCs and R-squared values for pretests
  select(contains(c("icc", "pretest."))) %>% 
  round(2) %>% 
  pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>% 
  kbl(linesep = "") %>% 
  kable_styling(full_width=FALSE, position = "left", 
                  bootstrap_options = c("condensed", "responsive"))

Variable	Value
icc_l3.est	0.12
icc_l3.se	0.01
r2_l1_pretest.est	0.40
r2_l1_pretest.se	0.01
r2_l3_pretest.est	0.64
r2_l3_pretest.se	0.03

Formula

Based on the expression for the MDES of a 2L-CRT with randomization/treatment assignment at L2 as given in Dong & Maynard (2013, p. 51), the number of schools $J$ is computed as \[ J = \left(\frac{M_{J-g^*_{L2}-2}}{MDES}\right)^2\left(\frac{\rho_{L2}(1-R^2_{L2})}{P(1-P)}+\frac{(1-\rho_{L2})(1-R^2_{L1})}{P(1-P)n} \right)\] with

$n$ = harmonic mean number of L1 units per L2 unit
$J$ = number of L2 units
$P$ = $J_{T_{L2}}/J$ = proportion of sample assigned to treatment
$MDES$ = minimum detectable effect size
$M_{K-g^*_{L2}-2}$ = $t_{\alpha/2}+t_{1-\beta}$ = multiplier for two-tailed test with $K-g^*_{L2}-2$ degrees of freedom
$g^*_{L2}$ = number of covariates at L2
$\rho_{L2}$ = achievement differences at L2 (ICC at L2)
$R^2_{L1}$ = explained variance at L1
$R^2_{L2}$ = explained variance at L2

Assumptions

The initial design assumptions are as follows.

Target	`PowerUpR` conversion
Minimum required sample size of a 2L-CRT: Number of schools J	`mrss.cra2()`

Statistical test	`PowerUpR` conversion
Two-tailed test	`two-tailed = TRUE` (default)
Significance level α = .05	`alpha = .05` (default)
Power 1-β = .80	`power = .80` (default)

Sample sizes	`PowerUpR` conversion
Harmonic mean number of L1 units per L2 unit n = 20	`n = 20`
Starting value for J, J₀ = 8	`J0 = 8`
Proportion of sample assigned to treatment P = .50 (balanced)	`p = .50` (default)

Design parameters	`PowerUpR` conversion
ICC at L2 ρ_L2 = .12	`rho2 = dp_s1$icc_l3.est`
Explained variance at L1 R²_L1 = .40	`r21 = dp_s1$r2_l1_pretest.est`
Explained variance at L2 R²_L2 = .64	`r22 = dp_s1$r2_l3_pretest.est`

Further parameters	`PowerUpR` conversion
Number of covariates at L2 g*_L2 = 1	`g2 = 1`
Minimum detectable effect size MDES = .15	`es = .15`

Power Analysis with `PowerUpR`

# -- choose data frame B9 
dp_s1 <- multides1[["B9_General-2l"]] %>% 
  # filter mathematics in grade 4
  filter(domain == "Mathematics", grade == 4)

(a) Design with a math pretest as covariate

d.a <- mrss.cra2(es = .15, n = 40, rho2 = dp_s1$icc_l3.est, 
                 g2 = 1, r21 = dp_s1$r2_l1_pretest.est, r22 = dp_s1$r2_l3_pretest.est)

## J = 80

Given the assumptions, at least 80 schools are necessary to achieve an MDES of .15.

Of course, it is also possible to directly insert the numeric values for the design parameters, but using the variables instead is safer, more precise, and prevents typos. Note that the results can differ due to rounding.

d.a1 <- mrss.cra2(es = .15, n = 40, rho2 = .12, 
                 g2 = 1, r21 = .40, r22 = .64)

## J = 81

Everything else being equal (and using the design with a math pretest as baseline), what happens, when…

(b) … power increases from 80% to 90%?

# adjust the power argument to .90
d.b <- mrss.cra2(power = .90, es = .15, n = 40, rho2 = dp_s1$icc_l3.est, 
                 g2 = 1, r21 = dp_s1$r2_l1_pretest.est, r22 = dp_s1$r2_l3_pretest.est)

## J = 107

Tip: We could also simply update the respective parameter in the list $parms stored in the mrss object and then call the function again with the modified argument.

# save design (a) as a new design
d.b <- d.a
# update power argument to .90
d.b$parms$power = .90
# call the function
d.b <- exec("mrss.cra2", !!!d.b$parms)

## J = 107

(c) … adding sociodemographics as covariates?

Team 1 wants to explore, how precision can be improved if they additionally control for some vital sociodemographics, such as students’ gender and migration background, as well as parents’ educational attainment and the families’ HISEI, (1) at L1 only, and (2) at both L1 and L2.

Let’s have a look at the relevant design parameters first.

dp_s1 %>% 
  # select R-squared values for the combination of pretest and sociodemographics
  select(contains("pretestses")) %>% 
  round(2) %>% 
  pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>% 
  kbl(linesep = "") %>% 
  kable_styling(full_width=FALSE, position = "left", 
                  bootstrap_options = c("condensed", "responsive"))

Variable	Value
r2_l1_pretestses.est	0.43
r2_l1_pretestses.se	0.01
r2_l3_pretestses.est	0.76
r2_l3_pretestses.se	0.03

# add 4 sociodemographics at L1 only: adjust only r21
d.c1 <- mrss.cra2(es = .15, n = 40, rho2 = dp_s1$icc_l3.est, 
                 g2 = 1, r21 = dp_s1$r2_l1_pretestses.est, r22 = dp_s1$r2_l3_pretest.est)

## J = 80

# add 4 sociodemographics at both L1 and 2
d.c2 <- mrss.cra2(es = .15, n = 40, rho2 = dp_s1$icc_l3.est, 
                 g2 = 5, r21 = dp_s1$r2_l1_pretestses.est, r22 = dp_s1$r2_l3_pretestses.est)

## J = 60

(d) … no covariates are used?

# -- set all covariate-specific arguments to zero
# this can be achieved by **not** specifying the covariate-specific arguments
# (defaults: g2 = 0, r21 = 0, r22 = 0)
d.d <- mrss.cra2(es = .15, n = 40, rho2 = dp_s1$icc_l3.est)

## J = 199

(e) … larger achievement differences of $\rho$ = .20 are assumed?

# adjust rho2 to .2
d.e <- mrss.cra2(es = .15, n = 40, rho2 = .20, 
                 g2 = 1, r21 = dp_s1$r2_l1_pretest.est, r22 = dp_s1$r2_l3_pretest.est)

## J = 119

(f) … taking statistical uncertainty into account?

The design parameters are estimates of population quantities, and, thus, associated with statistical uncertainty. The uncertainty can be incorporated by using the standard errors for each design parameter. Assuming large-sample properties, research Team 1 uses the standard normal distribution to compute the limits for 95% confidence intervals by means of the standard errors for the design parameters.
A conservative approach for planning the sample size is to use the upper bound estimate of the 95% CI for $\rho$, and the lower bound estimates for the R² values.

d.f <- mrss.cra2(es = .15, n = 40, 
                 rho2 = dp_s1$icc_l3.est+1.96*dp_s1$icc_l3.se, 
                 g2 = 1, 
                 r21 = dp_s1$r2_l1_pretest.est-1.96*dp_s1$r2_l1_pretest.se, 
                 r22 = dp_s1$r2_l3_pretest.est-1.96*dp_s1$r2_l3_pretest.se)

## J = 106

Plotting

Let’s have a look at the results. Move your cursor across the bars to learn about the required number schools for each design and how large the difference in % is compared to the baseline design with a single math pretest as covariate at L1 and L2.

show code

packages <- c("plotly", "htmlwidgets")

invisible(
  lapply(packages,
         function(x) {
           if (!require(x, character.only = TRUE)) {
             install.packages(x, dependencies = TRUE); require(x, character.only = TRUE)}
         }
  )
)


design <- list(d.a, d.b, d.c1, d.c2, d.d, d.e, d.f) %>% 
  set_names("d.a", "d.b", "d.c1", "d.c2", "d.d", "d.e", "d.f")

delta <- function(design) (design[["J"]]-d.a[["J"]])/d.a[["J"]]*100


p <- data.frame(design = factor(names(design)), 
                schools = map_dbl(design, ~.[["J"]]), 
                delta = map_dbl(design, delta)) %>% 
  ggplot(aes(design, schools, text = paste(
    factor(design, 
           levels = c("d.a", "d.b", "d.c1", "d.c2", "d.d", "d.e", "d.f"), 
           labels = c("The baseline design with a math pretest as covariate", 
                      "... increasing power to 90%", 
                      " ... adding sociodemographics at L1", 
                      " ... adding sociodemographics at both L1 and L2", 
                      "... using no covariates", 
                      "... assuming increased achievement differences at L2 of 20%", 
                      "... applying a conservative approach")),
    " requires J = ", schools, " schools.","\n",
    ifelse(design == "d.a", "", 
           ifelse(design != "d.a" & delta > 0, 
           paste0("This amounts to ", round(delta, 0), 
                  "% more schools compared to the baseline design."),
           ifelse(design != "d.a" & delta == 0, 
           "The required number of schools does not change compared to the baseline design.",
           paste0("This amounts to ", round(delta, 0)*-1, 
                  "% fewer schools compared to the baseline design.")))),
    sep = ""
    ))) +
  geom_bar(stat="identity", width=0.9, fill = "#2c3e50") + 
  scale_x_discrete("Design") + 
  scale_y_continuous("Number of schools") + 
  theme_minimal() + 
  theme(axis.ticks = element_blank(), 
        plot.title = element_text(hjust = 0.5)) + 
  ggtitle("Number of Schools Required for an MDES of .15 by Design")

ggplotly(p, tooltip = "text")

PowerUpR also has built-in functionality for plotting. We will quickly check out the plot() function with design (a). We can either plot statistical power as a function of the required number of schools (which is the default in plot())…

plot(d.a, main = "MDES as a Function of Number of Schools")

… or we can plot the MDES along with its (1-α)*100% CI as a function of the required number of schools.

# include `ypar = "mdes"` to plot the MDES
# include `locate=TRUE` to locate parameter values for the applied design
plot(d.a, ypar = "mdes", 
     main = "MDES as a Function of Number of Schools", locate = TRUE)

Scenario 2: How Many Schools are Required for a 3L-CRT?

Research Team 2 would like to expand research Team 1’s CRT to a three-level design, where classroom-level information is available. Team 2 plans to sample 3 classrooms with 20 students per classroom from every school and, again, to use a math pretest as obtained for all students as covariate. The researchers are interested in the number of schools necessary to detect a standardized intervention effect of d = 0.25.

Design parameters

Which data frame contains the appropriate design parameters?

# -- choose data frame B1 
dp_s2 <- multides1[["B1_General"]] %>% 
  # filter mathematics achievement in grade 4
  filter(domain == "Mathematics", grade == 4)

Let’s have a look at the relevant design parameters.

dp_s2 %>% 
  # select ICCs and R-squared values for pretests
  select(contains(c("icc", "pretest."))) %>% 
  round(2) %>% 
  pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>% 
  kbl(linesep = "") %>% 
  kable_styling(full_width=FALSE, position = "left", 
                  bootstrap_options = c("condensed", "responsive"))

Variable	Value
icc_l2.est	0.05
icc_l2.se	0.02
icc_l3.est	0.10
icc_l3.se	0.02
r2_l1_pretest.est	0.40
r2_l1_pretest.se	0.01
r2_l2_pretest.est	0.35
r2_l2_pretest.se	0.04
r2_l3_pretest.est	0.76
r2_l3_pretest.se	0.03

Formula

The number of schools $K$ for a 3L-CRT with randomization/treatment assignment at L3 is given in Dong & Maynard (2013, p. 52) as \[ K = \left(\frac{M_{K-g^*_{L3}-2}}{MDES}\right)^2\left(\frac{\rho_{L3}(1-R^2_{L3})}{P(1-P)}+\frac{\rho_{L2}(1-R^2_{L2})}{P(1-P)J}+\frac{(1-\rho_{L3}-\rho_{L2})(1-R^2_{L1})}{P(1-P)Jn} \right)\]

$n$ = harmonic mean number of L1 units per L2 unit
$J$ = harmonic mean number of L2 units per L3 unit
$K$ = number of L3 units
$P$ = $K_{T_{L3}}/K$ = proportion of sample assigned to treatment
$MDES$ = minimum detectable effect size
$M_{K-g^*_{L3}-2}$ = $t_{\alpha/2}+t_{1-\beta}$ = multiplier for two-tailed test with $K-g^*_{L3}-2$ degrees of freedom
$g^*_{L3}$ = number of covariates at L3
$\rho_{L2}$ = achievement differences at L2 (ICC)
$\rho_{L3}$ = achievement differences at L3 (ICC)
$R^2_{L1}$ = explained variance at L1
$R^2_{L2}$ = explained variance at L2
$R^2_{L3}$ = explained variance at L3

Assumptions

The initial design assumptions are as follows.

Target	`PowerUpR` conversion
Minimum required sample size of a 3L-CRT: Number of schools K	`mrss.cra3()`

Statistical test	`PowerUpR` conversion
Two-tailed test	`two-tailed = TRUE` (default)
Significance level α = .05	`alpha = .05` (default)
Power 1-β = .80	`power = .80` (default)

Sample sizes	`PowerUpR` conversion
Harmonic mean number of L1 units per L2 unit n = 20	`n = 20`
Harmonic mean number of L2 units per L3 unit J = 3	`J = 3`
Starting value for K, K₀ = 8	`K0 = 8`
Proportion of sample assigned to treatment P = .50 (balanced)	`p = .50` (default)

Design parameters	`PowerUpR` conversion
ICC at L2 ρ_L2 = .05	`rho2 = dp_s2$icc_l2.est`
ICC at L3 ρ_L3 = .10	`rho3 = dp_s2$icc_l3.est`
Explained variance at L1 R²_L1 = .40	`r21 = dp_s2$r2_l1_pretest.est`
Explained variance at L2 R²_L2 = .35	`r22 = dp_s2$r2_l2_pretest.est`
Explained variance at L3 R²_L3 = .76	`r23 = dp_s2$r2_l3_pretest.est`

Further parameters	`PowerUpR` conversion
Number of covariates at L3 g*_L3 = 1	`g3 = 1`
Minimum detectable effect size MDES = .25	`es = .25`

Power Analysis with `PowerUpR`

# -- choose data frame B1 
dp_s2 <- multides1[["B1_General"]] %>% 
  # filter mathematics achievement in grade 4
  filter(domain == "Mathematics", grade == 4)

(a) Design with a math pretest as covariate

d.a <- mrss.cra3(n = 20, J = 3, K0 = 8, 
                 rho2 = dp_s2$icc_l2.est, 
                 rho3 = dp_s2$icc_l3.est, 
                 g3 = 1, 
                 r21 = dp_s2$r2_l1_pretest.est, 
                 r22 = dp_s2$r2_l2_pretest.est, 
                 r23 = dp_s2$r2_l3_pretest.est)

## K = 24

Everything else being equal (and using the design with the math pretest as baseline), what happens, when…

(b) … taking statistical uncertainty into account?

Team 2 opts for a conservative approach (see Scenario 1, design f).

d.b <- mrss.cra3(n = 20, J = 3, K0 = 8, 
                 rho2 = dp_s2$icc_l2.est+1.96*dp_s2$icc_l2.se, 
                 rho3 = dp_s2$icc_l3.est+1.96*dp_s2$icc_l3.se, 
                 g3 = 1, 
                 r21 = dp_s2$r2_l1_pretest.est-1.96*dp_s2$r2_l1_pretest.se, 
                 r22 = dp_s2$r2_l2_pretest.est-1.96*dp_s2$r2_l2_pretest.se, 
                 r23 = dp_s2$r2_l3_pretest.est-1.96*dp_s2$r2_l3_pretest.se)

## K = 36

(c) … adding sociodemographics as covariates

Let’s have a look at the relevant design parameters.

dp_s2 %>% 
  # select R-squared values for the combination of pretest and sociodemographics
  select(contains("pretestses")) %>% 
  round(2) %>% 
  pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>% 
  kbl(linesep = "") %>% 
  kable_styling(full_width=FALSE, position = "left", 
                  bootstrap_options = c("condensed", "responsive"))

Variable	Value
r2_l1_pretestses.est	0.42
r2_l1_pretestses.se	0.01
r2_l2_pretestses.est	0.83
r2_l2_pretestses.se	0.03
r2_l3_pretestses.est	0.83
r2_l3_pretestses.se	0.03

# -- point estimates
d.c1 <- mrss.cra3(n = 20, J = 3, K0 = 8, 
                 rho2 = dp_s2$icc_l2.est, 
                 rho3 = dp_s2$icc_l3.est, 
                 g3 = 5, 
                 r21 = dp_s2$r2_l1_pretestses.est, 
                 r22 = dp_s2$r2_l2_pretestses.est, 
                 r23 = dp_s2$r2_l3_pretestses.est)

## K = 17

# -- conservative approach
d.c2 <- mrss.cra3(n = 20, J = 3, K0 = 8, 
                 rho2 = dp_s2$icc_l2.est+1.96*dp_s2$icc_l2.se, 
                 rho3 = dp_s2$icc_l3.est+1.96*dp_s2$icc_l3.se, 
                 g3 = 1, 
                 r21 = dp_s2$r2_l1_pretestses.est-1.96*dp_s2$r2_l1_pretestses.se, 
                 r22 = dp_s2$r2_l2_pretestses.est-1.96*dp_s2$r2_l2_pretestses.se, 
                 r23 = dp_s2$r2_l3_pretestses.est-1.96*dp_s2$r2_l3_pretestses.se)

## K = 24

Now assume that the researchers’ resources allow to sample up to 50 schools, at maximum. Team 2 wants to explore the levels of statistical power the study design can reach with and without 4 sociodemographics as covariates, given this sample size constraint.

Let’s have a look at the relevant design parameters.

dp_s2 %>% 
  # select R-squared values for sociodemographics
  select(contains("_ses")) %>% 
  round(2) %>% 
  pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>% 
  kbl(linesep = "") %>% 
  kable_styling(full_width=FALSE, position = "left", 
                  bootstrap_options = c("condensed", "responsive"))

Variable	Value
r2_l1_ses.est	0.10
r2_l1_ses.se	0.01
r2_l2_ses.est	0.84
r2_l2_ses.se	0.03
r2_l3_ses.est	0.69
r2_l3_ses.se	0.05

To vectorize the functions of PowerUpR over a range of possible parameters (here: 8 to 50 schools), we need to write a custom function similar to what is suggested by the package authors in their vignette on vectorization).

custom_power.cra3 <- function(K, covariates = FALSE) {
  parms <- list(rho2 = dp_s2$icc_l2.est, rho3 = dp_s2$icc_l3.est,
                g3 = 0, r21 = 0, r22 = 0, r23 = 0, 
                n = 20, J = 2, K = K)
  
  # to include 4 sociodemographic covariates
  if(isTRUE(covariates)) {
    parms <- list(rho2 = dp_s2$icc_l2.est, rho3 = dp_s2$icc_l3.est,
                g3 = 4, 
                r21 = dp_s2$r2_l1_ses.est, 
                r22 = dp_s2$r2_l2_ses.est, 
                r23 = dp_s2$r2_l3_ses.est, 
                n = 20, J = 2, K = K)
  }
  design <- exec("power.cra3", !!!parms)
  design$power[1]
}

Now we just have to map the function across the various sample sizes up to 50 schools.

# number of schools ranging from 8 to 50
K <- seq(8, 50, 1)

(d) Power as outcome: Design without covariates

d.d <- map_dbl(K, custom_power.cra3) %>% set_names(K)

(e) Power as outcome: Design with 4 sociodemographics as covariates

d.e <- map_dbl(K, ~custom_power.cra3(K = ., covariates = TRUE)) %>% set_names(K)

Plotting

Let’s first have a look at the results for designs a to c (i.e., where the target outcome was the required number of schools). Move your cursor across the bars to learn about the required number schools for each design and how large the difference in % is compared to the baseline design with a single math pretest as covariate at L1 and L2.

show code

packages <- c("plotly", "htmlwidgets")

invisible(
  lapply(packages,
         function(x) {
           if (!require(x, character.only = TRUE)) {
             install.packages(x, dependencies = TRUE); require(x, character.only = TRUE)}
         }
  )
)


design <- list(d.a, d.b, d.c1, d.c2) %>% 
  set_names("d.a", "d.b", "d.c1", "d.c2")

delta <- function(design) (design[["K"]]-d.a[["K"]])/d.a[["K"]]*100


p <- data.frame(design = factor(names(design)), 
                schools = map_dbl(design, ~.[["K"]]), 
                delta = map_dbl(design, delta)) %>% 
  ggplot(aes(design, schools, text = paste(
    factor(design, 
           levels = c("d.a", "d.b", "d.c1", "d.c2"), 
           labels = c("The baseline design with a math pretest as covariate", 
                      "... applying a conservative approach", 
                      " ... adding sociodemographics", 
                      " ... adding sociodemographics and applying a conservative approach")),
    " requires K = ", schools, " schools.","\n",
    ifelse(design == "d.a", "", 
           ifelse(design != "d.a" & delta > 0, 
           paste0("This amounts to ", round(delta, 0), 
                  "% more schools compared to the baseline design."),
           ifelse(design != "d.a" & delta == 0, 
           "The required number of schools does not change compared to the baseline design.",
           paste0("This amounts to ", round(delta, 0)*-1, 
                  "% fewer schools compared to the baseline design.")))),
    sep = ""
    ))) +
  geom_bar(stat="identity", width=0.9, fill = "#2c3e50") + 
  scale_x_discrete("Design") + 
  scale_y_continuous("Number of schools") +
  theme_minimal() + 
  theme(axis.ticks = element_blank(), 
        plot.title = element_text(hjust = 0.5)) + 
  ggtitle("Number of Schools Required for an MDES of .25 by Design")

ggplotly(p, tooltip = "text")

Now we plot the results for designs d and e (i.e., where the target outcome was the achieved power level). Move your cursor across the lines to learn about the achieved level of statistical power.

show code

packages <- c("plotly", "htmlwidgets")

invisible(
  lapply(packages,
         function(x) {
           if (!require(x, character.only = TRUE)) {
             install.packages(x, dependencies = TRUE); require(x, character.only = TRUE)}
         }
  )
)


p <- data.frame(K = K, d.d = d.d, d.e = d.e) %>% 
  pivot_longer(d.d:d.e) %>%
  mutate(name = fct_recode(name, 
                           "No covariates" = "d.d", 
                           "Sociodemographics as covariates" =  "d.e")) %>% 
  ggplot(aes(x = as.numeric(K), y = value, color = name, 
           text = paste(
             "With " , K, " schools ",
             round(value*100, 0), "% Power is achieved.", sep = ""), 
           group = 1)) +
  geom_line() +
  geom_hline(yintercept = .8, lty = "dotted") + 
  scale_y_continuous("Statistical Power", limits = c(0,1)) + 
  scale_x_continuous("Number of schools", limits = c(8, 50)) + 
  scale_color_manual(values = c("#2c3e50", "#18bc9c")) + 
  theme_minimal() + 
  theme(axis.ticks = element_blank(),
        axis.title.x = element_text(margin = margin(0.5, 0, 0, 0)),
        plot.title = element_text(hjust = 0.5)) + 
  ggtitle("Statistical Power as a Function of the Number of Schools")

ggplotly(p, tooltip = "text")  %>% 
  layout(legend = list(
      orientation = "h",
      x = 0.2, 
      y = -0.2
    ))

Scenario 3: Which MDES is Attainable for a 3L-CRT?

Suppose that research Team 3 plans a 3L-CRT to study the impact of an intervention that is intended to affect students’ history achievement in comprehensive schools (grades 5 to 12). Due to budgetary constraints, a fixed maximum number of 40 schools (with 2 classrooms, and 20 students each) are at the researchers’ disposal. Given these limits, the primary concern of Team 3 is to ensure that the attainable MDES lies within the range of effects on students’ achievement that is typically observed for this kind of intervention (i.e., 0.20 ≤ d ≤ 0.30). Team 3 considers two designs: (1) a design without covariates, and (2) a design that adjusts for a pretest and 4 sociodemographics (i.e., students’ gender and migration background, parents’ educational attainment, and families’ HISEI).

Design parameters

Which data frame contains the appropriate design parameters?

Team 3 consults data frame B4 outlining the normative distributions across the various achievement domains to determine small (i.e., 25th percentile), medium (i.e., median), and large (i.e., 75th percentile) values of ρ and R².

# -- choose data frame B4 
dp_s3 <- multides1[["B4_Adjusted_ND"]] %>% 
  # filter P25, median, and P75 for design parameters that are summarized across domains and secondary school grades
  filter(domain == "All", grade_range == "5-12", !grepl("Min|Max", statistic)) %>% 
  # select relevant design parameters
  select(statistic, contains(c("icc", "pretestses"))) %>% 
  # simplify variable names 
  # (we will need this later on; this will facilitate the vectorization over the normative distributions)
  set_names(c("statistic", "rho2", "rho3", "r21", "r22", "r23"))

Let’s have a look at the design parameters.

dp_s3 %>% 
  mutate(across(where(is.numeric), round, 2)) %>% 
  kbl(linesep = "") %>% 
  kable_styling(full_width=FALSE, position = "left", 
                  bootstrap_options = c("condensed", "responsive"))

statistic	rho2	rho3	r21	r22	r23
25th Percentile	0.04	0.08	0.17	0.70	0.84
Median	0.06	0.10	0.22	0.77	0.90
75th Percentile	0.09	0.12	0.31	0.86	0.97

Formula

The $MDES$ for a 3l-CRT with randomization/treatment assignment at L3 is given in Dong & Maynard (2013, p. 52) as \[ MDES = M_{K-g^*_{L3}-2}\left(\frac{\rho_{L3}(1-R^2_{L3})}{P(1-P)K}+\frac{\rho_{L2}(1-R^2_{L2})}{P(1-P)KJ}+\frac{(1-\rho_{L3}-\rho_{L2})(1-R^2_{L1})}{P(1-P)KJn} \right)\]

$n$ = harmonic mean number of L1 units per L2 unit
$J$ = harmonic mean number of L2 units per L3 unit
$K$ = number of L3 units
$P$ = $K_{T_{L3}}/K$ = proportion of sample assigned to treatment
$MDES$ = minimum detectable effect size
$M_{K-g^*_{L3}-2}$ = $t_{\alpha/2}+t_{1-\beta}$ = multiplier for two-tailed test with $K-g^*_{L3}-2$ degrees of freedom
$g^*_{L3}$ = number of covariates at L3
$\rho_{L2}$ = achievement differences at L2 (ICC)
$\rho_{L3}$ = achievement differences at L3 (ICC)
$R^2_{L1}$ = explained variance at L1
$R^2_{L2}$ = explained variance at L2
$R^2_{L3}$ = explained variance at L3

Assumptions

The design assumptions are as follows.

Target	`PowerUpR` conversion
Minimum detectable effect size MDES of a 3L-CRT	`mdes.cra3()`

Statistical test	`PowerUpR` conversion
Two-tailed test	`two-tailed = TRUE` (default)
Significance level α = .05	`alpha = .05` (default)
Power 1-β = .80	`power = .80` (default)

Sample sizes	`PowerUpR` conversion
Harmonic mean number of L1 units per L2 unit n = 20	`n = 20`
Harmonic mean number of L2 units per L3 unit J = 2	`J = 2`
Number of L3 units K = 40	`K = 40`
Proportion of sample assigned to treatment P = .50 (balanced)	`p = .50` (default)

Design parameters	`PowerUpR` conversion
All designs
ICC at L2 ρ_L2 = .04/.06/.09 (small/medium/large)	`rho2 = dp_s3$icc_l2.est`
ICC at L3 ρ_L3 = .08/.10/.12 (small/medium/large)	`rho3 = dp_s3$icc_l3.est`
Design (a)
Explained variance at L1 R²_L1 = 0	`r21 = 0` (default)
Explained variance at L2 R²_L2 = 0	`r22 = 0` (default)
Explained variance at L3 R²_L3 = 0	`r23 = 0` (default
Design (b)
Explained variance at L1 R²_L1 = .17/.22/.31 (small/medium/large)	`r21 = dp_s3$r2_l1_pretestses.est`
Explained variance at L2 R²_L2 = .70/.77/.86 (small/medium/large)	`r22 = dp_s3$r2_l2_pretestses.est`
Explained variance at L3 R²_L3 = .84/.90/.97 (small/medium/large)	`r23 = dp_s3$r2_l3_pretestses.est`

Further parameters	`PowerUpR` conversion
Design (a)
Number of covariates at L3 g*_L3 = 0	`g3 = 0` (default)
Design (b)
Number of covariates at L3 g*_L3 = 5	`g3 = 5`

Power Analysis with `PowerUpR`

# -- choose data frame B4 
dp_s3 <- multides1[["B4_Adjusted_ND"]] %>% 
  # filter P25, median, and P75 for design parameters that are summarized across domains and secondary school grades
  filter(domain == "All", grade_range == "5-12", !grepl("Min|Max", statistic)) %>% 
  # select relevant design parameters
  select(statistic, contains(c("icc", "pretestses"))) %>% 
  # simplify variable names
  set_names(c("statistic", "rho2", "rho3", "r21", "r22", "r23"))

Again, we need to write a custom function (see also Scenario 2) to vectorize over a range of possible parameters (here: small, medium, and large design parameters).

custom_mdes.cra3 <- function(rho2, rho3, g3 = 0, r21 = 0, r22 = 0, r23 = 0, ...) {
  parms <- list(rho2 = rho2, rho3 = rho3,
                g3 = g3, 
                r21 = r21, r22 = r22, r23 = r23, 
                n = 20, J = 2, K = 40)
  design <- exec("mdes.cra3", !!!parms)
  mdes <- design$mdes[1]  # MDES point estimate
  ci.lb <- design$mdes[2] # lower bound of 95% CI
  ci.ub <- design$mdes[3] # upper bound of 95% CI
  return(data.frame(mdes = mdes, ci.lb = ci.lb, ci.ub = ci.ub))
}

(a) Design without covariates

# apply the function to compute the MDES for small, medium, and large intraclass correlations
d.a <- pmap(dp_s3 %>% select(contains("rho")), custom_mdes.cra3) %>% 
  set_names(dp_s3$statistic) %>% 
  bind_rows(.id = "statistic")

(b) Design with a pretest and 4 sociodemographics as covariates

When adding covariates, Team 3 uses the large values for ρ (i.e., the 75th percentile) to apply a conservative approach.

# use the 75th percentile of the ICCs and adjust for 5 covariates (1 pretest + 4 sociodemographics)
d.b <- pmap(dp_s3 %>% 
             mutate(rho2 = as.numeric(.[.$statistic == "75th Percentile", "rho2"]), 
                    rho3 = as.numeric(.[.$statistic == "75th Percentile", "rho3"]), 
                    g3 = 5), 
           custom_mdes.cra3) %>% 
  set_names(unique(dp_s3$statistic))  %>% 
  bind_rows(.id = "statistic")

Plotting

Let’s have a look at the results.

show code

p <- ggplot(bind_rows(d.a, d.b, .id = "design") %>% 
              mutate(statistic = factor(statistic, 
                                        levels = c("25th Percentile", 
                                                   "Median", 
                                                   "75th Percentile"),
                                        labels = c("Small", "Medium", "Large")))) + 
  geom_pointrange(aes(x = statistic, y = mdes, ymin = ci.lb, ymax = ci.ub, col = design), 
                  position = position_dodge(width = 0.5)) + 
  geom_hline(yintercept = .3, lty = "dotted") + 
  scale_y_continuous("MDES", breaks = seq(0, 0.7, 0.1))+
  scale_color_manual(values = c("#2c3e50", "#18bc9c"),
                     labels = c("No covariates", 
                                expression("Pretest + Sociodemographics (with large values of"~rho*")"))) + 
  theme_minimal() + 
  theme(axis.title.x = element_blank(),
        legend.title = element_blank(),
        legend.position = "bottom", 
        plot.title = element_text(hjust = 0.5)) + 
  ggtitle("MDES (with 95% CI) for Normative Distributions of Design Parameters") + 
  annotate("rect", xmin = -Inf, xmax = Inf, ymin = 0.3, ymax = Inf,
           alpha = .2)
p

Team 3 learns that the attainable MDES is 0.31/0.35/0.39 (point estimates) for small/medium/large values of ρ_L2 and ρ_L3. Including both a pretest and sociodemographics as covariates and using the 75th percentiles of the values for ρ_L2 and ρ_L3 (as more conservative upper bounds), the respective values for the MDES reduce to 0.2/0.18/0.14 for small/medium/large values of R² at the various levels. Consequently, Team 3 can be quite confident that their CRT design offers sufficient sensitivity to detect a true intervention effect within the desired range when including both pretests and sociodemographics.

Multisite Trials

The following two exercises cover power analysis for multisite trials (MSTs). The MST design can be seen as an extension of the CRT design. In a two-level multisite individually randomized trial (2L-MSIRT; see Scenario 4), students at L1 are nested within schools at L2 and randomization/treatment assignment occurs at the individual student level L1 within L2 school clusters. In a three-level multisite cluster randomized trial (3L-MSCRT; see Scenario 5), students at L1 are nested within classrooms at L2 which are, in turn, nested within schools at L3. Here, entire classroom subclusters at L2 within L3 school clusters are randomly allocated to experimental conditions.

Scenario 4: Which MDES is Attainable for a 2L-MSIRT?

In this exercise, we employ the worked example of Bloom & Spybrook (2017).

— Bloom & Spybrook (2017, p. 886)

Note that in this 2L-MSIRT, the intervention effects are assumed to vary across sites.

Formula

The MDES for a 2L-MSIRT with randomization/treatment assignment at L1 and random cross-site effects is given in Dong & Maynard (2013, p. 48) as \[ MDES = M_{J-g^*_{L2}-1}\left(\frac{\rho_{L2}\omega_{T_{L2}}(1-R^2_{T_{L2}})}{J}+\frac{(1-\rho_{L2})(1-R^2_{L1})}{P(1-P)Jn} \right)\]

$n$ = harmonic mean number of L1 units per L2 unit
$J$ = number of L2 units
$P$ = $n_{T_{L2}}/n$ = proportion of sample assigned to treatment
$MDES$ = minimum detectable effect size
$M_{K-g^*_{L2}-2}$ = $t_{\alpha/2}+t_{1-\beta}$ = multiplier for two-tailed test with $K-g^*_{L2}-2$ degrees of freedom
$g^*_{L2}$ = number of covariates at L2
$\rho_{L2}$ = achievement differences at L2 (ICC)
$R^2_{L1}$ = explained variance at L1
$\omega_{T_{L2}}$ = $\tau^2_{T_{L2}}/\rho_{L2}$ = treatment effect heterogeneity at L2
$\tau^2_{T_{L2}}$ = $\rho_{L2}\omega_{T_{L2}}$ = effect size variability at L2
$R^2_{T_{L2}}$ = proportion of $\tau^2_{T_{L2}}$ explained by L2 covariates

Assumptions

The design assumptions are as follows.

Target	`PowerUpR` conversion
Minimum detectable effect size MDES of a 2L-MSIRT with random site effects	`mdes.bira2()` or `mdes.bira2r1()`

Statistical test	`PowerUpR` conversion
Two-tailed test	`two-tailed = TRUE` (default)
Significance level α = .05	`alpha = .05` (default)
Power 1-β = .80	`power = .80` (default)

Sample sizes	`PowerUpR` conversion
Harmonic mean number of L1 units per L2 unit n = 50	`n = 50`
Number of L2 units J = 30	`J = 30`
Proportion of sample assigned to treatment P = .60 (unbalanced)	`p = .60`

Design parameters	`PowerUpR` conversion
All designs
ICC at L2 ρ_L2 = .18	`rho2 = .18`
Explained variance at L1 R²_L1 = .38	`r21 = .38`
Design (a)
Treatment effect heterogeneity at L2 ω_TL2 = .25^2/.18	`omega2 = .25^2/.18`
Effect size variability at L2 τ²_TL2 = .25^2	`esv2 = .25^2`
Design (b)
Treatment effect heterogeneity at L2 ω_TL2 = 0	`omega2 = 0` (default)
Effect size variability at L2 τ²_TL2 = 0	`esv2 = 0` (default)

Further parameters	`PowerUpR` conversion
Number of covariates at L2 g*_L2 = 0	`g3 = 0` (default)

Power Analysis with `PowerUpR`

(a) Design with τ_TL2 = .25

# -- use treatment effect heterogeneity argument `omega2` (equivalent to the 'PowerUp!' Excel tool) 
d.a1 <- mdes.bira2r1(rho2 = 0.18, omega2 = 0.25^2/0.18, 
                   p = 0.6, n = 50, J = 30,
                   r21 = 0.38)

## 
## Minimum detectable effect size: 
## --------------------------------------- 
##  0.171 95% CI [0.051,0.292]
## ---------------------------------------
## Degrees of freedom: 29
## Standardized standard error: 0.059
## Type I error rate: 0.05
## Type II error rate: 0.2
## Two-tailed test: TRUE

# -- use effect size variability argument `esv2` (not available in the 'PowerUp!' Excel tool) 
d.a2 <- mdes.bira2r1(rho2 = 0.18, esv2 = 0.25^2, 
                   p = 0.6, n = 50, J = 30,
                   r21 = 0.38)

# check for identity
identical(d.a1[["mdes"]], d.a2[["mdes"]])

## [1] TRUE

(b) Design with τ_TL2 = 0

d.b <- mdes.bira2r1(rho2 = 0.18, omega2 = 0, 
                   p = 0.6, n = 50, J = 30,
                   r21 = 0.38)

## 
## Minimum detectable effect size: 
## --------------------------------------- 
##  0.109 95% CI [0.032,0.186]
## ---------------------------------------
## Degrees of freedom: 29
## Standardized standard error: 0.038
## Type I error rate: 0.05
## Type II error rate: 0.2
## Two-tailed test: TRUE

Plotting

Let’s have a look at the results.

show code

extract_mdes <- function(design) {
  data.frame(mdes = design$mdes[1], 
             ci.lb = design$mdes[2], 
             ci.ub = design$mdes[3])
}

p <- map_df(list(d.a = d.a1, d.b = d.b), extract_mdes, .id = "design") %>% 
  ggplot() + 
  geom_pointrange(aes(x = design, y = mdes, ymin = ci.lb, ymax = ci.ub), 
                  position = position_dodge(width = 0.5), col = "#2c3e50") + 
  scale_x_discrete(label = c(bquote(tau[T[L2]]~"= .25"), bquote(tau[T[L2]]~"= .0")))+
  scale_y_continuous("MDES", breaks = seq(0, 0.3, 0.05), limits = c(0, 0.3))+
  theme_minimal() + 
  theme(axis.title.x = element_blank(),
        legend.title = element_blank(),
        legend.position = "bottom", 
        plot.title = element_text(hjust = 0.5)) + 
  ggtitle("MDES (with 95% CI) by Design")
p

Scenario 5: How Many Schools are Required for a 3L-MSCRT?

Research Team 5 would like to conduct a 3L-MSCRT to study the effects of a new teaching method involving learning software developed to enhance grade 9 students’ English listening comprehension skills in the academic track. In this 3L-MSCRT, students (L1) are nested within classrooms (L2), and classrooms are nested within schools (L3), where randomization/treatment assignment occurs at L2 subclusters within L3 sites (i.e., entire classrooms within schools serving as sites, are randomly assigned to experimental conditions). Team 5 plans to have 4 classrooms with 20 students each. The researchers consider an intervention effect of d = 0.10 policy-relevant. Since the goal of Team 5 is to generalize the study findings to the population of German academic track schools beyond those sampled for their MSCRT, they treat the school (site) effects as random. Team 5 assumes that the effect size variability equals τ²_TL3 = .15². In a first step, the researchers want to know how many schools would have to be sampled in the absence of covariates (design (a)). Moreover, they wonder, how much precision may be improved when adjusting (b) for an English listening pretest, (c) for 4 sociodemographics (including students’ gender, migration background, parents’ educational attainment and families’ HISEI), and (d) when combining these two covariate sets. Opting for a conservative approach, Team 5 Team 3 assumes that covariates will explain considerably less variability in the intervention effect across schools compared to between-school differences in achievement (i.e., 1/10). Following this rationale and having a reliable empirical estimate of R²_L3 for a pretest/sociodemographics/their combination of .90/.87/.97, Team 5 assumes that R² in τ²_TL3 equals R²_TL3 = .09/.09/.10.

Design parameters

Which data frame contains the appropriate design parameters?

# -- choose data frame B5 
dp_s5 <- multides1[["B5_Academic"]] %>% 
  # filter English Listening in grade 9
  filter(grepl("English", domain), grepl("Listening", subdomain), grade == 9) %>% 
  # since there are multiple estimates for that very same subdomain in grade 9
  # (multiple time points: beginning of grade 9 and end of grade 9), 
  # we use the meta-analytically pooled values
  filter(grepl("Pooled", study))

Have a look at the design parameters.

dp_s5 %>% 
  # select ICCs and R-squared values
  select(contains(c("icc", "r2"))) %>% 
  round(2) %>% 
  pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>% 
  kbl(linesep = "") %>% 
  kable_styling(full_width=FALSE, position = "left", 
                  bootstrap_options = c("condensed", "responsive")) %>% 
  scroll_box(height = "350px")

Variable	Value
icc_l2.est	0.19
icc_l2.se	0.02
icc_l3.est	0.07
icc_l3.se	0.02
r2_l1_pretest.est	0.11
r2_l1_pretest.se	0.01
r2_l2_pretest.est	0.57
r2_l2_pretest.se	0.05
r2_l3_pretest.est	0.90
r2_l3_pretest.se	0.04
r2_l1_ses.est	0.01
r2_l1_ses.se	0.00
r2_l2_ses.est	0.72
r2_l2_ses.se	0.07
r2_l3_ses.est	0.87
r2_l3_ses.se	0.04
r2_l1_pretestses.est	0.12
r2_l1_pretestses.se	0.01
r2_l2_pretestses.est	0.87
r2_l2_pretestses.se	0.04
r2_l3_pretestses.est	0.97
r2_l3_pretestses.se	0.01

Formula

The number of schools $K$ for a 3L-MSCRT with randomization/treatment assignment at L2 and random cross-site effects at L3 is given in Dong & Maynard (2013, p. 54) as \[ K = \left(\frac{M_{K-g^*_{L3}-2}}{MDES}\right)^2\left(\rho_{L3}\omega_{T_{L3}}(1-R^2_{T_{L3}})+\frac{\rho_{L2}(1-R^2_{L2})}{P(1-P)J}+\frac{(1-\rho_{L3}-\rho_{L2})(1-R^2_{L1})}{P(1-P)Jn}\right)\]

$n$ = harmonic mean number of L1 units per L2 unit
$J$ = harmonic mean number of L2 units per L3 unit
$K$ = number of L3 units
$P$ = $J_{T_{L2}}/J$ = proportion of sample assigned to treatment
$MDES$ = minimum detectable effect size
$M_{K-g^*_{L3}-2}$ = $t_{\alpha/2}+t_{1-\beta}$ = multiplier for two-tailed test with $K-g^*_{L3}-2$ degrees of freedom
$g^*_{L3}$ = number of covariates at L3
$\rho_{L2}$ = achievement differences at L2 (ICC)
$\rho_{L3}$ = achievement differences at L3 (ICC)
$R^2_{L1}$ = explained variance at L1
$R^2_{L2}$ = explained variance at L2
$R^2_{L3}$ = explained variance at L3
$\omega_{T_{L3}}$ = $\tau^2_{T_{L3}}/\rho_{L3}$ = treatment effect heterogeneity at L3
$\tau^2_{T_{L3}}$ = $\rho_{L3}\omega_{T_{L3}}$ = effect size variability at L3
$R^2_{T_{L3}}$ = proportion of $\tau^2_{T_{L3}}$ explained by L3-covariates

Assumptions

The design assumptions are as follows.

Target	`PowerUpR` conversion
Minimum required sample size of a 3L-MSCRT with random site effects: Number of schools K	`mrss.bcra2()` or `mrss.bcra3r2()`

Statistical test	`PowerUpR` conversion
Two-tailed test	`two-tailed = TRUE` (default)
Significance level α = .05	`alpha = .05` (default)
Power 1-β = .80	`power = .80` (default)

Sample sizes	`PowerUpR` conversion
Harmonic mean number of L1 units per L2 unit n = 20	`n = 20`
Harmonic mean number of L2 units per L3 unit J = 4	`J = 4`
Starting value for K: K₀ = 10	`K0 = 10` (default)
Proportion of sample assigned to treatment P = .50 (balanced)	`p = .50` (default)

Design parameters	`PowerUpR` conversion
All designs
ICC at L2 ρ_L2 = .19	`rho2 = dp_s5$icc_l2.est`
ICC at L3 ρ_L3 = .07	`rho3 = dp_s5$icc_l3.est`
Treatment effect heterogeneity at L3 ω_TL3 = .15^2/.07	`omega3 = .15^2/dp_s5$icc_l2.est`
Effect size variability at L3 τ²_TL3 = .15^2	`esv3 = .15^2`
Design (a)
Explained variance at L1 R²_L1 = 0	`r21 = 0` (default)
Explained variance at L2 R²_L2 = 0	`r22 = 0` (default)
Proportion of τ²_TL3 explained by L3 covariates R²_TL3 = 0	`r2t2 = 0` (default)
Design (b)
Explained variance at L1 R²_L1 = .11	`r21 = dp_s5$r2_l1_pretest.est`
Explained variance at L2 R²_L2 = .57	`r22 = dp_s5$r2_l2_pretest.est`
Proportion of τ²_TL3 explained by L3 covariates R²_TL3 = .09	`r2t2 = .09`
Design (c)
Explained variance at L1 R²_L1 = .01	`r21 = dp_s5$r2_l1_ses.est`
Explained variance at L2 R²_L2 = .72	`r22 = dp_s5$r2_l2_ses.est`
Proportion of τ²_TL3 explained by L3 covariates R²_TL3 = .09	`r2t2 = .09`
Design (d)
Explained variance at L1 R²_L1 = .12	`r21 = dp_s5$r2_l1_pretestses.est`
Explained variance at L2 R²_L2 = .87	`r22 = dp_s5$r2_l2_pretestses.est`
Proportion of τ²_TL3 explained by L3 covariates R²_TL3 = .10	`r2t2 = .10`

Further parameters	`PowerUpR` conversion
All designs
Minimum detectable effect size MDES = .10	`es = .10`
Design (a)
Number of covariates at L3 g*_L3 = 0	`g3 = 0`
Design (b)
Number of covariates at L3 g*_L3 = 1	`g3 = 1`
Design (c)
Number of covariates at L3 g*_L3 = 4	`g3 = 4`
Design (d)
Number of covariates at L3 g*_L3 = 5	`g3 = 5`

Power Analysis with `PowerUpR`

# -- choose data frame B5 
dp_s5 <- multides1[["B5_Academic"]] %>% 
  # filter English Listening in grade 9
  filter(grepl("English", domain), grepl("Listening", subdomain), grade == 9) %>% 
  # since there are multiple estimates for that very same subdomain in grade 9
  # (multiple time points: beginning of grade 9 and end of grade 9), 
  # we use the meta-analytically pooled values
  filter(grepl("Pooled", study))

(a) Design without covariates

# -- use treatment effect heterogeneity argument `omega3` (equivalent to the 'PowerUp!' Excel tool) 
d.a <- mrss.bcra3r2(es = .10, rho2 = dp_s5$icc_l2.est, rho3 = dp_s5$icc_l3.est, 
                     omega3 = .15^2/dp_s5$icc_l3.est, 
                     n = 20, J = 4)

## K = 200

# -- use effect size variability argument `esv3` (not available in the 'PowerUp!' Excel tool) 
d.a <- mrss.bcra3r2(es = .10, rho2 = dp_s5$icc_l2.est, rho3 = dp_s5$icc_l3.est, 
                     esv3 = .15^2, 
                     n = 20, J = 4)

## K = 200

(b) Design with an English listening pretest as covariate

d.b <- mrss.bcra3r2(es = .10, 
                    rho2 = dp_s5$icc_l2.est, rho3 = dp_s5$icc_l3.est, 
                    esv3 = .15^2, 
                    n = 20, J = 4, g3 = 1, 
                    r21 = dp_s5$r2_l1_pretest.est, r22 = dp_s5$r2_l2_pretest.est, r2t3 = .09)

## K = 109

(c) Design with sociodemographics as covariates

d.c <- mrss.bcra3r2(es = .10, 
                    rho2 = dp_s5$icc_l2.est, rho3 = dp_s5$icc_l3.est, 
                    esv3 = .15^2, 
                    n = 20, J = 4, g3 = 4, 
                    r21 = dp_s5$r2_l1_ses.est, r22 = dp_s5$r2_l2_ses.est, r2t3 = .09)

## K = 89

(d) Design with an English listening pretest and sociodemographics as covariates

d.d <- mrss.bcra3r2(es = .10, 
                    rho2 = dp_s5$icc_l2.est, rho3 = dp_s5$icc_l3.est, 
                    esv3 = .15^2, 
                    n = 20, J = 4, g3 = 5, 
                    r21 = dp_s5$r2_l1_pretestses.est, r22 = dp_s5$r2_l2_pretestses.est, r2t3 = .09)

## K = 63

Plotting

Let’s have a look at the results! Move your cursor across the bars to learn about the required number schools for each design and how large the difference in % is compared to the baseline design without covariates.

code

packages <- c("plotly", "htmlwidgets")

invisible(
  lapply(packages,
         function(x) {
           if (!require(x, character.only = TRUE)) {
             install.packages(x, dependencies = TRUE); require(x, character.only = TRUE)}
         }
  )
)


design <- list(d.a, d.b, d.c, d.d) %>% 
  set_names("d.a", "d.b", "d.c", "d.d")

delta <- function(design) (design[["K"]]-d.a[["K"]])/d.a[["K"]]*100


p <- data.frame(design = factor(names(design)), 
                schools = map_dbl(design, ~.[["K"]]), 
                delta = map_dbl(design, delta)) %>% 
  ggplot(aes(design, schools, text = paste(
    factor(design, 
           levels = c("d.a", "d.b", "d.c", "d.d"), 
           labels = c("The baseline design without covariates", 
                      "... adjusting for a pretest", 
                      " ... adjusting for sociodemographics", 
                      " ... adjusting for both a pretest and sociodemographics")),
    " requires K = ", schools, " schools.","\n",
    ifelse(design == "d.a", "", 
           ifelse(design != "d.a" & delta > 0, 
           paste0("This amounts to ", round(delta, 0), 
                  "% more schools compared to the baseline design."),
           ifelse(design != "d.a" & delta == 0, 
           "The required number of schools does not change compared to the baseline design.",
           paste0("This amounts to ", round(delta, 0)*-1, 
                  "% fewer schools compared to the baseline design.")))),
    sep = ""
    ))) +
  geom_bar(stat="identity", width=0.9, fill = "#2c3e50") + 
  scale_x_discrete("Design") + 
  scale_y_continuous("Number of schools") + 
  theme_minimal() + 
  theme(axis.ticks = element_blank(), 
        plot.title = element_text(hjust = 0.5)) + 
  ggtitle("Number of Schools Required for an MDES of .10 by Design")

ggplotly(p, tooltip = "text")

References

Bloom, H. S., & Spybrook, J. (2017). Assessing the precision of multisite trials for estimating the parameters of a cross-site population distribution of program effects. Journal of Research on Educational Effectiveness, 10(4), 877–902. https://doi.org/10.1080/19345747.2016.1271069

Bulus, M., Dong, N., Kelcey, B., & Spybrook, J. (2021). PowerUpR: Power analysis tools for multilevel randomized experiments. R package version 1.1.0. https://CRAN.R-project.org/package=PowerUpR

Dong, N., & Maynard, R. (2013). PowerUp!: A tool for calculating minimum detectable effect sizes and minimum required sample sizes for experimental and quasi-experimental design studies. Journal of Research on Educational Effectiveness, 6(1), 24–67. https://doi.org/10.1080/19345747.2012.673143

Sievert, C. (2020). Interactive web-based data visualization with R, plotly, and shiny. CRC Press, Taylor and Francis Group.

Stallasch, S. E., Lüdtke, O., Artelt, C., & Brunner, M. (2021). Multilevel Design Parameters to Plan Cluster-Randomized Intervention Studies on Student Achievement in Elementary and Secondary School. Journal of Research on Educational Effectiveness, 14(1), 172–206. https://doi.org/10.1080/19345747.2020.1823539

Vaidyanathan, R., Xie, Y., Allaire, J., Cheng, Sievert, C., & Russell, K. (2021). htmlwidgets: HTML widgets for R. R package version 1.5.4. https://CRAN.R-project.org/package=htmlwidgets

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T., Miller, E., Bache, S., Müller, K., Ooms, J., Robinson, D., Seidel, D., Spinu, V., … Yutani, H. (2019). Welcome to the Tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686

Zhu, H. (2021). kableExtra: Construct complex table with “kable” and pipe syntax. R package version 1.3.4. https://CRAN.R-project.org/package=kableExtra

Session Info

if(!require(devtools)) {
  install.packages("devtools"); require(devtools)} 

session_info()

## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.2.0 (2022-04-22 ucrt)
##  os       Windows 10 x64 (build 22000)
##  system   x86_64, mingw32
##  ui       RTerm
##  language (EN)
##  collate  German_Germany.utf8
##  ctype    German_Germany.utf8
##  tz       Europe/Berlin
##  date     2022-05-30
##  pandoc   2.17.1.1 @ C:/Program Files/RStudio/bin/quarto/bin/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version    date (UTC) lib source
##  assertthat    0.2.1      2019-03-21 [1] CRAN (R 4.2.0)
##  backports     1.4.1      2021-12-13 [1] CRAN (R 4.2.0)
##  brio          1.1.3      2021-11-30 [1] CRAN (R 4.2.0)
##  broom         0.8.0      2022-04-13 [1] CRAN (R 4.2.0)
##  bslib         0.3.1      2021-10-06 [1] CRAN (R 4.2.0)
##  cachem        1.0.6      2021-08-19 [1] CRAN (R 4.2.0)
##  callr         3.7.0      2021-04-20 [1] CRAN (R 4.2.0)
##  cellranger    1.1.0      2016-07-27 [1] CRAN (R 4.2.0)
##  cli           3.3.0      2022-04-25 [1] CRAN (R 4.2.0)
##  colorspace    2.0-3      2022-02-21 [1] CRAN (R 4.2.0)
##  crayon        1.5.1      2022-03-26 [1] CRAN (R 4.2.0)
##  crosstalk     1.2.0      2021-11-04 [1] CRAN (R 4.2.0)
##  data.table    1.14.2     2021-09-27 [1] CRAN (R 4.2.0)
##  DBI           1.1.2      2021-12-20 [1] CRAN (R 4.2.0)
##  dbplyr        2.1.1      2021-04-06 [1] CRAN (R 4.2.0)
##  desc          1.4.1      2022-03-06 [1] CRAN (R 4.2.0)
##  devtools    * 2.4.3      2021-11-30 [1] CRAN (R 4.2.0)
##  digest        0.6.29     2021-12-01 [1] CRAN (R 4.2.0)
##  dplyr       * 1.0.9      2022-04-28 [1] CRAN (R 4.2.0)
##  ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.2.0)
##  evaluate      0.15       2022-02-18 [1] CRAN (R 4.2.0)
##  fansi         1.0.3      2022-03-24 [1] CRAN (R 4.2.0)
##  farver        2.1.0      2021-02-28 [1] CRAN (R 4.2.0)
##  fastmap       1.1.0      2021-01-25 [1] CRAN (R 4.2.0)
##  forcats     * 0.5.1      2021-01-27 [1] CRAN (R 4.2.0)
##  fs            1.5.2      2021-12-08 [1] CRAN (R 4.2.0)
##  generics      0.1.2      2022-01-31 [1] CRAN (R 4.2.0)
##  ggplot2     * 3.3.6      2022-05-03 [1] CRAN (R 4.2.0)
##  glue          1.6.2      2022-02-24 [1] CRAN (R 4.2.0)
##  gtable        0.3.0      2019-03-25 [1] CRAN (R 4.2.0)
##  haven         2.5.0      2022-04-15 [1] CRAN (R 4.2.0)
##  highr         0.9        2021-04-16 [1] CRAN (R 4.2.0)
##  hms           1.1.1      2021-09-26 [1] CRAN (R 4.2.0)
##  htmltools     0.5.2      2021-08-25 [1] CRAN (R 4.2.0)
##  htmlwidgets * 1.5.4      2021-09-08 [1] CRAN (R 4.2.0)
##  httr          1.4.3      2022-05-04 [1] CRAN (R 4.2.0)
##  jquerylib     0.1.4      2021-04-26 [1] CRAN (R 4.2.0)
##  jsonlite      1.8.0      2022-02-22 [1] CRAN (R 4.2.0)
##  kableExtra  * 1.3.4      2021-02-20 [1] CRAN (R 4.2.0)
##  klippy        0.0.0.9500 2022-05-18 [1] Github (rlesur/klippy@378c247)
##  knitr         1.39       2022-04-26 [1] CRAN (R 4.2.0)
##  labeling      0.4.2      2020-10-20 [1] CRAN (R 4.2.0)
##  lazyeval      0.2.2      2019-03-15 [1] CRAN (R 4.2.0)
##  lifecycle     1.0.1      2021-09-24 [1] CRAN (R 4.2.0)
##  lubridate     1.8.0      2021-10-07 [1] CRAN (R 4.2.0)
##  magrittr      2.0.3      2022-03-30 [1] CRAN (R 4.2.0)
##  memoise       2.0.1      2021-11-26 [1] CRAN (R 4.2.0)
##  modelr        0.1.8      2020-05-19 [1] CRAN (R 4.2.0)
##  munsell       0.5.0      2018-06-12 [1] CRAN (R 4.2.0)
##  pillar        1.7.0      2022-02-01 [1] CRAN (R 4.2.0)
##  pkgbuild      1.3.1      2021-12-20 [1] CRAN (R 4.2.0)
##  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.2.0)
##  pkgload       1.2.4      2021-11-30 [1] CRAN (R 4.2.0)
##  plotly      * 4.10.0     2021-10-09 [1] CRAN (R 4.2.0)
##  PowerUpR    * 1.1.0      2021-10-25 [1] CRAN (R 4.2.0)
##  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 4.2.0)
##  processx      3.5.3      2022-03-25 [1] CRAN (R 4.2.0)
##  ps            1.7.0      2022-04-23 [1] CRAN (R 4.2.0)
##  purrr       * 0.3.4      2020-04-17 [1] CRAN (R 4.2.0)
##  R6            2.5.1      2021-08-19 [1] CRAN (R 4.2.0)
##  readr       * 2.1.2      2022-01-30 [1] CRAN (R 4.2.0)
##  readxl        1.4.0      2022-03-28 [1] CRAN (R 4.2.0)
##  remotes       2.4.2      2021-11-30 [1] CRAN (R 4.2.0)
##  reprex        2.0.1      2021-08-05 [1] CRAN (R 4.2.0)
##  rlang         1.0.2      2022-03-04 [1] CRAN (R 4.2.0)
##  rmarkdown     2.14       2022-04-25 [1] CRAN (R 4.2.0)
##  rprojroot     2.0.3      2022-04-02 [1] CRAN (R 4.2.0)
##  rstudioapi    0.13       2020-11-12 [1] CRAN (R 4.2.0)
##  rvest         1.0.2      2021-10-16 [1] CRAN (R 4.2.0)
##  sass          0.4.1      2022-03-23 [1] CRAN (R 4.2.0)
##  scales        1.2.0      2022-04-13 [1] CRAN (R 4.2.0)
##  sessioninfo   1.2.2      2021-12-06 [1] CRAN (R 4.2.0)
##  stringi       1.7.6      2021-11-29 [1] CRAN (R 4.2.0)
##  stringr     * 1.4.0      2019-02-10 [1] CRAN (R 4.2.0)
##  svglite       2.1.0      2022-02-03 [1] CRAN (R 4.2.0)
##  systemfonts   1.0.4      2022-02-11 [1] CRAN (R 4.2.0)
##  testthat      3.1.4      2022-04-26 [1] CRAN (R 4.2.0)
##  tibble      * 3.1.7      2022-05-03 [1] CRAN (R 4.2.0)
##  tidyr       * 1.2.0      2022-02-01 [1] CRAN (R 4.2.0)
##  tidyselect    1.1.2      2022-02-21 [1] CRAN (R 4.2.0)
##  tidyverse   * 1.3.1      2021-04-15 [1] CRAN (R 4.2.0)
##  tufte         0.12       2022-01-27 [1] CRAN (R 4.2.0)
##  tzdb          0.3.0      2022-03-28 [1] CRAN (R 4.2.0)
##  usethis     * 2.1.5      2021-12-09 [1] CRAN (R 4.2.0)
##  utf8          1.2.2      2021-07-24 [1] CRAN (R 4.2.0)
##  vctrs         0.4.1      2022-04-13 [1] CRAN (R 4.2.0)
##  viridisLite   0.4.0      2021-04-13 [1] CRAN (R 4.2.0)
##  webshot       0.5.3      2022-04-14 [1] CRAN (R 4.2.0)
##  withr         2.5.0      2022-03-03 [1] CRAN (R 4.2.0)
##  xfun          0.30       2022-03-02 [1] CRAN (R 4.2.0)
##  xml2          1.3.3      2021-11-30 [1] CRAN (R 4.2.0)
##  yaml          2.3.5      2022-02-21 [1] CRAN (R 4.2.0)
## 
##  [1] C:/Users/Sophie Stallasch/AppData/Local/R/win-library/4.2
##  [2] C:/Program Files/R/R-4.2.0/library
## 
## ──────────────────────────────────────────────────────────────────────────────

Workshop on Cluster Randomized Trials – Exercises on Power Analysis with PowerUpR

University of Potsdam, June 9-10, 2022

Sophie E. Stallasch, Martin Brunner, & Larry V. Hedges

Setup

Multilevel Design Parameters

Design Parameters for the German School Context

Overview of Data Frames

Structure of Data Frames

Point Estimates and Standard Errors

Normative Distributions

Achievement Domains

Cluster Randomized Trials

Scenario 1: How Many Schools are Required for a 2L-CRT?

Design Parameters

Formula

Assumptions

Power Analysis with PowerUpR

(a) Design with a math pretest as covariate

(b) … power increases from 80% to 90%?

(c) … adding sociodemographics as covariates?

(d) … no covariates are used?

(e) … larger achievement differences of \(\rho\) = .20 are assumed?

(f) … taking statistical uncertainty into account?

Plotting

Scenario 2: How Many Schools are Required for a 3L-CRT?

Design parameters

Formula

Assumptions

Power Analysis with PowerUpR

(a) Design with a math pretest as covariate

(b) … taking statistical uncertainty into account?

(c) … adding sociodemographics as covariates

(d) Power as outcome: Design without covariates

(e) Power as outcome: Design with 4 sociodemographics as covariates

Plotting

Scenario 3: Which MDES is Attainable for a 3L-CRT?

Design parameters

Formula

Assumptions

Power Analysis with PowerUpR

(a) Design without covariates

(b) Design with a pretest and 4 sociodemographics as covariates

Plotting

Multisite Trials

Scenario 4: Which MDES is Attainable for a 2L-MSIRT?

Formula

Assumptions

Power Analysis with PowerUpR

(a) Design with τTL2 = .25

(b) Design with τTL2 = 0

Plotting

Scenario 5: How Many Schools are Required for a 3L-MSCRT?

Design parameters

Formula

Assumptions

Power Analysis with PowerUpR

(a) Design without covariates

(b) Design with an English listening pretest as covariate

(c) Design with sociodemographics as covariates

(d) Design with an English listening pretest and sociodemographics as covariates

Plotting

References

Session Info

Workshop on Cluster Randomized Trials – Exercises on Power Analysis with `PowerUpR`

Power Analysis with `PowerUpR`

Power Analysis with `PowerUpR`

Power Analysis with `PowerUpR`

Power Analysis with `PowerUpR`

(a) Design with τ_TL2 = .25

(b) Design with τ_TL2 = 0

Power Analysis with `PowerUpR`