Exercises Applied Modelling in Drug Development

Stan4PMX, Paris

Sebastian Weber

Novartis Pharma AG

July 4, 2025

Setup

R Session

Note: All shown code can be copied by simply clicking the top right icon of the code block. The code blocks from the main presentation can also be copied by the same icon, which is visible when hovering over the respective block with your mouse.

To run exercises on your hardware please follow the instructions below.

Stan

Please first install the cmdstanr R package following the instructions from their Getting started with CmdStanR article. This also covers installation of the necessary compiler and accompanying C++ toolchain as well as cmdstan itself.

An alternative is to use rstan as described on the wiki page for rstan to get started. In this case, please use as backend for brms the setting rstan. Note that the use of the cmdstanr backend is recommended over rstan allowing to use more recent versions of Stan. Moreover, the instructions for installing a C++ toolchain is better described as part of the respective cmdstan guide chapter.

C++ toolchain installation

Windows: cmdstanr can automatically install the RTools based C++ toolchain via

cmdstanr::check_cmdstan_toolchain(fix=T)

macOS: The project macrtools provides a comprehensive and simple installation of a full toolchain as used by CRAN https://mac.thecoatlessprofessor.com/macrtools/

# install.packages("remotes")
remotes::install_github("coatless-mac/macrtools")
macrtools::macos_rtools_install()

R packages

Please also ensure to have all R packages installed as listed on the next preliminary code section.

# in case packages are missing, please run:
install.packages(c("here", "ggplot2", "dplyr", "knitr", "brms", "posterior", "tidybayes", "RBesT", "ggrepel", "patchwork", "ggdist", "withr", "simsurv", "gt"))

Web-site

To download the full web-site to complement the material you may do so using the following R code snippet:

bamdd_zip <- tempfile(fileext=".zip")
download.file("https://github.com/Novartis/bamdd/archive/refs/heads/main.zip", bamdd_zip)
## extracts web-site into the users home
unzip(bamdd_zip, exdir=normalizePath("~"))
browseURL(normalizePath(file.path("~", "bamdd-main")))
# to install all dependencies needed to build the web-site, please run
source(file.path("~", "bamdd-main", "src", "install_dependencies.R"))

Preliminary code

First load the required packages and set some options as described in the case study.

# uncomment below and set to the filename of you R file for the
# exercises
# here::i_am("your-exercise-file.R")
library(here)
library(ggplot2)
library(dplyr)
library(knitr)
library(brms)
library(posterior)
library(tidybayes)
library(bayesplot)
library(RBesT)
library(ggrepel)
library(patchwork)
library(ggdist)

options(
  # how many processor cores would you like to use?
  mc.cores = 4, 
  # how would you like to access Stan?
  brms.backend = "cmdstanr",
  # cache model binaries
  cmdstanr_write_stan_file_dir=here::here("_brms-cache"),
  # no need to normalize likelihoods
  brms.normalize = FALSE
)

dir.create(here::here("_brms-cache"), FALSE)
set.seed(254335)

# restrict output precision
options(digits=3)

# in case rstan is used as backend, consider to enable model caching
# by enabling the following line
# rstan_options(auto_write = TRUE)

# Set defaults for ggplot2 ----
theme_set(theme_bw(base_size=18))

scale_colour_discrete <- function(...) {
  # Alternative: ggsci::scale_color_nejm(...)
  # scale_colour_brewer(..., palette="Set1")
  ggthemes::scale_colour_colorblind(...)
}
scale_fill_discrete <- function(...) {
  # Alternative: ggsci::scale_fill_nejm(...)
  #scale_fill_brewer(..., palette="Set1")
  ggthemes::scale_fill_colorblind(...)
}
scale_colour_continuous <- function(...) {
  scale_colour_viridis_c(..., option="turbo")
}
update_geom_defaults("point", list(size=2))
update_geom_defaults("line", list(size=1.5))

Resources

Historical Control Data

Exercise setup

Create the region dataset used in the example:

withr::with_seed(654873,
                 AS_region <- bind_cols(RBesT::AS, region=sample(c("asia", "europe", "north_america"), 8, TRUE)))
gt::gt(AS_region)

study	n	r	region
Study 1	107	23	europe
Study 2	44	12	asia
Study 3	51	19	europe
Study 4	39	9	europe
Study 5	139	39	asia
Study 6	20	6	north_america
Study 7	78	9	europe
Study 8	35	10	asia

And fit the model as shown in the case study:

model <- bf(r | trials(n) ~ 1 + (1 | study), family=binomial)

model_prior <- prior(normal(0, 2), class=Intercept) +
    prior(normal(0, 1), class=sd, coef=Intercept, group=study)

map_mc_brms  <- brm(model, AS_region, prior = model_prior,
                    seed = 4767, control=list(adapt_delta=0.99),
                    silent = 2, refresh = 0)
map_mc_brms

 Family: binomial 
  Links: mu = logit 
Formula: r | trials(n) ~ 1 + (1 | study) 
   Data: AS_region (Number of observations: 8) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~study (Number of levels: 8) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.38      0.20     0.06     0.88 1.00     1088     1254

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -1.10      0.19    -1.47    -0.70 1.00     1398     1575

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Exercise 1: Posterior predictive check

Create a posterior predictive check based on the predictive distribution for the response rate.

Steps:

Use posterior_predict to create samples from the predictive distribution of outcomes per trial.
Use sweep(predictive, 2, AS_region$n, "/") to convert these samples from the predictive distribution of the outcome counts to samples from the predictive distribution for responder rates.
Use ppc_intervals from bayesplot to create a plot showing your results. Consider using the with function to refer more easily to the data columns of the analysis data set.
Redo the above using from the tidybayes package the add_predicted_rvars function adding the predictions directly to the analysis data set and then redo task 2 and 3. To convert a rvars column to the respective draws format of bayesplot you can use the as_draws_matrix from the posterior package.

Exercise 2: Fixed vs random effect

Redo the analysis with region, but treat region as a fixed effect. Evaluate the informativeness of the obtained MAP priors. The model formula for brms should look like

region_model_fixed <- bf(r | trials(n) ~ 1 + region + (1 | study), family=binomial)

Steps:

Consider the prior for the region fixed effect first. The reference region is included in the intercept. The reference region is implicitly defined by the first level of the variable region when defined as factor.
- Define asia to be the reference region in the example. Also include a level other in the set of levels.
- Assume that an odds-ratio of 2 between regions can be seen as very large such that a prior of \text{N}(0, (\log(2)/1.96)^2) for the region main effect is adequate.
Obtain the MAP prior for each region by using the AS_region_all data frame defined below and apply posterior_linpred as shown in the case study.
Convert the MCMC samples from the MAP prior distribution into mixture distributions with the same code as in the case study.
Calculate the effective sample size (ESS) for each prior distribution with the ess function from the RBesT R package.

AS_region_all <- data.frame(region=c("asia", "europe", "north_america", "other")) |>
    mutate(study=paste("new_study", region, sep="_"), r=0, n=6)

Exercise 3: Normal endpoint meta-analysis

Run the analysis for the normal endpoint in the crohn data set of RBesT. Refer to the RBesT vignette for a normal endpoint on more details and context.

Steps:

Use as family=gaussian and use the se response modifier in place of trials to specify a known standard error. More details on the additional response information specifiers can be found for the documentation of brmsformula.
Use the same priors as proposed in the vignette from RBesT.
Compare the obtained MAP prior (in MCMC sample form) from RBesT and brms.

Solutions Historical Control

Exercise 1: Posterior predictive check

pp_count_AS_region <- posterior_predict(map_mc_brms)

pp_rate_AS_region <- sweep(pp_count_AS_region, 2, AS_region$n, "/")

head(pp_rate_AS_region) |> kable()

0.299	0.205	0.294	0.359	0.302	0.20	0.231	0.343
0.224	0.250	0.255	0.256	0.230	0.15	0.192	0.200
0.168	0.114	0.333	0.359	0.266	0.15	0.256	0.143
0.318	0.205	0.314	0.333	0.317	0.40	0.321	0.429
0.252	0.250	0.275	0.205	0.245	0.40	0.231	0.314
0.290	0.182	0.157	0.308	0.317	0.30	0.282	0.229

with(AS_region, ppc_intervals(r / n, pp_rate_AS_region))  +
    scale_x_continuous("Study", breaks=1:nrow(AS_region), labels=AS_region$study) +
    ylab("Responder Rate") +
    coord_flip() +
    theme(legend.position="right",
          # suppress vertical grid lines for better readability of intervals
          panel.grid.major.y = element_blank())

pp_AS_region <- AS_region |>
    add_predicted_rvars(map_mc_brms, value="pp_r") |>
    mutate(pp_rate=pp_r/n)

pp_AS_region

# A tibble: 8 × 6
  study       n     r region               pp_r       pp_rate
  <chr>   <dbl> <dbl> <chr>          <rvar[1d]>    <rvar[1d]>
1 Study 1   107    23 europe         24.3 ± 5.5  0.23 ± 0.052
2 Study 2    44    12 asia           11.3 ± 3.6  0.26 ± 0.081
3 Study 3    51    19 europe         16.0 ± 4.4  0.31 ± 0.086
4 Study 4    39     9 europe          9.4 ± 3.3  0.24 ± 0.085
5 Study 5   139    39 asia           37.6 ± 7.1  0.27 ± 0.051
6 Study 6    20     6 north_america   5.3 ± 2.3  0.27 ± 0.116
7 Study 7    78     9 europe         13.5 ± 4.7  0.17 ± 0.060
8 Study 8    35    10 asia            9.4 ± 3.2  0.27 ± 0.090

with(pp_AS_region, ppc_intervals(r / n, as_draws_matrix(pp_rate)))  +
    scale_x_continuous("Study", breaks=1:nrow(AS_region), labels=AS_region$study) +
    ylab("Responder Rate") +
    coord_flip() +
    theme(legend.position="right",
          # suppress vertical grid lines for better readability of intervals
          panel.grid.major.y = element_blank())

Exercise 2: Fixed vs random effect

AS_region_all <- data.frame(region=c("asia", "europe", "north_america", "other")) |>
    mutate(study=paste("new_study", region, sep="_"), r=0, n=6)

## to get brms to include the other factor level in the model, we have
## to add a fake row with region "other" and n=0
AS_region_2 <- mutate(bind_rows(AS_region, mutate(AS_region_all, n=0)[4,]),
                      region=factor(region, levels=c("asia", "europe", "north_america", "other")))

str(AS_region_2)

'data.frame':   9 obs. of  4 variables:
 $ study : chr  "Study 1" "Study 2" "Study 3" "Study 4" ...
 $ n     : num  107 44 51 39 139 20 78 35 0
 $ r     : num  23 12 19 9 39 6 9 10 0
 $ region: Factor w/ 4 levels "asia","europe",..: 2 1 2 2 1 3 2 1 4

model_fixed <- bf(r | trials(n) ~ 1 + region + (1|study), family=binomial)

get_prior(model_fixed, AS_region_2)

                prior     class                coef group resp dpar nlpar lb ub
               (flat)         b                                                
               (flat)         b        regioneurope                            
               (flat)         b regionnorth_america                            
               (flat)         b         regionother                            
 student_t(3, 0, 2.5) Intercept                                                
 student_t(3, 0, 2.5)        sd                                            0   
 student_t(3, 0, 2.5)        sd                     study                  0   
 student_t(3, 0, 2.5)        sd           Intercept study                  0   
       source
      default
 (vectorized)
 (vectorized)
 (vectorized)
      default
      default
 (vectorized)
 (vectorized)

model_fixed_prior <- prior(normal(0, 2), class=Intercept) +
    prior(normal(0, 1), class=sd, coef=Intercept, group=study) +
    prior(normal(0, log(2)/1.96), class=b)

fixed_mc_brms  <- brm(model_fixed, AS_region_2, prior=model_fixed_prior,
                      seed=4767,
                      silent = 2, refresh = 0, control=list(adapt_delta=0.99))

fixed_mc_brms

 Family: binomial 
  Links: mu = logit 
Formula: r | trials(n) ~ 1 + region + (1 | study) 
   Data: AS_region_2 (Number of observations: 9) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~study (Number of levels: 9) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.38      0.22     0.04     0.92 1.00     1314     1612

Regression Coefficients:
                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept              -1.02      0.23    -1.51    -0.57 1.00     1778     1669
regioneurope           -0.17      0.26    -0.66     0.34 1.00     2226     2381
regionnorth_america     0.03      0.31    -0.60     0.63 1.00     3351     2898
regionother             0.00      0.35    -0.68     0.66 1.00     3691     2599

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

post_regions <- posterior_linpred(fixed_mc_brms,
                                  newdata=AS_region_all,
                                  transform=TRUE,
                                  allow_new_levels=TRUE,
                                  sample_new_levels="gaussian")

head(post_regions) |> kable(digits=3)

0.342	0.257	0.193	0.332
0.292	0.179	0.307	0.318
0.222	0.213	0.325	0.228
0.288	0.219	0.231	0.231
0.274	0.264	0.224	0.196
0.241	0.268	0.270	0.161

colnames(post_regions) <- AS_region_all$region
map_region <- list()
for(r in AS_region_all$region) {
    map_region[[r]] <- mixfit(post_regions[,r], type="beta", Nc=3, constrain_gt1=TRUE)
}

kable(bind_rows(lapply(map_region, summary), .id="MAP"), digits=3)

MAP	mean	sd	2.5%	50.0%	97.5%
asia	0.277	0.093	0.118	0.269	0.515
europe	0.243	0.087	0.094	0.232	0.465
north_america	0.284	0.107	0.107	0.274	0.547
other	0.279	0.112	0.096	0.267	0.540

sapply(map_region, ess)

         asia        europe north_america         other 
         31.3          34.9          19.9          16.3

Exercise 3: Normal endpoint meta-analysis

crohn <- RBesT::crohn

crohn_sigma <- 88
crohn$y.se <- crohn_sigma/sqrt(crohn$n)

library(RBesT)
set.seed(1234)
rbest_normal_map_mcmc <- gMAP(cbind(y, y.se) ~ 1 | study, 
                              weights=n, data=crohn,
                              family=gaussian,
                              beta.prior=cbind(0, crohn_sigma),
                              tau.dist="HalfNormal",tau.prior=cbind(0,crohn_sigma/2))

model_normal <- bf(y | se(y.se) ~ 1 + (1 | study), family=gaussian())

prior_normal <- prior(normal(0, 88), class=Intercept) +
    prior(normal(0, 88/2), class=sd, coef=Intercept, group=study)

brms_normal_map_mcmc <- brm(model_normal, crohn, prior=prior_normal,
                            seed=4767,
                            silent = 2, refresh = 0, control=list(adapt_delta=0.99))

## comparing the outputs we see that the random effect posterior
## matches...
rbest_normal_map_mcmc

Generalized Meta Analytic Predictive Prior Analysis

Call:  gMAP(formula = cbind(y, y.se) ~ 1 | study, family = gaussian, 
    data = crohn, weights = n, tau.dist = "HalfNormal", tau.prior = cbind(0, 
        crohn_sigma/2), beta.prior = cbind(0, crohn_sigma))

Exchangeability tau strata: 1 
Prediction tau stratum    : 1 
Maximal Rhat              : 1 
Estimated reference scale : 88 

Between-trial heterogeneity of tau prediction stratum
 mean    sd  2.5%   50% 97.5% 
14.90 10.20  1.41 12.80 41.00 

MAP Prior MCMC sample
 mean    sd  2.5%   50% 97.5% 
-49.6  19.9 -90.7 -48.3 -11.6

brms_normal_map_mcmc

 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y | se(y.se) ~ 1 + (1 | study) 
   Data: crohn (Number of observations: 6) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~study (Number of levels: 6) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)    14.25     10.02     1.07    40.33 1.00      699      885

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept   -49.26      9.10   -68.11   -32.85 1.00     1061     1037

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.00      0.00     0.00     0.00   NA       NA       NA

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

brms_normal_map <- posterior_epred(brms_normal_map_mcmc,
                                   newdata=data.frame(study="new", n=1, y=0, y.se=88),
                                   allow_new_levels=TRUE,
                                   sample_new_levels="gaussian")

## ... and the MAP prior is also the same
summarise_draws(brms_normal_map, mean, sd, ~quantile2(., probs = c(0.025, 0.5, 0.975)))

# A tibble: 1 × 6
  variable  mean    sd  q2.5   q50 q97.5
  <chr>    <dbl> <dbl> <dbl> <dbl> <dbl>
1 ...1     -49.6  19.3 -94.1 -48.2 -12.9

Dose-finding

Peanut allergy exercise overview

This exercises accompanies the dose finding case study.

In a hypothetical phase 2b trial children and adolescents with peanut allergy were randomly assigned to placebo or 5 doses (5 to 300 mg) of a new drug.

After 26 weeks the patients underwent a double-blind placebo-controlled food challenge and the primary endpoint was the proportion of patients that could ingest 600 mg or more of peanut protein without experiencing dose-limiting symptoms.

The plots below show an overview of the data. Note that no placebo patients were considered “responders” per the primary endpoint definition.

Show data setup code

peanut <- tibble(
  TRT01P = structure(c(
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 5L, 5L, 5L, 5L, 5L, 5L,
    5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
    5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
    4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
    4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
    3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
  ),levels = c("PBO", "5 mg", "15 mg", "50 mg", "150 mg", "300 mg"), class = "factor"),
  dose = c(
    300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L,
    300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L,
    300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 150L, 150L, 150L, 150L, 150L, 150L, 150L,
    150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L,
    150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L,
    150L, 150L, 150L, 150L, 150L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L,
    50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L,
    50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L,
    15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L,
    15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
    5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
    5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
    0L, 0L),
  AVAL = c(
    1000L, 100L, 0L, 600L, 600L, 300L, 300L, 10L, 10L, 100L, 1000L, 300L, 10L, 1000L, 1000L, 1000L, 3L, 3L, 600L, 300L,
    1000L, 1000L, 100L, 1000L, 600L, 600L, 1000L, 1000L, 1000L, 300L, 600L, 1000L, 1000L, 1000L, 1000L, 1000L, 300L,
    1000L, 600L, 1000L, 300L, 1000L, 600L, 600L, 1000L, 1000L, 600L, 1000L, 1000L, 1000L, 1000L, 600L, 1000L, 1000L,
    1000L, 0L, 600L, 300L, 1000L, 1000L, 1000L, 100L, 1000L, 1000L, 600L, 1000L, 1000L, 1000L, 600L, 300L, 600L, 600L,
    100L, 600L, 1000L, 300L, 10L, 3L, 1000L, 300L, 300L, 1000L, 300L, 300L, 10L, 1000L, 1000L, 10L, 10L, 600L, 3L, 10L,
    600L, 600L, 600L, 1000L, 100L, 1000L, 1000L, 10L, 1000L, 3L, 10L, 1000L, 100L, 1000L, 100L, 10L, 300L, 10L, 100L,
    1000L, 0L, 1000L, 100L, 3L, 10L, 100L, 100L, 300L, 1000L, 1000L, 1000L, 10L, 1000L, 3L, 3L, 600L, 600L, 10L, 3L,
    600L, 600L, 300L, 300L, 1000L, 3L, 3L, 1000L, 10L, 1000L, 1000L, 600L, 100L, 300L, 600L, 10L, 100L, 0L, 100L, 3L,
    0L, 10L, 3L, 600L, 300L, 300L, 300L, 600L, 300L, 100L, 3L, 0L, 10L, 600L, 300L, 10L, 300L, 600L, 1000L, 600L, 600L,
    0L, 0L, 600L, 600L, 600L, 0L, 0L, 300L, 100L, 0L, 10L, 300L, 1000L, 300L, 600L, 600L, 300L, 10L, 600L, 100L, 100L,
    300L, 3L, 3L, 300L, 1000L, 10L, 3L, 100L, 3L, 100L, 100L, 300L, 3L, 3L, 600L, 300L, 3L, 3L, 3L, 300L, 3L, 0L, 10L,
    3L, 300L, 10L, 10L, 600L, 0L, 300L, 600L, 0L, 0L, 100L, 100L, 10L, 100L, 10L, 100L, 600L, 0L, 600L, 0L, 10L, 100L,
    0L, 0L, 0L, 0L, 3L, 10L, 3L, 300L, 600L, 0L, 0L, 300L, 10L, 10L, 100L, 300L, 0L, 0L, 0L, 3L, 0L, 0L, 10L, 0L, 300L,
    3L, 0L, 0L, 0L, 0L, 100L, 3L, 0L, 3L, 10L, 10L, 3L, 0L, 3L, 10L, 3L, 0L, 0L, 3L, 100L, 0L, 0L, 300L, 0L, 0L, 0L,
    0L, 0L, 0L, 0L, 3L, 0L, 3L, 0L, 0L, 0L, 0L),
  PARAM = structure(
    c(
      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
    levels = "Tolerated amount of peanut protein (mg)", class = "factor"),
  USUBJID = c(
    129L, 103L, 104L, 185L, 34L, 15L, 140L, 151L, 7L, 144L, 200L, 23L, 138L, 115L, 255L, 147L, 224L, 101L, 156L, 284L,
    136L, 248L, 179L, 298L, 168L, 295L, 289L, 241L, 36L, 27L, 123L, 266L, 11L, 291L, 236L, 130L, 173L, 195L, 203L, 160L,
    274L, 167L, 290L, 94L, 9L, 269L, 122L, 135L, 64L, 26L, 95L, 10L, 5L, 234L, 161L, 299L, 88L, 69L, 35L, 233L, 286L,
    85L, 91L, 189L, 80L, 152L, 223L, 287L, 244L, 57L, 108L, 18L, 62L, 157L, 300L, 283L, 164L, 243L, 89L, 220L, 24L, 271L,
    166L, 118L, 201L, 127L, 121L, 41L, 267L, 213L, 49L, 73L, 202L, 134L, 112L, 25L, 227L, 29L, 251L, 273L, 119L, 132L,
    74L, 270L, 83L, 37L, 181L, 258L, 253L, 48L, 120L, 54L, 277L, 176L, 65L, 264L, 107L, 171L, 262L, 162L, 187L, 272L,
    288L, 294L, 245L, 109L, 172L, 204L, 275L, 22L, 66L, 186L, 247L, 17L, 149L, 141L, 177L, 280L, 216L, 40L, 75L, 263L,
    246L, 14L, 81L, 260L, 153L, 45L, 237L, 8L, 117L, 86L, 296L, 146L, 154L, 116L, 38L, 100L, 191L, 175L, 92L, 158L, 192L,
    180L, 256L, 254L, 125L, 222L, 145L, 261L, 155L, 159L, 3L, 206L, 278L, 19L, 31L, 63L, 208L, 55L, 259L, 218L, 111L,
    226L, 33L, 2L, 44L, 297L, 53L, 87L, 16L, 28L, 90L, 207L, 56L, 137L, 128L, 178L, 142L, 143L, 148L, 193L, 229L, 265L,
    97L, 252L, 205L, 150L, 165L, 188L, 52L, 99L, 93L, 1L, 221L, 124L, 210L, 6L, 232L, 21L, 211L, 163L, 96L, 60L, 183L,
    190L, 242L, 42L, 46L, 67L, 126L, 209L, 72L, 194L, 238L, 184L, 39L, 105L, 249L, 61L, 113L, 30L, 77L, 12L, 4L, 51L,
    139L, 20L, 268L, 215L, 292L, 217L, 199L, 32L, 276L, 47L, 225L, 230L, 79L, 71L, 98L, 50L, 13L, 76L, 231L, 250L, 58L,
    68L, 239L, 198L, 293L, 212L, 110L, 59L, 182L, 133L, 170L, 43L, 282L, 281L, 131L, 114L, 196L, 214L, 285L, 70L, 102L,
    279L, 106L, 197L, 257L, 228L, 84L, 235L, 78L, 169L, 240L, 219L, 174L, 82L),
  CRIT1 = structure(c(
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
    levels = "Tolerating >=600 mg of peanut protein without dose-limiting symptoms", class = "factor"),
  CRIT1FL = structure(c(
    2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L,
    2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
    2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L,
    1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L,
    1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L,
    2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L,
    2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L,
    1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
    levels = c("N", "Y"), class = "factor"),
  CRIT2 = structure(c(
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
    levels = "Tolerating >=1000 mg of peanut protein without dose-limiting symptoms", class = "factor"),
  CRIT2FL = structure(c(
    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L,
    2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L,
    1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L,
    1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L,
    1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L,
    2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
    levels = c("N", "Y"), class = "factor" ) )

Code

peanut |>
  ggplot(aes(x=factor(AVAL), fill=TRT01P)) +
  geom_bar( position=position_dodge(preserve = "single")) +
  scale_fill_manual(values=c("royalblue", "#fb6a4a", "#ef3b2c", "#cb181d", "#a50f15", "#67000d")) +
  xlab("Peanut protein tolerated without dose-limiting symptoms (mg)") +
  ylab("Patients") +
  theme(legend.position="right")

Code

peanut |>
  group_by(TRT01P) |>
  summarize(proportion = sum(CRIT1FL=="Y") / n()) |>
  ggplot(aes(x=TRT01P, y=proportion, fill=TRT01P)) +
  geom_bar(stat="identity", position=position_dodge()) +
  geom_text(aes(label=scales::percent(proportion)), size=7, vjust=c(0, rep(1.5, 5))) +
  scale_fill_manual(values=c("royalblue", "#fb6a4a", "#ef3b2c", "#cb181d", "#a50f15", "#67000d")) +
  scale_y_continuous(breaks=seq(0, 0.7, 0.1), labels=scales::percent) +
  xlab("Tolerating >=600 mg of peanut protein without dose-limiting symptoms") +
  ylab("Percentage of patients")

Excercise 1: Emax logistic regression

Fit a sigmoid Emax logistic regression model (i.e. family=binomial(link="logit")). To accelerate likelihood evaluations, first summarize data as “responders” y out of n patients per dose. We tell brms to use this information using y | trials(n) ~ ....

We expect the true placebo proportion to be around 0.05 (-2.944 on the logit scale) with much more than 0.1 or much less than 0.02 considered unlikely. It is a-priori at least possible that there is a huge treatment effect such as 95% versus 5% responders (difference on the log-scale close to 6), but we are at least mildly skeptical.

The dose response is a-priori somewhat likely to follow a Emax curve (with Hill parameter near 1), but we wish to allow for the possibility of a steeper or shallower curve. The dose with half the effect (ED50) might be near 15 mg, but have considerable uncertainty around that.

model1 <- bf( #---- YOUR CODE HERE --- # ~ E0 + Emax * dose^h/(dose^h + ED50^h),
              family= #---- YOUR CODE HERE --- # ,
              nlf(h ~ exp(logh)), nlf(ED50 ~ exp(logED50)),
              E0 ~ 1, logED50 ~ 1, logh ~ 1, Emax ~ 1, 
              nl=TRUE)

priors1 <- prior(normal(-2.944, 0.38), nlpar=E0) +
           prior(normal(0, 1), nlpar=logh) +
           prior(normal(0, 6), nlpar=Emax) +
           prior(normal(2.7, 2.5), nlpar=logED50)

Now fit the model

brmfit1 <- brm(
  formula = model1,
  prior = priors1, 
  data = summarize(peanut, y=sum(CRIT1FL=="Y"), n=n(), .by=c(TRT01P, dose))
)

Now use

tibble(dose = seq(0, 300, 1), n=1) |> tidybayes::add_epred_rvars(brmfit1)

to obtain predicted proportions for every dose from 0 to 300 mg.

Then, plot the curve of predicted proportions for each of these doses using ggplot2 and ggdist (using ydist=.epred in the aesthetics and the stat_lineribbon() geom from the ggdist package).

If you prefer, replace the default fill colors e.g. with shades of blue using + scale_fill_brewer(palette="Blues")

Next obtain the posterior distribution for the difference in proportions for each of these dose levels compared with placebo.

Excercise 2: ordinal outcome and interval censored

Try to perform a more efficient statistical analysis by either using an ordinal outcome or treating the data as interval censored.

Interval censored

Try treating the data about the logarithm of the amount of protein tolerated as interval censored (-\infty to < \log(3), >= \log(3) to < \log(10), …) using logAVAL | cens("interval", logAVALnext) as endpoint and with family = gaussian(). For the latter approach you might have to set log(0) to a very low number such as -28 (approximate \log(\text{weight}) of one single peanut protein molecule) and assume the upper interval end when a patient could tolerate 1000 mg to, say, \log(30000) (approximately 100 peanut kernels). This achieves approximately the same thing as setting these to -\infty and \infty, but is handled better by brms/Stan.

How would you change the prior distributions in this case?

Ordinal outcome aproach

Now use an ordinal outcome via family = cumulative(link = "logit", threshold = "flexible"). Details of this model are described in a journal article, for which there is also a PsyArXiv preprint in case you do not have access to the journal.

Note that the placebo response is described by a series of K-1 ordered thresholds \theta_1, \ldots, \theta_{K-1} on the logit scale for K ordered categories. For the placebo group, the probability of being in categories k=1, \ldots, K-1 is be given by \text{logit}^{-1} \theta_k and for category K by 1-\text{logit}^{-1} \theta_{K-1}. In this case, these parameters will appear in summary(brmfit3) as Intercept[1], …, Intercept[6].

How would you change the model and the prior distributions in this case?

Note, if we had a baseline amount of protein tolerated, we could treat this as a monotonic covariate e.g. using mo(BASE) or assume a particular functional form.

Solutions Dose-Finding

Excercise 1: Emax logistic regression

model1 <- bf( y | trials(n) ~ E0 + Emax * dose^h/(dose^h + ED50^h),
              family=binomial(link="logit"),
              nlf(h ~ exp(logh)), nlf(ED50 ~ exp(logED50)),
              E0 ~ 1, logED50 ~ 1, logh ~ 1, Emax ~ 1, 
              nl=TRUE)

priors1 <- prior(normal(-2.944, 0.38), nlpar=E0) +
           prior(normal(0, 1), nlpar=logh) +
           prior(normal(0, 6), nlpar=Emax) +
           prior(normal(2.7, 2.5), nlpar=logED50)


brmfit1 <- brm(
  formula = model1,
  prior = priors1, 
  data = summarize(peanut, y=sum(CRIT1FL=="Y"), n=n(), .by=c(TRT01P, dose)),
  refresh=0
)

Running MCMC with 4 parallel chains...

Chain 1 finished in 0.2 seconds.
Chain 2 finished in 0.2 seconds.
Chain 3 finished in 0.2 seconds.
Chain 4 finished in 0.2 seconds.

All 4 chains finished successfully.
Mean chain execution time: 0.2 seconds.
Total execution time: 0.3 seconds.

summary(brmfit1)

 Family: binomial 
  Links: mu = logit 
Formula: y | trials(n) ~ E0 + Emax * dose^h/(dose^h + ED50^h) 
         h ~ exp(logh)
         ED50 ~ exp(logED50)
         E0 ~ 1
         logED50 ~ 1
         logh ~ 1
         Emax ~ 1
   Data: summarize(peanut, y = sum(CRIT1FL == "Y"), n = n() (Number of observations: 6) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
E0_Intercept         -3.21      0.34    -3.88    -2.58 1.00     1467     1720
logED50_Intercept     3.63      1.19     1.94     6.50 1.00      913      954
logh_Intercept       -0.59      0.39    -1.26     0.21 1.00      897     1460
Emax_Intercept        5.46      1.58     3.37     9.44 1.00      777     1024

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

tibble(dose = seq(0, 300, 1), n=1) |> 
  tidybayes::add_epred_rvars(brmfit1) |>
  ggplot(aes(x=dose, ydist=.epred)) +
  stat_lineribbon() +
  scale_fill_brewer(palette="Blues") +
  ylab("Model predicted proportion") +
  xlab("Dose [mg]")

pr <- tibble(dose = seq(0, 300, 1), n=1) |> 
    tidybayes::add_epred_rvars(brmfit1)

left_join(pr, filter(pr, dose == 0)|> 
              dplyr::select(-dose) |> 
              rename(.pbo=.epred), 
          by="n") |>
    mutate(diff = .epred - .pbo) |>
    ggplot(aes(x=dose, ydist=diff)) +
    stat_lineribbon() +
    scale_fill_brewer(palette="Blues") +
    ylab("Model predicted difference\nin proportion vs. placebo") +
    xlab("Dose [mg]")

Excercise 2: ordinal outcome and interval censored

Interval censored

model2 <- bf( logAVAL | cens("interval", logAVALnext) ~ E0 + Emax * dose^h/(dose^h + ED50^h),
              family = gaussian(),
              nlf(h ~ exp(logh)), nlf(ED50 ~ exp(logED50)),
              E0 ~ 1, logED50 ~ 1, logh ~ 1, Emax ~ 1, 
              nl=TRUE )

priors2 <- prior(normal(0, 5), nlpar=E0) + # centered on 1 mg, but
    # very uncertain
    prior(normal(0, 1), nlpar=logh) + # No particular reason
    # to change prior (dose
    # response could be
    # just as steep here as
    # for binomial outcome)
    prior(normal(0, 6.5), nlpar=Emax) + # decent prior probability
    # for >= 1000 mg, still conceivable could reach 300000 No
    # particular reason to change prior here (ED50 could be same as
    # for binomial outcome)
    prior(normal(2.7, 2.5), nlpar=logED50) + # Additional parameter: residual SD, clearly a decent amount of
    # variability, want to
    # allow large SD on
    # the log-scale
    prior(lognormal(5, 10), class=sigma) 

brmfit2 <- brm(
  formula = model2,
  prior = priors2,
  data = peanut |> 
    mutate(logAVAL = ifelse(AVAL==0, -28, log(AVAL)),
           logAVALnext = case_when(AVAL==0 ~ log(3),
                                   AVAL==3 ~ log(10),
                                   AVAL==10 ~ log(100),
                                   AVAL==100 ~ log(300),
                                   AVAL==300 ~ log(600),
                                   AVAL==600 ~ log(1000),
                                   TRUE ~ log(30000))),
  refresh=0
  )

Running MCMC with 4 parallel chains...

Chain 1 finished in 5.8 seconds.
Chain 2 finished in 5.9 seconds.
Chain 4 finished in 6.1 seconds.
Chain 3 finished in 6.4 seconds.

All 4 chains finished successfully.
Mean chain execution time: 6.1 seconds.
Total execution time: 6.5 seconds.

summary(brmfit2)

 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: logAVAL | cens("interval", logAVALnext) ~ E0 + Emax * dose^h/(dose^h + ED50^h) 
         h ~ exp(logh)
         ED50 ~ exp(logED50)
         E0 ~ 1
         logED50 ~ 1
         logh ~ 1
         Emax ~ 1
   Data: mutate(peanut, logAVAL = ifelse(AVAL == 0, -28, lo (Number of observations: 300) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
E0_Intercept          0.90      0.40     0.11     1.67 1.00     2385     1902
logED50_Intercept     3.14      1.06     1.75     5.99 1.01     1064      769
logh_Intercept       -0.64      0.34    -1.25     0.05 1.00     1102     1641
Emax_Intercept        7.67      1.83     5.31    12.76 1.00      893      775

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     2.43      0.12     2.21     2.67 1.00     2397     2144

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Ordinal outcome aproach

# Note that we omit E0 from the model, because the expected placebo outcome
# is already described by the intercept parameters of the distribution
model3 <- bf(AVAL ~ Emax * dose^h/(dose^h + ED50^h),
             family = cumulative(link = "logit", threshold = "flexible"),
             nlf(h ~ exp(logh)),
             nlf(ED50 ~ exp(logED50)),
             logED50 ~ 1,
             logh ~ 1,
             Emax ~ 1, 
             nl=TRUE)

priors3 <- prior(normal(0, 1), nlpar=logh) + # No particular reason to
                                             # change prior here
           prior(normal(0, 6), nlpar=Emax) + # Prior could be
                                             # different (odds from
                                             # one category to the
                                             # next not necessarily
                                             # the same as for top two
                                             # vs. rest)
           prior(normal(2.7, 2.5), nlpar=logED50) + # No particular
                                                    # reason to change
                                                    # prior here
           prior(student_t(3, 0, 2.5), class=Intercept) +
           prior(student_t(3, -0.2, 1), class=Intercept, coef=1) +
           prior(student_t(3, 2.197, 1), class=Intercept, coef=5)

brmfit3 <- brm(
  formula = model3,
  prior = priors3,
  data = mutate(peanut, AVAL = ordered(AVAL)),
  refresh=0
  )

Running MCMC with 4 parallel chains...

Chain 1 finished in 16.0 seconds.
Chain 2 finished in 16.7 seconds.
Chain 4 finished in 17.1 seconds.
Chain 3 finished in 18.1 seconds.

All 4 chains finished successfully.
Mean chain execution time: 17.0 seconds.
Total execution time: 18.3 seconds.

summary(brmfit3)

 Family: cumulative 
  Links: mu = logit; disc = identity 
Formula: AVAL ~ Emax * dose^h/(dose^h + ED50^h) 
         h ~ exp(logh)
         ED50 ~ exp(logED50)
         logED50 ~ 1
         logh ~ 1
         Emax ~ 1
   Data: mutate(peanut, AVAL = ordered(AVAL)) (Number of observations: 300) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1]          0.06      0.25    -0.43     0.56 1.00     4027     2490
Intercept[2]          1.02      0.26     0.53     1.54 1.00     4363     2562
Intercept[3]          1.75      0.27     1.25     2.28 1.00     4462     3185
Intercept[4]          2.29      0.28     1.76     2.85 1.00     4671     3188
Intercept[5]          3.04      0.29     2.47     3.63 1.00     4581     2866
Intercept[6]          4.03      0.32     3.42     4.67 1.00     4743     2795
logED50_Intercept     4.00      1.17     2.31     6.83 1.00     1796     1916
logh_Intercept       -0.66      0.30    -1.17    -0.04 1.00     1924     2401
Emax_Intercept        5.71      1.60     3.58     9.63 1.00     1762     1813

Further Distributional Parameters:
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00   NA       NA       NA

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).