Overview

This document contains exploratory plots for binary response data as well as the R code that generates these graphs. The plots presented here are based on simulated data (see: PKPD Datasets). Data specifications can be accessed on Datasets and Rmarkdown template to generate this page can be found on Rmarkdown-Template. You may also download the Multiple Ascending Dose PK/PD dataset for your reference (download dataset).

Setup

library(gridExtra)
library(ggplot2)
library(dplyr)
library(xgxr)
library(tidyr)

#set seed for random number generator (for plots with jitter)
set.seed(12345)

#flag for labeling figures as draft
status = "DRAFT"

# ggplot settings
xgx_theme_set()

#directories for saving individual graphs
dirs = list(
  parent_dir = "Parent_Directory",
  rscript_dir  = "./",
  rscript_name = "Example.R",
  results_dir  = "./",
  filename_prefix   = "",
  filename     = "Example.png")

Load Dataset

The plots presented here are based on simulated data (see: PKPD Datasets). You may also download the Multiple Ascending Dose PK/PD dataset for your reference (download dataset).

data <- read.csv("../Data/Multiple_Ascending_Dose_Dataset2.csv")

DOSE_CMT = 1
PD_CMT = 6
SS_PROFDAY = 6 # steady state prof day
PD_PROFDAYS <- c(0, 2, 4, 6)

#ensure dataset has all the necessary columns
data = data %>%
  mutate(ID      = ID,     #ID   column
         TIME    = TIME,   #TIME column name 
         NOMTIME = NOMTIME,#NOMINAL TIME column name
         PROFDAY = case_when(
           NOMTIME < (SS_PROFDAY - 1)*24 ~ 1 + floor(NOMTIME / 24),
           NOMTIME >= (SS_PROFDAY - 1)*24 ~ SS_PROFDAY
         ), #PROFILE DAY day associated with profile, e.g. day of dose administration
         LIDV    = LIDV,   #DEPENDENT VARIABLE column name
         CENS    = CENS,   #CENSORING column name
         CMT     = CMT,    #COMPARTMENT column
         DOSE    = DOSE,   #DOSE column here (numeric value)
         TRTACT  = TRTACT, #DOSE REGIMEN column here (character, with units),
         DAY_label = ifelse(PROFDAY > 0, paste("Day", PROFDAY), "Baseline")
  )

#create a factor for the treatment variable for plotting
data = data %>%
  arrange(DOSE) %>%
  mutate(TRTACT_low2high = factor(TRTACT, levels = unique(TRTACT)),
         TRTACT_high2low = factor(TRTACT, levels = rev(unique(TRTACT)))) %>%
  select(-TRTACT)    

#create pd binary dataset
pd_data <- data %>%
  filter(CMT == PD_CMT) %>%
  mutate(LIDV_jitter = jitter(LIDV, amount = 0.1)) %>%
  mutate(DAY_jitter = jitter(PROFDAY, amount = 0.1))

#units and labels
  time_units_dataset = "hours"
  time_units_plot    = "days"
  trtact_label       = "Dose"
  dose_label         = "Dose (mg)"
  pd_label           = "Responder Rate (%)"
  pd2_label          = "Response"

Binary data is data that can take on one of two values. This often happens when there is a characteristic/event/response, etc. of interest that subjects either have/achieve or they don’t. Binary response data can also come out of dichotomizing continuous data. For example in the psoriasis indication the binary response variable PASI90 (yes/no) is defined as subjects achieving a PASI score of at least 90.

There are two broad categories of PK/PD exploratory plots covered on this page

  1. Exposure and Response vs Time, stratified by dose. You may also have heard these referred to as longitudinal (meaning over time).
  2. Response vs Exposure at a particular time. For binomial response vs exposure plots, fitting a logistic regression is often helpful, as you will see below.

These plots are displayed below.

Provide an overview of the data

We start with Response vs Time, or longitudinal plots

Binary data is often summarized by displaying the percent of subjects in the “yes” category, e.g. the percent of “responders,” along with 95% confidence intervals. For generating confidence intervals, it is useful to bin data points by nominal time; confidence intervals should be calculated using the binom::binom.exact() function.

Responder rate (95% CI) by nominal time, colored or faceted by dose

Questions to ask when looking at these plots

  • Does the response rate increase/decrease over time?
  • Does response change with increasing dose?
  • Do you notice a plateau in time?
  • Do you notice a plateau with dose (Emax)?

Warning: Even if you don’t see a plateau in response, that doesn’t mean there isn’t one. Be very careful about using linear models for Dose-Response relationships. Extrapolation outside of the observed dose range could indicate a higher dose is always better (even if it isn’t).

When overlaying individual binomial data, it is often helpful to stagger the dots and use transparency, so that it is easier to see individual data points. If you do so, this should be clearly stated.

  gg <- ggplot(data = pd_data, aes(x = NOMTIME, y = LIDV, color = TRTACT_high2low))
  gg <- gg + xgx_stat_ci(distribution = "binomial", position = position_dodge(width = 12), alpha = 0.5)
  gg <- gg + xgx_scale_x_time_units(units_dataset = time_units_dataset, 
                                    units_plot    = time_units_plot)
  gg <- gg + scale_y_continuous(labels = scales::percent)
  gg <- gg + labs(y = pd_label, color = trtact_label)  
  gg <- gg + xgx_annotate_status(status)
  
  #if saving copy of figure, replace xgx_annotate lines with xgx_save() shown below:
  #xgx_save(width,height,dirs,"filename_main",status)
  print(gg)

  gg <- ggplot(data = pd_data, aes(x = PROFDAY, y = LIDV))
  gg <- gg + geom_point(aes(x = DAY_jitter, y = LIDV_jitter), size = 2, alpha = 0.3)
  gg <- gg + xgx_stat_ci(conf_level = 0.95, distribution = "binomial", geom = list("line", "ribbon"))

  gg <- gg + xgx_scale_x_time_units(units_dataset = time_units_plot, 
                                    units_plot    = time_units_plot)
  gg <- gg + scale_y_continuous(breaks = seq(0, 1, 0.25), labels = scales::percent)
  gg <- gg + labs(y = pd_label)  
  gg <- gg + facet_grid(~TRTACT_low2high)
  gg <- gg + xgx_annotate_status(status)
  
  #if saving copy of figure, replace xgx_annotate lines with xgx_save() shown below:
  #xgx_save(width,height,dirs,"filename_main",status)
  print(gg)

Explore variability

Spaghetti plots of binary response over time, faceted by dose

Use spaghetti plots to visualize the extent of variability between individuals. The wider the spread of the profiles, the higher the between subject variability. Distinguish different doses by color, or separate into different panels.

Binary spaghetti plots may not be as informative as spaghetti plots for continuous data, however you can pick out the transition states from 0 to 1 and back. By looking at the transition states, ask yourself these questions: For each dose group, at what time have most of the subjects “responded”? How wide is the spread of “response time”” across subjects (do all of them respond at the exact same time, or over a range of time?) Do you detect any dampening, or transitioning of response back to no response?

gg <- ggplot(data = pd_data, 
             aes(x = DAY_jitter, y = LIDV_jitter))
gg <- gg + geom_point( size = 2, alpha = 0.3)
gg <- gg + geom_line(aes(group = ID), alpha = 0.3) 
gg <- gg + xgx_scale_x_time_units(units_dataset = time_units_plot, 
                                    units_plot    = time_units_plot)
gg <- gg + scale_y_continuous(breaks = seq(0, 1, 0.25))
gg <- gg + labs(y = pd2_label)  
gg <- gg + facet_grid(~TRTACT_low2high)
gg <- gg + xgx_annotate_status(status)

#if saving copy of figure, replace xgx_annotate lines with xgx_save() shown below:
#xgx_save(width,height,dirs,"filename_main",status)
print(gg)

Explore irregularities in profiles

Plot individual profiles in order to inspect them for any unusual profiles. Look at the pattern of response/non-response over time and whether individuals switch from responder back to non-responder.

Response over time, faceted by individual, individual line plots

gg <- ggplot(data = pd_data, 
             aes(x = NOMTIME, y = LIDV))
gg <- gg + geom_point(size = 2) 
gg <- gg + geom_line(aes(group = ID)) 
gg <- gg + xgx_scale_x_time_units(units_dataset = time_units_dataset, 
                                    units_plot  = time_units_plot)
gg <- gg + facet_wrap(~ID + TRTACT_low2high, 
                ncol = 10)
gg <- gg + labs(y = pd2_label)  
gg <- gg + xgx_annotate_status(status)

#if saving copy of figure, replace xgx_annotate lines with xgx_save() shown below:
#xgx_save(width,height,dirs,"filename_main",status)
print(gg)

Explore covariate effects on PD

Stratify by covariates of interest to explore whether any key covariates impact response. For examples of plots and code startifying by covariate, see Single Ascending Dose Covariate Section

Warning Be careful of interpreting covariate effects on PD. Covariate effects on PD could be the result of covariate effects on PK transfering to PD through the PK/PD relationship.

Explore Dose-Response relationship

One of the key questions when looking at PD markers is to determine if there is a dose-response relationship, and if there is, what dose is necessary to achieve the desired effect? Simple dose-response plots can give insight into these questions.

Responder rate (95% CI) by dose, for endpoint of interest

Both linear scale and log scale may be used for dose. Linear scale helps in seeing a plateau at large doses. Log-scale often helps to see all dose groups. If the plot on log-scale looks linear, a log-linear dose-response model could be fit, i.e. Response = Elog*log(1+C).

Questions to consider

  • Do you see any relationship?
  • Does response increase or decrease with dose?
  • Is there a plateau (Emax or Emin) with drug effect? At what dose?

Warning: Even if you don’t see an Emax, that doesn’t mean there isn’t one. Be very careful about using linear models for Dose-Response relationships. Extrapolation outside of the observed dose range could indicate a higher dose is always better (even if it isn’t).

pd_data_to_plot <- pd_data %>% subset(PROFDAY == SS_PROFDAY)

gg <- ggplot(data = pd_data_to_plot, aes(x = DOSE,y = LIDV))
gg <- gg + xgx_stat_ci(conf_level = .95,distribution = "binomial", geom = list("point","errorbar"))
gg <- gg + geom_smooth(method = "glm", method.args = list(family = binomial(link = logit)), color = "black")
gg <- gg + scale_y_continuous(breaks = seq(0, 1, 0.25), labels = scales::percent)
gg <- gg + coord_cartesian(ylim = c(0, 1))
gg <- gg + labs(x = dose_label, y = pd_label)
gg <- gg + xgx_annotate_status(status)

pp1 <- gg + ggtitle("Dose on Linear Scale")


## Same plot but on a log scale
pp2 <- gg + scale_x_log10(breaks = unique(pd_data_to_plot$DOSE))+ ggtitle("Dose on Log Scale")

grid.arrange(pp1,pp2,nrow = 1)

Responder rate (95% CI) by dose, faceted by visit

The crossectional Dose-Response curve which looks only at one timepoint defined in the protocol can obscure certain characteristics of the dose-response relationship. For example, if the response variable is much delayed compared to PK the maximal PD effect could occur much later than steady state PK is achieved. Looking only at the defined clinical endpoint has the potential miss this, especially in early clinical trials before the time course of the effect has been characterized. In addtion to looking at longitudinal PD over time (as in previous sections above), plotting the cross-sectional Dose-Response curves for different time points throughout the study can also be helpful. In the figure below, we plot the exposure response relationship at a number of different visits to see how the relationsihp might change over time.

pd_data_to_plot <- pd_data %>% subset(PROFDAY %in% PD_PROFDAYS)

gg <- ggplot(data = pd_data_to_plot, aes(x = DOSE,y = LIDV))
gg <- gg + xgx_stat_ci(conf_level = .95,distribution = "binomial", geom = list("point","errorbar"))
gg <- gg + geom_smooth(method = "glm", method.args = list(family = binomial(link = logit)), color = "black")
gg <- gg + scale_y_continuous(breaks = seq(0, 1, 0.25), labels = scales::percent)
gg <- gg + coord_cartesian(ylim = c(0, 1))
gg <- gg + labs(x = dose_label, y = pd_label)
gg + facet_grid(~DAY_label)

gg <- gg + xgx_annotate_status(status)

R Session Info

sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Red Hat Enterprise Linux Server 7.9 (Maipo)
## 
## Matrix products: default
## BLAS/LAPACK: /CHBS/apps/EB/software/imkl/2019.1.144-gompi-2019a/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
##  [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] DT_0.26         forcats_0.5.2   stringr_1.4.1   purrr_0.3.5     readr_2.1.3     tibble_3.1.8    tidyverse_1.3.2 xgxr_1.1.1      zoo_1.8-11     
## [10] gridExtra_2.3   tidyr_1.2.1     dplyr_1.0.10    ggplot2_3.3.6  
## 
## loaded via a namespace (and not attached):
##   [1] googledrive_2.0.0   colorspace_2.0-3    deldir_1.0-6        ellipsis_0.3.2      class_7.3-19        htmlTable_2.2.1     markdown_1.2       
##   [8] base64enc_0.1-3     fs_1.5.2            gld_2.6.2           rstudioapi_0.14     proxy_0.4-26        farver_2.1.1        Deriv_4.1.3        
##  [15] fansi_1.0.3         mvtnorm_1.1-3       lubridate_1.8.0     xml2_1.3.3          codetools_0.2-18    splines_4.1.0       cachem_1.0.6       
##  [22] rootSolve_1.8.2.2   knitr_1.40          Formula_1.2-4       jsonlite_1.8.3      broom_1.0.1         binom_1.1-1         cluster_2.1.3      
##  [29] dbplyr_2.2.1        png_0.1-7           compiler_4.1.0      httr_1.4.4          backports_1.4.1     assertthat_0.2.1    Matrix_1.5-1       
##  [36] fastmap_1.1.0       gargle_1.2.1        cli_3.4.1           prettyunits_1.1.1   htmltools_0.5.3     tools_4.1.0         gtable_0.3.1       
##  [43] glue_1.6.2          lmom_2.8            Rcpp_1.0.9          cellranger_1.1.0    jquerylib_0.1.4     vctrs_0.5.0         nlme_3.1-160       
##  [50] crosstalk_1.2.0     xfun_0.34           rvest_1.0.3         lifecycle_1.0.3     googlesheets4_1.0.1 MASS_7.3-58.1       scales_1.2.1       
##  [57] hms_1.1.2           expm_0.999-6        RColorBrewer_1.1-3  yaml_2.3.6          Exact_2.1           pander_0.6.4        sass_0.4.2         
##  [64] rpart_4.1.16        reshape_0.8.8       latticeExtra_0.6-30 stringi_1.7.8       highr_0.9           e1071_1.7-8         checkmate_2.1.0    
##  [71] boot_1.3-28         rlang_1.0.6         pkgconfig_2.0.3     bitops_1.0-7        evaluate_0.17       lattice_0.20-45     htmlwidgets_1.5.4  
##  [78] labeling_0.4.2      tidyselect_1.2.0    GGally_2.1.2        plyr_1.8.7          magrittr_2.0.3      R6_2.5.1            DescTools_0.99.42  
##  [85] generics_0.1.3      Hmisc_4.7-0         DBI_1.1.3           pillar_1.8.1        haven_2.5.1         foreign_0.8-82      withr_2.5.0        
##  [92] mgcv_1.8-41         survival_3.4-0      RCurl_1.98-1.4      nnet_7.3-17         crayon_1.5.2        modelr_0.1.9        interp_1.1-2       
##  [99] utf8_1.2.2          tzdb_0.3.0          rmarkdown_2.17      progress_1.2.2      jpeg_0.1-9          grid_4.1.0          readxl_1.4.1       
## [106] minpack.lm_1.2-1    data.table_1.14.2   reprex_2.0.2        digest_0.6.30       munsell_0.5.0       bslib_0.4.0