This document contains exploratory plots for binary response data as well as the R code that generates these graphs. The plots presented here are based on simulated data (see: PKPD Datasets). Data specifications can be accessed on Datasets and Rmarkdown template to generate this page can be found on Rmarkdown-Template. You may also download the Multiple Ascending Dose PK/PD dataset for your reference (download dataset).
library(gridExtra)
library(ggplot2)
library(dplyr)
library(xgxr)
library(tidyr)
#set seed for random number generator (for plots with jitter)
set.seed(12345)
#flag for labeling figures as draft
status = "DRAFT"
# ggplot settings
xgx_theme_set()
#directories for saving individual graphs
dirs = list(
parent_dir = "Parent_Directory",
rscript_dir = "./",
rscript_name = "Example.R",
results_dir = "./",
filename_prefix = "",
filename = "Example.png")
The plots presented here are based on simulated data (see: PKPD Datasets). You may also download the Multiple Ascending Dose PK/PD dataset for your reference (download dataset).
data <- read.csv("../Data/Multiple_Ascending_Dose_Dataset2.csv")
DOSE_CMT = 1
PD_CMT = 6
SS_PROFDAY = 6 # steady state prof day
PD_PROFDAYS <- c(0, 2, 4, 6)
#ensure dataset has all the necessary columns
data = data %>%
mutate(ID = ID, #ID column
TIME = TIME, #TIME column name
NOMTIME = NOMTIME,#NOMINAL TIME column name
PROFDAY = case_when(
NOMTIME < (SS_PROFDAY - 1)*24 ~ 1 + floor(NOMTIME / 24),
NOMTIME >= (SS_PROFDAY - 1)*24 ~ SS_PROFDAY
), #PROFILE DAY day associated with profile, e.g. day of dose administration
LIDV = LIDV, #DEPENDENT VARIABLE column name
CENS = CENS, #CENSORING column name
CMT = CMT, #COMPARTMENT column
DOSE = DOSE, #DOSE column here (numeric value)
TRTACT = TRTACT, #DOSE REGIMEN column here (character, with units),
DAY_label = ifelse(PROFDAY > 0, paste("Day", PROFDAY), "Baseline")
)
#create a factor for the treatment variable for plotting
data = data %>%
arrange(DOSE) %>%
mutate(TRTACT_low2high = factor(TRTACT, levels = unique(TRTACT)),
TRTACT_high2low = factor(TRTACT, levels = rev(unique(TRTACT)))) %>%
select(-TRTACT)
#create pd binary dataset
pd_data <- data %>%
filter(CMT == PD_CMT) %>%
mutate(LIDV_jitter = jitter(LIDV, amount = 0.1)) %>%
mutate(DAY_jitter = jitter(PROFDAY, amount = 0.1))
#units and labels
time_units_dataset = "hours"
time_units_plot = "days"
trtact_label = "Dose"
dose_label = "Dose (mg)"
pd_label = "Responder Rate (%)"
pd2_label = "Response"
Binary data is data that can take on one of two values. This often happens when there is a characteristic/event/response, etc. of interest that subjects either have/achieve or they don’t. Binary response data can also come out of dichotomizing continuous data. For example in the psoriasis indication the binary response variable PASI90 (yes/no) is defined as subjects achieving a PASI score of at least 90.
There are two broad categories of PK/PD exploratory plots covered on this page
These plots are displayed below.
We start with Response vs Time, or longitudinal plots
Binary data is often summarized by displaying the percent of subjects
in the “yes” category, e.g. the percent of “responders,” along with 95%
confidence intervals. For generating confidence intervals, it is useful
to bin data points by nominal time; confidence intervals should be
calculated using the binom::binom.exact()
function.
Questions to ask when looking at these plots
Warning: Even if you don’t see a plateau in response, that doesn’t mean there isn’t one. Be very careful about using linear models for Dose-Response relationships. Extrapolation outside of the observed dose range could indicate a higher dose is always better (even if it isn’t).
When overlaying individual binomial data, it is often helpful to
stagger the dots and use transparency, so that it is easier to see
individual data points. If you do so, this should be clearly
stated.
gg <- ggplot(data = pd_data, aes(x = NOMTIME, y = LIDV, color = TRTACT_high2low))
gg <- gg + xgx_stat_ci(distribution = "binomial", position = position_dodge(width = 12), alpha = 0.5)
gg <- gg + xgx_scale_x_time_units(units_dataset = time_units_dataset,
units_plot = time_units_plot)
gg <- gg + scale_y_continuous(labels = scales::percent)
gg <- gg + labs(y = pd_label, color = trtact_label)
gg <- gg + xgx_annotate_status(status)
#if saving copy of figure, replace xgx_annotate lines with xgx_save() shown below:
#xgx_save(width,height,dirs,"filename_main",status)
print(gg)
gg <- ggplot(data = pd_data, aes(x = PROFDAY, y = LIDV))
gg <- gg + geom_point(aes(x = DAY_jitter, y = LIDV_jitter), size = 2, alpha = 0.3)
gg <- gg + xgx_stat_ci(conf_level = 0.95, distribution = "binomial", geom = list("line", "ribbon"))
gg <- gg + xgx_scale_x_time_units(units_dataset = time_units_plot,
units_plot = time_units_plot)
gg <- gg + scale_y_continuous(breaks = seq(0, 1, 0.25), labels = scales::percent)
gg <- gg + labs(y = pd_label)
gg <- gg + facet_grid(~TRTACT_low2high)
gg <- gg + xgx_annotate_status(status)
#if saving copy of figure, replace xgx_annotate lines with xgx_save() shown below:
#xgx_save(width,height,dirs,"filename_main",status)
print(gg)
Use spaghetti plots to visualize the extent of variability between individuals. The wider the spread of the profiles, the higher the between subject variability. Distinguish different doses by color, or separate into different panels.
Binary spaghetti plots may not be as informative as spaghetti plots for continuous data, however you can pick out the transition states from 0 to 1 and back. By looking at the transition states, ask yourself these questions: For each dose group, at what time have most of the subjects “responded”? How wide is the spread of “response time”” across subjects (do all of them respond at the exact same time, or over a range of time?) Do you detect any dampening, or transitioning of response back to no response?
gg <- ggplot(data = pd_data,
aes(x = DAY_jitter, y = LIDV_jitter))
gg <- gg + geom_point( size = 2, alpha = 0.3)
gg <- gg + geom_line(aes(group = ID), alpha = 0.3)
gg <- gg + xgx_scale_x_time_units(units_dataset = time_units_plot,
units_plot = time_units_plot)
gg <- gg + scale_y_continuous(breaks = seq(0, 1, 0.25))
gg <- gg + labs(y = pd2_label)
gg <- gg + facet_grid(~TRTACT_low2high)
gg <- gg + xgx_annotate_status(status)
#if saving copy of figure, replace xgx_annotate lines with xgx_save() shown below:
#xgx_save(width,height,dirs,"filename_main",status)
print(gg)
Plot individual profiles in order to inspect them for any unusual
profiles. Look at the pattern of response/non-response over time and
whether individuals switch from responder back to non-responder.
gg <- ggplot(data = pd_data,
aes(x = NOMTIME, y = LIDV))
gg <- gg + geom_point(size = 2)
gg <- gg + geom_line(aes(group = ID))
gg <- gg + xgx_scale_x_time_units(units_dataset = time_units_dataset,
units_plot = time_units_plot)
gg <- gg + facet_wrap(~ID + TRTACT_low2high,
ncol = 10)
gg <- gg + labs(y = pd2_label)
gg <- gg + xgx_annotate_status(status)
#if saving copy of figure, replace xgx_annotate lines with xgx_save() shown below:
#xgx_save(width,height,dirs,"filename_main",status)
print(gg)
Stratify by covariates of interest to explore whether any key covariates impact response. For examples of plots and code startifying by covariate, see Single Ascending Dose Covariate Section
Warning Be careful of interpreting covariate effects on PD. Covariate effects on PD could be the result of covariate effects on PK transfering to PD through the PK/PD relationship.
One of the key questions when looking at PD markers is to determine if there is a dose-response relationship, and if there is, what dose is necessary to achieve the desired effect? Simple dose-response plots can give insight into these questions.
Both linear scale and log scale may be used for dose. Linear scale helps in seeing a plateau at large doses. Log-scale often helps to see all dose groups. If the plot on log-scale looks linear, a log-linear dose-response model could be fit, i.e. Response = Elog*log(1+C).
Questions to consider
Warning: Even if you don’t see an Emax, that doesn’t mean there isn’t one. Be very careful about using linear models for Dose-Response relationships. Extrapolation outside of the observed dose range could indicate a higher dose is always better (even if it isn’t).
pd_data_to_plot <- pd_data %>% subset(PROFDAY == SS_PROFDAY)
gg <- ggplot(data = pd_data_to_plot, aes(x = DOSE,y = LIDV))
gg <- gg + xgx_stat_ci(conf_level = .95,distribution = "binomial", geom = list("point","errorbar"))
gg <- gg + geom_smooth(method = "glm", method.args = list(family = binomial(link = logit)), color = "black")
gg <- gg + scale_y_continuous(breaks = seq(0, 1, 0.25), labels = scales::percent)
gg <- gg + coord_cartesian(ylim = c(0, 1))
gg <- gg + labs(x = dose_label, y = pd_label)
gg <- gg + xgx_annotate_status(status)
pp1 <- gg + ggtitle("Dose on Linear Scale")
## Same plot but on a log scale
pp2 <- gg + scale_x_log10(breaks = unique(pd_data_to_plot$DOSE))+ ggtitle("Dose on Log Scale")
grid.arrange(pp1,pp2,nrow = 1)
The crossectional Dose-Response curve which looks only at one timepoint defined in the protocol can obscure certain characteristics of the dose-response relationship. For example, if the response variable is much delayed compared to PK the maximal PD effect could occur much later than steady state PK is achieved. Looking only at the defined clinical endpoint has the potential miss this, especially in early clinical trials before the time course of the effect has been characterized. In addtion to looking at longitudinal PD over time (as in previous sections above), plotting the cross-sectional Dose-Response curves for different time points throughout the study can also be helpful. In the figure below, we plot the exposure response relationship at a number of different visits to see how the relationsihp might change over time.
pd_data_to_plot <- pd_data %>% subset(PROFDAY %in% PD_PROFDAYS)
gg <- ggplot(data = pd_data_to_plot, aes(x = DOSE,y = LIDV))
gg <- gg + xgx_stat_ci(conf_level = .95,distribution = "binomial", geom = list("point","errorbar"))
gg <- gg + geom_smooth(method = "glm", method.args = list(family = binomial(link = logit)), color = "black")
gg <- gg + scale_y_continuous(breaks = seq(0, 1, 0.25), labels = scales::percent)
gg <- gg + coord_cartesian(ylim = c(0, 1))
gg <- gg + labs(x = dose_label, y = pd_label)
gg + facet_grid(~DAY_label)
gg <- gg + xgx_annotate_status(status)
sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Red Hat Enterprise Linux Server 7.9 (Maipo)
##
## Matrix products: default
## BLAS/LAPACK: /CHBS/apps/EB/software/imkl/2019.1.144-gompi-2019a/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
## [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] DT_0.26 forcats_0.5.2 stringr_1.4.1 purrr_0.3.5 readr_2.1.3 tibble_3.1.8 tidyverse_1.3.2 xgxr_1.1.1 zoo_1.8-11
## [10] gridExtra_2.3 tidyr_1.2.1 dplyr_1.0.10 ggplot2_3.3.6
##
## loaded via a namespace (and not attached):
## [1] googledrive_2.0.0 colorspace_2.0-3 deldir_1.0-6 ellipsis_0.3.2 class_7.3-19 htmlTable_2.2.1 markdown_1.2
## [8] base64enc_0.1-3 fs_1.5.2 gld_2.6.2 rstudioapi_0.14 proxy_0.4-26 farver_2.1.1 Deriv_4.1.3
## [15] fansi_1.0.3 mvtnorm_1.1-3 lubridate_1.8.0 xml2_1.3.3 codetools_0.2-18 splines_4.1.0 cachem_1.0.6
## [22] rootSolve_1.8.2.2 knitr_1.40 Formula_1.2-4 jsonlite_1.8.3 broom_1.0.1 binom_1.1-1 cluster_2.1.3
## [29] dbplyr_2.2.1 png_0.1-7 compiler_4.1.0 httr_1.4.4 backports_1.4.1 assertthat_0.2.1 Matrix_1.5-1
## [36] fastmap_1.1.0 gargle_1.2.1 cli_3.4.1 prettyunits_1.1.1 htmltools_0.5.3 tools_4.1.0 gtable_0.3.1
## [43] glue_1.6.2 lmom_2.8 Rcpp_1.0.9 cellranger_1.1.0 jquerylib_0.1.4 vctrs_0.5.0 nlme_3.1-160
## [50] crosstalk_1.2.0 xfun_0.34 rvest_1.0.3 lifecycle_1.0.3 googlesheets4_1.0.1 MASS_7.3-58.1 scales_1.2.1
## [57] hms_1.1.2 expm_0.999-6 RColorBrewer_1.1-3 yaml_2.3.6 Exact_2.1 pander_0.6.4 sass_0.4.2
## [64] rpart_4.1.16 reshape_0.8.8 latticeExtra_0.6-30 stringi_1.7.8 highr_0.9 e1071_1.7-8 checkmate_2.1.0
## [71] boot_1.3-28 rlang_1.0.6 pkgconfig_2.0.3 bitops_1.0-7 evaluate_0.17 lattice_0.20-45 htmlwidgets_1.5.4
## [78] labeling_0.4.2 tidyselect_1.2.0 GGally_2.1.2 plyr_1.8.7 magrittr_2.0.3 R6_2.5.1 DescTools_0.99.42
## [85] generics_0.1.3 Hmisc_4.7-0 DBI_1.1.3 pillar_1.8.1 haven_2.5.1 foreign_0.8-82 withr_2.5.0
## [92] mgcv_1.8-41 survival_3.4-0 RCurl_1.98-1.4 nnet_7.3-17 crayon_1.5.2 modelr_0.1.9 interp_1.1-2
## [99] utf8_1.2.2 tzdb_0.3.0 rmarkdown_2.17 progress_1.2.2 jpeg_0.1-9 grid_4.1.0 readxl_1.4.1
## [106] minpack.lm_1.2-1 data.table_1.14.2 reprex_2.0.2 digest_0.6.30 munsell_0.5.0 bslib_0.4.0