Title: | Fit and Tune Models to Detect Treatment Effect Heterogeneity |
---|---|
Description: | Implements methods to fit Virtual Twins models (Foster et al. (2011) <doi:10.1002/sim.4322>) for identifying subgroups with differential effects in the context of clinical trials while controlling the probability of falsely detecting a differential effect when the conditional average treatment effect is uniform across the study population using parameter selection methods proposed in Wolf et al. (2022) <doi:10.1177/17407745221095855>. |
Authors: | Jack Wolf [aut, cre] |
Maintainer: | Jack Wolf <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.3.2 |
Built: | 2024-10-31 18:37:09 UTC |
Source: | https://github.com/jackmwolf/tehtuner |
Find the lowest penalty parameter so that the Step 2 model fit for the estimated CATE from Step 1 is constant for all subjects.
get_mnpp(z, data, step2, Trt, Y, threshold)
get_mnpp(z, data, step2, Trt, Y, threshold)
z |
a numeric vector of estimated CATEs from Step 1 |
data |
a data frame containing a response, binary treatment indicators, and covariates. |
step2 |
a character string specifying the Step 2 model. Supports
" |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
threshold |
for " |
Finds the lowest complexity parameter for a null regression tree fit
get_mnpp.classtree(z, data, Trt, Y, threshold)
get_mnpp.classtree(z, data, Trt, Y, threshold)
z |
a numeric vector of estimated CATEs from Step 1 |
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
threshold |
for " |
the MNPP
Finds the lowest test statistic for a null conditional inference tree
get_mnpp.ctree(z, data, Trt, Y)
get_mnpp.ctree(z, data, Trt, Y)
z |
a numeric vector of estimated CATEs from Step 1 |
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
the MNPP
Finds the lowest penalty parameter for a null lasso model.
get_mnpp.lasso(z, data, Trt, Y)
get_mnpp.lasso(z, data, Trt, Y)
z |
a numeric vector of estimated CATEs from Step 1 |
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
Finds the lowest complexity parameter for a null regression tree fit
get_mnpp.rtree(z, data, Trt, Y)
get_mnpp.rtree(z, data, Trt, Y)
z |
a numeric vector of estimated CATEs from Step 1 |
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
the MNPP
Permute a dataset under the null hypothesis and get the MNPP
get_theta_null(data, Trt, Y, zbar, step1, step2, threshold, ...)
get_theta_null(data, Trt, Y, zbar, step1, step2, threshold, ...)
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
zbar |
the estimated marginal treatment effect |
step1 |
character strings specifying the Step 1 model. Supports
either " |
step2 |
a character string specifying the Step 2 model. Supports
" |
threshold |
for " |
... |
additional arguments to the Step 1 model call. |
the MNPP for the permuted data set
Get the appropriate Step 1 estimation function associated with a method
get_vt1(step1)
get_vt1(step1)
step1 |
character strings specifying the Step 1 model. Supports
either " |
a function that estimates the CATE through Step 1 of Virtual Twins
Get the appropriate Step 2 estimation function associated with a method
get_vt2(step2)
get_vt2(step2)
step2 |
a character string specifying the Step 2 model. Supports
" |
a function that fits a model for the CATE through Step 2 of Virtual Twins
Sets the marginal treatment effect to zero and then permute all treatment indicators.
permute(data, Trt, Y, zbar)
permute(data, Trt, Y, zbar)
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
zbar |
the estimated marginal treatment effect |
a permuted dataset of the same size as data
Prints a Virtual Twins model for the conditional average treatment effect with a tuned Step 2 model.
## S3 method for class 'tunevt' print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'tunevt' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
an object of class |
digits |
the number of significant digits to use when printing. |
... |
further arguments passed to or from other methods. |
An object of class "tunevt"
.
An object of class "tunevt"
is a list containing at least the
following components:
call |
the matched call |
vtmod |
the model estimated by the given |
mnpp |
the MNPP for the estimated CATEs from Step 1. |
theta_null |
a vector of the MNPPs from each permutation under the null hypothesis. |
pvalue |
the probability of observing a MNPP as or more extreme as the observed MNPP under the null hypothesis of no effect heterogeneity. |
z |
if |
Simulated data from a clinical trial with heterogeneous treatment effects where the CATE was a function of V1 and V9.
tehtuner_example
tehtuner_example
A data frame with 1000 rows and 12 columns:
Binary treatment indicator
Continuous response
Continuous covariates
Binary covariates
Fits a conditional inference tree with minimal test statistic theta
and tests if the tree has more than one terminal node.
test_null_theta_ctree(theta, z, data, Trt, Y)
test_null_theta_ctree(theta, z, data, Trt, Y)
theta |
a positive double |
z |
a numeric vector of estimated CATEs from Step 1 |
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
a boolean. True
if theta
is large enough to give a null
conditional inference tree. False
otherwise.
Permutes data under the null hypothesis of a constant treatment effect and
calculates the MNPP on each permuted data set. The 1 - alpha
quantile
of the distribution is taken.
tune_theta( data, Trt, Y, zbar, step1, step2, threshold, alpha0, p_reps, parallel, ... )
tune_theta( data, Trt, Y, zbar, step1, step2, threshold, alpha0, p_reps, parallel, ... )
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
zbar |
the estimated marginal treatment effect |
step1 |
character strings specifying the Step 1 model. Supports
either " |
step2 |
a character string specifying the Step 2 model. Supports
" |
threshold |
for " |
alpha0 |
the nominal Type I error rate. |
p_reps |
the number of permutations to run. |
parallel |
Should the loop over replications be parallelized? If
|
... |
additional arguments to the Step 1 model call. |
the estimated penalty parameter
tunevt
fits a Virtual Twins model to estimate factors and subgroups
associated with differential treatment effects while controlling the Type I
error rate of falsely detecting at least one heterogeneous effect when the
treatment effect is uniform across the study population.
tunevt( data, Y = "Y", Trt = "Trt", step1 = "randomforest", step2 = "rtree", alpha0, p_reps, threshold = NA, keepz = FALSE, parallel = FALSE, ... )
tunevt( data, Y = "Y", Trt = "Trt", step1 = "randomforest", step2 = "rtree", alpha0, p_reps, threshold = NA, keepz = FALSE, parallel = FALSE, ... )
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Y |
a string specifying the name of the column of |
Trt |
a string specifying the name of the column of |
step1 |
character strings specifying the Step 1 model. Supports
either " |
step2 |
a character string specifying the Step 2 model. Supports
" |
alpha0 |
the nominal Type I error rate. |
p_reps |
the number of permutations to run. |
threshold |
for " |
keepz |
logical. Should the estimated CATE from Step 1 be returned? |
parallel |
Should the loop over replications be parallelized? If
|
... |
additional arguments to the Step 1 model call. |
Virtual Twins is a two-step approach to detecting differential treatment
effects. Subjects' conditional average treatment effects (CATEs) are first
estimated in Step 1 using a flexible model. Then, a simple and interpretable
model is fit in Step 2 to model either (1) the expected value of these
estimated CATEs if step2
is equal to "lasso
", "rtree
",
or "ctree
" or (2) the probability that the CATE is greater than a
specified threshold
if step2
is equal to "classtree
".
The Step 2 model is dependent on some tuning parameter. This parameter is
selected to control the Type I error rate by permuting the data under the
null hypothesis of a constant treatment effect and identifying the minimal
null penalty parameter (MNPP), which is the smallest penalty parameter that
yields a Step 2 model with no covariate effects. The 1-alpha0
quantile
of the distribution of is then used to fit the Step 2 model on the original
data.
An object of class "tunevt"
.
An object of class "tunevt"
is a list containing at least the
following components:
call |
the matched call |
vtmod |
the model estimated by the given |
mnpp |
the MNPP for the estimated CATEs from Step 1. |
theta_null |
a vector of the MNPPs from each permutation under the null hypothesis. |
pvalue |
the probability of observing a MNPP as or more extreme as the observed MNPP under the null hypothesis of no effect heterogeneity. |
z |
if |
Foster JC, Taylor JM, Ruberg SJ (2011). “Subgroup identification from randomized clinical trial data.” Statistics in Medicine, 30(24), 2867–2880. ISSN 02776715, doi:10.1002/sim.4322.
Wolf JM, Koopmeiners JS, Vock DM (2022). “A permutation procedure to detect heterogeneous treatment effects in randomized clinical trials while controlling the type I error rate.” Clinical Trials, 19(5), 512-521. ISSN 1740-7745, doi:10.1177/17407745221095855, Publisher: SAGE Publications.
Deng C, Wolf JM, Vock DM, Carroll DM, Hatsukami DK, Leng N, Koopmeiners JS (2023). “Practical guidance on modeling choices for the virtual twins method.” Journal of Biopharmaceutical Statistics. doi:10.1080/10543406.2023.2170404.
data(tehtuner_example) # Low p_reps for example use only tunevt( tehtuner_example, step1 = "lasso", step2 = "rtree", alpha0 = 0.2, p_reps = 5 )
data(tehtuner_example) # Low p_reps for example use only tunevt( tehtuner_example, step1 = "lasso", step2 = "rtree", alpha0 = 0.2, p_reps = 5 )
Check if alpha0 is a valid input to tunevt
validate_alpha0(data, alpha0)
validate_alpha0(data, alpha0)
data |
a data frame containing a response, binary treatment indicators, and covariates. |
alpha0 |
the nominal Type I error rate. |
TRUE if alpha0
is a valid input. Errors otherwise.
Check if p_reps is a valid input to tunevt
validate_p_reps(data, p_reps)
validate_p_reps(data, p_reps)
data |
a data frame containing a response, binary treatment indicators, and covariates. |
p_reps |
the number of permutations to run. |
TRUE if p_reps
is a valid input. Errors otherwise.
Check if Trt is a valid input to tunevt
validate_Trt(data, Trt)
validate_Trt(data, Trt)
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
TRUE if Trt
is a valid input. Errors otherwise.
Check if Y is a valid input to tunevt
validate_Y(data, Y)
validate_Y(data, Y)
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Y |
a string specifying the name of the column of |
TRUE if Y
is a valid input. Errors otherwise.
Estimate the CATE Using the Lasso for Step 1 of Virtual Twins
vt1_lasso(data, Trt, Y, ...)
vt1_lasso(data, Trt, Y, ...)
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
... |
additional arguments to |
Estimated CATEs for each subject in data
.
Other VT Step 1 functions:
vt1_mars()
,
vt1_rf()
,
vt1_super()
Estimate the CATE Using MARS for Step 1 of Virtual Twins
vt1_mars(data, Trt, Y, ...)
vt1_mars(data, Trt, Y, ...)
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
... |
additional arguments to |
Estimated CATEs for each subject in data
.
Other VT Step 1 functions:
vt1_lasso()
,
vt1_rf()
,
vt1_super()
Estimate the CATE Using a Random Forest for Step 1 of Virtual Twins
vt1_rf(data, Trt, Y, ...)
vt1_rf(data, Trt, Y, ...)
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
... |
additional arguments to |
Estimated CATEs for each subject in data
.
Other VT Step 1 functions:
vt1_lasso()
,
vt1_mars()
,
vt1_super()
Estimate the CATE Using Super Learner for Step 1 of Virtual Twins
vt1_super(data, Trt, Y, SL.library, ...)
vt1_super(data, Trt, Y, SL.library, ...)
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
SL.library |
Either a character vector of prediction algorithms or a
list containing character vector. See |
... |
additional arguments to |
Estimated CATEs for each subject in data
.
Other VT Step 1 functions:
vt1_lasso()
,
vt1_mars()
,
vt1_rf()
Estimate the CATE using a classification tree for Step 2
vt2_classtree(z, data, Trt, Y, theta, threshold)
vt2_classtree(z, data, Trt, Y, theta, threshold)
z |
a numeric vector of estimated CATEs from Step 1 |
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
theta |
tree complexity parameter ( |
threshold |
for " |
an object of class rpart
. See
rpart.object
.
Other VT Step 2 functions:
vt2_ctree()
,
vt2_lasso()
,
vt2_rtree()
Estimate the CATE using a conditional inference tree for Step 2
vt2_ctree(z, data, Trt, Y, theta)
vt2_ctree(z, data, Trt, Y, theta)
z |
a numeric vector of estimated CATEs from Step 1 |
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
theta |
the value of the test statistic that must be exceeded in order
to implement a split ( |
An object of class BinaryTree-class
. See
BinaryTree-class
.
Other VT Step 2 functions:
vt2_classtree()
,
vt2_lasso()
,
vt2_rtree()
Estimate the CATE using the Lasso for Step 2
vt2_lasso(z, data, Trt, Y, theta)
vt2_lasso(z, data, Trt, Y, theta)
z |
a numeric vector of estimated CATEs from Step 1 |
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
theta |
lasso penalty parameter ( |
a list of length 3 containing the following elements:
mod |
an object of class |
coefficients |
coefficients associated with the penalty parameter
|
fitted.values |
predicted values associated with the penalty parameter
|
Other VT Step 2 functions:
vt2_classtree()
,
vt2_ctree()
,
vt2_rtree()
Estimate the CATE using a regression tree for Step 2
vt2_rtree(z, data, Trt, Y, theta)
vt2_rtree(z, data, Trt, Y, theta)
z |
a numeric vector of estimated CATEs from Step 1 |
data |
a data frame containing a response, binary treatment indicators, and covariates. |
Trt |
a string specifying the name of the column of |
Y |
a string specifying the name of the column of |
theta |
tree complexity parameter ( |
an object of class rpart
. See
rpart.object
.
Other VT Step 2 functions:
vt2_classtree()
,
vt2_ctree()
,
vt2_lasso()