Title: | Super Learner for Survival Prediction from Censored Data |
---|---|
Description: | Several functions and S3 methods to construct a super learner in the presence of censored times-to-event and to evaluate its prognostic capacities. |
Authors: | Yohann Foucher [aut, cre] , Camille Sabathe [aut] |
Maintainer: | Yohann Foucher <[email protected]> |
License: | GPL (>=2) |
Version: | 0.97 |
Built: | 2025-01-30 10:26:57 UTC |
Source: | https://github.com/foucher-y/survivalsl |
A data frame with 1912 French kidney transplant recipients from the DIVAT cohort.
data(dataDIVAT2)
data(dataDIVAT2)
A data frame with the 4 following variables:
age
This numeric vector provides the age of the recipient at the transplantation (in years).
hla
This numeric vector provides the indicator of transplantations with at least 4 HLA incompatibilities between the donor and the recipient (1 for high level and 0 otherwise).
retransplant
This numeric vector provides the indicator of re-transplantation (1 for more than one transplantation and 0 for first kidney transplantation).
ecd
The Expended Criteria Donor (1 for transplantations from ECD and 0 otherwise). ECD are defined by widely accepted criteria, which includes donors older than 60 years of age or 50-59 years of age with two of the following characteristics: history of hypertension, cerebrovascular accident as the cause of death or terminal serum creatinine higher than 1.5 mg/dL.
times
This numeric vector is the follow up times of each patient.
failures
This numeric vector is the event indicator (0=right censored, 1=event). An event is considered when return in dialysis or patient death with functioning graft is observed.
URL: www.divat.fr
Le Borgne F, Giraudeau B, Querard AH, Giral M and Foucher Y. Comparisons of the performances of different statistical tests for time-to-event analysis with confounding factors: practical illustrations in kidney transplantation. Statistics in medicine. 30;35(7):1103-16, 2016. <doi:10.1002/ sim.6777>
data(dataDIVAT2) # Compute the non-adjusted Hazard Ratio related to the ECD versus SCD cox.ecd<-coxph(Surv(times, failures) ~ ecd, data=dataDIVAT2) summary(cox.ecd) # Hazard Ratio = 1.97
data(dataDIVAT2) # Compute the non-adjusted Hazard Ratio related to the ECD versus SCD cox.ecd<-coxph(Surv(times, failures) ~ ecd, data=dataDIVAT2) summary(cox.ecd) # Hazard Ratio = 1.97
A data frame with 4267 French kidney transplant recipients.
data(dataDIVAT3)
data(dataDIVAT3)
A data frame with 4267 observations for the 8 following variables.
ageR
This numeric vector represents the age of the recipient (in years)
sexeR
This numeric vector represents the gender of the recipient (1=men, 0=female)
year.tx
This numeric vector represents the year of the transplantation
ante.diab
This numeric vector represents the diabetes statute (1=yes, 0=no)
pra
This numeric vector represents the pre-graft immunization using the panel reactive antibody (1=detectable, 0=undetectable)
ageD
This numeric vector represents the age of the donor (in years)
death.time
This numeric vector represents the follow up time in days (until death or censoring)
death
This numeric vector represents the death indicator at the follow-up end (1=death, 0=alive)
URL: www.divat.fr
Le Borgne et al. Standardized and weighted time-dependent ROC curves to evaluate the intrinsic prognostic capacities of a marker by taking into account confounding factors. Manuscript submitted. Stat Methods Med Res. 27(11):3397-3410, 2018. <doi: 10.1177/ 0962280217702416.>
data(dataDIVAT3) ### a short summary of the recipient age at the transplantation summary(dataDIVAT3$ageR) ### Kaplan and Meier estimation of the recipient survival plot(survfit(Surv(death.time/365.25, death) ~ 1, data = dataDIVAT3), xlab="Post transplantation time (in years)", ylab="Patient survival", mark.time=FALSE)
data(dataDIVAT3) ### a short summary of the recipient age at the transplantation summary(dataDIVAT3$ageR) ### Kaplan and Meier estimation of the recipient survival plot(survfit(Surv(death.time/365.25, death) ~ 1, data = dataDIVAT3), xlab="Post transplantation time (in years)", ylab="Patient survival", mark.time=FALSE)
A data frame with 1300 simulated French patients with multiple sclerosis from the OFSEP cohort. The baseline is 1 year after the initiation of the first-line treatment.
data(dataOFSEP)
data(dataOFSEP)
A data frame with 1300 observations for the 3 following variables:
time
This numeric vector represents the follow up time in years (until disease progression or censoring)
event
This numeric vector represents the disease progression indicator at the follow-up end (1=progression, 0=censoring)
age
This numeric vector represents the patient age (in years) at baseline.
duration
This numeric vector represents the disease duration (in days) at baseline.
period
This numeric vector represents the calendar period: 1 in-between 2014 and 2018, and 0 otherwise.
gender
This numeric vector represents the gender: 1 for women.
relapse
This numeric vector represents the diagnosis of at least one relapse since the treatment initiation : 1 if at leat one event, and 0 otherwise.
edss
This vector of character string represents the EDSS level: "miss" for missing, "low" for EDSS between 0 to 2, and "high" otherwise.
t1
This vector of character string represents the new gadolinium-enhancing T1 lesion: "missing", "0" or "1+" for at least 1 lesion.
t2
This vector of character string represents the new T2 lesions: "no" or "yes".
rio
This numeric vector represents the modified Rio score.
data(dataOFSEP) ### Kaplan and Meier estimation of the disease progression free survival plot(survfit(Surv(time, event) ~ 1, data = dataOFSEP), ylab="Disease progression free survival", xlab="Time after the first anniversary of the first-line treatment in years")
data(dataOFSEP) ### Kaplan and Meier estimation of the disease progression free survival plot(survfit(Surv(time, event) ~ 1, data = dataOFSEP), ylab="Disease progression free survival", xlab="Time after the first anniversary of the first-line treatment in years")
Fit an AFT parametric model with a gamma distribution.
LIB_AFTgamma(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_AFTgamma(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
The model is obtained by using the dist="gamma"
in the flexsurvreg
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_AFTgamma(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_AFTgamma(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit an AFT parametric model with a generalized gamma distribution.
LIB_AFTggamma(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_AFTggamma(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
The model is obtained by using the dist="gengamma"
in the flexsurvreg
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_AFTggamma(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_AFTggamma(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit an AFT parametric model with a log logistic distribution.
LIB_AFTllogis(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_AFTllogis(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
The model is obtained by using the dist="llogis"
in the flexsurvreg
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_AFTllogis(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_AFTllogis(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit an AFT parametric model with a Weibull distribution.
LIB_AFTweibull(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_AFTweibull(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
The model is obtained by using the dist="weibull"
in the flexsurvreg
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_AFTweibull(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_AFTweibull(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit a Cox regression for a selection of covariate.
LIB_COXaic(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, final.model)
LIB_COXaic(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, final.model)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
final.model |
The covariates to consider |
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, https://www.jstatsoft.org/v39/i05/
data(dataDIVAT2) # The estimation of the model model <- LIB_COXaic(times="times", failures="failures", data=dataDIVAT2, final.model=c("age"), cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model model <- LIB_COXaic(times="times", failures="failures", data=dataDIVAT2, final.model=c("age"), cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit a Cox regression for all covariates to be used in the super learner.
LIB_COXall(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_COXall(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
The Cox regression is obtained by using the survival
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Terry M. Therneau (2021). A Package for Survival Analysis in R. R package version 3.2-13, https://CRAN.R-project.org/package=survival.
data(dataDIVAT2) # The estimation of the model model <- LIB_COXall(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model model <- LIB_COXall(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit an elastic net Cox regression for fixed values of the regularization parameters.
LIB_COXen(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, alpha, lambda)
LIB_COXen(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, alpha, lambda)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
alpha |
The value of the regularization parameter alpha for penalizing the partial likelihood. |
lambda |
The value of the regularization parameter lambda for penalizing the partial likelihood. |
The elastic net Cox regression is obtained by using the glmnet
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, https://www.jstatsoft.org/v39/i05/
data(dataDIVAT2) # The estimation of the model model <- LIB_COXen(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=.1, alpha=.1) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model model <- LIB_COXen(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=.1, alpha=.1) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit a Lasso Cox regression for a fixed value of the regularization parameter.
LIB_COXlasso(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, lambda)
LIB_COXlasso(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, lambda)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
lambda |
The value of the regularization parameter lambda for penalizing the partial likelihood. |
The Lasso Cox regression is obtained by using the glmnet
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, https://www.jstatsoft.org/v39/i05/
data(dataDIVAT2) # The estimation of the model model <- LIB_COXlasso(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=1) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model model <- LIB_COXlasso(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=1) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit a ridge Cox regression for a fixed value of the regularization parameter.
LIB_COXridge(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, lambda)
LIB_COXridge(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, lambda)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
lambda |
The value of the regularization parameter lambda for penalizing the partial likelihood. |
The ridge Cox regression is obtained by using the glmnet
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, https://www.jstatsoft.org/v39/i05/
data(dataDIVAT2) # The estimation of the model model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=1) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=1) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit a PH model with an Exponential distribution.
LIB_PHexponential(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_PHexponential(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
The model is obtained by using the dist="exp"
in the flexsurvreg
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_PHexponential(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_PHexponential(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit a PH parametric model with a Gompertz distribution.
LIB_PHgompertz(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_PHgompertz(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
The model is obtained by using the dist="gompertz"
in the flexsurvreg
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit an PH model with a survival function is modelled as a natural cubic spline function.
LIB_PHspline(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, k)
LIB_PHspline(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, k)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
k |
Number of knots. |
The model is obtained by using the scale="hazard"
in the flexsurvreg
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
hazard |
A vector of numeric values with the values of the cumulative baseline hazard function at the prediction |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08
data(dataDIVAT2) # The estimation of the model from the first 200 lignes with two knots model <- LIB_PHspline(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), k=2) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model from the first 200 lignes with two knots model <- LIB_PHspline(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), k=2) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit a neural network based on the partial logistic regression.
LIB_PLANN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, inter, size, decay, maxit, MaxNWts)
LIB_PLANN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, inter, size, decay, maxit, MaxNWts)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
inter |
The length of the intervals. |
size |
The number of units in the hidden layer. |
decay |
The parameter for weight decay. |
maxit |
The maximum number of iterations. |
MaxNWts |
The maximum allowable number of weights. |
This function is based is based on the survivalPLANN
from the related package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Biganzoli E, Boracchi P, Mariani L, and et al. Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. Stat Med, 17:1169-86, 1998.
data(dataDIVAT2) # The neural network based from the first 300 individuals of the data base model <- LIB_PLANN(times="times", failures="failures", data=dataDIVAT2[1:300,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), inter=0.5, size=32, decay=0.01, maxit=100, MaxNWts=10000) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The neural network based from the first 300 individuals of the data base model <- LIB_PLANN(times="times", failures="failures", data=dataDIVAT2[1:300,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), inter=0.5, size=32, decay=0.01, maxit=100, MaxNWts=10000) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit survival random forest tree for given values of the regularization parameters.
LIB_RSF(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, nodesize, mtry, ntree)
LIB_RSF(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, nodesize, mtry, ntree)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
nodesize |
The value of the node size. |
mtry |
The number of variables randomly sampled as candidates at each split. |
ntree |
The number of trees. |
The survival random forest tree is obtained by using the randomForestSRC
package.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, https://www.jstatsoft.org/v39/i05/
data(dataDIVAT2) # The estimation of the model model <- LIB_RSF(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), nodesize=10, mtry=2, ntree=100) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) # The estimation of the model model <- LIB_RSF(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), nodesize=10, mtry=2, ntree=100) # The predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
Fit a 1-layer neural network based on the partial likelihood from a Cox proportional hazards model.
LIB_SNN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, n.nodes, decay, batch.size, epochs)
LIB_SNN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, n.nodes, decay, batch.size, epochs)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
n.nodes |
The number of hidden nodes. |
decay |
The value of the weight decay. |
batch.size |
The value of batch size. |
epochs |
The value of epochs. |
This function is based is based on the deepsurv
from the survivalmodels
package. You need to call Python using reticulate
. In order to use it, the required Python packages must be installed with reticulate::py_install
. Therefore, before running the present LIB_SNN
function, you must install and call for the reticulate
and survivalmodels
packages, and install pycox
by using the following command: install_pycox(pip = TRUE, install_torch = FALSE)
. The survivalSL
package functions without these supplementary installations if this learner is not included in the library.
model |
The estimated model. |
group |
The name of the variable related to the exposure/treatment. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. |
data |
The data frame used for learning. The first column is entitled |
times |
A vector of numeric values with the times of the |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
Katzman, J. L., Shaham, U., Cloninger, A., Bates, J., Jiang, T., & Kluger, Y. (2018). DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18(1), 24. https://doi.org/10.1186/s12874-018-0482-1
Compute several metrics to evaluate the prognostic capacities with time-to-event data.
metrics(times, failures, data, prediction.matrix, prediction.times, metric, pro.time=NULL, ROC.precision=seq(.01, .99, by=.01))
metrics(times, failures, data, prediction.matrix, prediction.times, metric, pro.time=NULL, ROC.precision=seq(.01, .99, by=.01))
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
data |
A data frame for in which to look for the variables related to the status of the follow-up time ( |
prediction.matrix |
A matrix with the predictions of survivals of each subject (lines) for each prognostic times (columns). |
prediction.times |
A vector of numeric values with the times of the |
metric |
The metric to compute. See details. |
pro.time |
This optional value of prognostic time represents the maximum delay for which the capacity of the variable is evaluated. The same unit than the one used in the argument times. Not used for the following metrics: "loglik", "ibs", "bll", and "ibll". Default value is the time at which half of the subjects are still at risk. |
ROC.precision |
An optional argument with the percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. Only used when |
The following metrics can be used: "bs" for the Brier score at the prognostic time pro.time
, "ci" for the concordance index at the prognostic time pro.time
, "loglik" for the log-likelihood, "ibs" for the integrated Brier score up to the last observed time of event, "ibll" for the integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial log-likelihood, "ribs" for the restricted integrated Brier score up to the prognostic time pro.time
, "ribll" for the restricted integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial log-likelihood, "auc" for the area under the time-dependent ROC curve up to the prognostic time pro.time
.
A numeric value with the metric estimation.
data(dataDIVAT2) # The estimation of the model model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=1) # The apparent AUC at 10-year post-transplantation metrics(times="times", failures="failures", data=dataDIVAT2, prediction.matrix=model$predictions, prediction.times=model$times, metric="auc", pro.time=10) # The integrated Brier score up to 10 years post-transplanation metrics(times="times", failures="failures", data=dataDIVAT2, prediction.matrix=model$predictions, prediction.times=model$times, metric="ribs", pro.time=10)
data(dataDIVAT2) # The estimation of the model model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=1) # The apparent AUC at 10-year post-transplantation metrics(times="times", failures="failures", data=dataDIVAT2, prediction.matrix=model$predictions, prediction.times=model$times, metric="auc", pro.time=10) # The integrated Brier score up to 10 years post-transplanation metrics(times="times", failures="failures", data=dataDIVAT2, prediction.matrix=model$predictions, prediction.times=model$times, metric="ribs", pro.time=10)
A calibration plot of an object of the class libsl
(library of survival super learner).
## S3 method for class 'libsl' plot(x, n.groups=5, pro.time=NULL, newdata=NULL, times=NULL, failures=NULL, ...)
## S3 method for class 'libsl' plot(x, n.groups=5, pro.time=NULL, newdata=NULL, times=NULL, failures=NULL, ...)
x |
An object returned by a library of survival super learner. |
n.groups |
A numeric value with the number of groups by their class probabilities. The default is 5. |
pro.time |
The prognostic time at which the calibration plot of the survival probabilities. |
newdata |
An optional data frame containing the new sample for validation with covariate values, follow-up times, and event status. The default value is |
times |
The name of the variable related the numeric vector with the follow-up times in |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event) in |
... |
Additional arguments affecting the plot. |
The plot represents the observed survival and the related 95% confidence intervals, which are respectively estimated by the Kaplan and Meier estimator and the Greenwood formula, against the mean of the predictive values for individuals stratified into groups of the same size according to the percentiles. The identity line is usually included for reference.
No return value for this S3 method.
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_COXall(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The calibration plot from the validation sample of 150 patients plot(model, n.groups=5, pro.time=12, col=3, xlab="Predicted 12-year survival", ylab="Observed 12-year survival", newdata=dataDIVAT2[151:300,], times="times", failures="failures")
data(dataDIVAT2) # The estimation of the model from the first 200 lignes model <- LIB_COXall(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The calibration plot from the validation sample of 150 patients plot(model, n.groups=5, pro.time=12, col=3, xlab="Predicted 12-year survival", ylab="Observed 12-year survival", newdata=dataDIVAT2[151:300,], times="times", failures="failures")
A plot of ROC curves is produced.
## S3 method for class 'rocrisca' plot(x, ..., information=TRUE)
## S3 method for class 'rocrisca' plot(x, ..., information=TRUE)
x |
An object of class |
... |
Additional arguments affecting the plot. |
information |
A logical value indicating whether the non-information line is plotted. The default values is TRUE. |
No return value for this S3 method.
data(dataDIVAT3) # A subgroup analysis to reduce the time needed for this example dataDIVAT3 <- dataDIVAT3[1:400,] # The time-dependent ROC curve to evaluate the # capacities of the recipient age for the prognosis of post-kidney # transplant mortality up to 2000 days. # Compute the raw sensitivity and specificity roc1 <- roc(times="death.time", failures="death", variable="ageR", confounders=~1, data=dataDIVAT3, pro.time=2000, precision=seq(0.1,0.9, by=0.2)) plot(roc1, type="b", col=1, pch=2, lty=2, xlab="1-specificity", ylab="sensibility")
data(dataDIVAT3) # A subgroup analysis to reduce the time needed for this example dataDIVAT3 <- dataDIVAT3[1:400,] # The time-dependent ROC curve to evaluate the # capacities of the recipient age for the prognosis of post-kidney # transplant mortality up to 2000 days. # Compute the raw sensitivity and specificity roc1 <- roc(times="death.time", failures="death", variable="ageR", confounders=~1, data=dataDIVAT3, pro.time=2000, precision=seq(0.1,0.9, by=0.2)) plot(roc1, type="b", col=1, pch=2, lty=2, xlab="1-specificity", ylab="sensibility")
A calibration plot of a Super Learner obtained by the function survivalSL
.
## S3 method for class 'sltime' plot(x, method, n.groups, pro.time, newdata, times, failures, ...)
## S3 method for class 'sltime' plot(x, method, n.groups, pro.time, newdata, times, failures, ...)
x |
An object returned by the function |
method |
A character string with the name of the algorithm included in the SL for which the calibration plot is performed. The default is "sl" for the Super Learner. |
n.groups |
A numeric value with the number of groups by their class probabilities. The default is 5. |
pro.time |
The prognostic time at which the calibration plot of the survival probabilities. |
newdata |
An optional data frame containing the new sample for validation with covariate values, follow-up times, and event status. The default value is |
times |
The name of the variable related the numeric vector with the follow-up times in |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event) in |
... |
Additional arguments affecting the plot. |
The plot represents the observed survival and the related 95% confidence intervals, which are respectively estimated by the Kaplan and Meier estimator and the Greenwood formula, against the mean of the predictive values for individuals stratified into groups of the same size according to the percentiles. The identity line is usually included for reference.
No return value for this S3 method.
data(dataDIVAT2) #The outcome model base on a Super Learner from the first 150 individuals of the data base sl1 <- survivalSL( methods=c("LIB_AFTgamma", "LIB_PHgompertz"), metric="ci", data=dataDIVAT2[1:150,], times="times", failures="failures", group="ecd", cov.quanti=c("age"), cov.quali=c("hla", "retransplant"), cv=3) # The calibration plot from the validation sample of 150 patients plot(sl1, method="sl", n.groups=5, pro.time=12, col=2, xlab="Predicted 12-year survival", ylab="Observed 12-year survival", newdata=dataDIVAT2[151:300,], times="times", failures="failures")
data(dataDIVAT2) #The outcome model base on a Super Learner from the first 150 individuals of the data base sl1 <- survivalSL( methods=c("LIB_AFTgamma", "LIB_PHgompertz"), metric="ci", data=dataDIVAT2[1:150,], times="times", failures="failures", group="ecd", cov.quanti=c("age"), cov.quali=c("hla", "retransplant"), cv=3) # The calibration plot from the validation sample of 150 patients plot(sl1, method="sl", n.groups=5, pro.time=12, col=2, xlab="Predicted 12-year survival", ylab="Observed 12-year survival", newdata=dataDIVAT2[151:300,], times="times", failures="failures")
Predict the survival based on a model or algorithm from an object of the class libsl
.
## S3 method for class 'libsl' predict(object, newdata, newtimes, ...)
## S3 method for class 'libsl' predict(object, newdata, newtimes, ...)
object |
An object returned by the function |
newdata |
An optional data frame containing covariate values at which to produce predicted values. There must be a column for every covariate included in |
newtimes |
The times at which to produce predicted values. The default value is |
... |
For future methods. |
The model object
is obtained from the flexsurvreg
package.
times |
A vector of numeric values with the times of the |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
data(dataDIVAT2) # The estimation of the model from the first 200 lines model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # Predicted survival for 2 new subjects pred <- predict(model, newdata=data.frame(age=c(52,52), hla=c(0,1), retransplant=c(1,1), ecd=c(0,1))) plot(y=pred$predictions[1,], x=pred$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1)) lines(y=pred$predictions[2,], x=pred$times, col=2, type="l", lty=1, lwd=2) legend("bottomright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))
data(dataDIVAT2) # The estimation of the model from the first 200 lines model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:200,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # Predicted survival for 2 new subjects pred <- predict(model, newdata=data.frame(age=c(52,52), hla=c(0,1), retransplant=c(1,1), ecd=c(0,1))) plot(y=pred$predictions[1,], x=pred$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1)) lines(y=pred$predictions[2,], x=pred$times, col=2, type="l", lty=1, lwd=2) legend("bottomright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))
Predict the survival of new observations based on an SL by using the survivalSL
function.
## S3 method for class 'sltime' predict(object, newdata, newtimes, ...)
## S3 method for class 'sltime' predict(object, newdata, newtimes, ...)
object |
An object returned by the function |
newdata |
An optional data frame containing covariate values at which to produce predicted values. There must be a column for every covariate included in |
newtimes |
The times at which to produce predicted values. The default value is |
... |
For future methods. |
times |
A vector of numeric values with the times of the |
predictions |
A matrix with the predictions of survivals of each subject (lines) for each observed time (columns). |
data(dataDIVAT2) # The training of the super learner from the first 150 individuals of the data base sl1 <- survivalSL(method=c("LIB_COXridge", "LIB_AFTggamma"), metric="ci", data=dataDIVAT2[1:150,], times="times", failures="failures", pro.time = 12, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=3) # Individual prediction for 2 new subjects pred <- predict(sl1, newdata=data.frame(age=c(52,52), hla=c(0,1), retransplant=c(1,1), ecd=c(0,1))) plot(y=pred$predictions$sl[1,], x=pred$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1)) lines(y=pred$predictions$sl[2,], x=pred$times, col=2, type="l", lty=1, lwd=2) legend("bottomright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))
data(dataDIVAT2) # The training of the super learner from the first 150 individuals of the data base sl1 <- survivalSL(method=c("LIB_COXridge", "LIB_AFTggamma"), metric="ci", data=dataDIVAT2[1:150,], times="times", failures="failures", pro.time = 12, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=3) # Individual prediction for 2 new subjects pred <- predict(sl1, newdata=data.frame(age=c(52,52), hla=c(0,1), retransplant=c(1,1), ecd=c(0,1))) plot(y=pred$predictions$sl[1,], x=pred$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1)) lines(y=pred$predictions$sl[2,], x=pred$times, col=2, type="l", lty=1, lwd=2) legend("bottomright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))
Print the model or algorithm.
## S3 method for class 'libsl' print(x, ...)
## S3 method for class 'libsl' print(x, ...)
x |
An object returned by the function |
... |
For future methods. |
No return value for this S3 method.
LIB_AFTgamma
, LIB_AFTggamma
, LIB_AFTllogis
, LIB_AFTweibull
,
LIB_PHexponential
, LIB_PHgompertz
.
data(dataDIVAT2) model <- LIB_AFTgamma(times="times", failures="failures", data=dataDIVAT2[1:100,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) print(model)
data(dataDIVAT2) model <- LIB_AFTgamma(times="times", failures="failures", data=dataDIVAT2[1:100,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) print(model)
Print the contribution of learners included in the super learner.
## S3 method for class 'sltime' print(x, digits=7, ...)
## S3 method for class 'sltime' print(x, digits=7, ...)
x |
An object returned by the function |
digits |
An optional integer for the number of digits to print when printing numeric values. |
... |
For future methods. |
No return value for this S3 method.
data(dataDIVAT2) sl1 <- survivalSL(method=c("LIB_COXridge", "LIB_AFTggamma"), metric="ci", data=dataDIVAT2[1:150,], times="times", failures="failures", pro.time = 12, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=3) print(sl1, digits=4)
data(dataDIVAT2) sl1 <- survivalSL(method=c("LIB_COXridge", "LIB_AFTggamma"), metric="ci", data=dataDIVAT2[1:150,], times="times", failures="failures", pro.time = 12, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=3) print(sl1, digits=4)
This function allows for the estimation of time-dependent ROC curve by considering possible confounding factors. This method is implemented by standardizing and weighting based on an IPW estimator.
roc(times, failures, variable, confounders, data, pro.time, precision=seq(.01, .99, by=.01))
roc(times, failures, variable, confounders, data, pro.time, precision=seq(.01, .99, by=.01))
times |
A character string with the name of the variable in |
failures |
A character string with the name of the variable in |
variable |
A character string with the name of the variable in |
confounders |
An object of class "formula". More precisely only the right part with an expression of the form |
data |
An object of the class |
pro.time |
The value of prognostic time represents the maximum delay for which the capacity of the variable is evaluated. The same unit than the one used in the argument |
precision |
The quintiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. 0 (min) and 1 (max) are not allowed. |
This function computes confounder-adjusted time-dependent ROC curve with right-censored data. We adapted the naive IPCW estimator as explained by Blanche, Dartigues and Jacqmin-Gadda (2013) by considering the probability of experiencing the event of interest before the fixed prognostic time, given the possible confounding factors.
table |
This data frame presents the sensitivities and specificities associated with the cut-off values. |
auc |
The area under the time-dependent ROC curve for a prognostic up to |
Blanche et al. (2013) Review and comparison of roc curve estimators for a time-dependent outcome with marker-dependent censoring. Biometrical Journal, 55, 687-704. <doi:10.1002/ bimj.201200045>
Le Borgne et al. Standardized and weighted time-dependent ROC curves to evaluate the intrinsic prognostic capacities of a marker by taking into account confounding factors. Stat Methods Med Res. 27(11):3397-3410, 2018. <doi: 10.1177/ 0962280217702416>.
# import and attach the data example data(dataDIVAT3) # A subgroup analysis to reduce the time needed for this example dataDIVAT3 <- dataDIVAT3[1:400,] # The standardized and weighted time-dependent ROC curve to evaluate the # capacities of the recipient age for the prognosis of post kidney # transplant mortality up to 2000 days by taking into account the # donor age and the recipient gender. # 1. Standardize the marker according to the covariates among the controls lm1 <- lm(ageR ~ ageD + sexeR, data=dataDIVAT3[dataDIVAT3$death.time >= 2500,]) dataDIVAT3$ageR_std <- (dataDIVAT3$ageR - (lm1$coef[1] + lm1$coef[2] * dataDIVAT3$ageD + lm1$coef[3] * dataDIVAT3$sexeR)) / sd(lm1$residuals) # 2. Compute the sensitivity and specificity from the proposed IPW estimators roc2 <- roc(times="death.time", failures="death", variable="ageR_std", confounders=~bs(ageD, df=3) + sexeR, data=dataDIVAT3, pro.time=2000, precision=seq(0.1,0.9, by=0.2)) # The corresponding ROC graph plot(roc2, col=2, pch=2, lty=1, type="b", xlab="1-specificity", ylab="sensibility") # The corresponding AUC roc2$auc
# import and attach the data example data(dataDIVAT3) # A subgroup analysis to reduce the time needed for this example dataDIVAT3 <- dataDIVAT3[1:400,] # The standardized and weighted time-dependent ROC curve to evaluate the # capacities of the recipient age for the prognosis of post kidney # transplant mortality up to 2000 days by taking into account the # donor age and the recipient gender. # 1. Standardize the marker according to the covariates among the controls lm1 <- lm(ageR ~ ageD + sexeR, data=dataDIVAT3[dataDIVAT3$death.time >= 2500,]) dataDIVAT3$ageR_std <- (dataDIVAT3$ageR - (lm1$coef[1] + lm1$coef[2] * dataDIVAT3$ageD + lm1$coef[3] * dataDIVAT3$sexeR)) / sd(lm1$residuals) # 2. Compute the sensitivity and specificity from the proposed IPW estimators roc2 <- roc(times="death.time", failures="death", variable="ageR_std", confounders=~bs(ageD, df=3) + sexeR, data=dataDIVAT3, pro.time=2000, precision=seq(0.1,0.9, by=0.2)) # The corresponding ROC graph plot(roc2, col=2, pch=2, lty=1, type="b", xlab="1-specificity", ylab="sensibility") # The corresponding AUC roc2$auc
Return predictive performances of a model or algorithm obtained by a library of the class libsl
.
## S3 method for class 'libsl' summary(object, newdata=NULL, ROC.precision=seq(.01,.99,.01), digits=7, ...)
## S3 method for class 'libsl' summary(object, newdata=NULL, ROC.precision=seq(.01,.99,.01), digits=7, ...)
object |
An object returned by a library of the class |
newdata |
An optional data frame containing the new sample for validation with covariate values, follow-up times, and event status. The default value is |
ROC.precision |
An optional argument with the percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. 0 (min) and 1 (max) are not allowed. By default, the precision is |
digits |
An optional integer for the number of digits to print when printing numeric values. |
... |
Additional arguments affecting the summary which are passed from |
The following metrics are returned: "brier" for the Brier score at the prognostic time pro.time
, "ibs" for the Integrated Brier score up to the last observed time of event, "ibll" for the Integrated Binomial Log-likelihood up to the last observed time of event, "bll" for the binomial Log-likelihood, "ribs" for the restricted Integrated Brier score up to the prognostic time pro.time
, "ribll" for the restricted Integrated Binomial Log-likelihood Log-likelihood up to the last observed time of event, "bll" for the binomial Log-likelihood, "auc" for the area under the time-dependent ROC curve up to the prognostic time pro.time
.
No return value for this S3 method.
LIB_AFTgamma
, LIB_AFTggamma
, LIB_AFTllogis
, LIB_AFTweibull
,
LIB_PHexponential
, LIB_PHgompertz
.
data(dataDIVAT2) # The training of the Weibull model with the first 400 patients model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:400,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The prognostic capacities from the same training sample # (up to 4 years forseveral indicators) summary(model, pro.time=4) # The prognostic capacities from a validation of the next 150 patients # (up to 4 years for several indicators) summary(model, pro.time=4, newdata=dataDIVAT2[401:550,], times="times", failures="failures")
data(dataDIVAT2) # The training of the Weibull model with the first 400 patients model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:400,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) # The prognostic capacities from the same training sample # (up to 4 years forseveral indicators) summary(model, pro.time=4) # The prognostic capacities from a validation of the next 150 patients # (up to 4 years for several indicators) summary(model, pro.time=4, newdata=dataDIVAT2[401:550,], times="times", failures="failures")
Return goodness-of-fit indicators of a Super Learner obtained by the function survivalSL
.
## S3 method for class 'sltime' summary(object, method="sl", newdata=NULL, ROC.precision=seq(.01,.99,.01), digits=7, ...)
## S3 method for class 'sltime' summary(object, method="sl", newdata=NULL, ROC.precision=seq(.01,.99,.01), digits=7, ...)
object |
An object returned by the function |
method |
A character string with the name of the algorithm included in the SL for which the calibration plot is performed. The default is "sl" for the Super Learner. |
newdata |
An optional data frame containing the new sample for validation with covariate values, follow-up times, and event status. The default value is |
ROC.precision |
An optional argument with the percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. 0 (min) and 1 (max) are not allowed. By default, the precision is |
digits |
An optional integer for the number of digits to print when printing numeric values. |
... |
Additional arguments affecting the summary which are passed from |
The following metrics are returned: "ci" for the concordance index at the prognostic time pro.time
, "bs" for the Brier score at the prognostic time pro.time
, "ibs" for the integrated Brier score up to the last observed time of event, "ibll" for the integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial Log-likelihood, "ribs" for the restricted Integrated Brier score up to the prognostic time pro.time
, "ribll" for the restricted integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial log-likelihood, and "auc" for the area under the time-dependent ROC curve up to the prognostic time pro.time
.
No return value for this S3 method.
data(dataDIVAT2) dataDIVAT2$train <- 1*rbinom(n=dim(dataDIVAT2)[1], size = 1, prob=1/2) # The training of the super learner with 2 algorithms from the # first 100 patients of the training sample sl1 <- survivalSL(method=c("LIB_AFTgamma", "LIB_PHgompertz"), metric="auc", data=dataDIVAT2[dataDIVAT2$train==1,][1:100,], times="times", failures="failures", pro.time = 12, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=3) # The prognostic capacities from the same training sample summary(sl1)
data(dataDIVAT2) dataDIVAT2$train <- 1*rbinom(n=dim(dataDIVAT2)[1], size = 1, prob=1/2) # The training of the super learner with 2 algorithms from the # first 100 patients of the training sample sl1 <- survivalSL(method=c("LIB_AFTgamma", "LIB_PHgompertz"), metric="auc", data=dataDIVAT2[dataDIVAT2$train==1,][1:100,], times="times", failures="failures", pro.time = 12, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=3) # The prognostic capacities from the same training sample summary(sl1)
This function allows to compute a Super Learner (SL) to predict survival outcomes.
survivalSL(methods, metric="ci", data, times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, cv=10, param.tune=NULL, pro.time=NULL, optim.local.min=FALSE, ROC.precision=seq(.01,.99,.01), param.weights.fix=NULL, param.weights.init=NULL, keep.predictions=TRUE, progress=TRUE)
survivalSL(methods, metric="ci", data, times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, cv=10, param.tune=NULL, pro.time=NULL, optim.local.min=FALSE, ROC.precision=seq(.01,.99,.01), param.weights.fix=NULL, param.weights.init=NULL, keep.predictions=TRUE, progress=TRUE)
methods |
A vector of characters with the names of the algorithms included in the SL. At least two algorithms have to be included. |
metric |
The loss function used to estimate the weights of the algorithms in the SL. See details. |
data |
A data frame in which to look for the variables related to the status of the follow-up time ( |
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible. |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
cv |
The number of splits for cross-validation. The default value is 10. |
param.tune |
A list with a length equals to the number of algorithms included in |
pro.time |
This optional value of prognostic time represents the maximum delay for which the capacity of the variable is evaluated. The same unit than the one used in the argument times. Not used for the following metrics: "loglik", "ibs", "bll", and "ibll". Default value is the time at which half of the subjects are still at risk. |
optim.local.min |
An optional logical value. If |
ROC.precision |
The percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. Only used when |
param.weights.fix |
A vector with the parameters of the multinomial logistic regression which generates the weights of the algorithms declared in |
param.weights.init |
A vector with the initial values of the parameters of the multinomial logistic regression which generates the weights of the algorithms declared in |
keep.predictions |
A logical value specifying if all the predictions for all the |
progress |
A logical value to print a progress bar in the R console. The default is |
Each object of the list declared in param.tune
must have the same name than the names of the methods
included in the SL. If param.tune
= NULL
, the tunning parameters of each algorithm are estimated by cv
-fold cross-validation. Otherwise, the user can propose a tunning grid for each method, as explained in the following table. The following metrics can be used: "ci" for the concordance index at the prognostic time pro.time
, "bs" for the Brier score at the prognostic time pro.time
, "loglik" for the log-likelihood, "ibs" for the integrated Brier score up to the last observed time of event, "ibll" for the Integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial log-likelihood, "ribs" for the restricted integrated Brier score up to the prognostic time pro.time
, "ribll" for the restricted integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial log-likelihood, and "auc" for the area under the time-dependent ROC curve up to the prognostic time pro.time
.
The following learners are available:
Names | Description | Package |
"LIB_AFTgamma" |
Gamma-distributed AFT model | flexsurv |
"LIB_AFTggamma" |
Generalized Gamma-distributed AFT model | flexsurv |
"LIB_AFTweibull" |
Weibull-distributed AFT model | flexsurv |
"LIB_PHexponential" |
Exponential-distributed PH model | flexsurv |
"LIB_PHgompertz" |
Gompertz-distributed PH model | flexsurv |
"LIB_PHspline" |
Spline-based PH model | flexsurv |
"LIB_COXall" |
Usual Cox model | survival |
"LIB_COXaic" |
Cox model with AIC-based forward selection | MASS |
"LIB_COXen" |
Elastic Net Cox model | glmnet |
"LIB_COXlasso" |
Lasso Cox model | glmnet |
"LIB_COXridge" |
Ridge Cox model | glmnet |
"LIB_RSF" |
Survival Random Forest | randomForestSRC |
"LIB_SNN" |
(Python-based) Survival Neural Network | survivalmodels |
"LIB_PLANN" |
(Python-based) Survival Neural Network | survivalPLANN |
The following loss functions for the estimation of the super learner weigths are available (metric
):
Area under the ROC curve ("auc"
)
Concordance index ("ci"
)
Brier score ("bs"
)
Binomial log-likelihood ("bll"
)
Integrated Brier score ("ibs"
)
Integrated binomial log-likelihood ("ibll"
)
Restricted integrated Brier score ("ribs"
)
Restricted integrated binomial log-Likelihood ("ribll"
)
times |
A vector of numeric values with the times of the |
predictions |
A list of matrices with the predictions of survivals of each subject (lines) for each observed time (columns). Each matrix corresponds to the included |
data |
The data frame used for learning. The first column is entitled |
predictors |
A list with the predictors involved in |
ROC.precision |
The percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. |
cv |
The number of splits for cross-validation. |
pro.time |
The maximum delay for which the capacity of the variable is evaluated. |
models |
A list with the estimated models/algorithms included in the SL. |
weights |
A list composed by two vectors: the regressions |
metric |
A list composed by two vectors: the loss function used to estimate the weights of the algorithms in the SL and its value. |
param.tune |
The estimated tunning parameters. |
Polley E and van der Laanet M. Super Learner In Prediction. http://biostats.bepress.com. 2010.
data(dataDIVAT2) # The Super Learner based from the first 250 individuals of the data base sl1 <- survivalSL(methods=c("LIB_AFTgamma", "LIB_PHgompertz"), metric="ci", data=dataDIVAT2[1:250,], times="times", failures="failures", group="ecd", cov.quanti=c("age"), cov.quali=c("hla", "retransplant"), cv=5) # Individual prediction pred <- predict(sl1, newdata=data.frame(age=c(52,52), hla=c(0,1), retransplant=c(1,1), ecd=c(0,1))) plot(y=pred$predictions$sl[1,], x=pred$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1)) lines(y=pred$predictions$sl[2,], x=pred$times, col=2, type="l", lty=1, lwd=2) legend("topright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))
data(dataDIVAT2) # The Super Learner based from the first 250 individuals of the data base sl1 <- survivalSL(methods=c("LIB_AFTgamma", "LIB_PHgompertz"), metric="ci", data=dataDIVAT2[1:250,], times="times", failures="failures", group="ecd", cov.quanti=c("age"), cov.quali=c("hla", "retransplant"), cv=5) # Individual prediction pred <- predict(sl1, newdata=data.frame(age=c(52,52), hla=c(0,1), retransplant=c(1,1), ecd=c(0,1))) plot(y=pred$predictions$sl[1,], x=pred$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1)) lines(y=pred$predictions$sl[2,], x=pred$times, col=2, type="l", lty=1, lwd=2) legend("topright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))
This function finds the model which minimize the AIC of a Cox PH model.
tuneCOXaic(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, model.min=NULL, model.max=NULL)
tuneCOXaic(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, model.min=NULL, model.max=NULL)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
model.min |
An optional argument with the minimal set of covariates. |
model.max |
An optional argument with the maximal set of covariates. |
The function runs the stepAIC
function of the MASS
package for covariates' selection.
optimal |
The names of covariate to adjuste the fit. |
results |
The result of the stepAIC process. |
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
data(dataDIVAT2) tune.model <- tuneCOXaic(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) tune.model$optimal$final.model # the covariate in the model with the best AIC # The estimation of the training model with the corresponding lambda value model <- LIB_COXaic(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), final.model=tune.model$optimal$final.model) # The resulted predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) tune.model <- tuneCOXaic(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd")) tune.model$optimal$final.model # the covariate in the model with the best AIC # The estimation of the training model with the corresponding lambda value model <- LIB_COXaic(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), final.model=tune.model$optimal$final.model) # The resulted predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
This function finds the optimal lambda and alpha parameters for an elastic net Cox regression.
tuneCOXen(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, parallel=FALSE, alpha, lambda)
tuneCOXen(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, parallel=FALSE, alpha, lambda)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
cv |
The value of the number of folds. The default value is 10. |
parallel |
If |
alpha |
The values of the regularization parameter alpha optimized over. |
lambda |
The values of the regularization parameter lambda optimized over. |
The function runs the cv.glmnet
function of the glmnet
package.
optimal |
The value of lambda that gives the minimum mean cross-validated error. |
results |
The data frame with the mean cross-validated errors for each lambda values. |
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, https://www.jstatsoft.org/v39/i05/
data(dataDIVAT2) tune.model <- tuneCOXen(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=5, alpha=seq(.1, 1, by=.1), lambda=seq(.1, 1, by=.1)) tune.model$optimal$lambda # the estimated lambda value # The estimation of the training modelwith the corresponding lambda value model <- LIB_COXen(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), alpha=tune.model$optimal$alpha, lambda=tune.model$optimal$lambda) # The resulted predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) tune.model <- tuneCOXen(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=5, alpha=seq(.1, 1, by=.1), lambda=seq(.1, 1, by=.1)) tune.model$optimal$lambda # the estimated lambda value # The estimation of the training modelwith the corresponding lambda value model <- LIB_COXen(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), alpha=tune.model$optimal$alpha, lambda=tune.model$optimal$lambda) # The resulted predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
This function finds the optimal lambda parameter for a Lasso Cox regression.
tuneCOXlasso(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, parallel=FALSE, lambda)
tuneCOXlasso(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, parallel=FALSE, lambda)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
cv |
The value of the number of folds. The default value is 10. |
parallel |
If |
lambda |
The values of the regularization parameter lambda optimized over. |
The function runs the cv.glmnet
function of the glmnet
package.
optimal |
The value of lambda that gives the minimum mean cross-validated error. |
results |
The data frame with the mean cross-validated errors for each lambda values. |
Simon et al. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, https://www.jstatsoft.org/v39/i05/
data(dataDIVAT2) tune.model <- tuneCOXlasso(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=5, lambda=seq(0, 10, by=.1)) tune.model$optimal$lambda # the estimated lambda value # The estimation of the training modelwith the corresponding lambda value model <- LIB_COXlasso(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=tune.model$optimal$lambda) # The resulted predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) tune.model <- tuneCOXlasso(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=5, lambda=seq(0, 10, by=.1)) tune.model$optimal$lambda # the estimated lambda value # The estimation of the training modelwith the corresponding lambda value model <- LIB_COXlasso(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=tune.model$optimal$lambda) # The resulted predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
This function finds the optimal lambda parameter for a ridge Cox regression.
tuneCOXridge(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, parallel=FALSE, lambda)
tuneCOXridge(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, parallel=FALSE, lambda)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
cv |
The value of the number of folds. The default value is 10. |
parallel |
If |
lambda |
The values of the regularization parameter lambda optimized over. |
The function runs the cv.glmnet
function of the glmnet
package.
optimal |
The value of lambda that gives the minimum mean cross-validated error. |
results |
The data frame with the mean cross-validated errors for each lambda values. |
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, https://www.jstatsoft.org/v39/i05/
data(dataDIVAT2) tune.model <- tuneCOXridge(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=5, lambda=seq(0, 10, by=.1)) tune.model$optimal$lambda # the estimated lambda value # The estimation of the training modelwith the corresponding lambda value model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=tune.model$optimal$lambda) # The resulted predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) tune.model <- tuneCOXridge(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=5, lambda=seq(0, 10, by=.1)) tune.model$optimal$lambda # the estimated lambda value # The estimation of the training modelwith the corresponding lambda value model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), lambda=tune.model$optimal$lambda) # The resulted predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
This function finds the optimal number of knots of the spline function.
tunePHspline(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, k)
tunePHspline(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, k)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
cv |
The value of the number of folds. The default value is 10. |
k |
The number of knots optimized over. |
The function runs the flexsurvspline
function of the flexsurv
package. The metric used in the cross-validation is the C-index.
optimal |
The value of |
results |
The data frame with the mean cross-validated C-index according to |
Royston, P. and Parmar, M. (2002). Flexible parametric proportional-hazards and proportional odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Statistics in Medicine 21(1):2175-2197. doi: 10.1002/sim.1203
data(dataDIVAT2) # The estimation of the hyperparameters on the first 150 patients tune.model <- tunePHspline(times="times", failures="failures", data=dataDIVAT2[1:150,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=3, k=1:2) # the estimated nodesize value tune.model$optimal tune.model$results
data(dataDIVAT2) # The estimation of the hyperparameters on the first 150 patients tune.model <- tunePHspline(times="times", failures="failures", data=dataDIVAT2[1:150,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=3, k=1:2) # the estimated nodesize value tune.model$optimal tune.model$results
This function finds the optimal inter, size, decay, maxit, and MaxNWts parameters for the survival neural network by using cross-validation and the concordance index.
tunePLANN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, inter, size, decay, maxit, MaxNWts)
tunePLANN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, inter, size, decay, maxit, MaxNWts)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
cv |
The value of the number of folds. The default value is 10. |
inter |
The length of the intervals. |
size |
The number of units in the hidden layer. |
decay |
The parameter for weight decay. |
maxit |
The maximum number of iterations. |
MaxNWts |
The maximum allowable number of weights. |
This function is based is based on the survivalPLANN
package.
optimal |
The value of |
results |
The data frame with the mean cross-validated C-index according to |
Biganzoli E, Boracchi P, Mariani L, and et al. Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. Stat Med, 17:1169-86, 1998.
data(dataDIVAT2) # The hyper-parameter grid needs to be more precise and the maximum number # of iterations > 1000. We have reduced the arguments to respect examples requiring # less than 5 seconds for packages on the CRAN. tune.model <- tunePLANN(times="times", failures="failures", data=dataDIVAT2[1:300,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=3, inter=1, size=c(16, 32), decay=0.01, maxit=50, MaxNWts=10000) tune.model$optimal # the optimal hyperparameters tune.model$results # the C-index for the tested grid
data(dataDIVAT2) # The hyper-parameter grid needs to be more precise and the maximum number # of iterations > 1000. We have reduced the arguments to respect examples requiring # less than 5 seconds for packages on the CRAN. tune.model <- tunePLANN(times="times", failures="failures", data=dataDIVAT2[1:300,], cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), cv=3, inter=1, size=c(16, 32), decay=0.01, maxit=50, MaxNWts=10000) tune.model$optimal # the optimal hyperparameters tune.model$results # the C-index for the tested grid
This function finds the optimal nodesize, mtry, and ntree parameters for a survival random forest tree.
tuneRSF(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, nodesize, mtry, ntree)
tuneRSF(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, nodesize, mtry, ntree)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
nodesize |
The values of the node size optimized over. |
mtry |
The numbers of variables randomly sampled as candidates at each split optimized over. |
ntree |
The numbers of trees optimized over. |
The function runs the tune.rfsrc
function of the randomForestSRC
package.
optimal |
The value of lambda that gives the minimum mean cross-validated error. |
results |
The data frame with the mean cross-validated errors for each lambda values. |
Ishwaran H. and Kogalur U.B. (2007). Random survival forests for R, Rnews, 7(2):25-31.
data(dataDIVAT2) tune.model <- tuneRSF(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), nodesize=c(100, 250, 500), mtry=1, ntree=100) tune.model$optimal # the estimated nodesize value # The estimation of the training modelwith the corresponding lambda value model <- LIB_RSF(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), nodesize=tune.model$optimal$nodesize, mtry=1, ntree=100) # The resulted predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2) tune.model <- tuneRSF(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), nodesize=c(100, 250, 500), mtry=1, ntree=100) tune.model$optimal # the estimated nodesize value # The estimation of the training modelwith the corresponding lambda value model <- LIB_RSF(times="times", failures="failures", data=dataDIVAT2, cov.quanti=c("age"), cov.quali=c("hla", "retransplant", "ecd"), nodesize=tune.model$optimal$nodesize, mtry=1, ntree=100) # The resulted predicted survival of the first subject of the training sample plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
This function finds the optimal n.nodes, decay, batch.size, and epochs parameters for a survival neural network.
tuneSNN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, n.nodes, decay, batch.size, epochs)
tuneSNN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data, cv=10, n.nodes, decay, batch.size, epochs)
times |
The name of the variable related the numeric vector with the follow-up times. |
failures |
The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event). |
group |
The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is |
cov.quanti |
The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric. |
cov.quali |
The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels. |
data |
A data frame for training the model in which to look for the variables related to the status of the follow-up time ( |
cv |
The value of the number of folds. The default value is 10. |
n.nodes |
The number of hidden nodes optimized over. |
decay |
The value of the weight decay optimized over. |
batch.size |
The value of batch size |
epochs |
The value of epochs |
This function is based is based on the deepsurv
from the survivalmodels
package. You need to call Python using reticulate
. In order to use it, the required Python packages must be installed with reticulate::py_install
. Therefore, before running the present LIB_SNN
function, you must install and call for the reticulate
and survivalmodels
packages, and install pycox
by using the following command: install_pycox(pip = TRUE, install_torch = FALSE)
. The survivalSL
package functions without these supplementary installations if this learner is not included in the library.
optimal |
The value of |
results |
The data frame with the mean cross-validated C-index according to |
Katzman et al. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18(1), 24. 1018.
https://doi.org/10.1186/s12874-018-0482-1