Package 'survivalSL' reference manual

Title:	Super Learner for Survival Prediction from Censored Data
Description:	Several functions and S3 methods to construct a super learner in the presence of censored times-to-event and to evaluate its prognostic capacities.
Authors:	Yohann Foucher [aut, cre] , Camille Sabathe [aut]
Maintainer:	Yohann Foucher <[email protected]>
License:	GPL (>=2)
Version:	0.97
Built:	2025-03-31 06:28:17 UTC
Source:	https://github.com/foucher-y/survivalsl

A Sample from the DIVAT Data Bank.

Description

A data frame with 1912 French kidney transplant recipients from the DIVAT cohort.

Usage

data(dataDIVAT2)data(dataDIVAT2)

Format

A data frame with the 4 following variables:

age: This numeric vector provides the age of the recipient at the transplantation (in years).
hla: This numeric vector provides the indicator of transplantations with at least 4 HLA incompatibilities between the donor and the recipient (1 for high level and 0 otherwise).
retransplant: This numeric vector provides the indicator of re-transplantation (1 for more than one transplantation and 0 for first kidney transplantation).
ecd: The Expended Criteria Donor (1 for transplantations from ECD and 0 otherwise). ECD are defined by widely accepted criteria, which includes donors older than 60 years of age or 50-59 years of age with two of the following characteristics: history of hypertension, cerebrovascular accident as the cause of death or terminal serum creatinine higher than 1.5 mg/dL.
times: This numeric vector is the follow up times of each patient.
failures: This numeric vector is the event indicator (0=right censored, 1=event). An event is considered when return in dialysis or patient death with functioning graft is observed.

Source

URL: www.divat.fr

References

Le Borgne F, Giraudeau B, Querard AH, Giral M and Foucher Y. Comparisons of the performances of different statistical tests for time-to-event analysis with confounding factors: practical illustrations in kidney transplantation. Statistics in medicine. 30;35(7):1103-16, 2016. <doi:10.1002/ sim.6777>

Examples


data(dataDIVAT2)

# Compute the non-adjusted Hazard Ratio related to the ECD versus SCD
cox.ecd<-coxph(Surv(times, failures) ~ ecd, data=dataDIVAT2)
summary(cox.ecd) # Hazard Ratio = 1.97
data(dataDIVAT2)

# Compute the non-adjusted Hazard Ratio related to the ECD versus SCD
cox.ecd<-coxph(Surv(times, failures) ~ ecd, data=dataDIVAT2)
summary(cox.ecd) # Hazard Ratio = 1.97

A Sample from the DIVAT Data Bank.

Description

A data frame with 4267 French kidney transplant recipients.

Usage

data(dataDIVAT3)data(dataDIVAT3)

Format

A data frame with 4267 observations for the 8 following variables.

ageR: This numeric vector represents the age of the recipient (in years)
sexeR: This numeric vector represents the gender of the recipient (1=men, 0=female)
year.tx: This numeric vector represents the year of the transplantation
ante.diab: This numeric vector represents the diabetes statute (1=yes, 0=no)
pra: This numeric vector represents the pre-graft immunization using the panel reactive antibody (1=detectable, 0=undetectable)
ageD: This numeric vector represents the age of the donor (in years)
death.time: This numeric vector represents the follow up time in days (until death or censoring)
death: This numeric vector represents the death indicator at the follow-up end (1=death, 0=alive)

Source

URL: www.divat.fr

References

Le Borgne et al. Standardized and weighted time-dependent ROC curves to evaluate the intrinsic prognostic capacities of a marker by taking into account confounding factors. Manuscript submitted. Stat Methods Med Res. 27(11):3397-3410, 2018. <doi: 10.1177/ 0962280217702416.>

Examples

data(dataDIVAT3)

### a short summary of the recipient age at the transplantation
summary(dataDIVAT3$ageR)

### Kaplan and Meier estimation of the recipient survival
plot(survfit(Surv(death.time/365.25, death) ~ 1, data = dataDIVAT3),
 xlab="Post transplantation time (in years)", ylab="Patient survival",
 mark.time=FALSE)
data(dataDIVAT3)

### a short summary of the recipient age at the transplantation
summary(dataDIVAT3$ageR)

### Kaplan and Meier estimation of the recipient survival
plot(survfit(Surv(death.time/365.25, death) ~ 1, data = dataDIVAT3),
 xlab="Post transplantation time (in years)", ylab="Patient survival",
 mark.time=FALSE)

A Simulated Sample from the OFSEP Cohort.

Description

A data frame with 1300 simulated French patients with multiple sclerosis from the OFSEP cohort. The baseline is 1 year after the initiation of the first-line treatment.

Usage

data(dataOFSEP)data(dataOFSEP)

Format

A data frame with 1300 observations for the 3 following variables:

time: This numeric vector represents the follow up time in years (until disease progression or censoring)
event: This numeric vector represents the disease progression indicator at the follow-up end (1=progression, 0=censoring)
age: This numeric vector represents the patient age (in years) at baseline.
duration: This numeric vector represents the disease duration (in days) at baseline.
period: This numeric vector represents the calendar period: 1 in-between 2014 and 2018, and 0 otherwise.
gender: This numeric vector represents the gender: 1 for women.
relapse: This numeric vector represents the diagnosis of at least one relapse since the treatment initiation : 1 if at leat one event, and 0 otherwise.
edss: This vector of character string represents the EDSS level: "miss" for missing, "low" for EDSS between 0 to 2, and "high" otherwise.
t1: This vector of character string represents the new gadolinium-enhancing T1 lesion: "missing", "0" or "1+" for at least 1 lesion.
t2: This vector of character string represents the new T2 lesions: "no" or "yes".
rio: This numeric vector represents the modified Rio score.

Examples

data(dataOFSEP)

### Kaplan and Meier estimation of the disease progression free survival
plot(survfit(Surv(time, event) ~ 1, data = dataOFSEP),
     ylab="Disease progression free survival",
     xlab="Time after the first anniversary of the first-line treatment in years")
data(dataOFSEP)

### Kaplan and Meier estimation of the disease progression free survival
plot(survfit(Surv(time, event) ~ 1, data = dataOFSEP),
     ylab="Disease progression free survival",
     xlab="Time after the first anniversary of the first-line treatment in years")

Library of the Super Learner for an Accelerated Failure Time (AFT) Model with a Gamma Distribution

Description

Fit an AFT parametric model with a gamma distribution.

Usage

LIB_AFTgamma(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_AFTgamma(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).

Details

The model is obtained by using the dist="gamma" in the flexsurvreg package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08

Examples

data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_AFTgamma(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_AFTgamma(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for an Accelerated Failure Time (AFT) Model with a Generalized Gamma Distribution

Description

Fit an AFT parametric model with a generalized gamma distribution.

Usage

LIB_AFTggamma(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_AFTggamma(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).

Details

The model is obtained by using the dist="gengamma" in the flexsurvreg package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08

Examples

data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_AFTggamma(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_AFTggamma(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for an Accelerated Failure Time (AFT) Model with a Log Logistic Distribution

Description

Fit an AFT parametric model with a log logistic distribution.

Usage

LIB_AFTllogis(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_AFTllogis(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).

Details

The model is obtained by using the dist="llogis" in the flexsurvreg package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08

Examples

data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_AFTllogis(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_AFTllogis(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for an Accelerated Failure Time (AFT) Model with a Weibull Distribution

Description

Fit an AFT parametric model with a Weibull distribution.

Usage

LIB_AFTweibull(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_AFTweibull(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).

Details

The model is obtained by using the dist="weibull" in the flexsurvreg package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08

Examples

data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_AFTweibull(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_AFTweibull(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for a Cox Model with Selected Covariates

Description

Fit a Cox regression for a selection of covariate.

Usage

LIB_COXaic(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, final.model)
LIB_COXaic(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, final.model)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is `NULL`: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariates included in the previous model (`cov.quanti` and `cov.quali`)
`final.model`	The covariates to consider

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, https://www.jstatsoft.org/v39/i05/

Examples

data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXaic(times="times", failures="failures", data=dataDIVAT2,
  final.model=c("age"),  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival",
     col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXaic(times="times", failures="failures", data=dataDIVAT2,
  final.model=c("age"),  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival",
     col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for Cox Regression

Description

Fit a Cox regression for all covariates to be used in the super learner.

Usage

LIB_COXall(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)
LIB_COXall(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL, data)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).

Details

The Cox regression is obtained by using the survival package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Terry M. Therneau (2021). A Package for Survival Analysis in R. R package version 3.2-13, https://CRAN.R-project.org/package=survival.

Examples

data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXall(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXall(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for Elastic Net Cox Regression

Description

Fit an elastic net Cox regression for fixed values of the regularization parameters.

Usage

LIB_COXen(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, alpha, lambda)
LIB_COXen(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, alpha, lambda)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`alpha`	The value of the regularization parameter alpha for penalizing the partial likelihood.
`lambda`	The value of the regularization parameter lambda for penalizing the partial likelihood.

Details

The elastic net Cox regression is obtained by using the glmnet package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Examples

data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXen(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), lambda=.1, alpha=.1)

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXen(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), lambda=.1, alpha=.1)

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for Lasso Cox Regression

Description

Fit a Lasso Cox regression for a fixed value of the regularization parameter.

Usage

LIB_COXlasso(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, lambda)
LIB_COXlasso(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, lambda)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`lambda`	The value of the regularization parameter lambda for penalizing the partial likelihood.

Details

The Lasso Cox regression is obtained by using the glmnet package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Examples

data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXlasso(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), lambda=1)

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXlasso(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), lambda=1)

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for Ridge Cox Regression

Description

Fit a ridge Cox regression for a fixed value of the regularization parameter.

Usage

LIB_COXridge(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, lambda)
LIB_COXridge(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, lambda)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`lambda`	The value of the regularization parameter lambda for penalizing the partial likelihood.

Details

The ridge Cox regression is obtained by using the glmnet package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Examples

data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), lambda=1)

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival",
     col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), lambda=1)

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)", ylab="Predicted survival",
     col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for a Proportional Hazards (PH) Model with an Exponential Distribution

Description

Fit a PH model with an Exponential distribution.

Usage

LIB_PHexponential(times, failures, group=NULL, cov.quanti=NULL,
cov.quali=NULL, data)
LIB_PHexponential(times, failures, group=NULL, cov.quanti=NULL,
cov.quali=NULL, data)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).

Details

The model is obtained by using the dist="exp" in the flexsurvreg package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08

Examples

data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_PHexponential(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_PHexponential(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for an Proportional Hazards (PH) Model with a Gompertz Distribution

Description

Fit a PH parametric model with a Gompertz distribution.

Usage

LIB_PHgompertz(times, failures, group=NULL, cov.quanti=NULL,
cov.quali=NULL, data)
LIB_PHgompertz(times, failures, group=NULL, cov.quanti=NULL,
cov.quali=NULL, data)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).

Details

The model is obtained by using the dist="gompertz" in the flexsurvreg package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08

Examples

data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for an Survival Regression using the Royston/Parmar Spline Model

Description

Fit an PH model with a survival function is modelled as a natural cubic spline function.

Usage

LIB_PHspline(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, k)
LIB_PHspline(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, k)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`k`	Number of knots.

Details

The model is obtained by using the scale="hazard" in the flexsurvreg package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`hazard`	A vector of numeric values with the values of the cumulative baseline hazard function at the prediction `times`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08

Examples

data(dataDIVAT2)

# The estimation of the model from the first 200 lignes with two knots
model <- LIB_PHspline(times="times", failures="failures", data=dataDIVAT2[1:200,],
        cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), k=2)

# The predicted survival of the first subject of the training sample
  plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
       ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model from the first 200 lignes with two knots
model <- LIB_PHspline(times="times", failures="failures", data=dataDIVAT2[1:200,],
        cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), k=2)

# The predicted survival of the first subject of the training sample
  plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
       ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for Survival Neural Network Based on the PLANN Method

Description

Fit a neural network based on the partial logistic regression.

Usage

LIB_PLANN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, inter, size, decay, maxit, MaxNWts)
LIB_PLANN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, inter, size, decay, maxit, MaxNWts)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`inter`	The length of the intervals.
`size`	The number of units in the hidden layer.
`decay`	The parameter for weight decay.
`maxit`	The maximum number of iterations.
`MaxNWts`	The maximum allowable number of weights.

Details

This function is based is based on the survivalPLANN from the related package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Biganzoli E, Boracchi P, Mariani L, and et al. Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. Stat Med, 17:1169-86, 1998.

Examples

data(dataDIVAT2)

# The neural network based from the first 300 individuals of the data base

model <- LIB_PLANN(times="times", failures="failures", data=dataDIVAT2[1:300,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  inter=0.5, size=32, decay=0.01, maxit=100, MaxNWts=10000)

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The neural network based from the first 300 individuals of the data base

model <- LIB_PLANN(times="times", failures="failures", data=dataDIVAT2[1:300,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  inter=0.5, size=32, decay=0.01, maxit=100, MaxNWts=10000)

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for Survival Random Survival Forest

Description

Fit survival random forest tree for given values of the regularization parameters.

Usage

LIB_RSF(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, nodesize, mtry, ntree)
LIB_RSF(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, nodesize, mtry, ntree)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is `NULL`: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`nodesize`	The value of the node size.
`mtry`	The number of variables randomly sampled as candidates at each split.
`ntree`	The number of trees.

Details

The survival random forest tree is obtained by using the randomForestSRC package.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Examples

data(dataDIVAT2)

# The estimation of the model
model <- LIB_RSF(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), nodesize=10,
  mtry=2, ntree=100)

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

# The estimation of the model
model <- LIB_RSF(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), nodesize=10,
  mtry=2, ntree=100)

# The predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Library of the Super Learner for Survival Neural Network

Description

Fit a 1-layer neural network based on the partial likelihood from a Cox proportional hazards model.

Usage

LIB_SNN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, n.nodes, decay, batch.size, epochs)
LIB_SNN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, n.nodes, decay, batch.size, epochs)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`n.nodes`	The number of hidden nodes.
`decay`	The value of the weight decay.
`batch.size`	The value of batch size.
`epochs`	The value of epochs.

Details

This function is based is based on the deepsurv from the survivalmodels package. You need to call Python using reticulate. In order to use it, the required Python packages must be installed with reticulate::py_install. Therefore, before running the present LIB_SNN function, you must install and call for the reticulate and survivalmodels packages, and install pycox by using the following command: install_pycox(pip = TRUE, install_torch = FALSE). The survivalSL package functions without these supplementary installations if this learner is not included in the library.

Value

`model`	The estimated model.
`group`	The name of the variable related to the exposure/treatment.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`times`	A vector of numeric values with the times of the `predictions`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

References

Katzman, J. L., Shaham, U., Cloninger, A., Bates, J., Jiang, T., & Kluger, Y. (2018). DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18(1), 24. https://doi.org/10.1186/s12874-018-0482-1

Metrics to Evaluate the Prognostic Capacities

Description

Compute several metrics to evaluate the prognostic capacities with time-to-event data.

Usage

metrics(times, failures, data, prediction.matrix, prediction.times, metric,
pro.time=NULL, ROC.precision=seq(.01, .99, by=.01))
metrics(times, failures, data, prediction.matrix, prediction.times, metric,
pro.time=NULL, ROC.precision=seq(.01, .99, by=.01))

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`data`	A data frame for in which to look for the variables related to the status of the follow-up time (`times`) and the event (`failures`).
`prediction.matrix`	A matrix with the predictions of survivals of each subject (lines) for each prognostic times (columns).
`prediction.times`	A vector of numeric values with the times of the `predictions` (same length than the number of columns of `prediction.matrix`).
`metric`	The metric to compute. See details.
`pro.time`	This optional value of prognostic time represents the maximum delay for which the capacity of the variable is evaluated. The same unit than the one used in the argument times. Not used for the following metrics: "loglik", "ibs", "bll", and "ibll". Default value is the time at which half of the subjects are still at risk.
`ROC.precision`	An optional argument with the percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. Only used when `metric="auc"`. 0 (min) and 1 (max) are not allowed. By default, the precision is `seq(.01,.99,.01)`.

Details

The following metrics can be used: "bs" for the Brier score at the prognostic time pro.time, "ci" for the concordance index at the prognostic time pro.time, "loglik" for the log-likelihood, "ibs" for the integrated Brier score up to the last observed time of event, "ibll" for the integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial log-likelihood, "ribs" for the restricted integrated Brier score up to the prognostic time pro.time, "ribll" for the restricted integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial log-likelihood, "auc" for the area under the time-dependent ROC curve up to the prognostic time pro.time.

Value

A numeric value with the metric estimation.

Examples


data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), lambda=1)

# The apparent AUC at 10-year post-transplantation
metrics(times="times", failures="failures", data=dataDIVAT2,
  prediction.matrix=model$predictions, prediction.times=model$times,
  metric="auc", pro.time=10)

# The integrated Brier score up to 10 years post-transplanation
metrics(times="times", failures="failures", data=dataDIVAT2,
  prediction.matrix=model$predictions, prediction.times=model$times,
  metric="ribs", pro.time=10)
data(dataDIVAT2)

# The estimation of the model
model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), lambda=1)

# The apparent AUC at 10-year post-transplantation
metrics(times="times", failures="failures", data=dataDIVAT2,
  prediction.matrix=model$predictions, prediction.times=model$times,
  metric="auc", pro.time=10)

# The integrated Brier score up to 10 years post-transplanation
metrics(times="times", failures="failures", data=dataDIVAT2,
  prediction.matrix=model$predictions, prediction.times=model$times,
  metric="ribs", pro.time=10)

Calibration Plot for a Cox-like Model

Description

A calibration plot of an object of the class libsl (library of survival super learner).

Usage

## S3 method for class 'libsl'
plot(x, n.groups=5, pro.time=NULL,
newdata=NULL, times=NULL, failures=NULL, ...)
## S3 method for class 'libsl'
plot(x, n.groups=5, pro.time=NULL,
newdata=NULL, times=NULL, failures=NULL, ...)

Arguments

`x`	An object returned by a library of survival super learner.
`n.groups`	A numeric value with the number of groups by their class probabilities. The default is 5.
`pro.time`	The prognostic time at which the calibration plot of the survival probabilities.
`newdata`	An optional data frame containing the new sample for validation with covariate values, follow-up times, and event status. The default value is `NULL`, the calibration plot is performed from the same subjects of the training sample.
`times`	The name of the variable related the numeric vector with the follow-up times in `newdata` (optional argument only necessary when newdata is not `NULL`).
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event) in `newdata` (optional argument only necessary when newdata is not `NULL`).
`...`	Additional arguments affecting the plot.

Details

The plot represents the observed survival and the related 95% confidence intervals, which are respectively estimated by the Kaplan and Meier estimator and the Greenwood formula, against the mean of the predictive values for individuals stratified into groups of the same size according to the percentiles. The identity line is usually included for reference.

Value

No return value for this S3 method.

Examples

data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_COXall(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The calibration plot from the validation sample of 150 patients
plot(model, n.groups=5, pro.time=12, col=3,
     xlab="Predicted 12-year survival", ylab="Observed 12-year survival",
     newdata=dataDIVAT2[151:300,], times="times", failures="failures")
data(dataDIVAT2)

# The estimation of the model from the first 200 lignes
model <- LIB_COXall(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The calibration plot from the validation sample of 150 patients
plot(model, n.groups=5, pro.time=12, col=3,
     xlab="Predicted 12-year survival", ylab="Observed 12-year survival",
     newdata=dataDIVAT2[151:300,], times="times", failures="failures")

Plot Method for 'rocrisca' Objects

Description

A plot of ROC curves is produced.

Usage

## S3 method for class 'rocrisca'
plot(x, ..., information=TRUE)
## S3 method for class 'rocrisca'
plot(x, ..., information=TRUE)

Arguments

`x`	An object of class `rocrisca`, returned by the functions `roc.binary`, `roc.net`, `roc.summary`, and `roc.time`.
`...`	Additional arguments affecting the plot.
`information`	A logical value indicating whether the non-information line is plotted. The default values is TRUE.

Value

No return value for this S3 method.

Examples

data(dataDIVAT3)

# A subgroup analysis to reduce the time needed for this example

dataDIVAT3 <- dataDIVAT3[1:400,]

# The time-dependent ROC curve to evaluate the
# capacities of the recipient age for the prognosis of post-kidney
# transplant mortality up to 2000 days.

# Compute the raw sensitivity and specificity
roc1 <- roc(times="death.time", failures="death", variable="ageR",
confounders=~1, data=dataDIVAT3, pro.time=2000,
precision=seq(0.1,0.9, by=0.2))

plot(roc1, type="b", col=1, pch=2, lty=2, xlab="1-specificity", ylab="sensibility")
data(dataDIVAT3)

# A subgroup analysis to reduce the time needed for this example

dataDIVAT3 <- dataDIVAT3[1:400,]

# The time-dependent ROC curve to evaluate the
# capacities of the recipient age for the prognosis of post-kidney
# transplant mortality up to 2000 days.

# Compute the raw sensitivity and specificity
roc1 <- roc(times="death.time", failures="death", variable="ageR",
confounders=~1, data=dataDIVAT3, pro.time=2000,
precision=seq(0.1,0.9, by=0.2))

plot(roc1, type="b", col=1, pch=2, lty=2, xlab="1-specificity", ylab="sensibility")

Calibration Plot for Super Learner

Description

A calibration plot of a Super Learner obtained by the function survivalSL.

Usage

## S3 method for class 'sltime'
plot(x, method, n.groups, pro.time, newdata,
times, failures, ...)
## S3 method for class 'sltime'
plot(x, method, n.groups, pro.time, newdata,
times, failures, ...)

Arguments

`x`	An object returned by the function `survivalSL`.
`method`	A character string with the name of the algorithm included in the SL for which the calibration plot is performed. The default is "sl" for the Super Learner.
`n.groups`	A numeric value with the number of groups by their class probabilities. The default is 5.
`pro.time`	The prognostic time at which the calibration plot of the survival probabilities.
`newdata`	An optional data frame containing the new sample for validation with covariate values, follow-up times, and event status. The default value is `NULL`, the calibration plot is performed from the same subjects of the training sample.
`times`	The name of the variable related the numeric vector with the follow-up times in `newdata` (optional argument only necessary when newdata is not `NULL`).
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event) in `newdata` (optional argument only necessary when newdata is not `NULL`).
`...`	Additional arguments affecting the plot.

Details

Value

No return value for this S3 method.

Examples

data(dataDIVAT2)

#The outcome model base on a Super Learner from the first 150 individuals of the data base
sl1 <- survivalSL( methods=c("LIB_AFTgamma", "LIB_PHgompertz"),  metric="ci",
  data=dataDIVAT2[1:150,],  times="times", failures="failures", group="ecd",
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant"), cv=3)

# The calibration plot from the validation sample of 150 patients
plot(sl1, method="sl", n.groups=5, pro.time=12, col=2,
     xlab="Predicted 12-year survival", ylab="Observed 12-year survival",
     newdata=dataDIVAT2[151:300,], times="times", failures="failures")
data(dataDIVAT2)

#The outcome model base on a Super Learner from the first 150 individuals of the data base
sl1 <- survivalSL( methods=c("LIB_AFTgamma", "LIB_PHgompertz"),  metric="ci",
  data=dataDIVAT2[1:150,],  times="times", failures="failures", group="ecd",
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant"), cv=3)

# The calibration plot from the validation sample of 150 patients
plot(sl1, method="sl", n.groups=5, pro.time=12, col=2,
     xlab="Predicted 12-year survival", ylab="Observed 12-year survival",
     newdata=dataDIVAT2[151:300,], times="times", failures="failures")

Prediction from an Flexible Parametric Model

Description

Predict the survival based on a model or algorithm from an object of the class libsl.

Usage

## S3 method for class 'libsl'
predict(object, newdata, newtimes, ...)
## S3 method for class 'libsl'
predict(object, newdata, newtimes, ...)

Arguments

`object`	An object returned by the function `LIB_AFTllogis`, `LIB_AFTggamma`, `LIB_AFTgamma`, `LIB_AFTweibull`, `LIB_PHexponential`, `LIB_PHspline` or `LIB_PHgompertz`.
`newdata`	An optional data frame containing covariate values at which to produce predicted values. There must be a column for every covariate included in `cov.quanti` and `cov.quali` included in the training sample. The default value is `NULL`, the predicted values are computed for the subjects of the training sample.
`newtimes`	The times at which to produce predicted values. The default value is `NULL`, the predicted values are computed for the observed times in the training data frame.
`...`	For future methods.

Details

The model object is obtained from the flexsurvreg package.

Value

`times`	A vector of numeric values with the times of the `predictions`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

Examples

data(dataDIVAT2)

# The estimation of the model from the first 200 lines
model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# Predicted survival for 2 new subjects
pred <- predict(model,
  newdata=data.frame(age=c(52,52), hla=c(0,1), retransplant=c(1,1), ecd=c(0,1)))

plot(y=pred$predictions[1,], x=pred$times, xlab="Time (years)", ylab="Predicted survival",
     col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

lines(y=pred$predictions[2,], x=pred$times, col=2, type="l", lty=1, lwd=2)

legend("bottomright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))
data(dataDIVAT2)

# The estimation of the model from the first 200 lines
model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:200,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# Predicted survival for 2 new subjects
pred <- predict(model,
  newdata=data.frame(age=c(52,52), hla=c(0,1), retransplant=c(1,1), ecd=c(0,1)))

plot(y=pred$predictions[1,], x=pred$times, xlab="Time (years)", ylab="Predicted survival",
     col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

lines(y=pred$predictions[2,], x=pred$times, col=2, type="l", lty=1, lwd=2)

legend("bottomright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))

Prediction from a Super Learner for Censored Outcomes

Description

Predict the survival of new observations based on an SL by using the survivalSL function.

Usage

## S3 method for class 'sltime'
predict(object, newdata, newtimes, ...)
## S3 method for class 'sltime'
predict(object, newdata, newtimes, ...)

Arguments

`object`	An object returned by the function `survivalSL`.
`newdata`	An optional data frame containing covariate values at which to produce predicted values. There must be a column for every covariate included in `cov.quanti` and `cov.quali` included in the training sample. The default value is `NULL`, the predicted values are computed for the subjects of the training sample.
`newtimes`	The times at which to produce predicted values. The default value is `NULL`, the predicted values are computed for the observed times in the training data frame.
`...`	For future methods.

Value

`times`	A vector of numeric values with the times of the `predictions`.
`predictions`	A matrix with the predictions of survivals of each subject (lines) for each observed time (columns).

Examples

data(dataDIVAT2)

# The training of the super learner from the first 150 individuals of the data base
sl1 <- survivalSL(method=c("LIB_COXridge", "LIB_AFTggamma"),  metric="ci",
  data=dataDIVAT2[1:150,],  times="times", failures="failures", pro.time = 12,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), cv=3)

# Individual prediction for 2 new subjects
pred <- predict(sl1,
  newdata=data.frame(age=c(52,52), hla=c(0,1), retransplant=c(1,1), ecd=c(0,1)))

plot(y=pred$predictions$sl[1,], x=pred$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

lines(y=pred$predictions$sl[2,], x=pred$times, col=2, type="l", lty=1, lwd=2)

legend("bottomright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))
data(dataDIVAT2)

# The training of the super learner from the first 150 individuals of the data base
sl1 <- survivalSL(method=c("LIB_COXridge", "LIB_AFTggamma"),  metric="ci",
  data=dataDIVAT2[1:150,],  times="times", failures="failures", pro.time = 12,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), cv=3)

# Individual prediction for 2 new subjects
pred <- predict(sl1,
  newdata=data.frame(age=c(52,52), hla=c(0,1), retransplant=c(1,1), ecd=c(0,1)))

plot(y=pred$predictions$sl[1,], x=pred$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

lines(y=pred$predictions$sl[2,], x=pred$times, col=2, type="l", lty=1, lwd=2)

legend("bottomright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))

S3 Method for Printing an 'libsl' Object

Description

Print the model or algorithm.

Usage

## S3 method for class 'libsl'
print(x, ...)
## S3 method for class 'libsl'
print(x, ...)

Arguments

`x`	An object returned by the function `flexsurv`.
`...`	For future methods.

Value

No return value for this S3 method.

Examples

data(dataDIVAT2)

model <- LIB_AFTgamma(times="times", failures="failures",  data=dataDIVAT2[1:100,],
        cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

print(model)
data(dataDIVAT2)

model <- LIB_AFTgamma(times="times", failures="failures",  data=dataDIVAT2[1:100,],
        cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

print(model)

S3 Method for Printing an 'sltime' Object

Description

Print the contribution of learners included in the super learner.

Usage

## S3 method for class 'sltime'
print(x,  digits=7, ...)
## S3 method for class 'sltime'
print(x,  digits=7, ...)

Arguments

`x`	An object returned by the function `survivalSL`.
`digits`	An optional integer for the number of digits to print when printing numeric values.
`...`	For future methods.

Value

No return value for this S3 method.

Examples

data(dataDIVAT2)

sl1 <- survivalSL(method=c("LIB_COXridge", "LIB_AFTggamma"),  metric="ci",
  data=dataDIVAT2[1:150,],  times="times", failures="failures", pro.time = 12,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), cv=3)

print(sl1, digits=4)
data(dataDIVAT2)

sl1 <- survivalSL(method=c("LIB_COXridge", "LIB_AFTggamma"),  metric="ci",
  data=dataDIVAT2[1:150,],  times="times", failures="failures", pro.time = 12,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), cv=3)

print(sl1, digits=4)

Time-Dependent ROC Curves With Right Censored Data

Description

This function allows for the estimation of time-dependent ROC curve by considering possible confounding factors. This method is implemented by standardizing and weighting based on an IPW estimator.

Usage

roc(times, failures, variable, confounders, data,
 pro.time, precision=seq(.01, .99, by=.01))
roc(times, failures, variable, confounders, data,
 pro.time, precision=seq(.01, .99, by=.01))

Arguments

`times`	A character string with the name of the variable in `data` which represents the follow up times.
`failures`	A character string with the name of the variable in `data` which represents the event indicator (0=right censored, 1=event).
`variable`	A character string with the name of the variable in `data` which represents the prognostic variable under interest. This variable is collected at the baseline. The variable must be previously standardized according to the covariates among the controls as proposed by Le Borgne et al. (2017).
`confounders`	An object of class "formula". More precisely only the right part with an expression of the form `~ model`, where `model` is the linear predictor of the logistic regressions performed for each cut-off value. The user can use `~1` to obtain the crude estimation.
`data`	An object of the class `data.frame` containing the variables previously detailed.
`pro.time`	The value of prognostic time represents the maximum delay for which the capacity of the variable is evaluated. The same unit than the one used in the argument `times`.
`precision`	The quintiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. 0 (min) and 1 (max) are not allowed.

Details

This function computes confounder-adjusted time-dependent ROC curve with right-censored data. We adapted the naive IPCW estimator as explained by Blanche, Dartigues and Jacqmin-Gadda (2013) by considering the probability of experiencing the event of interest before the fixed prognostic time, given the possible confounding factors.

Value

`table`	This data frame presents the sensitivities and specificities associated with the cut-off values. `J` represents the Youden index.
`auc`	The area under the time-dependent ROC curve for a prognostic up to `pro.time`.

References

Blanche et al. (2013) Review and comparison of roc curve estimators for a time-dependent outcome with marker-dependent censoring. Biometrical Journal, 55, 687-704. <doi:10.1002/ bimj.201200045>

Le Borgne et al. Standardized and weighted time-dependent ROC curves to evaluate the intrinsic prognostic capacities of a marker by taking into account confounding factors. Stat Methods Med Res. 27(11):3397-3410, 2018. <doi: 10.1177/ 0962280217702416>.

Examples

# import and attach the data example
data(dataDIVAT3)

# A subgroup analysis to reduce the time needed for this example

dataDIVAT3 <- dataDIVAT3[1:400,]

# The standardized and weighted time-dependent ROC curve to evaluate the
# capacities of the recipient age for the prognosis of post kidney
# transplant mortality up to 2000 days by taking into account the
# donor age and the recipient gender.

# 1. Standardize the marker according to the covariates among the controls
lm1 <- lm(ageR ~ ageD + sexeR, data=dataDIVAT3[dataDIVAT3$death.time >= 2500,])
dataDIVAT3$ageR_std <- (dataDIVAT3$ageR - (lm1$coef[1] + lm1$coef[2] * dataDIVAT3$ageD +
 lm1$coef[3] * dataDIVAT3$sexeR)) / sd(lm1$residuals)

# 2. Compute the sensitivity and specificity from the proposed IPW estimators
roc2 <- roc(times="death.time", failures="death", variable="ageR_std",
confounders=~bs(ageD, df=3) + sexeR, data=dataDIVAT3, pro.time=2000,
precision=seq(0.1,0.9, by=0.2))

# The corresponding ROC graph
plot(roc2, col=2, pch=2, lty=1, type="b", xlab="1-specificity", ylab="sensibility")

# The corresponding AUC
roc2$auc
# import and attach the data example
data(dataDIVAT3)

# A subgroup analysis to reduce the time needed for this example

dataDIVAT3 <- dataDIVAT3[1:400,]

# The standardized and weighted time-dependent ROC curve to evaluate the
# capacities of the recipient age for the prognosis of post kidney
# transplant mortality up to 2000 days by taking into account the
# donor age and the recipient gender.

# 1. Standardize the marker according to the covariates among the controls
lm1 <- lm(ageR ~ ageD + sexeR, data=dataDIVAT3[dataDIVAT3$death.time >= 2500,])
dataDIVAT3$ageR_std <- (dataDIVAT3$ageR - (lm1$coef[1] + lm1$coef[2] * dataDIVAT3$ageD +
 lm1$coef[3] * dataDIVAT3$sexeR)) / sd(lm1$residuals)

# 2. Compute the sensitivity and specificity from the proposed IPW estimators
roc2 <- roc(times="death.time", failures="death", variable="ageR_std",
confounders=~bs(ageD, df=3) + sexeR, data=dataDIVAT3, pro.time=2000,
precision=seq(0.1,0.9, by=0.2))

# The corresponding ROC graph
plot(roc2, col=2, pch=2, lty=1, type="b", xlab="1-specificity", ylab="sensibility")

# The corresponding AUC
roc2$auc

Summaries of a Learner

Description

Return predictive performances of a model or algorithm obtained by a library of the class libsl.

Usage

## S3 method for class 'libsl'
summary(object, newdata=NULL, ROC.precision=seq(.01,.99,.01), digits=7, ...)
## S3 method for class 'libsl'
summary(object, newdata=NULL, ROC.precision=seq(.01,.99,.01), digits=7, ...)

Arguments

`object`	An object returned by a library of the class `libsl`.
`newdata`	An optional data frame containing the new sample for validation with covariate values, follow-up times, and event status. The default value is `NULL`, the calibration plot is performed from the same subjects of the training sample.
`ROC.precision`	An optional argument with the percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. 0 (min) and 1 (max) are not allowed. By default, the precision is `seq(.01,.99,.01)`.
`digits`	An optional integer for the number of digits to print when printing numeric values.
`...`	Additional arguments affecting the summary which are passed from `libsl` by default. It concerns the argument `times`, `failures`, and `pro.time`.

Details

The following metrics are returned: "brier" for the Brier score at the prognostic time pro.time, "ibs" for the Integrated Brier score up to the last observed time of event, "ibll" for the Integrated Binomial Log-likelihood up to the last observed time of event, "bll" for the binomial Log-likelihood, "ribs" for the restricted Integrated Brier score up to the prognostic time pro.time, "ribll" for the restricted Integrated Binomial Log-likelihood Log-likelihood up to the last observed time of event, "bll" for the binomial Log-likelihood, "auc" for the area under the time-dependent ROC curve up to the prognostic time pro.time.

Value

No return value for this S3 method.

Examples

data(dataDIVAT2)

# The training of the Weibull model with the first 400 patients
model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:400,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The prognostic capacities from the same training sample
# (up to 4 years forseveral indicators)
summary(model, pro.time=4)

# The prognostic capacities from a validation of the next 150 patients
# (up to 4 years for several indicators)
 summary(model, pro.time=4, newdata=dataDIVAT2[401:550,], times="times",
 failures="failures")
data(dataDIVAT2)

# The training of the Weibull model with the first 400 patients
model <- LIB_PHgompertz(times="times", failures="failures", data=dataDIVAT2[1:400,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

# The prognostic capacities from the same training sample
# (up to 4 years forseveral indicators)
summary(model, pro.time=4)

# The prognostic capacities from a validation of the next 150 patients
# (up to 4 years for several indicators)
 summary(model, pro.time=4, newdata=dataDIVAT2[401:550,], times="times",
 failures="failures")

Summaries of a Super Learner

Description

Return goodness-of-fit indicators of a Super Learner obtained by the function survivalSL.

Usage

## S3 method for class 'sltime'
summary(object,  method="sl", newdata=NULL,
ROC.precision=seq(.01,.99,.01), digits=7, ...)
## S3 method for class 'sltime'
summary(object,  method="sl", newdata=NULL,
ROC.precision=seq(.01,.99,.01), digits=7, ...)

Arguments

`object`	An object returned by the function `survivalSL`.
`method`	A character string with the name of the algorithm included in the SL for which the calibration plot is performed. The default is "sl" for the Super Learner.
`newdata`	An optional data frame containing the new sample for validation with covariate values, follow-up times, and event status. The default value is `NULL`, the calibration plot is performed from the same subjects of the training sample.
`ROC.precision`	An optional argument with the percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. 0 (min) and 1 (max) are not allowed. By default, the precision is `seq(.01,.99,.01)`.
`digits`	An optional integer for the number of digits to print when printing numeric values.
`...`	Additional arguments affecting the summary which are passed from `libsl` by default. It concerns the argument `times`, `failures`, and `pro.time`.

Details

The following metrics are returned: "ci" for the concordance index at the prognostic time pro.time, "bs" for the Brier score at the prognostic time pro.time, "ibs" for the integrated Brier score up to the last observed time of event, "ibll" for the integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial Log-likelihood, "ribs" for the restricted Integrated Brier score up to the prognostic time pro.time, "ribll" for the restricted integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial log-likelihood, and "auc" for the area under the time-dependent ROC curve up to the prognostic time pro.time.

Value

No return value for this S3 method.

Examples

data(dataDIVAT2)

dataDIVAT2$train <- 1*rbinom(n=dim(dataDIVAT2)[1], size = 1, prob=1/2)

# The training of the super learner with 2 algorithms from the
   # first 100 patients of the training sample
sl1 <- survivalSL(method=c("LIB_AFTgamma", "LIB_PHgompertz"),  metric="auc",
  data=dataDIVAT2[dataDIVAT2$train==1,][1:100,],  times="times", failures="failures",
  pro.time = 12,  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  cv=3)

# The prognostic capacities from the same training sample
summary(sl1)
data(dataDIVAT2)

dataDIVAT2$train <- 1*rbinom(n=dim(dataDIVAT2)[1], size = 1, prob=1/2)

# The training of the super learner with 2 algorithms from the
   # first 100 patients of the training sample
sl1 <- survivalSL(method=c("LIB_AFTgamma", "LIB_PHgompertz"),  metric="auc",
  data=dataDIVAT2[dataDIVAT2$train==1,][1:100,],  times="times", failures="failures",
  pro.time = 12,  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  cv=3)

# The prognostic capacities from the same training sample
summary(sl1)

Super Learner for Censored Outcomes

Description

This function allows to compute a Super Learner (SL) to predict survival outcomes.

Usage

survivalSL(methods, metric="ci",  data, times, failures, group=NULL,
cov.quanti=NULL, cov.quali=NULL, cv=10, param.tune=NULL, pro.time=NULL,
optim.local.min=FALSE, ROC.precision=seq(.01,.99,.01),
param.weights.fix=NULL, param.weights.init=NULL,
keep.predictions=TRUE, progress=TRUE)
survivalSL(methods, metric="ci",  data, times, failures, group=NULL,
cov.quanti=NULL, cov.quali=NULL, cv=10, param.tune=NULL, pro.time=NULL,
optim.local.min=FALSE, ROC.precision=seq(.01,.99,.01),
param.weights.fix=NULL, param.weights.init=NULL,
keep.predictions=TRUE, progress=TRUE)

Arguments

`methods`	A vector of characters with the names of the algorithms included in the SL. At least two algorithms have to be included.
`metric`	The loss function used to estimate the weights of the algorithms in the SL. See details.
`data`	A data frame in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`cv`	The number of splits for cross-validation. The default value is 10.
`param.tune`	A list with a length equals to the number of algorithms included in `methods`. If `NULL`, the tunning parameters are estimated (see details).
`pro.time`	This optional value of prognostic time represents the maximum delay for which the capacity of the variable is evaluated. The same unit than the one used in the argument times. Not used for the following metrics: "loglik", "ibs", "bll", and "ibll". Default value is the time at which half of the subjects are still at risk.
`optim.local.min`	An optional logical value. If `TRUE`, the optimization is performed twice to better ensure the estimation of the weights. If `FALSE` (default value), the optimization is performed once.
`ROC.precision`	The percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. Only used when `metric="auc"`. 0 (min) and 1 (max) are not allowed. By default: `seq(.01,.99,.01)`.
`param.weights.fix`	A vector with the parameters of the multinomial logistic regression which generates the weights of the algorithms declared in `methods`. When completed, the related parameters are not estimated. The default value is NULL: the parameters are estimated by a `cv`-fold cross-validation. See details.
`param.weights.init`	A vector with the initial values of the parameters of the multinomial logistic regression which generates the weights of the algorithms declared in `methods`. The default value is NULL: the initial values are equaled to 0. See details.
`keep.predictions`	A logical value specifying if all the predictions for all the `methods` are saved. If `FALSE`, only the predictions related to the SL are saved (for space saving). The default is `TRUE`.
`progress`	A logical value to print a progress bar in the R console. The default is `TRUE`

Details

Each object of the list declared in param.tune must have the same name than the names of the methods included in the SL. If param.tune = NULL, the tunning parameters of each algorithm are estimated by cv-fold cross-validation. Otherwise, the user can propose a tunning grid for each method, as explained in the following table. The following metrics can be used: "ci" for the concordance index at the prognostic time pro.time, "bs" for the Brier score at the prognostic time pro.time, "loglik" for the log-likelihood, "ibs" for the integrated Brier score up to the last observed time of event, "ibll" for the Integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial log-likelihood, "ribs" for the restricted integrated Brier score up to the prognostic time pro.time, "ribll" for the restricted integrated binomial log-likelihood up to the last observed time of event, "bll" for the binomial log-likelihood, and "auc" for the area under the time-dependent ROC curve up to the prognostic time pro.time.

The following learners are available:

Names	Description	Package
`"LIB_AFTgamma"`	Gamma-distributed AFT model	flexsurv
`"LIB_AFTggamma"`	Generalized Gamma-distributed AFT model	flexsurv
`"LIB_AFTweibull"`	Weibull-distributed AFT model	flexsurv
`"LIB_PHexponential"`	Exponential-distributed PH model	flexsurv
`"LIB_PHgompertz"`	Gompertz-distributed PH model	flexsurv
`"LIB_PHspline"`	Spline-based PH model	flexsurv
`"LIB_COXall"`	Usual Cox model	survival
`"LIB_COXaic"`	Cox model with AIC-based forward selection	MASS
`"LIB_COXen"`	Elastic Net Cox model	glmnet
`"LIB_COXlasso"`	Lasso Cox model	glmnet
`"LIB_COXridge"`	Ridge Cox model	glmnet
`"LIB_RSF"`	Survival Random Forest	randomForestSRC
`"LIB_SNN"`	(Python-based) Survival Neural Network	survivalmodels
`"LIB_PLANN"`	(Python-based) Survival Neural Network	survivalPLANN

The following loss functions for the estimation of the super learner weigths are available (metric):

Area under the ROC curve ("auc")
Concordance index ("ci")
Brier score ("bs")
Binomial log-likelihood ("bll")
Integrated Brier score ("ibs")
Integrated binomial log-likelihood ("ibll")
Restricted integrated Brier score ("ribs")
Restricted integrated binomial log-Likelihood ("ribll")

Value

`times`	A vector of numeric values with the times of the `predictions`.
`predictions`	A list of matrices with the predictions of survivals of each subject (lines) for each observed time (columns). Each matrix corresponds to the included `methods` and the resulted SL (the last item entitled "sl"). If `keep.predictions=TRUE`, it corresponds to a matrix with predictions related to the SL.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`predictors`	A list with the predictors involved in `group`, `cov.quanti` and `cov.quali`.
`ROC.precision`	The percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve.
`cv`	The number of splits for cross-validation.
`pro.time`	The maximum delay for which the capacity of the variable is evaluated.
`models`	A list with the estimated models/algorithms included in the SL.
`weights`	A list composed by two vectors: the regressions `coefficients` of the logistic multinomial regression and the resulting weights' `values`
`metric`	A list composed by two vectors: the loss function used to estimate the weights of the algorithms in the SL and its value.
`param.tune`	The estimated tunning parameters.

References

Polley E and van der Laanet M. Super Learner In Prediction. http://biostats.bepress.com. 2010.

Examples

data(dataDIVAT2)

# The Super Learner based from the first 250 individuals of the data base
sl1 <- survivalSL(methods=c("LIB_AFTgamma", "LIB_PHgompertz"),  metric="ci",
  data=dataDIVAT2[1:250,],  times="times", failures="failures", group="ecd",
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant"), cv=5)

# Individual prediction
pred <- predict(sl1, newdata=data.frame(age=c(52,52), hla=c(0,1),
retransplant=c(1,1), ecd=c(0,1)))

plot(y=pred$predictions$sl[1,], x=pred$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

lines(y=pred$predictions$sl[2,], x=pred$times, col=2, type="l", lty=1, lwd=2)

legend("topright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))
data(dataDIVAT2)

# The Super Learner based from the first 250 individuals of the data base
sl1 <- survivalSL(methods=c("LIB_AFTgamma", "LIB_PHgompertz"),  metric="ci",
  data=dataDIVAT2[1:250,],  times="times", failures="failures", group="ecd",
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant"), cv=5)

# Individual prediction
pred <- predict(sl1, newdata=data.frame(age=c(52,52), hla=c(0,1),
retransplant=c(1,1), ecd=c(0,1)))

plot(y=pred$predictions$sl[1,], x=pred$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

lines(y=pred$predictions$sl[2,], x=pred$times, col=2, type="l", lty=1, lwd=2)

legend("topright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))

Tune a Cox Model with a Forward Selection Based on the AIC

Description

This function finds the model which minimize the AIC of a Cox PH model.

Usage

tuneCOXaic(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, model.min=NULL, model.max=NULL)
tuneCOXaic(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, model.min=NULL, model.max=NULL)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is `NULL`: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`model.min`	An optional argument with the minimal set of covariates.
`model.max`	An optional argument with the maximal set of covariates.

Details

The function runs the stepAIC function of the MASS package for covariates' selection.

Value

`optimal`	The names of covariate to adjuste the fit.
`results`	The result of the stepAIC process.

References

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.

Examples

data(dataDIVAT2)

tune.model <- tuneCOXaic(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

tune.model$optimal$final.model # the covariate in the model with the best AIC

# The estimation of the training model with the corresponding lambda value
model <- LIB_COXaic(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  final.model=tune.model$optimal$final.model)

# The resulted predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

tune.model <- tuneCOXaic(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"))

tune.model$optimal$final.model # the covariate in the model with the best AIC

# The estimation of the training model with the corresponding lambda value
model <- LIB_COXaic(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  final.model=tune.model$optimal$final.model)

# The resulted predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Tune Elastic Net Cox Regression

Description

This function finds the optimal lambda and alpha parameters for an elastic net Cox regression.

Usage

tuneCOXen(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, cv=10, parallel=FALSE, alpha, lambda)
tuneCOXen(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, cv=10, parallel=FALSE, alpha, lambda)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is `NULL`: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`cv`	The value of the number of folds. The default value is 10.
`parallel`	If `TRUE`, use parallel `foreach` to fit each fold. The default is `FALSE`.
`alpha`	The values of the regularization parameter alpha optimized over.
`lambda`	The values of the regularization parameter lambda optimized over.

Details

The function runs the cv.glmnet function of the glmnet package.

Value

`optimal`	The value of lambda that gives the minimum mean cross-validated error.
`results`	The data frame with the mean cross-validated errors for each lambda values.

References

Examples

data(dataDIVAT2)

tune.model <- tuneCOXen(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), cv=5,
  alpha=seq(.1, 1, by=.1), lambda=seq(.1, 1, by=.1))

tune.model$optimal$lambda # the estimated lambda value

# The estimation of the training modelwith the corresponding lambda value
model <- LIB_COXen(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  alpha=tune.model$optimal$alpha,
  lambda=tune.model$optimal$lambda)

# The resulted predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

tune.model <- tuneCOXen(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), cv=5,
  alpha=seq(.1, 1, by=.1), lambda=seq(.1, 1, by=.1))

tune.model$optimal$lambda # the estimated lambda value

# The estimation of the training modelwith the corresponding lambda value
model <- LIB_COXen(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  alpha=tune.model$optimal$alpha,
  lambda=tune.model$optimal$lambda)

# The resulted predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Tune Lasso Cox Regression

Description

This function finds the optimal lambda parameter for a Lasso Cox regression.

Usage

tuneCOXlasso(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, cv=10, parallel=FALSE, lambda)
tuneCOXlasso(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, cv=10, parallel=FALSE, lambda)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is `NULL`: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`cv`	The value of the number of folds. The default value is 10.
`parallel`	If `TRUE`, use parallel `foreach` to fit each fold. The default is `FALSE`.
`lambda`	The values of the regularization parameter lambda optimized over.

Details

The function runs the cv.glmnet function of the glmnet package.

Value

`optimal`	The value of lambda that gives the minimum mean cross-validated error.
`results`	The data frame with the mean cross-validated errors for each lambda values.

References

Simon et al. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, https://www.jstatsoft.org/v39/i05/

Examples

data(dataDIVAT2)

tune.model <- tuneCOXlasso(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  cv=5, lambda=seq(0, 10, by=.1))

tune.model$optimal$lambda # the estimated lambda value

# The estimation of the training modelwith the corresponding lambda value
model <- LIB_COXlasso(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  lambda=tune.model$optimal$lambda)

# The resulted predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

tune.model <- tuneCOXlasso(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  cv=5, lambda=seq(0, 10, by=.1))

tune.model$optimal$lambda # the estimated lambda value

# The estimation of the training modelwith the corresponding lambda value
model <- LIB_COXlasso(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  lambda=tune.model$optimal$lambda)

# The resulted predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Tune Ridge Cox Regression

Description

This function finds the optimal lambda parameter for a ridge Cox regression.

Usage

tuneCOXridge(times, failures, group=NULL, cov.quanti=NULL,
cov.quali=NULL, data, cv=10, parallel=FALSE, lambda)
tuneCOXridge(times, failures, group=NULL, cov.quanti=NULL,
cov.quali=NULL, data, cv=10, parallel=FALSE, lambda)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is `NULL`: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`cv`	The value of the number of folds. The default value is 10.
`parallel`	If `TRUE`, use parallel `foreach` to fit each fold. The default is `FALSE`.
`lambda`	The values of the regularization parameter lambda optimized over.

Details

The function runs the cv.glmnet function of the glmnet package.

Value

`optimal`	The value of lambda that gives the minimum mean cross-validated error.
`results`	The data frame with the mean cross-validated errors for each lambda values.

References

Examples

data(dataDIVAT2)

tune.model <- tuneCOXridge(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  cv=5, lambda=seq(0, 10, by=.1))

tune.model$optimal$lambda # the estimated lambda value

# The estimation of the training modelwith the corresponding lambda value
model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  lambda=tune.model$optimal$lambda)

# The resulted predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

tune.model <- tuneCOXridge(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  cv=5, lambda=seq(0, 10, by=.1))

tune.model$optimal$lambda # the estimated lambda value

# The estimation of the training modelwith the corresponding lambda value
model <- LIB_COXridge(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  lambda=tune.model$optimal$lambda)

# The resulted predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Tune a Survival Regression using the Royston/Parmar Spline Model

Description

This function finds the optimal number of knots of the spline function.

Usage

tunePHspline(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, cv=10, k)
tunePHspline(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, cv=10, k)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is `NULL`: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`cv`	The value of the number of folds. The default value is 10.
`k`	The number of knots optimized over.

Details

The function runs the flexsurvspline function of the flexsurv package. The metric used in the cross-validation is the C-index.

Value

`optimal`	The value of `k` that gives the maximum mean cross-validated C-index.
`results`	The data frame with the mean cross-validated C-index according to `k`.

References

Royston, P. and Parmar, M. (2002). Flexible parametric proportional-hazards and proportional odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Statistics in Medicine 21(1):2175-2197. doi: 10.1002/sim.1203

Examples

data(dataDIVAT2)

# The estimation of the hyperparameters on the first 150 patients

tune.model <- tunePHspline(times="times", failures="failures", data=dataDIVAT2[1:150,],
    cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
    cv=3, k=1:2)

# the estimated nodesize value

 tune.model$optimal
 tune.model$results
data(dataDIVAT2)

# The estimation of the hyperparameters on the first 150 patients

tune.model <- tunePHspline(times="times", failures="failures", data=dataDIVAT2[1:150,],
    cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
    cv=3, k=1:2)

# the estimated nodesize value

 tune.model$optimal
 tune.model$results

Tune a Survival Neural Network Based on the PLANN Method

Description

This function finds the optimal inter, size, decay, maxit, and MaxNWts parameters for the survival neural network by using cross-validation and the concordance index.

Usage

tunePLANN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, cv=10, inter, size, decay, maxit, MaxNWts)
tunePLANN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, cv=10, inter, size, decay, maxit, MaxNWts)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is `NULL`: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`cv`	The value of the number of folds. The default value is 10.
`inter`	The length of the intervals.
`size`	The number of units in the hidden layer.
`decay`	The parameter for weight decay.
`maxit`	The maximum number of iterations.
`MaxNWts`	The maximum allowable number of weights.

Details

This function is based is based on the survivalPLANN package.

Value

`optimal`	The value of `inter`, `size`, `decay`, `maxit`, and `MaxNWts` that gives the maximum mean cross-validated C-index.
`results`	The data frame with the mean cross-validated C-index according to `inter`, `size`, `decay`, `maxit`, and `MaxNWts`.

References

Biganzoli E, Boracchi P, Mariani L, and et al. Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. Stat Med, 17:1169-86, 1998.

Examples

data(dataDIVAT2)

# The hyper-parameter grid needs to be more precise and the maximum number
# of iterations > 1000. We have reduced the arguments to respect examples requiring
# less than 5 seconds for packages on the CRAN.

tune.model <- tunePLANN(times="times", failures="failures", data=dataDIVAT2[1:300,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), cv=3,
  inter=1, size=c(16, 32), decay=0.01, maxit=50, MaxNWts=10000)

tune.model$optimal # the optimal hyperparameters

tune.model$results # the C-index for the tested grid
data(dataDIVAT2)

# The hyper-parameter grid needs to be more precise and the maximum number
# of iterations > 1000. We have reduced the arguments to respect examples requiring
# less than 5 seconds for packages on the CRAN.

tune.model <- tunePLANN(times="times", failures="failures", data=dataDIVAT2[1:300,],
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"), cv=3,
  inter=1, size=c(16, 32), decay=0.01, maxit=50, MaxNWts=10000)

tune.model$optimal # the optimal hyperparameters

tune.model$results # the C-index for the tested grid

Tune a Survival Random Forest

Description

This function finds the optimal nodesize, mtry, and ntree parameters for a survival random forest tree.

Usage

tuneRSF(times, failures, group=NULL, cov.quanti=NULL,
cov.quali=NULL, data, nodesize, mtry, ntree)
tuneRSF(times, failures, group=NULL, cov.quanti=NULL,
cov.quali=NULL, data, nodesize, mtry, ntree)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is `NULL`: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`nodesize`	The values of the node size optimized over.
`mtry`	The numbers of variables randomly sampled as candidates at each split optimized over.
`ntree`	The numbers of trees optimized over.

Details

The function runs the tune.rfsrc function of the randomForestSRC package.

Value

`optimal`	The value of lambda that gives the minimum mean cross-validated error.
`results`	The data frame with the mean cross-validated errors for each lambda values.

References

Ishwaran H. and Kogalur U.B. (2007). Random survival forests for R, Rnews, 7(2):25-31.

Examples

data(dataDIVAT2)

tune.model <- tuneRSF(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  nodesize=c(100, 250, 500), mtry=1, ntree=100)

tune.model$optimal # the estimated nodesize value

# The estimation of the training modelwith the corresponding lambda value
model <- LIB_RSF(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  nodesize=tune.model$optimal$nodesize, mtry=1, ntree=100)

# The resulted predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))
data(dataDIVAT2)

tune.model <- tuneRSF(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  nodesize=c(100, 250, 500), mtry=1, ntree=100)

tune.model$optimal # the estimated nodesize value

# The estimation of the training modelwith the corresponding lambda value
model <- LIB_RSF(times="times", failures="failures", data=dataDIVAT2,
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant", "ecd"),
  nodesize=tune.model$optimal$nodesize, mtry=1, ntree=100)

# The resulted predicted survival of the first subject of the training sample
plot(y=model$predictions[1,], x=model$times, xlab="Time (years)",
ylab="Predicted survival", col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

Tune a 1-Layer Survival Neural Network

Description

This function finds the optimal n.nodes, decay, batch.size, and epochs parameters for a survival neural network.

Usage

tuneSNN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, cv=10, n.nodes, decay, batch.size, epochs)
tuneSNN(times, failures, group=NULL, cov.quanti=NULL, cov.quali=NULL,
data, cv=10, n.nodes, decay, batch.size, epochs)

Arguments

`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is `NULL`: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`data`	A data frame for training the model in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`cv`	The value of the number of folds. The default value is 10.
`n.nodes`	The number of hidden nodes optimized over.
`decay`	The value of the weight decay optimized over.
`batch.size`	The value of batch size
`epochs`	The value of epochs

Details

Value

`optimal`	The value of `n.nodes`, `decay`, `batch.size`, and `epochs` that gives the maximum mean cross-validated C-index.
`results`	The data frame with the mean cross-validated C-index according to `n.nodes`, `decay`, `batch.size`, and `epochs`.

References

Katzman et al. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18(1), 24. 1018.

https://doi.org/10.1186/s12874-018-0482-1

Package 'survivalSL'

Help Index

A Sample from the DIVAT Data Bank.

Description

Usage

Format

Source

References

Examples

A Sample from the DIVAT Data Bank.

Description

Usage

Format

Source

References

Examples

A Simulated Sample from the OFSEP Cohort.

Description

Usage

Format

Examples

Library of the Super Learner for an Accelerated Failure Time (AFT) Model with a Gamma Distribution

Description

Usage

Arguments

Details

Value

References

Examples

Library of the Super Learner for an Accelerated Failure Time (AFT) Model with a Generalized Gamma Distribution

Description

Usage

Arguments

Details

Value

References

Examples

Library of the Super Learner for an Accelerated Failure Time (AFT) Model with a Log Logistic Distribution

Description

Usage

Arguments

Details

Value

References

Examples

Library of the Super Learner for an Accelerated Failure Time (AFT) Model with a Weibull Distribution

Description

Usage

Arguments

Details

Value

References

Examples

Library of the Super Learner for a Cox Model with Selected Covariates

Description

Usage

Arguments

Value

References

Examples

Library of the Super Learner for Cox Regression

Description

Usage

Arguments

Details

Value

References

Examples

Library of the Super Learner for Elastic Net Cox Regression

Description

Usage

Arguments

Details

Value

References

Examples

Library of the Super Learner for Lasso Cox Regression

Description

Usage

Arguments