Validate an existing prediction model, to calculate the predictive performance against a new (validation) dataset.
Usage
pred_validate(
x,
new_data,
binary_outcome = NULL,
survival_time = NULL,
event_indicator = NULL,
time_horizon = NULL,
level = 0.95,
cal_plot = TRUE,
...
)
Arguments
- x
an object of class "
predinfo
" produced by callingpred_input_info
.- new_data
data.frame upon which the prediction model should be evaluated.
- binary_outcome
Character variable giving the name of the column in
new_data
that represents the observed binary outcomes (should be coded 0 and 1 for non-event and event, respectively). Only relevant formodel_type
="logistic"; leave asNULL
otherwise. Leave asNULL
ifnew_data
does not contain any outcomes.- survival_time
Character variable giving the name of the column in
new_data
that represents the observed survival times. Only relevant forx$model_type
="survival"; leave asNULL
otherwise.- event_indicator
Character variable giving the name of the column in
new_data
that represents the observed survival indicator (1 for event, 0 for censoring). Only relevant forx$model_type
="survival"; leave asNULL
otherwise.- time_horizon
for survival models, an integer giving the time horizon (post baseline) at which a prediction is required. Currently, this must match a time in x$cum_hazard.
- level
the confidence level required for all performance metrics. Defaults at 95%. Must be a value between 0 and 1.
- cal_plot
indicate if a flexible calibration plot should be produced (TRUE) or not (FALSE).
- ...
further plotting arguments for the calibration plot. See Details below.
Value
pred_validate
returns an object of class
"predvalidate
", with child classes per model_type
. This is a
list of performance metrics, estimated by applying the existing prediction
model to the new_data. An object of class "predvalidate
" is a list
containing relevant calibration and discrimination measures. For logistic
regression models, this will include observed:expected ratio,
calibration-intercept, calibration slope, area under the ROC curve,
R-squared, and Brier Score. For survival models, this will include
observed:expected ratio (if cum_hazard
is provided to x
),
calibration slope, and Harrell's C-statistic. Optionally, a flexible
calibration plot is also produced, along with a box-plot and violin plot of
the predicted risk distribution.
The summary
function can be used to extract and print summary
performance results (calibration and discrimination metrics). The graphical
assessments of performance can be extracted using plot
.
Details
This function takes an existing prediction model formatted according
to pred_input_info
, and calculates measures of predictive
performance on new data (e.g., within an external validation study). The
information about the existing prediction model should first be inputted by
calling pred_input_info
, before passing the resulting object
to pred_validate
.
new_data
should be a data.frame, where each row should be an
observation (e.g. patient) and each variable/column should be a predictor
variable. The predictor variables need to include (as a minimum) all of the
predictor variables that are included in the existing prediction model
(i.e., each of the variable names supplied to
pred_input_info
, through the model_info
parameter,
must match the name of a variables in new_data
).
Any factor variables within new_data
must be converted to dummy
(0/1) variables before calling this function. dummy_vars
can
help with this. See pred_predict
for examples.
binary_outcome
, survival_time
and event_indicator
are
used to specify the outcome variable(s) within new_data
(use
binary_outcome
if x$model_type
= "logistic", or use
survival_time
and event_indicator
if x$model_type
=
"survival").
In the case of validating a logistic regression model, this function assesses the predictive performance of the predicted risks against an observed binary outcome. Various metrics of calibration (agreement between the observed risk and the predicted risks, across the full risk range) and discrimination (ability of the model to distinguish between those who develop the outcome and those who do not) are calculated. For calibration, the observed-to-expected ratio, calibration intercept and calibration slopes are estimated. The calibration intercept is estimated by fitting a logistic regression model to the observed binary outcomes, with the linear predictor of the model as an offset. For calibration slope, a logistic regression model is fit to the observed binary outcome with the linear predictor from the model as the only covariate. For discrimination, the function estimates the area under the receiver operating characteristic curve (AUC). Various other metrics are also calculated to assess overall accuracy (Brier score, Cox-Snell R2).
In the case of validating a survival prediction model, this function
assesses the predictive performance of the linear predictor and
(optionally) the predicted event probabilities at a fixed time horizon
against an observed time-to-event outcome. Various metrics of calibration
and discrimination are calculated. For calibration, the
observed-to-expected ratio at the specified time_horizon
(if
predicted risks are available through specification of x$cum_hazard
)
and calibration slope are produced. For discrimination, Harrell's
C-statistic is calculated.
For both model types, a flexible calibration plot is produced (for survival
models, the cumulative baseline hazard must be available in the
predinfo
object, x$cum_hazard
). Specify parameter
cal_plot
to indicate whether a calibration plot should be produced
(TRUE), or not (FALSE). The calibration plot is produced by regressing the
observed outcomes against a cubic spline of the logit of predicted risks
(for a logistic model) or the complementary log-log of the predicted risks
(for a survival model). Users can specify the following additional
parameters to pred_validate
to modify the calibration plot:
xlim
as a numeric vector of length 2, giving the lower and upper range of the x-axis scale - defaults at 0 and 1. Changes here should match changes to theylim
such that the plot remains 'square'.ylim
as a numeric vector of length 2, giving the lower and upper range of the y-axis scale - defaults at 0 and 1. Changes here should match changes to thexlim
such that the plot remains 'square'.xlab
string giving the x-axis label. Defaults as "Predicted Probability".ylab
string giving the x-axis label. Defaults as "Observed Probability".pred_rug
TRUE/FALSE of whether a 'rug' should be placed along the x-axis of the calibration plot showing the distribution of predicted risks. Defaults as FALSE in favour of examining the box-plot/violin plot that is produced.cal_plot_n_sample
numeric value (less than nrow(new_data)) giving a random subset of observations to render the calibration plot over. The calibration plot is always created using all data, but for rendering speed in large datasets, it can sometimes be useful to render the plot over a smaller (random) subset of observations. Final (e.g. publication-ready) plots should always show the full plot, so a warning is created if users enter a value of cal_plot_n_sample.
Examples
#Example 1 - multiple existing model, with outcome specified; uses
# an example dataset within the package
model1 <- pred_input_info(model_type = "logistic",
model_info = SYNPM$Existing_logistic_models)
val_results <- pred_validate(x = model1,
new_data = SYNPM$ValidationData,
binary_outcome = "Y",
cal_plot = FALSE)
summary(val_results)
#>
#> Performance Results for Model 1
#> =================================
#> Calibration Measures
#> ---------------------------------
#> Estimate Lower 95% Confidence Interval
#> Observed:Expected Ratio 1.9006 1.8368
#> Calibration Intercept 0.7323 0.6921
#> Calibration Slope 0.6484 0.5576
#> Upper 95% Confidence Interval
#> Observed:Expected Ratio 1.9666
#> Calibration Intercept 0.7726
#> Calibration Slope 0.7392
#>
#> Also examine the calibration plot, if produced.
#>
#> Discrimination Measures
#> ---------------------------------
#> Estimate Lower 95% Confidence Interval Upper 95% Confidence Interval
#> AUC 0.5814 0.5702 0.5927
#>
#>
#> Overall Performance Measures
#> ---------------------------------
#> Cox-Snell R-squared: -0.0481
#> Nagelkerke R-squared: -0.0863
#> Brier Score (CI): 0.1249 (0.1219, 0.1279)
#>
#> Also examine the distribution plot of predicted risks.
#>
#> Performance Results for Model 2
#> =================================
#> Calibration Measures
#> ---------------------------------
#> Estimate Lower 95% Confidence Interval
#> Observed:Expected Ratio 0.8945 0.8645
#> Calibration Intercept -0.1325 -0.1725
#> Calibration Slope 0.9868 0.8489
#> Upper 95% Confidence Interval
#> Observed:Expected Ratio 0.9256
#> Calibration Intercept -0.0926
#> Calibration Slope 1.1247
#>
#> Also examine the calibration plot, if produced.
#>
#> Discrimination Measures
#> ---------------------------------
#> Estimate Lower 95% Confidence Interval Upper 95% Confidence Interval
#> AUC 0.5828 0.5716 0.5941
#>
#>
#> Overall Performance Measures
#> ---------------------------------
#> Cox-Snell R-squared: 0.0074
#> Nagelkerke R-squared: 0.0132
#> Brier Score (CI): 0.1206 (0.1172, 0.124)
#>
#> Also examine the distribution plot of predicted risks.
#>
#> Performance Results for Model 3
#> =================================
#> Calibration Measures
#> ---------------------------------
#> Estimate Lower 95% Confidence Interval
#> Observed:Expected Ratio 1.5945 1.5410
#> Calibration Intercept 0.5324 0.4923
#> Calibration Slope 0.7212 0.5981
#> Upper 95% Confidence Interval
#> Observed:Expected Ratio 1.6499
#> Calibration Intercept 0.5724
#> Calibration Slope 0.8443
#>
#> Also examine the calibration plot, if produced.
#>
#> Discrimination Measures
#> ---------------------------------
#> Estimate Lower 95% Confidence Interval Upper 95% Confidence Interval
#> AUC 0.5613 0.5497 0.5728
#>
#>
#> Overall Performance Measures
#> ---------------------------------
#> Cox-Snell R-squared: -0.0249
#> Nagelkerke R-squared: -0.0446
#> Brier Score (CI): 0.1234 (0.1202, 0.1266)
#>
#> Also examine the distribution plot of predicted risks.