Skip to contents

Validate an existing prediction model, to calculate the predictive performance against a new (validation) dataset.

Usage

pred_validate(
  x,
  new_data,
  binary_outcome = NULL,
  survival_time = NULL,
  event_indicator = NULL,
  time_horizon = NULL,
  level = 0.95,
  cal_plot = TRUE,
  ...
)

Arguments

x

an object of class "predinfo" produced by calling pred_input_info.

new_data

data.frame upon which the prediction model should be evaluated.

binary_outcome

Character variable giving the name of the column in new_data that represents the observed binary outcomes (should be coded 0 and 1 for non-event and event, respectively). Only relevant for model_type="logistic"; leave as NULL otherwise. Leave as NULL if new_data does not contain any outcomes.

survival_time

Character variable giving the name of the column in new_data that represents the observed survival times. Only relevant for x$model_type="survival"; leave as NULL otherwise.

event_indicator

Character variable giving the name of the column in new_data that represents the observed survival indicator (1 for event, 0 for censoring). Only relevant for x$model_type="survival"; leave as NULL otherwise.

time_horizon

for survival models, an integer giving the time horizon (post baseline) at which a prediction is required. Currently, this must match a time in x$cum_hazard.

level

the confidence level required for all performance metrics. Defaults at 95%. Must be a value between 0 and 1.

cal_plot

indicate if a flexible calibration plot should be produced (TRUE) or not (FALSE).

...

further plotting arguments for the calibration plot. See Details below.

Value

pred_validate returns an object of class "predvalidate", with child classes per model_type. This is a list of performance metrics, estimated by applying the existing prediction model to the new_data. An object of class "predvalidate" is a list containing relevant calibration and discrimination measures. For logistic regression models, this will include observed:expected ratio, calibration-intercept, calibration slope, area under the ROC curve, R-squared, and Brier Score. For survival models, this will include observed:expected ratio (if cum_hazard is provided to x), calibration slope, and Harrell's C-statistic. Optionally, a flexible calibration plot is also produced, along with a box-plot and violin plot of the predicted risk distribution.

The summary function can be used to extract and print summary performance results (calibration and discrimination metrics). The graphical assessments of performance can be extracted using plot.

Details

This function takes an existing prediction model formatted according to pred_input_info, and calculates measures of predictive performance on new data (e.g., within an external validation study). The information about the existing prediction model should first be inputted by calling pred_input_info, before passing the resulting object to pred_validate.

new_data should be a data.frame, where each row should be an observation (e.g. patient) and each variable/column should be a predictor variable. The predictor variables need to include (as a minimum) all of the predictor variables that are included in the existing prediction model (i.e., each of the variable names supplied to pred_input_info, through the model_info parameter, must match the name of a variables in new_data).

Any factor variables within new_data must be converted to dummy (0/1) variables before calling this function. dummy_vars can help with this. See pred_predict for examples.

binary_outcome, survival_time and event_indicator are used to specify the outcome variable(s) within new_data (use binary_outcome if x$model_type = "logistic", or use survival_time and event_indicator if x$model_type = "survival").

In the case of validating a logistic regression model, this function assesses the predictive performance of the predicted risks against an observed binary outcome. Various metrics of calibration (agreement between the observed risk and the predicted risks, across the full risk range) and discrimination (ability of the model to distinguish between those who develop the outcome and those who do not) are calculated. For calibration, the observed-to-expected ratio, calibration intercept and calibration slopes are estimated. The calibration intercept is estimated by fitting a logistic regression model to the observed binary outcomes, with the linear predictor of the model as an offset. For calibration slope, a logistic regression model is fit to the observed binary outcome with the linear predictor from the model as the only covariate. For discrimination, the function estimates the area under the receiver operating characteristic curve (AUC). Various other metrics are also calculated to assess overall accuracy (Brier score, Cox-Snell R2).

In the case of validating a survival prediction model, this function assesses the predictive performance of the linear predictor and (optionally) the predicted event probabilities at a fixed time horizon against an observed time-to-event outcome. Various metrics of calibration and discrimination are calculated. For calibration, the observed-to-expected ratio at the specified time_horizon (if predicted risks are available through specification of x$cum_hazard) and calibration slope are produced. For discrimination, Harrell's C-statistic is calculated.

For both model types, a flexible calibration plot is produced (for survival models, the cumulative baseline hazard must be available in the predinfo object, x$cum_hazard). Specify parameter cal_plot to indicate whether a calibration plot should be produced (TRUE), or not (FALSE). The calibration plot is produced by regressing the observed outcomes against a cubic spline of the logit of predicted risks (for a logistic model) or the complementary log-log of the predicted risks (for a survival model). Users can specify the following additional parameters to pred_validate to modify the calibration plot:

  • xlim as a numeric vector of length 2, giving the lower and upper range of the x-axis scale - defaults at 0 and 1. Changes here should match changes to the ylim such that the plot remains 'square'.

  • ylim as a numeric vector of length 2, giving the lower and upper range of the y-axis scale - defaults at 0 and 1. Changes here should match changes to the xlim such that the plot remains 'square'.

  • xlab string giving the x-axis label. Defaults as "Predicted Probability".

  • ylab string giving the x-axis label. Defaults as "Observed Probability".

  • pred_rug TRUE/FALSE of whether a 'rug' should be placed along the x-axis of the calibration plot showing the distribution of predicted risks. Defaults as FALSE in favour of examining the box-plot/violin plot that is produced.

  • cal_plot_n_sample numeric value (less than nrow(new_data)) giving a random subset of observations to render the calibration plot over. The calibration plot is always created using all data, but for rendering speed in large datasets, it can sometimes be useful to render the plot over a smaller (random) subset of observations. Final (e.g. publication-ready) plots should always show the full plot, so a warning is created if users enter a value of cal_plot_n_sample.

See also

Examples

#Example 1 - multiple existing model, with outcome specified; uses
#            an example dataset within the package
model1 <- pred_input_info(model_type = "logistic",
                          model_info = SYNPM$Existing_logistic_models)
val_results <- pred_validate(x = model1,
                             new_data = SYNPM$ValidationData,
                             binary_outcome = "Y",
                             cal_plot = FALSE)
summary(val_results)
#> 
#> Performance Results for Model 1 
#> ================================= 
#> Calibration Measures 
#> --------------------------------- 
#>                         Estimate Lower 95% Confidence Interval
#> Observed:Expected Ratio   1.9006                        1.8368
#> Calibration Intercept     0.7323                        0.6921
#> Calibration Slope         0.6484                        0.5576
#>                         Upper 95% Confidence Interval
#> Observed:Expected Ratio                        1.9666
#> Calibration Intercept                          0.7726
#> Calibration Slope                              0.7392
#> 
#>  Also examine the calibration plot, if produced. 
#> 
#> Discrimination Measures 
#> --------------------------------- 
#>     Estimate Lower 95% Confidence Interval Upper 95% Confidence Interval
#> AUC   0.5814                        0.5702                        0.5927
#> 
#> 
#> Overall Performance Measures 
#> --------------------------------- 
#> Cox-Snell R-squared: -0.0481
#> Nagelkerke R-squared: -0.0863
#> Brier Score (CI): 0.1249 (0.1219, 0.1279)
#> 
#>  Also examine the distribution plot of predicted risks. 
#> 
#> Performance Results for Model 2 
#> ================================= 
#> Calibration Measures 
#> --------------------------------- 
#>                         Estimate Lower 95% Confidence Interval
#> Observed:Expected Ratio   0.8945                        0.8645
#> Calibration Intercept    -0.1325                       -0.1725
#> Calibration Slope         0.9868                        0.8489
#>                         Upper 95% Confidence Interval
#> Observed:Expected Ratio                        0.9256
#> Calibration Intercept                         -0.0926
#> Calibration Slope                              1.1247
#> 
#>  Also examine the calibration plot, if produced. 
#> 
#> Discrimination Measures 
#> --------------------------------- 
#>     Estimate Lower 95% Confidence Interval Upper 95% Confidence Interval
#> AUC   0.5828                        0.5716                        0.5941
#> 
#> 
#> Overall Performance Measures 
#> --------------------------------- 
#> Cox-Snell R-squared: 0.0074
#> Nagelkerke R-squared: 0.0132
#> Brier Score (CI): 0.1206 (0.1172, 0.124)
#> 
#>  Also examine the distribution plot of predicted risks. 
#> 
#> Performance Results for Model 3 
#> ================================= 
#> Calibration Measures 
#> --------------------------------- 
#>                         Estimate Lower 95% Confidence Interval
#> Observed:Expected Ratio   1.5945                        1.5410
#> Calibration Intercept     0.5324                        0.4923
#> Calibration Slope         0.7212                        0.5981
#>                         Upper 95% Confidence Interval
#> Observed:Expected Ratio                        1.6499
#> Calibration Intercept                          0.5724
#> Calibration Slope                              0.8443
#> 
#>  Also examine the calibration plot, if produced. 
#> 
#> Discrimination Measures 
#> --------------------------------- 
#>     Estimate Lower 95% Confidence Interval Upper 95% Confidence Interval
#> AUC   0.5613                        0.5497                        0.5728
#> 
#> 
#> Overall Performance Measures 
#> --------------------------------- 
#> Cox-Snell R-squared: -0.0249
#> Nagelkerke R-squared: -0.0446
#> Brier Score (CI): 0.1234 (0.1202, 0.1266)
#> 
#>  Also examine the distribution plot of predicted risks.