The net reclassification improvement (NRI) is a widely used metric used to assess the relative ability of 2 risk models to distinguish between low- and high-risk individuals. However, the validity and usefulness of the NRI have been questioned. Criticism of the NRI focuses on its use comparing nested risk models, whereas in practice it is often used to compare nonnested risk models derived from distinct data sources. In this study, we evaluated the performance of the NRI in a nonnested context by using it to compare competing cardiovascular risk-prediction models. We explored the NRI's sensitivity to variations in risk categories and to the calibration of the compared models. We found that the NRI was very sensitive to changes in the definition of risk categories, especially when at least 1 model was miscalibrated. To address these shortcomings, we describe a novel alternative to the usual NRI that uses percentiles of risk instead of cutoffs based on absolute risk. This percentile-based NRI demonstrates the relative ability of 2 models to rank patient risk. It displays more stable behavior, and we recommend its use when there are no established risk categories or when models are miscalibrated.