Benefits of biomarker selection and clinico-pathological covariate inclusion in breast cancer prognostic models
-
* Corresponding author: Yuval Kluger yuval.kluger@yale.edu
- Equal contributors
1 Department of Cell Biology, New York University Center for Health Informatics and Bioinformatics, New York University School of Medicine and Cancer Institute, 550 First Avenue, New York, NY 10016, USA
2 Department of Pathology, Yale University School of Medicine, 333 Cedar Street, New Haven, CT 06520, USA
3 Yale Cancer Center, Yale University School of Medicine, 333 Cedar Street, New Haven, CT 06520, USA
4 Computer Science Department of the Universidad Autónoma of Madrid, Calle Francisco Tomás y Valiente, 11, Cantoblanco 28049, Madrid, Spain
5 Department of Medicine, Yale University School of Medicine, 333 Cedar Street, New Haven, CT 06520, USA
Breast Cancer Research 2010, 12:R66 doi:10.1186/bcr2633
Published: 1 September 2010Abstract
Introduction
Multi-marker molecular assays have impacted management of early stage breast cancer, facilitating adjuvant chemotherapy decisions. We generated prognostic models that incorporate protein-based molecular markers and clinico-pathological variables to improve survival prediction.
Methods
We used a quantitative immunofluorescence method to study protein expression of 14 markers included in the Oncotype DX™ assay on a 638 breast cancer patient cohort with 15-year follow-up. We performed cross-validation analyses to assess performance of multivariate Cox models consisting of these markers and standard clinico-pathological covariates, using an average time-dependent Area Under the Receiver Operating Characteristic curves and compared it to nested Cox models obtained by robust backward selection procedures.
Results
A prognostic index derived from of a multivariate Cox regression model incorporating molecular and clinico-pathological covariates (nodal status, tumor size, nuclear grade, and age) is superior to models based on molecular studies alone or clinico-pathological covariates alone. Performance of this composite model can be further improved using feature selection techniques to prune variables. When stratifying patients by Nottingham Prognostic Index (NPI), the most prognostic markers in high and low NPI groups differed. Similarly, for the node-negative, hormone receptor-positive sub-population, we derived a compact model with three clinico-pathological variables and two protein markers that was superior to the full model.
Conclusions
Prognostic models that include both molecular and clinico-pathological covariates can be more accurate than models based on either set of features alone. Furthermore, feature selection can decrease the number of molecular variables needed to predict outcome, potentially resulting in less expensive assays.