Breast Cancer Research

official impact factor 5.79

Open Access Highly Access Research article

Effect of training-sample size and classification difficulty on the accuracy of genomic predictors

Vlad Popovici1, Weijie Chen2, Brandon D Gallas2, Christos Hatzis3, Weiwei Shi4, Frank W Samuelson2, Yuri Nikolsky4, Marina Tsyganova5, Alex Ishkin5, Tatiana Nikolskaya4,5, Kenneth R Hess6, Vicente Valero7, Daniel Booser7, Mauro Delorenzi1,8, Gabriel N Hortobagyi7, Leming Shi9, W Fraser Symmans10 and Lajos Pusztai7*

Author Affiliations

1 Bioinformatics Core Facility, Swiss Institute of Bioinformatics, Génopode Building, Quartier Sorge, Lausanne CH-1015, Switzerland

2 Center for Devices and Radiological Health, US Food and Drug Administration, 10903 New Hampshire Ave WO62-3124, Silver Springs, MD 20993-0002, USA

3 Nuvera Biosciences, 400 West Cummings Park, Woburn, MA 01801, USA

4 GeneGo, Inc., 500 Renaissance Drive, St. Joseph, MI 49085, USA

5 Department of Systems Biology, Vavilov Institute for General Genetics, Russian Academy of Sciences, Gubkina str. 3 korp. 1, Moscow 119333, Russia

6 Department of Biostatistics, P.O. Box 301439, Houston, TX 77230-1439, USA

7 Department of Breast Medical Oncology, P.O. Box 301439, Houston, TX 77230-1439, USA

8 Swiss NCCR Molecular Oncology, Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland

9 National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA

10 Department of Pathology of the University of Texas M. D. Anderson Cancer Center, P.O. Box 301439, Houston, TX 77230-1439, USA

For all author emails, please log on.

Breast Cancer Research 2010, 12:R5 doi:10.1186/bcr2468

Published: 11 January 2010

Additional files

Additional file 1:

Supplemental Table S1. Clinical data for all the patients in the training and validation sets.

Format: XLS Size: 79KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 2:

Supplemental Table S2. Quality control results.

Format: XLS Size: 71KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

Supplemental Table S3. Pathways mapping for all endpoints.

Format: XLS Size: 41KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 4:

Supplemental methods. Pseudo-code description of the two-level external cross-validation scheme.

Format: PDF Size: 31KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 5:

Supplemental Table S4. Features (probesets) selected in the 120 models.

Format: XLS Size: 36KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 6:

Supplemental Table S5. Estimated and validation performance of all models.

Format: XLS Size: 53KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data