Abstract

In cancer, the primary tumour’s organ of origin and histopathology are the strongest determinants of its clinical behaviour, but in 3% of cases a patient presents with a metastatic tumour and no obvious primary. Here,as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we train a deep learning classifier to predict cancer type based on patterns of somatic passenger mutations detected in whole genome sequencing (WGS) of 2606 tumours representing 24 common cancer types produced by the PCAWG Consortium. Our classifier achieves an accuracy of 91% on held-out tumor samples and 88% and 83% respectively on independent primary and metastatic samples, roughly double the accuracy of trained pathologists when presented with a metastatic tumour without knowledge of the primary. Surprisingly, adding information on driver mutations reduced accuracy. Our results have clinical applicability, underscore how patterns of somatic passenger mutations encode the state of the cell of origin, and can inform future strategies to detect the source of circulating tumour DNA.

Some cancer patients first present with metastases where the location of the primary is unidentified; these are difficult to treat. In this study, using machine learning, the authors develop a method to determine the tissue of origin of a cancer based on whole sequencing data.

Details

Title
A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns
Author
Jiao, Wei 1 ; Atwal Gurnit 2   VIAFID ORCID Logo  ; Polak Paz 3   VIAFID ORCID Logo  ; Karlic Rosa 4   VIAFID ORCID Logo  ; Cuppen, Edwin 5 ; Al-Shahrour, Fatima 6 ; Bailey, Peter J 7 ; Biankin, Andrew V 8 ; Boutros, Paul C 9 ; Campbell, Peter J 10 ; Chang, David K 11 ; Cooke, Susanna L 12 ; Deshpande Vikram 13 ; Faltas, Bishoy M 14 ; Faquin William C 13 ; Garraway Levi 15 ; Getz Gad 16 ; Grimmond Sean M 17 ; Haider Syed 18 ; Hoadley, Katherine A 19 ; Kaiser, Vera B 20 ; Karlić Rosa 4 ; Kato Mamoru 21 ; Kübler, Kirsten 22 ; Lazar, Alexander J 23 ; Li, Constance H 24 ; Louis, David N 13 ; Margolin, Adam 25 ; Sancha, Martin 26 ; Nahal-Bose, Hardeep K 27 ; Petur, Nielsen G 13 ; Nik-Zainal Serena 28 ; Larsson, Omberg 29 ; P’ng Christine 18 ; Perry, Marc D 30 ; Rheinbay Esther 22 ; Rubin, Mark A 31 ; Semple, Colin A 20 ; Sgroi, Dennis C 13 ; Shibata Tatsuhiro 32 ; Siebert Reiner 33 ; Smith, Jaclyn 25 ; Stein, Lincoln D 34 ; Stobbe, Miranda D 35 ; Sun, Ren X 18 ; Thai, Kevin 27 ; Wright, Derek W 36 ; Chin-Lee, Wu 13 ; Yuan Ke 37 ; Zhang, Junjun 27 ; Danyi Alexandra 38 ; de Ridder Jeroen 38   VIAFID ORCID Logo  ; van Herpen Carla 39 ; Lolkema, Martijn P 40   VIAFID ORCID Logo  ; Steeghs Neeltje 41 ; Morris Quaid 42   VIAFID ORCID Logo 

 Ontario Institute for Cancer Research, Toronto, Canada (GRID:grid.419890.d) (ISNI:0000 0004 0626 690X) 
 Ontario Institute for Cancer Research, Toronto, Canada (GRID:grid.419890.d) (ISNI:0000 0004 0626 690X); University of Toronto, Department of Molecular Genetics, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938); Vector Institute, Toronto, Canada (GRID:grid.494618.6) 
 Icahn School of Medicine at Mount Sinai, Department of Oncological Sciences, New York, USA (GRID:grid.59734.3c) (ISNI:0000 0001 0670 2351) 
 University of Zagreb, Bioinformatics Group, Division of Molecular Biology, Department of Biology, Faculty of Science, Zagreb, Croatia (GRID:grid.4808.4) (ISNI:0000 0001 0657 4636) 
 Hartwig Medical Foundation, Amsterdam, The Netherlands (GRID:grid.4808.4); University Medical Center Utrecht, Center for Molecular Medicine and Oncode Institute, Utrecht, The Netherlands (GRID:grid.7692.a) (ISNI:0000000090126352) 
 Spanish National Cancer Research Centre (CNIO), Bioinformatics Unit, Madrid, Spain (GRID:grid.7719.8) (ISNI:0000 0000 8700 1153) 
 University of Glasgow, CRUK Beatson Institute for Cancer Research, Bearsden, Glasgow, UK (GRID:grid.8756.c) (ISNI:0000 0001 2193 314X) 
 University of NSW, South Western Sydney Clinical School, Faculty of Medicine, Liverpool, Australia (GRID:grid.1005.4) (ISNI:0000 0004 4902 0432); University of NSW, The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, Sydney, Australia (GRID:grid.1005.4) (ISNI:0000 0004 4902 0432); Glasgow Royal Infirmary, West of Scotland Pancreatic Unit, Glasgow, UK (GRID:grid.411714.6) (ISNI:0000 0000 9825 7840); University of Glasgow, Bearsden, Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, Glasgow, UK (GRID:grid.8756.c) (ISNI:0000 0001 2193 314X) 
 Ontario Institute for Cancer Research, Computational Biology Program, Toronto, Canada (GRID:grid.419890.d) (ISNI:0000 0004 0626 690X); University of Toronto, Department of Medical Biophysics, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938); University of Toronto, Department of Pharmacology, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938); University of California Los Angeles, Los Angeles, USA (GRID:grid.19006.3e) (ISNI:0000 0000 9632 6718) 
10  Wellcome Genome Campus, Hinxton, Wellcome Sanger Institute, Cambridge, UK (GRID:grid.19006.3e); University of Cambridge, Department of Haematology, Cambridge, UK (GRID:grid.5335.0) (ISNI:0000000121885934) 
11  University of NSW, The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, Sydney, Australia (GRID:grid.1005.4) (ISNI:0000 0004 4902 0432); University of Glasgow, Bearsden, Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, Glasgow, UK (GRID:grid.8756.c) (ISNI:0000 0001 2193 314X) 
12  University of Glasgow, Bearsden, Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, Glasgow, UK (GRID:grid.8756.c) (ISNI:0000 0001 2193 314X) 
13  Massachusetts General Hospital, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924) 
14  Weill Cornell Medical College, New York, USA (GRID:grid.5386.8) (ISNI:000000041936877X) 
15  Dana-Farber Cancer Institute, Boston, USA (GRID:grid.65499.37) (ISNI:0000 0001 2106 9910) 
16  Broad Institute of MIT and Harvard, Cambridge, USA (GRID:grid.66859.34); Massachusetts General Hospital, Center for Cancer Research, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924); Massachusetts General Hospital, Department of Pathology, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924); Harvard Medical School, Boston, USA (GRID:grid.38142.3c) (ISNI:000000041936754X) 
17  The University of Melbourne, University of Melbourne Centre for Cancer Research, Melbourne, Australia (GRID:grid.1008.9) (ISNI:0000 0001 2179 088X) 
18  Ontario Institute for Cancer Research, Computational Biology Program, Toronto, Canada (GRID:grid.419890.d) (ISNI:0000 0004 0626 690X) 
19  University of North Carolina at Chapel Hill, Department of Genetics, Chapel Hill, USA (GRID:grid.10698.36) (ISNI:0000000122483208); University of North Carolina at Chapel Hill, Lineberger Comprehensive Cancer Center, Chapel Hill, USA (GRID:grid.10698.36) (ISNI:0000000122483208) 
20  University of Edinburgh, MRC Human Genetics Unit, MRC IGMM, Edinburgh, UK (GRID:grid.4305.2) (ISNI:0000 0004 1936 7988) 
21  Research Institute, National Cancer Center Japan, Department of Bioinformatics, Tokyo, Japan (GRID:grid.272242.3) (ISNI:0000 0001 2168 5385) 
22  Massachusetts General Hospital, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924); Broad Institute of MIT and Harvard, Cambridge, USA (GRID:grid.66859.34); Harvard Medical School, Boston, USA (GRID:grid.38142.3c) (ISNI:000000041936754X) 
23  The University of Texas MD Anderson Cancer Center, Departments of Pathology, Genomic Medicine, and Translational Molecular Pathology, Houston, USA (GRID:grid.240145.6) (ISNI:0000 0001 2291 4776) 
24  Ontario Institute for Cancer Research, Computational Biology Program, Toronto, Canada (GRID:grid.419890.d) (ISNI:0000 0004 0626 690X); University of Toronto, Department of Medical Biophysics, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938) 
25  Oregon Health & Science University, Portland, USA (GRID:grid.5288.7) (ISNI:0000 0000 9758 5690) 
26  Wellcome Genome Campus, Hinxton, Wellcome Sanger Institute, Cambridge, UK (GRID:grid.5288.7); University of Glasgow, Glasgow, UK (GRID:grid.8756.c) (ISNI:0000 0001 2193 314X) 
27  Ontario Institute for Cancer Research, Genome Informatics Program, Toronto, Canada (GRID:grid.419890.d) (ISNI:0000 0004 0626 690X) 
28  Wellcome Genome Campus, Hinxton, Wellcome Sanger Institute, Cambridge, UK (GRID:grid.32224.35); University of Cambridge, Addenbrooke’s Hospital, Academic Department of Medical Genetics, Cambridge, UK (GRID:grid.5335.0) (ISNI:0000000121885934); University of Cambridge, MRC Cancer Unit, Cambridge, UK (GRID:grid.5335.0) (ISNI:0000000121885934); The University of Cambridge School of Clinical Medicine, Cambridge, UK (GRID:grid.5335.0) (ISNI:0000000121885934) 
29  Sage Bionetworks, Seattle, USA (GRID:grid.430406.5) (ISNI:0000 0004 6023 5303) 
30  Ontario Institute for Cancer Research, Genome Informatics Program, Toronto, Canada (GRID:grid.419890.d) (ISNI:0000 0004 0626 690X); University of California San Francisco, Department of Radiation Oncology, San Francisco, USA (GRID:grid.266102.1) (ISNI:0000 0001 2297 6811) 
31  University Hospital of Bern, University of Bern, Bern Center for Precision Medicine, Bern, Switzerland (GRID:grid.5734.5) (ISNI:0000 0001 0726 5157); University of Bern, Department for Biomedical Research, Bern, Switzerland (GRID:grid.5734.5) (ISNI:0000 0001 0726 5157); Weill Cornell Medicine and NewYork Presbyterian Hospital, Englander Institute for Precision Medicine, New York, USA (GRID:grid.5386.8) (ISNI:000000041936877X); Weill Cornell Medicine, Meyer Cancer Center, New York, USA (GRID:grid.5386.8) (ISNI:000000041936877X); Weill Cornell Medical College, Pathology and Laboratory, New York, USA (GRID:grid.5386.8) (ISNI:000000041936877X) 
32  National Cancer Center Research Institute, Division of Cancer Genomics, Tokyo, Japan (GRID:grid.272242.3) (ISNI:0000 0001 2168 5385); The University of Tokyo, Minato-ku, Laboratory of Molecular Medicine, Human Genome Center, The Institute of Medical Science, Tokyo, Japan (GRID:grid.26999.3d) (ISNI:0000 0001 2151 536X) 
33  University of Kiel, Human Genetics, Kiel, Germany (GRID:grid.9764.c) (ISNI:0000 0001 2153 9986); Ulm University and Ulm University Medical Center, Institute of Human Genetics, Ulm, Germany (GRID:grid.410712.1) 
34  University of Toronto, Department of Molecular Genetics, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938); Ontario Institute for Cancer Research, Computational Biology Program, Toronto, Canada (GRID:grid.419890.d) (ISNI:0000 0004 0626 690X) 
35  CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain (GRID:grid.11478.3b); Universitat Pompeu Fabra (UPF), Barcelona, Spain (GRID:grid.5612.0) (ISNI:0000 0001 2172 2676) 
36  MRC-University of Glasgow Centre for Virus Research, Glasgow, UK (GRID:grid.301713.7) (ISNI:0000 0004 0393 3981); University of Glasgow, Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, Bearsden, United Kingdom (GRID:grid.8756.c) (ISNI:0000 0001 2193 314X) 
37  University of Glasgow, Glasgow, UK (GRID:grid.8756.c) (ISNI:0000 0001 2193 314X); University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, UK (GRID:grid.5335.0) (ISNI:0000000121885934); University of Glasgow, School of Computing Science, Glasgow, UK (GRID:grid.8756.c) (ISNI:0000 0001 2193 314X) 
38  University Medical Center Utrecht, Center for Molecular Medicine, Utrecht, The Netherlands (GRID:grid.7692.a) (ISNI:0000000090126352) 
39  Radboud University Medical Center, Nijmegen, The Netherlands (GRID:grid.10417.33) (ISNI:0000 0004 0444 9382) 
40  University Medical Center Rotterdam, Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands (GRID:grid.5645.2) (ISNI:000000040459992X) 
41  The Netherlands Cancer Institute, Department of Medical Oncology, Amsterdam, The Netherlands (GRID:grid.430814.a) 
42  Ontario Institute for Cancer Research, Toronto, Canada (GRID:grid.419890.d) (ISNI:0000 0004 0626 690X); University of Toronto, Department of Molecular Genetics, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938); Vector Institute, Toronto, Canada (GRID:grid.494618.6); University of Toronto, Department of Computer Science, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938); University of Toronto, Donnelly Centre, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938) 
Publication year
2020
Publication date
2020
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2351473344
Copyright
This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.