Content area
Full Text
BRIEF COMMUNICATIONS
PyClone: statistical inference of clonal population structure in cancer
npg 201 4 Nature America, Inc. All rights reserved.
Andrew Roth1,2, Jaswinder Khattra2, Damian Yap2, Adrian Wan2, Emma Laks2, Justina Biele2, Gavin Ha1,2, Samuel Aparicio2,3, Alexandre Bouchard-Ct4 & Sohrab P Shah2,3
We introduce PyClone, a statistical model for inference of clonal population structures in cancers. PyClone is a Bayesian clustering method for grouping sets of deeply sequenced somatic mutations into putative clonal clusters while estimating their cellular prevalences and accounting for allelic imbalances introduced by segmental copy-number changes and normal-cell contamination. Single-cell sequencing validation demonstrates PyClones accuracy.
Human cancer progresses under Darwinian evolution, in which genetic or epigenetic variation alters molecular phenotypes in individual cells1. Consequently, tumors at diagnosis often consist of multiple, genotypically distinct cell populations2
(Supplementary Fig. 1). These populations, referred to as clones, are related through a phylogeny and act as substrates for selection in tumor microenvironments or with therapeutic intervention2,3.
The prevalence of a particular clone measured over time or in anatomic space is a reflection of its growth and proliferative fitness. Thus, ascertaining the dynamic prevalence of clones can help identify precise genetic determinants of phenotypes such as acquisition of metastatic potential or chemotherapeutic resistance.
We provide a hierarchical Bayes statistical model, PyClone (Supplementary Figs. 2 and 3), for analysis of deeply sequenced (coverage > 100) mutations to identify and quantify clonal populations in tumors, which extends to modeling mutations measured in multiple samples from the same patient. Our approach uses the measurement of allelic prevalence to estimate the proportion of tumor cells harboring a mutation (referred to herein as the cellular prevalence). Owing to the cell lysis involved in the preparation of bulk samples for sequencing, we cannot determine the complete set of genomic aberrations defining a clonal population. However, assuming that clonal populations follow a perfect (that is, no site mutates more than once in its evolutionary history, and
1Bioinformatics Graduate Program, University of British Columbia, Vancouver, British Columbia, Canada. 2Department of Molecular Oncology, British Columbia Cancer Research Centre, Vancouver, British Columbia, Canada. 3Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada. 4Department of Statistics, University of British Columbia, Vancouver, British Columbia, Canada. Correspondence should be addressed to S.P.S. ([email protected]).
RECEIVED 18...