Full Text

Turn on search term navigation

ARTICLE

Received 15 Jan 2013 | Accepted 7 Jun 2013 | Published 9 Jul 2013

Cancer cell lines are frequently used as in vitro tumour models. Recent molecular proles of hundreds of cell lines from The Cancer Cell Line Encyclopedia and thousands of tumour samples from the Cancer Genome Atlas now allow a systematic genomic comparison of cell lines and tumours. Here we analyse a panel of 47 ovarian cancer cell lines and identify those that have the highest genetic similarity to ovarian tumours. Our comparison of copy-number changes, mutations and mRNA expression proles reveals pronounced differences in molecular proles between commonly used ovarian cancer cell lines and high-grade serous ovarian cancer tumour samples. We identify several rarely used cell lines that more closely resemble cognate tumour proles than commonly used cell lines, and we propose these lines as the most suitable models of ovarian cancer. Our results indicate that the gap between cell lines and tumours can be bridged by genomically informed choices of cell line models for all tumour types.

DOI: 10.1038/ncomms3126 OPEN

Evaluating cell lines as tumour models by comparison of genomic proles

Silvia Domcke1,2,*, Rileen Sinha1,*, Douglas A. Levine3, Chris Sander1 & Nikolaus Schultz1

1 Computational Biology Center, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, Box 460, New York, New York 10065, USA. 2 Department of Chemistry, Technische Universitat Mnchen, Lichtenbergstra 4, 85747 Garching bei Mnchen, Germany. 3 Department of Surgery, Memorial Sloan-

Kettering Cancer Center, 1275 York Avenue, New York, New York 10065, USA. *These authors contributed equally to this work. Correspondence and requests for materials should be addressed to N.S. (email: mailto:[email protected]

Web End [email protected] ).

NATURE COMMUNICATIONS | 4:2126 | DOI: 10.1038/ncomms3126 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications 1

ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3126

Cell lines derived from tumours are the most frequently utilized models in cancer research and their use has advanced the understanding of cancer biology tremen

dously over the past decades. Genomic differences between cancer cell lines and tissue samples have been pointed out in several studies14. However, owing to the lack of large-scale genomic data, nding the cell lines that most closely resemble the genomic alterations of a given tumour (sub)type has been difcult. Now for the rst time, a large set of molecular proles are available for both tumour samples and cell lines: In The Cancer Genome Atlas (TCGA), the genomes and expression proles of at least 500 tissue samples per tumour type are being comprehensively characterized5. The Broad-Novartis Cancer Cell Line Encyclopedia (CCLE) contains genomic proles of around 1,000 cell lines that are used as models for various tumour types6. These efforts enable a systematic comparison of tumours and cell lines at the level of DNA copy-number, mutation and mRNA expression data across a diversity of tumour types. In this pilot study, we focus on high-grade serous ovarian cancer (HGSOC) and seek to identify the ovarian cancer cell lines most suitable as in vitro models based on comparison of the available genomic proles.

Every year, 4100,000 women around the globe die of ovarian cancer7. In the United States, ovarian cancer is the most lethal gynaecological malignancy and fth leading cause of cancer death for women8.

Epithelial ovarian cancer is traditionally divided into four major histological subtypes: serous, endometrioid, clear cell and mucinous carcinoma. Serous ovarian carcinoma is responsible for B70% of epithelial ovarian cancers9. The most aggressive sub-type, HGSOC, accounts for 90% of these serous carcinomas10 and two-thirds of all ovarian cancer deaths11, making it by far the most extensively studied ovarian carcinoma.

Until recently, all histological subtypes were believed to arise from the ovarian surface epithelium and were often not differentiated in preclinical research or clinical trials. However, the discovery that the majority of invasive tumours may stem from different non-ovarian tissues accompanied by molecular analysis of the respective subtypes has led to the recognition that ovarian cancer is extremely heterogeneous and in fact comprises several distinct diseases12,13.

The most commonly used cell line models for ovarian cancer and implicitly for the most prevalent subtype HGSOCare SK-OV-3, A2780, OVCAR-3, CAOV3 and IGROV1 (quantied via Pubmed citations, see Results). However, their histopathological origin is partly unclear, and the need for well-characterized cell lines as models for the respective subtypes of ovarian cancer has been repeatedly voiced12,13.

Our comparison of data from TCGA and the CCLE reveals striking differences between some of the most commonly used cell line models and the majority of HGSOC samples. On the basis of our ndings, we recommend an alternative set of ovarian cancer cell lines more suitable for in vitro studies of HGSOC. Although conclusions based on in vitro cell line experiments are not necessarily valid in a clinical setting, choosing cell lines most representative of certain subtypes should increase the value of cell line studies in preclinical research.

ResultsGenomic characterization of HGSOC. The TCGA study revealed three major genomic features of HGSOC. First, copy-number alterations (CNAs) are remarkably common in HGSOC, with the median fraction of the genome altered as large as 46% (Fig. 1a, Supplementary Fig. S1A). Second, TP53 mutations are near universal (95% of samples), and the few tumours with wild-

type TP53 predominantly have at copy-number proles (Fig. 1a). Third, the overall frequency of somatic mutation in protein-coding regions is low, with only TP53, BRCA1 and BRCA2 mutated in 410% of samples (Fig. 1b)5.

These features set the HGSOC subtype apart from the low-grade serous, endometrioid, clear cell and mucinous ovarian carcinomas, which have near-normal gene copy-numbers and wild-type TP53 (refs 1416). Comprehensive information on protein mutations in these other subtypes of ovarian cancer is not yet available, but the known mutations differ strongly from the mutation spectrum of HGSOC. For instance, two-thirds of low-grade serous carcinomas carry mutations in KRAS, BRAF or ERBB2 (refs 1719). Low-grade endometrioid carcinoma is characterized by ARID1A mutations in one-third of the tumours20, as well as CTNNB1 mutations21, PTEN mutations22 and PIK3CA mutations23. ARID1A mutations are similarly found in nearly half of clear cell carcinomas20, and PIK3CA mutations are common23. The majority of mucinous carcinomas are mutated in KRAS (ref. 18).

Interestingly, some of the HGSOC tumour samples proled in TCGA with wild-type TP53 have mutations in one of the genes typically altered in non-HGSOC subtypes, as well as uncharacteristically at copy-number proles (Fig. 1a), casting doubt on their origin. Histopathological reassessment of these tumour samples should reveal whether they truly belong to the HGSOC or rather a different ovarian cancer subtype. In an independent collection, HGSOC samples with wild-type TP53 in fact showed diverse histology after pathological review or evidence of TP53 dysfunction24, implying that loss of TP53 function is truly universal in HGSOC.

Comparison of cell lines and tumour samples. At rst glance, the CCLE ovarian cancer cell line panel appears to have overall genomic similarity to the HGSOC tissue samples. On the DNA copy-number level, the median fraction of the genome altered in the 47 ovarian cancer cell lines in the CCLE data set is quite similar to that of the TCGA tumours, although the distribution is wider for the cell line panel (Fig. 1a, Supplementary Fig. S1A). This is not surprising, given that the CCLE data set encompasses diverse subtypes of ovarian cancer, which are known to differ drastically in their copy-number status16. The most frequent CNAs in TCGA are all represented to some extent among the CCLE ovarian cancer cell lines (Fig. 1b). The most recurrently mutated genes in HGSOC are also mutated in a considerable fraction of the cell lines (Fig. 1b): TP53 is mutated in 62% of cell lines, and BRCA1 and BRCA2 in 6% and 9%, respectively.

However, closer inspection reveals substantial differences between some of the cell lines and the tumours. In general, more mutations were identied in the cell lines in the 1651 genes proled in both studies (median frequency of 4.3 per Mb in cell lines versus 1.6 per Mb in tumours; Supplementary Fig. S1B). Several factors plausibly contribute to the larger number of mutations reported for the cell line panel. First, cell lines are purer than tumour samples, which tend to be contaminated with stromal cells. Second, apart from BRCA1 and BRCA2, the TCGA study only considers somatic mutations, whereas the mutations identied in the CCLE also include private germline variants. Third, mutations acquired during in vitro culturing are a further possible contributing factor.

Apart from general differences between cell lines and tumours, some of the cell lines in the panel further differ from the HGSOC tumour samples because they probably originate from other, non-HGSOC subtypes. For example, PIK3CA is mutated in 19% of the ovarian cancer cell lines but in o1% of TCGA HGSOC samples (Fig. 1a). Although mutations in PIK3CA, KRAS, PTEN, BRAF

2 NATURE COMMUNICATIONS | 4:2126 | DOI: 10.1038/ncomms3126 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3126 ARTICLE

Mutations

in genes

CNA

across chromosomes

Subtype

TP53

BRCA1

BRCA2

PIK3CA

PTEN

KRAS

BRAF

ARID1A

Three-hundred and sixteen high-grade serous ovarian cancer tumour samples (TCGA)

Tumours Cell lines

Forty-seven ovarian cancer cell lines (CCLE)

TP53 wt,

low CNA

TP 53 wt,

low CNA,

mutations in

non-HGSOC

genes

SNU119 OVCAR4 CAOV3 EFO21 JHOS4 SNU8 JHOC5 FUOV1 KURAMOCHI SNU840 OVKATE OVISE COV318 ONCODG1 OVSAHO JHOS2 OAW42 OAW28 COV644 HS571T COV362 NIHOVCAR3 OVCAR8 COV504 JHOM1 TYKNU RMUGS OVMANA CAOV4ES2OC316 RMGI59M JHOM2B OV90 HEYA8 COLO704 OVTOKO OV56 SKOV3 EFO27 MCAS OVK18 A2780 TOV21G IGROV1 COV434

Subtype Mutation Serous Clear cell Mucinous Endometrioid Other Mixed NS Yes No

Copy number

1.5 1.5

MYC

CCNE1

ALG8

TACC3

RB1 PTEN

0 5 10 15 20 25

MECOM

KRAS

Genes

SOX17

Amplification

Homozygous deletion

Cell lines

Tumours

Cell lines

Tumours

Genes

TP53 BRCA1 BRCA2 CSMD3 NF1 CDK12 RB1

NF1

Tumours

Cell lines

0 20 40 60 80 100

Percentage of samples with mutation

Percentage of samples with copy-number alteration

Figure 1 | Genomic comparison of TCGA HGSOC samples with CCLE ovarian cancer cell lines suggests overall genomic similarity. (a) CNA proles (right, chromosomes 122) and mutations (left, in eight selected genes) of HGSOC patient samples from TCGA, top and ovarian cancer cell lines from the CCLE, bottom. The samples are sorted according to decreasing fraction of the genome altered in DNA copy number. Somatic mutations in genes known to be commonly altered in one of the four epithelial ovarian cancer subtypes are indicated on the left, with germline mutations included for BRCA1 and BRCA2 in the tumour samples in addition to the somatic mutations. Note the samples with a low degree of CNA coinciding with wild-type TP53 copies near the bottom of each panel (square bracket). (b) The most frequent genomic alterations identied in HGSOC tumour samples and their occurrence in the ovarian cancer cell line panel: CNAs (left) and mutations (right).

NATURE COMMUNICATIONS | 4:2126 | DOI: 10.1038/ncomms3126 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications 3

ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3126

and ARID1A are uncommon in HGSOC, they are characteristic of other ovarian cancer subtypes. Strikingly, a higher mutation frequency in one of these non-HGSOC oncogenes or tumour suppressors tends to coincide with a at copy-number prole and wild-type TP53 (Fig. 1a), making these cell lines possible models of low-grade serous, endometrioid, clear cell or mucinous ovarian carcinoma1726.

Five ovarian cancer cell lines are hypermutated. Although the ovarian cancer cell lines typically have a slightly higher number of mutations than HGSOC tumour samples, they have a similar degree of CNAs (Fig. 2). However, ve cell lines are outliers with opposite characteristics: IGROV1, OC316, EFO27, OVK18 and TOV21G not only have few CNAs but also surprisingly many mutations. This hypermutator genotype sets them clearly apart from the rest of the ovarian cancer cell lines and from the HGSOC tissue samples.

Key genomic features of suitable cell line models of HGSOC. Although the altered fraction of the genome and total mutation count can reveal clear outliers among the cell lines, these criteria are summary properties and not sufcient for identifying appropriate tumour models. For a more detailed gene-by-gene comparison of copy-number data, we calculated the correlation of the CNA prole of each cell line to the tumours (see Methods, Supplementary Data S1). For identifying cell lines resembling the majority of HGSOC tumour samples, the correlation with the mean CNA of tumours is of most interest. This mean CNA prole takes into account amplications or deletions consistently present in many samples, whereas conicting or noisy copy-number values are averaged out (Fig. 3, left).

In addition to these average properties, alterations in cancer genes known to have a specic functional role in diverse cancer subtypes can help distinguish more or less suitable cell line models (Fig. 3, right). To best discriminate between HGSOC and other ovarian cancer subtypes, we chose, on one hand, alterations characteristic of HGSOC, such as mutations in TP53 and BRCA1/2 as well as amplications in C11orf30 (EMSY), CCNE1, MYC, PIK3CA and KRAS (ref. 5) and, on the other hand, mutations in a subset of genes recurrently altered by mutation only in other ovarian cancer subtypes (PIK3CA, PTEN, ERBB2, KRAS, BRAF, CTNNB1 and ARID1A).

Ranking of cell lines by suitability as HGSOC models. To evaluate the suitability of any particular cell line as a model for HGSOC tumours, we dened a set of plausible criteria. Although these criteria are not exhaustive, nor tailored to particular research questions, they can be a reasonable guide to avoid clearly unsuitable cell lines and choose those that at least resemble tumour samples in terms of overall and specically functional criteria. We thus divided the 47 ovarian cancer cell lines into good, moderate and poor models of HGSOC using an empirical numerical score (see Methods). The suitability score is higher (1) the better the correlation between the copy-number prole of the cell line and the mean copy-number prole of HGSOC tumour samples; (2) the lower the frequency of non-synonymous mutations in protein-coding genes; (3) in the presence of a TP53 mutation; and (4) in the absence of mutations in the seven non-HGSOC genes (see above) commonly altered in other ovarian cancer subtypes (Fig. 3, Supplementary Data S1). Applying the score leads to a reasonable ordering of the cell lines from the most suitable (top, green, Fig. 3) to the least suitable (bottom, red, Fig. 3). This order is useful for selecting or deselecting cell lines, but is not meaningful as a nely graduated ranking.

Good and bad cell line models. The grouping by suitability score in Fig. 3 provides a guide to cell line selection. The cell lines near the top feature the major genomic characteristics of HGSOC, and thus seem best suited as in vitro models for HGSOC. These

1.0

Tumours

Cell lines

0.8

SNU119

COV362

OVCAR4

COV318

KURAMOCHI

OVSAHO

Hypermutated cell lines

OC316

OVK18 EFO27

TOV21G

IGROV1

Fraction genome altered

0.6

0.4

0.2

0.0

0 5 10 15 20

Mutations per million bases

Figure 2 | Hypermutated cell lines are outliers. The comparison of mutation frequency (horizontal) and degree of CNA (vertical) for HGSOC tumour samples (blue) and ovarian cancer cell lines (red) reveals a subset of cell lines (dashed ellipse) with a hypermutator genotype (high mutation frequency, few DNA copy-number changes). The hypermutated cell lines (mutation frequency in parentheses) are: IGROV1 (20.7/Mb), OC316(19.0/Mb), EFO27 (16.1/Mb), OVK18 (14.4/Mb) and TOV21G (13.4/Mb). Cell lines that on the contrary resemble the tumour samples in key characteristics (as shown below in Fig. 3) are also labelled.

Figure 3 | Ranking ovarian cancer cell lines by suitability as HGSOC models. Both average properties (left) and selected genetic events specic to ovarian cancer (right) can be used to distinguish better and poorer models of HGSOC. Average properties include the histological subtype as determined in the original publication (references in Supplementary Data S1), the citation frequency in the literature as estimator of frequency of use in laboratories, the altered fraction of the genome, the number of mutations per million bases and the correlation with the mean CNA prole of HGSOC tumour samples. The selected genetic events include alterations recurrently found either in HGSOC (mutation of TP53, BRCA1 or BRCA2; amplication of C11orf30 (EMSY), CCNE1, KRAS or MYC; mutation or deletion of RB1) or one of the three other major subtypes of ovarian cancer (mutation in PIK3CA, PTEN, KRAS, BRAF, CTNNB1 or ARID1A; mutation or amplication in ERBB2). The colour gradient underlying the cell line names to the left indicates better (green) versus poorer (red) models of HGSOC according to selected characteristics (TP53 status, correlation with mean CNA prole of TCGA samples, low mutation rate and absence of mutations in the seven non-HGSOC genes, see Supplementary Data S1). The hypermutated cell lines described in Fig. 2 are located at the bottom of the table. Note that although HGSOC cell lines are probably at the top and unsuitable cell lines are at the bottom of the table (vertical labels), the order does not signify an exact ranking of cell line models.

4 NATURE COMMUNICATIONS | 4:2126 | DOI: 10.1038/ncomms3126 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3126 ARTICLE

cell lines have a TP53 mutation but no mutation in the seven non-HGSOC genes. Their copy-number proles correlate well with the mean CNA of all tumours. They also have a high correlation with the copy-number prole of a single tumour sample (Supplementary Data S1), and their alteration pattern in the

ovarian cancer-specic gene set matches the TCGA samples. Strikingly, the twelve best candidates in this analysis account for only 1% of current Pubmed citations out of the 47 analysed cell lines, although HGSOC is by far the most prevalent and extensively studied ovarian cancer subtype (Fig. 3, Supplementary Fig. S2).

ARID1A

Cell

lines Average properties Selected genetic events

# Pubmed citations

Original annotation

Fraction genome altered

Mutations per Mb

Correlation w/ CNA

of HGSOC

TP53

C11orf30

BRCA2

BRCA1

RB1

CCNE1

MYC

PIK3CA

KRAS

ERBB2

PTEN

CTNNB1

BRAF

KURAMOCHI OVSAHO SNU119 COV362 OVCAR4 COV318 JHOS4 TYKNU OVKATE CAOV4 OAW28 JHOS2 CAOV359M ONCODG1 FUOV1 NIHOVCAR3 ES2 JHOM2B SNU8 COV504 OV90 RMUGS JHOM1 HS571T OVCAR8 COV644 EFO21 JHOC5 SNU840 COLO704 OVISE OAW42 OVTOKO OVMANA RMGI HEYA8 MCAS COV434 OV56 SKOV3 A2780 IGROV1 OVK18 EFO27 OC316 TOV21G

6 4 1 7 29

0 0 19

1622 278 651 1082

128 0 2 0 43

0 0 58

5 4 1 5 26

5 2 25 67 32 23

0 2101 1363

255 7 20

0.540.430.740.370.700.450.600.330.470.290.420.430.690.240.440.560.350.280.230.600.340.220.330.340.370.340.390.600.570.520.200.460.430.180.330.240.210.110.000.150.140.070.040.090.130.250.05

5.324.324.054.094.633.124.135.064.603.554.224.313.763.232.715.053.406.004.492.993.163.884.912.662.835.804.063.314.313.049.485.493.285.056.033.783.935.374.155.095.934.5920.6614.4116.1419.0413.42

0.580.540.540.520.460.460.430.420.390.380.380.340.330.320.320.300.300.440.420.410.250.190.150.130.040.180.410.340.260.360.340.290.430.080.350.030.280.250.000.14 0.050.200.230.210.040.310.05

Likely

high-grade serous

Possibly

high-grade serous

Unlikely

high-grade serous

Hyper-

mutated

0 2,000

0 0.7 0 20 0 0.5

Alterations

Subtype

Serous Clear cell Mucinous Endometrioid

Other Mixed NS

Mutation Homozygous deletion Amplification

NATURE COMMUNICATIONS | 4:2126 | DOI: 10.1038/ncomms3126 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications 5

ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3126

Of the dozen cell lines with the highest suitability score, many were indeed classied as serous cell lines by the pathologists in the original publications (Fig. 3). For a sizeable group, the histological subtype was not or could not be specied in the original publication. Among these is the cell line with the highest suitability score, KURAMOCHI: its copy-number prole highly correlates with the mean CNA of HGSOC tumours and the copy-number prole of a single tumour, it has a low mutation frequency and HGSOC-specic alterations in key oncogenes and tumour suppressors (Fig. 3, Supplementary Data S1). In the original publication, however, KURAMOCHI is rather ambiguously classied as undifferentiated carcinoma25,26. As this cell line has all the major characteristics of HGSOC, our analysis implies that it was in fact derived from this tumour subtype. Interestingly, a cell line classied as endometrioid in the original publication, COV362, is among the top-ranking HGSOC-like cell lines27. Although this is surprising at the rst sight, high-grade endometrioid carcinomas are in fact difcult to distinguish from HGSOC at the morphological and molecular level28. As these tumours also have high CNA and mutations in TP53 (ref. 16), it has been recently suggested that they actually belong to the HGSOC subtype12,29. Taken together, these observations highlight that the subtypes assigned to cell lines at their derivation based on histopathology are not necessarily identical to their molecular subtypes.

Although several cell lines resemble HGSOC tumour samples, there are also several cell lines that have little resemblance to HGSOC and a low suitability score (Fig. 3, bottom), among them the hypermutated cell lines mentioned above. Most of these poorly matched cell lines were not classied as high-grade serous by the pathologists in the original publication. The lack of HGSOC features in these cell lines stemming from ovarian carcinomas of the endometrioid, clear cell or mucinous subtype can be explained by the substantial molecular differences between the diverse subtypes of ovarian cancer1726. However, among the low-ranking cell lines, there are also some that were classied as serous in the original publication. Low- and high-grade serous carcinomas were not differentiated in most of the original publications. Today these subtypes are recognized as different diseases with diverse genomic proles2022. Some of the low-ranking cell lines whose parent tumours were described as serous but have only modest CNA and an uncharacteristic mutation prole, therefore plausibly stem from low-grade serous tumours. Again, this suggests that insights from molecular proles can help to rene the traditional histopathological annotation of subtypes.

All cell lines were derived at least 13 years ago and have been in passage for a considerable time. A substantial number was derived not directly from primary tumours in the ovary but rather from ascitic uid or peritoneal deposits (Supplementary Data S1). Interestingly, no correlation was observed between the time of derivation of a cell line (as substitute measure for passage number) or the specimen site and the estimated suitability as tumour model.

For some preclinical studies, cell lines with BRCA mutations are of particular interest, given the implication of this gene in the prevention and treatment of HGSOC5,12,30. The fraction of BRCA mutation carriers lies at roughly 10% for both the ovarian cancer cell line panel and the HGSOC tumour samples (Fig. 1b). However, out of the six cell lines with a BRCA mutation, two are among the hypermutated cell lines (IGROV1, OC316) and one has wild-type TP53 and uncharacteristic mutations (OVMANA). However, the top-ranking HGSOC-like cell line, KURAMOCHI, as well as two further cell lines (COV362, JHOS2) also carry BRCA mutations, and therefore constitute possible models for in vitro investigation of BRCAness in HGSOC (Fig. 3).

Popular cell line models do not closely resemble HGSOC tumours. SK-OV-3, A2780, OVCAR-3, CAOV3 and IGROV1 are the most popular cell line models as quantied by Pubmed citations, accounting for 90% of publications mentioning at least one of the 47 CCLE ovarian cancer cell lines (Supplementary Fig. S2). Although the exact histological origin is not specied in the original reference for most of them, they are commonly used as models for HGSOC. OVCAR-3 and CAOV3 possess TP53 mutations and substantial copy-number changes, key characteristics of HGSOC. However, they are not among the top-ranking HGSOC-like cell lines owing to a lower correlation value with the mean CNA as well as lower correlation values with the CNA of individual tumours (Fig. 3, Supplementary Data S1). Strikingly, the two most frequently used cell lines, SK-OV-3 and A2780, which together account for 60% of publications on this cell line panel, are poorly suited as models for HGSOC. Both have a very at copy-number prole, and they do not have TP53 mutations but instead mutations frequently found in other histological subtypes, such as ARID1A, BRAF, PIK3CA and PTEN mutations. This lack of HGSOC characteristics stands in stark contrast to the frequent use of these cell lines as models for this subtype.

IGROV1 is most probably not of the HGSOC subtype. IGROV1 is often quoted as being of the HGSOC subtype3143. However, its at copy-number prole and high mutation frequency place it among the hyper-mutators described above (Figs 1a, 2 and 3). The large number of mutations is most probably due to frameshift mutations in the DNA repair genes MLH1, MSH3 and MSH6. Similar loss of MLH1 or MSH2 expression has been observed in endometrioid cancers44. With frameshift mutations in ARID1A, an activating missense mutation in PIK3CA (R38C) (ref. 45) and an inactivating missense mutation in PTEN (Y155C) (ref. 46), IGROV1 not only has the overall genomic prole but also several specic signature mutations of endometrioid carcinoma. Especially, the co-occurrence of PIK3CA and PTEN mutations is rare in general but has been described in both endometrial and endometrioid carcinomas15,47.

Expression proles of tumours and cell lines were compared to further corroborate our observations made on the copy-number and mutation level. We computed the correlation of the expression proles of all cell line and tumour pairs, and ranked the cell lines by the average of their correlations with the tumours. The correlation between this ranking and the ordering produced by the suitability score assigned based on copy-number and mutation data is highly signicant (P-value 1.27e 05, Kendalls

tau rank-correlation test; Supplementary Data S2). Clustering both the ovarian cancer cell lines and the HGSOC tumours based on expression data is not as informative of the relative suitability of the 47 cell lines as tumour models, as a clear division between cell lines and tumours is observed, both by unsupervised clustering as well as by principal component analysis (Supplementary Fig. S3)48. Expression-based clustering of all CCLE cell lines from all tumour types, however, groups most cell lines according to their tissue of origin, thus providing valuable information (Fig. 4)49. Interestingly, IGROV1 clusters with endometrial and clear cell ovarian cancer cell lines. In light of the recent discovery linking both endometrioid and clear cell ovarian cancers to endometriosis, this observation is no longer surprising20. Taken together, these ndings imply that IGROV1 is of endometrioid or clear cell rather than high-grade serous origin. In fact, the original publication describes the parent tumour as mainly endometrioid carcinoma with serous, clear cell and undifferentiated areas50.

6 NATURE COMMUNICATIONS | 4:2126 | DOI: 10.1038/ncomms3126 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3126 ARTICLE

Histologic subtype (ovarian)

Other Mixed

Clear cell

Serous NS

Autonomic gangliaBiliary tractBoneBreastCNSEndometriumHaematopoietic and lymphoid tissue Large intestineKidneyLiverLungOesophagusOvaryPancreasPleuraProstateSalivary glandSkinSmall intestineSoft tissueStomachThyroidUpper aerodigestive tractUrinary tract

* * * * * * * * * * * * ** *

Haematopoietic and lymphoid tissue GIT Aerodigestive tract

Breast Lung Pancreas Lung Skin CNS CNS Kidney

Ovary

Lung Liver

Endometrium

Figure 4 | Expression-based clustering of all 963 CCLE cell lines from diverse tumour types. The 5,000 most variable genes were used for unsupervised clustering of cell lines by mRNA expression data. Cell lines are colour-coded (vertical bars) according to the reported tissue of origin (a PDF version that can be enlarged at high resolution is in Supplementary Information, Supplementary Fig. S4); horizontal labels at bottom indicate the dominating tissue types within the respective branches of the dendrogram. Most ovarian cancer cell lines (magenta) cluster together, interspersed with endometrial cell lines. However, some ovarian cancer cell lines cluster with other tissue types (*). Top right panels: neighbourhoods (1) of the top cell lines in our analysis, (2) of cell line IGROV1, and (3) of cell line A2780. For the ovarian cancer cell lines in these enlarged areas, the histological subtype as assigned in the original publication is indicated by coloured letters.

Expression clustering suggests diverse tissues of origin. IGROV1 is not the only cell line that has acquired an inaccurate subtype label in the literature. The eld has come to realize that several ovarian tumours in fact do not originate in this organ but rather constitute metastases stemming from distant primary tumours12. Interestingly, several CCLE ovarian cancer cell lines cluster with non-ovarian cancer types by mRNA expression data (Fig. 4, Supplementary Data S1). Among these is A2780, the second most commonly used ovarian cancer cell line. By expression, it clusters far from the majority of ovarian cancer cell lines with the lung, liver, stomach and small intestine cancer cell lines, and its copy-number and mutation proles show no resemblance to the TCGA samples (Figs 1a, 3 and 4).

Some cell lines are not classied as HGSOC, although they have all hallmarks of this cancer subtype. An especially striking example of this is KURAMOCHI, which is one of the top HGSOC-like cell lines in the above analysis and clusters with serous ovarian cancer cell lines in the expression data set. Indeed, the top-ranking HGSOC-like cell lines in terms of CNA and mutation patterns all cluster together in the expression data analysis (Fig. 4). These cell lines, assigned a high suitability score based on their genomic features, therefore also share somewhat similar mRNA expression proles, further corroborating that they stem from the same tissue type, that is, HGSOC tumours.

In short, several cell lines considerably resemble HGSOC with respect to copy-number, mutation and expression data. On the other hand, three cell lines commonly used as models for this subtype, namely SK-OV-3, A2780 and IGROV1, have little prole similarity to the tumours.

DiscussionSeveral publications have recently pointed out the need for good cell line models of the distinct subtypes of ovarian cancer and especially the most prevalent HGSOC12,13. Which cell line is the optimal tumour model depends on numerous factors such as the problem at hand, the specic genomic alterations of interest as well as more practical issues like growth characteristics, and thus has no single answer. However, for certain studies, such as drug sensitivity assessment, maximal molecular similarity to tissue samples is desirable. Our analysis can serve as a general guideline for choosing appropriate and avoiding poorly suited cell line models of HGSOC.

Alarmingly, this study reveals that the most frequently used cell lines seem for the most part badly suited for investigating HGSOC, whereas the cell lines that more closely resemble the tumours are rarely used in laboratories. Indeed, the dozen top-ranking HGSOC-like cell lines account for only 1% of Pubmed citations out of the 47 analysed cell lines, although HGSOC is by far the most prevalent and extensively studied ovarian cancer subtype. Although limited commercial availability could have contributed to the infrequent use of some of the top-ranking HGSOC cell lines, it cannot fully explain it. The top HGSOC-like cell lines are all obtainable from one of the major commercial distributors6. Another plausible reason for the striking discrepancy between suitability and frequency of use of cell lines is the ambiguity of subtype annotations of cell lines in the literature.

For several cell lines, the subtype assumed in publications is not mirrored by the molecular proles. The most striking

NATURE COMMUNICATIONS | 4:2126 | DOI: 10.1038/ncomms3126 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications 7

ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3126

example, IGROV1, has a hypermutator genotype and is possibly of endometrioid or clear cell origin. Although the original publication describes mainly the endometrioid nature of the parent tumour, over the years the subtype annotation of IGROV1 in publications has evolved to HGSOC. Further examples of miscommunication in the literature are the most frequently used ovarian cancer cell lines A2780 or SK-OV-3, which were not assigned any histological subtype by the originators, but today are widely assumed to be good models of HGSOC. On the other hand, there are cell lines like KURAMOCHI or OVCAR-4, which are not as frequently used and could not be assigned a histological subtype by the originators, but whose genomic features place them among the HGSOC cell lines. Taken together, these issues raise the question of potential composite use of classical histopathology and genomic proling for subtype identication of parent tumours but also of the derived cell lines. Especially, when the histopathological diagnosis is ambiguous, it may be advisable to complement visual microscopic classication by quantitative evaluation of genomic attributes, which should soon be available to pathology at reasonable cost.

Cell line models for the distinct cancer subtypes that are clearly annotated and whose identity has been conrmed by a combination of targeted sequencing and copy-number proling or single nucleotide polymorphism-ngerprinting can be particularly valuable in the clinic, especially in the age of personalized medicine. On one hand, preclinical results, for example, measurement of drug response proles, obtained in well-characterized cell line models with known alterations may be a very useful guide to patient selection in clinical trials at a level of subdivision that would lead to higher response rates. On the other hand, in light of the advances in molecular proling, one can envision the reverse scenario: for a given patient, determine the molecular prole of the tumour, select the most similar cell line model by means of a more rened suitability score, use this cell line to perform preclinical drug screens and as a result make a more informed choice of therapy for the patient. A more practical and straightforward form of cell line selection as in vitro models of patient tumours could already be implemented today. The realization that the distinct subtypes of ovarian cancer may be diverse diseases has prompted calls for distinguishing between these subtypes in clinical trials. Taking another step back, it is reasonable to use cell lines of the same subtype as the intended patient cohort in preclinical studies. There are examples of failed clinical trials conducted in HGSOC patients after preclinical studies in cell lines of endometrioid origin, among them IGROV1 (ref. 31). It is not guaranteed that using cell line models of the same subtype would have inuenced the preclinical results. However, using cell lines with genomic background similar to patient samples at least increases the likelihood that conclusions reached in an in vitro setting will be transferable to the clinic. Although several of the cell lines analysed here are genomically similar to HGSOC, deriving new cell lines from untreated primary ovarian tumours will probably help to further bridge the gap between cell line models and clinical tumours. The cell lines proled by CCLE have been in passage for several years, if not decades, and some patients were treated with severe chemotherapy before the biopsy, both factors that are known to affect genomic proles51,52.

In summary, in this study we distinguish the good, the bad and the ugly among cell line models of HGSOC. We recommend a set of good cell lines that closely resemble tumour samples (Fig. 3). In contrast, we point out several bad cell line models of this subtype that have at copy-number proles, wild-type TP53 and uncharacteristic mutations. This group includes the two most frequently used ovarian cancer cell lines SK-OV-3 and A2780. Ugly cell line models make up a third group: these cell lines

resemble HGSOC at the rst sight, as they have TP53 mutations or a substantial degree of CNAbut closer inspection reveals striking differences. For some of these ugly cell lines, expression proles imply they are derived from metastases from distant tissues. Others, such as IGROV1, are hypermutated and plausibly stem from a different ovarian cancer subtype.

This pilot study on HGSOC describes a methodology for selecting suitable cell lines as tumour models. Although the choice of the optimal cell line is highly context specic, our conceptual approach for identifying suitable cell line models is widely applicable. Hand in hand with the increasing availability of genomic data from studies such as the CCLE and the Sanger Cancer Cell Line project or TCGA and the International Cancer Genome Consortium, this method can be further rened and applied to a wide range of tumour types. In this way, it can help to optimize the choice of cell lines as tumour models for a broad variety of tumour types, and thus increase the value of preclinical studies.

Methods

Data acquisition. DNA copy-number, mutation and mRNA expression data were analysed for all 316 HGSOC tumour samples proled by TCGA (ref. 5) and 47 ovarian cancer cell lines from the CCLE (ref. 6). For the remaining CCLE ovarian cancer cell lines, COLO684, TOV112D, OC314 and OC315, not all three data types were available from the CCLE, so they were excluded from the analysis. In our comparison, we consider all data types that are available for both studies: genome-wide DNA copy-number information, mutation data for 1,651 genes and mRNA expression proles. Only recently, short-tandem repeat proling revealed substantial redundancy and contamination in a different ovarian cancer cell line panel53. For the CCLE ovarian cancer cell line panel, however, identity was conrmed via single nucleotide polymorphism-ngerprinting6.

Copy-number data processing. Segmented copy-number data obtained from the CCLE website (platform: Affymetrix SNP6) (ref. 6) and the cBio Cancer Genomics Portal (http://www.cbioportal.org/

Web End =http://www.cbioportal.org/) 54 for the TCGA data (platform: Agilent 1M array)5 was used for the analysis of CNAs. Fraction genome altered (FGA) was calculated as follows:

FGA X

CNi 4 T

Li=X Li 1

For each segment i, CNi is given by CN log2(sample intensity/reference

intensity), L(i) is the length of segment i and T is the threshold value of the CNi above which the segments are considered altered. In other words, FGA is the ratio of the sum of the lengths of all segments with signal above the threshold to the sum of all segment lengths. A threshold T of 0.2 was used for TCGA tumour samples and 0.3 for the CCLE cell lines. Different thresholds were chosen for the tumours and cell lines as the copy-number signal for tumours is often weakened due to contamination with non-tumour material or by tumour heterogeneity, whereas cell lines are purer. Similar reasoning was used when choosing a CN value 41.0 to dene high-level amplications in CCLE cell lines.

To enable a gene-by-gene comparison of copy-number proles from TCGA tumour samples and CCLE cell lines, the Bioconductor package CNTools was used to map the segmented copy-number data of all CCLE and TCGA samples to genes55. The mean copy-number prole of the TCGA samples was obtained by computing the mean signal of each gene across all tumour samples. Correlations of copy-number proles were calculated using Pearsons correlation coefcients.

In detail, the similarities and differences between cell lines and tumours on the copy-number level were quantied in three different ways (see Supplementary Data S1). For each cell line, the CNA prole was compared with that of single tumours and the entire group of tumour samples. To determine similarity to single tumour samples, the correlation of the copy-number prole of each cell line with the copy-number prole of each of the 316 HGSOC tumour samples was calculated over all genes. This measure is of particular interest when seeking to identify suitable cell line models for specic subgroups of patients. On the other hand, it can be desirable to nd cell lines whose copy-number proles are most similar to those of the majority of tumour samples, disregarding any diversity within the tissue samples. To determine the similarity of the CNA of each cell line with that of the entire group of tumours, we calculated the median of all the correlation values for the 316 tumour samples. In addition, we determined the mean CNA prole of the tumour samples, that is, the copy-number change for each gene averaged over all samples. In this measure, amplications or deletions consistently present in many samples are taken into account, whereas conicting or noisy copy-number values are averaged out. For each cell line, the correlation of its copy-number prole with this mean CNA prole was calculated over all genes. Although these three comparisons of copy-number proles are related, which one is most

8 NATURE COMMUNICATIONS | 4:2126 | DOI: 10.1038/ncomms3126 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3126 ARTICLE

informative depends on the question at hand. Although the correlation with the mean CNA prole of tumours resembles the median of the correlations withthe CNA of all single tumours for all the ovarian cancer cell lines, the correlation value with the CNA of the nearest single tumour does not necessarily follow a similar trend.

Calculation of mutation frequencies. Mutation frequencies were calculated as the ratio of mutation counts to number of bases covered. To focus on the mutations most likely to be functional, mutations in introns, untranslated regions, anking and intergenic regions, as well as silent and RNA mutations, were excluded. The CCLE provided the number of reads per base in the sequenced regions (in wig format), so the number of bases covered was given by the number of positions with one or more reads. TCGA, on the other hand, provided exon-wise coverage information, namely the length of each exon and an associated coverage per exon between 0 and 1. So the effective number of bases covered for each exon was given by the product of the length and coverage of the exon. The sum of these values is the total number of bases covered for each TCGA HGSOC sample.

Computing the cell line suitability score. The extent to which the ovarian cancer cell lines match genetic characteristics shared by the majority of TCGA high-grade serous ovarian tumours was assessed using an empirical numerical score. This suitability score S, in which selected features of HGSOC are positively weighted and characteristics of other ovarian cancer subtypes are negatively weighted is given by

S A B 2 C D=7 2

where A is the correlation with the mean CNA of HGSOC tumours, B is 1 for cell lines harbouring a TP53 mutation and 0 otherwise, C is 1 for hypermutated cell lines and 0 otherwise, and D is the number of genes mutated among the seven non-HGSOC genes recurrently altered only in the other ovarian cancer subtypes (ARID1A, BRAF, CTNNB1, ERBB2, KRAS, PIK3CA and PTEN). This score serves to distinguish better and poorer cell line models of HGSOC, but is not considered a nely graduated ranking (Supplementary Data S1).

Expression analysis and clustering. Robust z-scores (median-centred expression values divided by the median absolute deviation) were used for expression-based clustering of all CCLE cell lines. The top 5,000 genes by interquartile range (difference between the 25th and 75th percentile) across all cell lines were chosen, and 1 c (where c is Pearsons correlation coefcient) was used as the distance for

hierarchical clustering using Wards agglomeration method56.

For expression-based comparison of CCLE ovarian cancer cell lines and TCGA HGSOC tumour samples, z-scores were derived separately for the two data sets before a combined analysis was performed using the 10,383 genes available on both platforms. We used data from the Affymetrix U133A platform for TCGA, although this meant missing data for one of the 316 tumour samples, as the CCLE expression data was obtained using Affymetrix U133 Plus 2.0 Arrays. The top 5,000 genes by interquartile range across the combined data set were chosen for principal component analysis as well as hierarchical clustering using 1 c as the distance,

and complete linkage for agglomeration.

Software tools. Data processing, analysis and visualization was done in the Perl and R programming environments, and statistical calculations were done using the R language57. The copy-number proles of TCGA samples and CCLE cell lines were visualized using the Integrative Genomics Viewer (version 1.4.2)58 and OncoPrints were generated using the cBio Cancer Genomics Portal (http://www.cbioportal.org

Web End =http:// http://www.cbioportal.org

Web End =www.cbioportal.org )54. The Bioconductor package sparcl was used to draw the coloured dendrogram55,59.

Pubmed citation analysis. The number of Pubmed abstracts mentioning one of the 47 CCLE ovarian cancer cell lines was determined using the Pubmed search builder (http://www.pubmed.org

Web End =http://www.pubmed.org) on 4 June 2012 using several punctuation alternatives for the cell line names. This search method can lead to false-negative results, for example, it did not yield any hits for some cell lines such as COV318, although a few publications exist that do not refer to the cell line in the abstract.

References

1. Ertel, A., Verghese, A., Byers, S. W., Ochs, M. & Tozeren, A. Pathway-specic differences between tumor cell lines and normal and tumor tissue cells. Mol. Cancer 5, 55 (2006).

2. Stein, W. D., Litman, T., Fojo, T. & Bates, S. E. A Serial Analysis of Gene Expression (SAGE) database analysis of chemosensitivity: comparing solid tumors with cell lines and comparing solid tumors from different tissue origins. Cancer Res. 64, 28052816 (2004).

3. Gillet, J. P. et al. Redening the relevance of established cancer cell lines to the study of mechanisms of clinical anti-cancer drug resistance. Proc. Natl Acad. Sci. USA 108, 1870818713 (2011).

4. Sandberg, R. & Ernberg, I. Assessment of tumor characteristic gene expression in cell lines using a tissue similarity index (TSI). Proc. Natl Acad. Sci. USA 102, 20522057 (2005).

5. Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature 474, 609615 (2011).

6. Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603607 (2012).

7. IARC Press. World Cancer Report (eds Boyle, P. & Levin, B.) (IARC Press, Lyon, France, 2008).

8. Jemal, A., Siegel, R., Xu, J. & Ward, E. Cancer statistics, 2010. CA. Cancer.J. Clin. 60, 277300 (2010).9. Seidman, J. D. et al. The histologic type and stage distribution of ovarian carcinomas of surface epithelial origin. Int. J. Gynecol. Pathol. 23, 4144 (2004).

10. Gershenson, D. M. et al. Clinical behavior of stage II-IV low-grade serous carcinoma of the ovary. Obstet. Gynecol. 108, 361368 (2006).

11. Bowtell, D. D. The genesis and evolution of high-grade serous ovarian cancer. Nat. Rev. Cancer 10, 803808 (2010).

12. Vaughan, S. et al. Rethinking ovarian cancer: recommendations for improving outcomes. Nat. Rev. Cancer 11, 719725 (2011).

13. Berns, E. M. & Bowtell, D. D. The changing view of high-grade serous ovarian cancer. Cancer Res. 72, 27012704 (2012).

14. Bast, Jr R. C. & Mills, G. B. Dissecting PI3Kness: the complexity of personalized therapy for ovarian cancer. Cancer Discov. 2, 1618 (2012).

15. Willner, J. et al. Alternate molecular genetic pathways in ovarian carcinomas of common histological types. Hum. Pathol. 38, 607613 (2007).

16. Cho, K. R. & Shih, I. e. M. Ovarian cancer. Annu. Rev. Pathol. 4, 287313 (2009).

17. Singer, G. et al. Mutations in BRAF and KRAS characterize the development of low-grade ovarian serous carcinoma. J. Natl Cancer Inst. 95, 484486 (2003).

18. Sieben, N. L. et al. In ovarian neoplasms, BRAF, but not KRAS, mutations are restricted to low-grade serous tumours. J. Pathol. 202, 336340 (2004).

19. Vang, R., Shih, I. e., M. & Kurman, R. J. Ovarian low-grade and high-grade serous carcinoma: pathogenesis, clinicopathologic and molecular biologic features, and diagnostic problems. Adv. Anat. Pathol. 16, 267282 (2009).20. Wiegand, K. C. et al. ARID1A mutations in endometriosis-associated ovarian carcinomas. N. Engl. J. Med. 363, 15321543 (2010).

21. Wright, K. et al. beta-catenin mutation and expression analysis in ovarian cancer: exon 3 mutations and nuclear translocation in 16% of endometrioid tumours. Int. J. Cancer 82, 625629 (1999).

22. Obata, K. et al. Frequent PTEN/MMAC mutations in endometrioid but not serous or mucinous epithelial ovarian tumors. Cancer Res. 58, 20952097 (1998).

23. Campbell, I. G. et al. Mutation of the PIK3CA gene in ovarian and breast cancer. Cancer Res. 64, 76787681 (2004).

24. Ahmed, A. A. et al. Driver mutations in TP53 are ubiquitous in high grade serous carcinoma of the ovary. J. Pathol. 221, 4956 (2010).

25. Motoyama, T. Biological characterization including sensitivity to mitomycin C of cultured human ovarian cancers (authors transl). Nihon Sanka Fujinka Gakkai Zasshi 33, 11971204 (1981).

26. Motoyama, T. Quantitative analysis on in vitro drug sensitivity of cultured human ovarian cancer cell lines (authors transl). Nihon Sanka Fujinka Gakkai Zasshi 34, 308314 (1982).

27. van den Berg-Bakker, C. A. et al. Establishment and characterization of 7 ovarian carcinoma cell lines and one granulosa tumor cell line: growth features and cytogenetics. Int. J. Cancer 53, 613620 (1993).

28. Gilks, C. B. et al. Tumor cell type can be reproducibly diagnosed and is of independent prognostic signicance in patients with maximally debulked ovarian carcinoma. Hum. Pathol. 39, 12391251 (2008).

29. Madore, J. et al. Characterization of the molecular differences between ovarian endometrioid carcinoma and ovarian serous carcinoma. J. Pathol. 220, 392400 (2010).

30. Rigakos, G. & Razis, E. BRCAness: nding the Achilles heel in ovarian cancer. Oncologist 17, 956962 (2012).

31. Coward, J. et al. Interleukin-6 as a therapeutic target in human ovarian cancer. Clin. Cancer Res. 17, 60836096 (2011).

32. Taylor, S. A. et al. Combining the farnesyltransferase inhibitor lonafarnib with paclitaxel results in enhanced growth inhibitory effects on human ovarian cancer models in vitro and in vivo. Gynecol. Oncol. 109, 97106 (2008).

33. Szotek, P. P. et al. Ovarian cancer side population denes cells with stem cell-like characteristics and Mullerian Inhibiting Substance responsiveness. Proc. Natl Acad. Sci. USA 103, 1115411159 (2006).

34. Kulbe, H. et al. A dynamic inammatory cytokine network in the human ovarian cancer microenvironment. Cancer Res. 72, 6675 (2012).

35. Leinster, D. A. et al. The peritoneal tumour microenvironment of high-grade serous ovarian cancer. J. Pathol. 227, 136145 (2012).

36. Galmozzi, E. et al. Exon 3 of the alpha folate receptor gene contains a 50 splice site which confers enhanced ovarian carcinoma specic expression. FEBS Lett. 502, 3134 (2001).

NATURE COMMUNICATIONS | 4:2126 | DOI: 10.1038/ncomms3126 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications 9

ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3126

37. De Cecco, L. et al. Gene expression proling of advanced ovarian cancer: characterization of a molecular signature involving broblast growth factor 2. Oncogene 23, 81718183 (2004).

38. Iorio, E. et al. Activation of phosphatidylcholine cycle enzymes in human epithelial ovarian cancer cells. Cancer Res. 70, 21262135 (2010).

39. Mangiarotti, F. et al. Functional effect of point mutations in the alpha-folate receptor gene of CABA I ovarian carcinoma cells. J. Cell Biochem. 81, 488498 (2001).

40. Mezzanzanica, D. et al. CD95-mediated apoptosis is impaired at receptor level by cellular FLICE-inhibitory protein (long form) in wild-type p53 human ovarian carcinoma. Clin. Cancer Res. 10, 52025214 (2004).

41. Aldovini, D. et al. M-CAM expression as marker of poor prognosis in epithelial ovarian cancer. Int. J. Cancer 119, 19201926 (2006).

42. Gloss, B. S. et al. Integrative genome-wide expression and promoter DNA methylation proling identies a potential novel panel of ovarian cancer epigenetic biomarkers. Cancer Lett. 318, 7685 (2012).

43. Macor, P. et al. Complement activated by chimeric anti-folate receptor antibodies is an efcient effector system to control ovarian carcinoma. Cancer Res. 66, 38763883 (2006).

44. Liu, J. et al. Microsatellite instability and expression of hMLH1 and hMSH2 proteins in ovarian endometrioid cancer. Mod. Pathol. 17, 7580 (2004).

45. Oda, K. et al. PIK3CA cooperates with other phosphatidylinositol 3-kinase pathway mutations to effect oncogenic transformation. Cancer Res. 68, 81278136 (2008).

46. Han, S. Y. et al. Functional evaluation of PTEN missense mutations usingin vitro phosphoinositide phosphatase assay. Cancer Res. 60, 31473151 (2000).

47. Oda, K., Stokoe, D., Taketani, Y. & McCormick, F. High frequency of coexistent mutations of PIK3CA and PTEN genes in endometrial carcinoma. Cancer Res. 65, 1066910673 (2005).

48. Lukk, M. et al. A global map of human gene expression. Nat. Biotechnol. 28, 322324 (2010).

49. Wang, H. et al. Comparative analysis and integrative classication of NCI60 cell lines and primary tumors using gene expression proling data. BMC 7, 166 (2006).

50. Benard, J. et al. Characterization of a human ovarian adenocarcinoma line, IGROV1, in tissue culture and in nude mice. Cancer Res. 45, 49704979 (1985).

51. Wenger, S. L. et al. Comparison of established cell lines at different passages by karyotype and comparative genomic hybridization. Biosci. Rep. 24, 631639 (2004).

52. Cooke, S. L. et al. Genomic analysis of genetic heterogeneity and evolution in high-grade serous ovarian carcinoma. Oncogene 29, 49054913 (2010).

53. Korch, C. et al. DNA proling analysis of endometrial and ovarian cell lines reveals misidentication, redundancy and contamination. Gynecol. Oncol. 127, 241248 (2012).

54. Cerami, E. et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2, 401404 (2012).

55. Gentleman, R. C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004).

56. Ward, J. H. J. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 48, 236244 (1963).

57. R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, Austria, 2010).

58. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 2426 (2011).

59. Witten, D. M. & Tibshirani, R. sparcl: perform sparse hierarchical clustering and sparse k-means clustering. R package version 1.0.1 (2010). http://www.cran.r-project.org/web/packages/sparcl/index.html

Web End =http://www.cran. http://www.cran.r-project.org/web/packages/sparcl/index.html

Web End =r-project.org/web/packages/sparcl/index.html .

Acknowledgements

We thank N. Stransky and J. Barretina for information about CCLE and providing robust z-scores for the expression data of 966 cell lines, Gordon B. Mills for comments on the manuscript and the Bioinformatics Core for maintaining computer systems. Funding for S.D., R.S., C.S. and N.S. was provided by the US National Cancer Institute as part of the TCGA Genome Data Analysis Center grant (NCI-U24CA143840) and by a Stand Up To Cancer Dream Team Translational Research Grant, a programme of the Entertainment Industry Foundation (SU2C-AACR-DT0209). D.A.L. received funding from the DoD Award W81XWH-10-1-0222 and The Chandler Cox Foundation. S.D. was supported by a study-abroad grant of the Dr Karl Wamsler Stiftung.

Author contributions

S.D., R.S., D.A.L., C.S. and N.S. conceived the project. C.S. and N.S. supervised the project. S.D., R.S. and N.S. analysed and interpreted the data. S.D., R.S., C.S. and N.S. wrote the manuscript. All authors discussed the results, and reviewed and commented on the manuscript.

Additional information

Supplementary Information accompanies this paper at http://www.nature.com/naturecommunications

Web End =http://www.nature.com/ http://www.nature.com/naturecommunications

Web End =naturecommunications

Competing nancial interests: The authors declare no competing nancial interests.

Reprints and permission information is available online at http://npg.nature.com/reprintsandpermissions/

Web End =http://npg.nature.com/ http://npg.nature.com/reprintsandpermissions/

Web End =reprintsandpermissions/

How to cite this article: Domcke, S. et al. Evaluating cell lines as tumour models by comparison of genomic proles. Nat. Commun. 4:2126 doi: 10.1038/ncomms3126 (2013).

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. To view a copy of this licence visit http://creativecommons.org/licenses/by/3.0/

Web End =http:// http://creativecommons.org/licenses/by/3.0/

Web End =creativecommons.org/licenses/by/3.0/ .

10 NATURE COMMUNICATIONS | 4:2126 | DOI: 10.1038/ncomms3126 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications

Word count: 9034

Show less

Abstract

Translate

Cancer cell lines are frequently used as in vitro tumour models. Recent molecular profiles of hundreds of cell lines from The Cancer Cell Line Encyclopedia and thousands of tumour samples from the Cancer Genome Atlas now allow a systematic genomic comparison of cell lines and tumours. Here we analyse a panel of 47 ovarian cancer cell lines and identify those that have the highest genetic similarity to ovarian tumours. Our comparison of copy-number changes, mutations and mRNA expression profiles reveals pronounced differences in molecular profiles between commonly used ovarian cancer cell lines and high-grade serous ovarian cancer tumour samples. We identify several rarely used cell lines that more closely resemble cognate tumour profiles than commonly used cell lines, and we propose these lines as the most suitable models of ovarian cancer. Our results indicate that the gap between cell lines and tumours can be bridged by genomically informed choices of cell line models for all tumour types.

Details

Title

Evaluating cell lines as tumour models by comparison of genomic profiles

Author

Domcke, Silvia; Sinha, Rileen; Levine, Douglas A; Sander, Chris; Schultz, Nikolaus

Pages

2126

Publication year

2013

Publication date

Jul 2013

Publisher

Nature Publishing Group

e-ISSN

20411723

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1038/ncomms3126

ProQuest document ID

1399092000

Evaluating cell lines as tumour models by comparison of genomic profiles

Jump to:

Full Text

Abstract

Details

Suggested sources