ARTICLE
Received 1 Sep 2015 | Accepted 19 Nov 2015 | Published 7 Jan 2016
DOI: 10.1038/ncomms10256 OPEN
Label-free cell cycle analysis for high-throughput imaging ow cytometry
Thomas Blasi1,2,3, Holger Hennig1, Huw D. Summers4, Fabian J. Theis2,3, Joana Cerveira5, James O. Patterson6, Derek Davies5, Andrew Filby7, Anne E. Carpenter1 & Paul Rees1,4
Imaging ow cytometry combines the high-throughput capabilities of conventional ow cytometry with single-cell imaging. Here we demonstrate label-free prediction of DNA content and quantication of the mitotic cell cycle phases by applying supervised machine learning to morphological features extracted from brighteld and the typically ignored darkeld images of cells from an imaging ow cytometer. This method facilitates nondestructive monitoring of cells avoiding potentially confounding effects of uorescent stains while maximizing available uorescence channels. The method is effective in cell cycle analysis for mammalian cells, both xed and live, and accurately assesses the impact of a cell cycle mitotic phase blocking agent. As the same method is effective in predicting the DNA content of ssion yeast, it is likely to have a broad application to other cell types.
1 Imaging Platform at the Broad Institute of Harvard and MIT, 415 Main St, Cambridge, Massachusetts 02142, USA. 2 Helmholtz Zentrum MnchenGerman Research Center for Environmental Health, Institute of Computational Biology, Ingolstadter Landstra 1, 85764 Neuherberg, Germany. 3 Department of
Mathematics, Technische Universitat Mnchen, Boltzmannstra 3, 85748 Garching, Germany. 4 College of Engineering, Swansea University, Singleton Park, Swansea SA2 8PP, UK. 5 Flow Cytometry Facility, The Francis Crick Institute, Lincolns Inn Fields Laboratory, 44 Lincolns Inn Fields, London WC2A 3LY, UK.
6 Cell Cycle Laboratory, The Francis Crick Institute, 44 Lincolns Inn Fields, Holborn WC2A 3LY, UK. 7 Newcastle Upon Tyne University, Faculty of Medical Sciences, Bioscience Centre, International Centre for life, Newcastle Upon Tyne NE1 7RU, UK. Correspondence and requests for materials should be addressed to A.E.C. (email: mailto:[email protected]
Web End [email protected] ) or to P.R. (email: mailto:[email protected]
Web End [email protected] ).
NATURE COMMUNICATIONS | 7:10256 | DOI: 10.1038/ncomms10256 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 1
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms10256
Amajor challenge in many modern biological laboratories is obtaining information-rich measurements of cells in high-throughput and at single-cell resolution.
Conventional ow cytometry is a widespread and powerful technique for the measurement of cell phenotype and function using targeted uorescent stains1. It is highly suited to the study of cell populations and rare subset identication due to its high-throughput, multi-parameter nature. The uorescent stains can be used to label cellular components or processes, revealing specic cell phenotypes in the population and quantifying the particular state of each cell2. For example, quantifying the proportion of cells in each phase of the cell cycle, including mitotic phases is very useful in the modern biological laboratory3. It can be achieved with conventional ow cytometry using multiple stains: typically, a stoichiometric uorescent stain for DNA reports the cells position within the G1, S and G2 phases of the cell cycle2, and additional stains are needed to sort mitotic cells into phases. Often these stains are incompatible with live cell analysis (for example, antibodies against histone modications3) and even if live cell reporters are available4 these may have confounding effects on the cells. For example, the commonly used Hoechst 33342 stain, which binds to the minor groove of the double-stranded DNA can induce single-strand DNA breaks5, or DRAQ5 (deep red uorescing bisalkylaminoanthraquinone) the nuclear stain that intercalates with the cells DNA can inuence chromation organization and lead to histone dissociation6. Also, several different markers are usually required to unambiguously identify all cell cycle phases7. Therefore, an assay that reduces or even eliminates the number of stains required to identify phenotypes such as the position in the cell cycle is particularly attractive.
In recent years, the two technologies of uorescence microscopy and ow cytometry have been integrated to create imaging ow cytometry8, where an image is captured of each cell as it ows past an excitation source and a CCD detector. It combines conventional ow cytometrys high-throughput speed and easy identication of each individual cell with the uorescence microscopys spatial image acquisition. Therefore, imaging ow cytometry measures not only uorescence intensities but also the spatial image of the uorescence together with brighteld and darkeld images of each cell in a population. The rich information captured using imaging ow cytometry makes it an ideal candidate for the use of high content approaches to identify
complex cell phenotypes such as the cell cycle phase of an individual cell. We have previously demonstrated that measuring the shape of the nucleus from cells stained with a nuclear marker using imaging ow cytometry drastically improves the classication of mitotic phases9. However, the even richer morphological information that can be extracted using imaging software tools10 offers the prospect of using more advanced multivariate analysis techniques to mine the data and to identify various cell phenotypes, as has been successfully done for traditional microscopy images1114. This type of analysis is also usually more accurate and less subjective than any manual analysis of the acquired images13 as well as more robust than typical gating strategies that rely on only few features of the cells.
Here we report that quantitative image analysis of two largely overlooked channels; brighteld and darkeld, both readily collected by imaging ow cytometers that enables cell cycle-related assays without needing any uorescence biomarkers. We use image analysis software9 to extract numerical measurements of cell morphology from the brighteld and darkeld images, and then we apply supervised machine-learning algorithms to identify cellular phenotypes of interest, in the present case, cell cycle phases. The designed workow is open-source and freely available (visit http://www.cellprofiler.org/imagingflowcytometry
Web End =www.cellproler.org/imagingowcytometry ) and accompanied by step-by-step tutorials and example data sets online. Avoiding uorescent stains provides several benets: it reduces effort and cost, avoids potentially confounding side effects of live cell markers and frees up the remaining available uorescence channels of the imaging ow cytometer that can be used to investigate other biological questions.
ResultsLabel-free analysis workow. The rst step in the workow of label-free cell cycle classication (Fig. 1) is to acquire brighteld and darkeld images from the cells (see Methods section). To allow visual inspection and to optimize the le size for processing, we tile individual cells brighteld and the darkeld images into 15 15 montages, with up to 225 cells per montage. Then,
we load the montages into the open-source imaging software CellProler9 for processing (see Methods section). There is sufcient contrast between the cells and the ow media to robustly segment the cells in the brighteld images without the need for any stains. We extract 213 features from the segmented
Imaging flow cytometry Segmentation and feature extraction Machine learning: classification
Detector (brightfield)
Preprocessing of raw images
Train the classifier for a set of cells with known
classes; find important features
Test the classifier (score) for a different set of
cells with known classes; evaluate the
predictive power of the trained classifier
Training Scoring Binning
90
1
Light sources
Detector (darkfield)
2
3
4
Generate uniform image sizes by
padding or deleting background pixels
Extraction of hundreds of morphological
features from bright- and darkfield images,
such as area and shape, intensity, texture,
granularity and radial distribution
Tile images into montages
Brightfield image
Darkfield image
With the trained classifier, predict the class of
unlabeled cells
Digitally sort the cells into bins
Figure 1 | Label-free imaging ow cytometry workow. First the brighteld and darkeld images of the cells are measured by an imaging ow cytometer. The brighteld and darkeld images depict the light transmitted through the cell and light scattered from the cells within a cone centered at a 90 angle, respectively. Then the images are preprocessed, where we reshape the images to have their sizes coincide and tile them to montages of 15 15 images.
The montages are loaded into the open-source image software CellProler that we use to segment the cells brighteld images and to extract morphological features from the images. Finally, we apply supervised machine learning such as classication. For this purpose we need an annotated set of cells where the actual cell state is known to train the classier and to test its predictive power. Once the classier is trained it is used to predict the state of unlabelled cells and to digitally sort the cells into bins.
2 NATURE COMMUNICATIONS | 7:10256 | DOI: 10.1038/ncomms10256 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms10256 ARTICLE
brighteld and the full darkeld image (Supplementary Table 1). The features can be summarized into ve categories: size and shape, granularity, intensity, radial distribution and texture. These image features are the input for supervised machine learning, namely classication and regression (see Methods section), which we use to predict each cells DNA content and the mitotic phases in the cell cycle without the need for any stains. The machine-learning algorithms have to be trained on an annotated subset of the investigated cells where the true cell state, that is, the ground truth is known. The ground truth can be obtained either by manual identication (by a trained biologist or using software tools11) or from labelling a subset of the investigated cells with uorescent stains (see Methods section).
Cell cycle analysis of xed Jurkat cells. As an initial demonstration of our technique, we sought a label-free way to measure important cell cycle phenotypes including a continuous property (a cells DNA content, from which G1, S and G2 phases can be estimated) and discrete phenotypes (the mitotic phase of a cell: prophase, anaphase, metaphase and telophase). We used the ImageStream platform to capture images of 32,255 asynchronously growing Jurkat cells (Supplementary Fig. 1). As controls, the cells were xed and stained with PI (propidium iodide) to quantify DNA content and a MPM2 (mitotic protein monoclonal #2) antibody to identify mitotic cells (Supplementary Fig. 2). These uorescent markers were used to annotate a subset of the cells with the ground truth (expected results) needed to train the machine-learning algorithms and to evaluate the predictive accuracy of our label-free approach (see Methods section). Since it is infeasible to accurately identify individual cells in the G1, S and G2 phase based only on one nuclear marker5, we do not aim to predict those phases individually but to predict each cells DNA content. Subsequently, we use the Watson pragmatic curve tting algorithm15 (see Methods section) to estimate the percentage of cells in each of the G1/S/G2M phases based on the predicted DNA content.
Using only cell features measured from brighteld and darkeld images, we were able to devise a regression ensemble (using least squares boosting16) that accurately predicts each cells DNA content, obtaining a Pearsons correlation of r 0.8960.007 (error bars indicate the s.d. obtained via
10-fold (n 10) cross-validation here and in all following
statements of the Results section unless stated differently; see Methods section and Supplementary Note 1) between predicted and actual nuclear stain intensity (Fig. 2a). This highly accurate prediction of the DNA content can be used to further categorize G1, S and G2/M cells or to allocate each cell a time position within the cell cycle via the ergodic rate analysis, where cells are sorted according to their DNA content17. Moreover, we were able to classify mitotic phases (using random undersampling18 to compensate for the high class imbalance) with true positive rates of 55.47.0% (for prophase), 50.217.2% (for metaphase), 100% (for anaphase and telophase) and 93.10.5% for the non-mitotic phases (Fig. 2bg and Supplementary Table 2). We analysed which features have the most signicant contributions for the prediction of both the nuclear stain and the mitotic phases by leave one out cross-validation (Supplementary Table 3). We nd that leaving one feature out has only a minor effect on the results of the supervised machine-learning algorithms we used, likely because many features are highly correlated to others. The most important features are intensity, area, shape and radial distribution of the brighteld images.
Detection of mitotic phase block. The assessment of the therapeutic blocking of the cell cycle (in a particular phase) is of
particular importance. We tested the methods ability to predict the DNA content of Jurkat cells treated with 50 mM Nocodazole, a mitotic blocking agent. To conrm the magnitude of the block of cells in mitosis, we performed three additional replicates demonstrating an average increase of cells in the G2/M phase of19.011.0% (error bars indicate the s.d. obtained from n 3
replicates for each condition) compared with the control (Supplementary Fig. 3). The label-free prediction of the DNA content has a Pearsons correlation of r 0.8940.032 with the
true DNA content (PI is used as a xed cell nuclear stain to provide the ground truth for the machine-learning algorithms) and the percentage of cells in the G1, S and G2/M phases are in excellent agreement (Fig. 3a). Therefore, the technique is successfully detecting the expected increase in the G2/M cells due to the blocking agent based on the predicted DNA content. Again, we were able to classify mitotic phases and found true positive rates of 65.56.3% (for prophase), 100% (for the other mitotic phases) and 85.81.4% for the non-mitotic phases (Fig. 3be and Supplementary Table 4). Treatment of the cells with the mitotic blocking agent led to an increase in the percentage of prophase cells from 1.88 to 11.07, which is conrmed by comparison with the ground truth (Supplementary Table 2 and Table 4) and in agreement with the identied magnitude of the block of cells in mitosis.
Cell cycle analysis of live Jurkat cells. Many experimental protocols require live cells rather than xed. We tested the ability of the technique to detect cell cycle changes in live Jurkat cells. To provide ground truth (that is, the expected cell cycle distribution), the cells were stained with DRAQ5, a live-cell DNA stain (Fig. 4a). Like most live-cell-compatible DNA stains, DRAQ5 is not an ideal marker because of the variability of uptake of the dye in live cells19, nonetheless, we obtain a Pearsons correlation of r 0.7860.010 for predicting the DNA content of untreated
cells. With a regression ensemble trained on the stained live cells, we are also able to predict the effect of treatment with a phase-blocking agent on an entirely unstained data set (Fig. 4b). We detect an increase of cells in the G2/M phase from 20.9 to 34.3% when the cells are treated with 50 mM Nocodazole; this is consistent with the average increase of 19.011.0% obtained from repeating the phase block experiments with stained cells (Supplementary Fig. 3).
Cell cycle analysis of ssion yeast. To explore the generality of our method for other cell types, we tested it on another species, ssion yeast (Supplementary Fig. 4). The yeast cells were xed and stained with PI to measure the DNA content of each cell (see Methods section); subsequently the cells were assigned to the G1, S, G2 or M phase by manually gating on image based metrics from the PI channel of the Imagestream data20, which provided the ground truth (Supplementary Fig. 5). The label-free regression predicts a DNA content with a Pearsons correlation of r 0.8550.006 (Fig. 5a) and a classication accuracy of
70.22.2% (G1), 90.11.1% (S), 96.80.3% (G2) and44.08.4% (M) (Fig. 5bf and Supplementary Table 5).
DiscussionWe demonstrate here that it is possible to determine a cell populations DNA content and mitotic phases based entirely on features extracted from cells brighteld and darkeld images, as obtained in high-throughput via imaging ow cytometry. The method requires an annotated data set to train the machine-learning algorithms, either by staining a subset of the investigated cells with markers, or by visual inspection and assignment of cell classes of interest. Once the machine-learning algorithm is
NATURE COMMUNICATIONS | 7:10256 | DOI: 10.1038/ncomms10256 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 3
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms10256
0.6
G1: 52.8%
S: 21.2%
G2/M: 26.0%
r = 0.896
0.6
Actual DNA content
0.4
0.4
0.2
0.2
1Frequency of cells
0.2 0.4 0.6
0.5
0
1
G1: 52.4%
S: 22.8%
G2/M: 24.9%
Frequency of cells
M
Ana Meta
Pro
Telo
G1
S
0.5
0
0.2 0.4 0.6
G2
Predicted DNA content
Actual phase: G1/S/G2
Actual phase: Pro
100
100
75
75
50
50
25
25
0 G1/S/G2 Pro Meta Ana Telo
0 G1/S/G2 Pro Meta Ana Telo
Predicted mitotic phases Predicted mitotic phases
Predicted mitotic phases Predicted mitotic phases
Predicted mitotic phases Predicted mitotic phases
Actual phase: Meta
Actual phase: Ana
100
100
0 G1/S/G2 Pro Meta Ana Telo
75
75
50
50
25
25
0
G1/S/G2 Pro Meta Ana Telo
Actual phase: Telo
True positive rate
100
100
75
75
50
50
25
25
0 G1/S/G2 Pro Meta Ana Telo
0 G1/S/G2 Pro Meta Ana Telo
Figure 2 | Machine learning allows for robust label-free prediction of DNA content and cell cycle phases of Jurkat cells. (a) We nd a Pearsons correlation of r 0.8960.007 (error bars indicate the s.d. obtained via 10-fold cross-validation) between actual DNA content and predicted DNA
content based on regression using brighteld and darkeld morphological features only (see Methods section). We used the Watson pragmatic curve tting algorithm to specify the fraction of cells in the G1, S and G2 phases. (bf) For cells that are actually in a particular phase (for example, b shows cells in G1/S/G2), the bar plots show the classication results based on brighteld and darkeld morphological features only (for example, b shows that the few cells in prophase (Pro), metaphase (Meta), anaphase (Ana), and telophase (Telo) are errors). (g) Bar plot of the true positive rates of the cell cycle classication.
4 NATURE COMMUNICATIONS | 7:10256 | DOI: 10.1038/ncomms10256 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms10256 ARTICLE
r = 0.894
G1: 32.8% S: 34.1%
G2/M: 33.2%
0.6
0.6
0.4
0.4
Actual DNA content
0.2
0.2
0.2 0.4 0.6
1
0.5
0
Frequency of cells
G1: 36.5% S: 30.0%
G2/M: 33.5%
1
Frequency of cells
M
Telo
G1
0.5
Ana MetaPro (block)
0
0.2 0.4 0.6
G2
S
Predicted DNA content
Actual phase: G1/S/G2
Actual phase: Pro
100
100
75
75
50
50
25
25
0 G1/S/G2 Pro Others
0 G1/S/G2 Pro Others
Predicted mitotic phases Predicted mitotic phases
Actual phase: others
True positive rate
100
100
75
75
50
50
25
25
0 G1/S/G2 Pro Others
0 G1/S/G2 Pro Others
Predicted mitotic phases
Predicted mitotic phases
Figure 3 | Label-free prediction of DNA content and cell cycle phases for xed Jurkat cells treated with a prophase blocking agent. (a) Based only on brighteld and darkeld features, we nd a Pearsons correlation of r 0.8940.032 (error bars indicate the s.d. obtained via 10-fold cross-validation)
between actual DNA content and predicted DNA content using regression (see Methods section). We applied the Watson pragmatic algorithm to determine the G1, S and G2/M phases in the DNA histograms. (bd) For cells that are actually in a particular phase (for example, b shows cells in G1/S/G2), the bar plots show the classication results (see Methods section) (for example, b shows that the few cells in prophase (Pro) and the other mitotic phases (others) are errors). Note that we grouped the cells in metaphase, anaphase and telophase into one class since we only detected very little cells in those phases after treatment with the prophase blocking agent. (e) Bar plot of the true positive rates of the cell cycle classication. Using boosting with random undersampling to compensate for class imbalances, we obtain true positive rates of 65.56.3% (P), 85.81.4% (G1/S/G2) and 100% (others).
trained for a particular cell type and phenotype, the consistency of imaging ow cytometry allows high-throughput scoring of unlabelled cells for discrete and well-dened phenotypes (for example, mitotic cell cycle phases) and continuous properties (for example, DNA content).
The same basic strategy can be readily adapted to measure other phenotypes, making this a generally useful approach for label-free, single-cell phenotyping in the modern biological laboratory. The method can also be used retrospectively on data sets that do not have the necessary stains for phenotype
identication, providing an annotated data set is available to train the algorithms (see Methods section). While current imaging ow cytometers do not have physical cell-sorting capabilities, and for now our approach is suited to experimental contexts where samples are analysed only, this approach may offer the possibility to entirely avoid any uorescent stain and opens up the perspective for a new generation of image ow cytometers that could operate without uorescence channels.
The workow we designed is open-source and freely available (http://www.cellprofiler.org/imagingflowcytometry
Web End =www.cellproler.org/imagingowcytometry and Supplementary
NATURE COMMUNICATIONS | 7:10256 | DOI: 10.1038/ncomms10256 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 5
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms10256
G1: 64.8%
S: 20.7% G2/M: 14.6%
0.8
r = 0.786
0.8
Actual DNA content
0.6
0.6
0.4
0.4
0.2
0.2
0.2 0.4 0.6 0.8
1
0.5
0
Frequency of cells
G1: 59.0%
S: 20.2%
G2/M: 20.9%
Frequency of cells
1
M
Telo
G1
0.5
Ana Meta Pro
0
0.2 0.4 0.6 0.8
G2
S
Predicted DNA content
Frequency of cells
1
G1: 51.8% S: 14.0% G2/M: 34.3%
M
Telo
G1
0.5
Ana MetaPro (block)
0
0.2 0.4 0.6 0.8
G2
S
Predicted DNA content
Figure 4 | Label-free prediction of DNA content for live Jurkat cells and detection of a phase blockage. (a) Supervised machine learning (trained using live cells stained with DRAQ5 to determine the DNA content) allows for robust label-free prediction of the DNA content of live cells based only on brighteld and darkeld images. We nd a Pearson correlation of r 0.7860.010 (error bars indicate the s.d. obtained via 10-fold cross-validation)
between actual DNA content and predicted DNA content using regression (see Methods section). We believe this reduction in correlation from the value of0.896 obtained for xed cells to be a consequence of the greater variability of the uptake of the live DNA dye compared with the staining achieved with xed cells. Despite the reduction in correlation a value of 0.786 is still high enough to make this a viable method for the cell cycle analysis of live cells. As previously, we determine the fraction of cells in the G1, S and G2/M phases using the Watson pragmatic curve tting algorithm. (b) We predict an increase of 13.4% in the G2/M phase after the cells were treated with 50mM Nocodazole, which is in good agreement with the average increase of19.011.0% in G2/M as was found for three independent cell populations under the same treatment (Supplementary Figure 3). The phase-blocked data set was not labelled with any marker. Instead, we trained our machine learning algorithm on the untreated data set, which was labelled with a DRAQ5 DNA stain (see a) and used the trained machine learning algorithm to predict the DNA stain of the blocked cells.
Note 1). Label-free identication of phenotypes enables continuous, non-destructive monitoring of cell samples, minimizes potentially confounding inuences of the stains on the cells and maximizes available uorescence channels to investigate biological questions such as the search for novel hallmarks in cell cycle21, the identication of stem and progenitor cells22 or the proliferation of cancer cells23.
Methods
Code availability. All processing steps are described in a step-by-step tutorial hosted on an up-to-date website with guidance on carrying out the tutorial (visit http://www.cellprofiler.org/imagingflowcytometry
Web End =www.cellproler.org/imagingowcytometry ; a static version of the tutorial can also
be found in Supplementary Note 1). The code and the analysed data are freely available on the webpage. The code is also available as Supplementary Information (Supplementary Code 16). We used Matlab version 8.0.0.783 (R2012b)) and CellProler version 2.1.1 for our analysis.
Cell culture and phase block. Ten million E6.1 Jurkat Cells (Fred Hutchinson Cancer Research Center derived clone, Cell Services, CRUK) were cultured in RPMI media (Cat no 31870-082, Life Technologies, Inc., USA) containing 10% FBS, Penicillin/Streptomycin/Glutamine (Sigma-Aldrich G6784) at 1% and 2-Mercaptoethanol (50 mM) at 37 C/5% CO2. For cells requiring a phase block the cells were incubated with 50 mM Nocodazole for 20 h at 37 C per 5% CO2, counted and checked for viability using a Vi-Cell counter (Beckman Coulter, Inc., USA). Cells were washed once in PBS containing 2% FBS (wash buffer) and the cellular suspensions were divided in two.
6 NATURE COMMUNICATIONS | 7:10256 | DOI: 10.1038/ncomms10256 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms10256 ARTICLE
Actual phase: G1
Actual phase: S
100
100
r = 0.855
75
75
0.8
0.8
50
50
Actual DNA content
25
25
0.6
0.6
0 G1 S G2 M
0 G1 S G2 M
Predicted phases Predicted phases
Predicted phases Predicted phases
Predicted phases
0.4
0.4
Actual phase: G2
Actual phase: M
100
100
75
75
0.2
0.2
50
50
25
25
0.2 0.4 0.6 0.8
1
0.5
0
0 G1 S G2 M
0 G1 S G2 M
Frequency of cells
Frequency of cells
1
M
G1
True positive rate
100
0.5
75
50
0
25
0.2 0.4 0.6 0.8
G2
S
0 G1 S G2 M
Predicted DNA content
Figure 5 | Label-free prediction of DNA content and cell cycle phases for ssion yeast cells. (a) Based only on brighteld and darkeld features,we nd a Pearsons correlation of r 0.8550.006 (error bars indicate the s.d. obtained via 10-fold cross-validation) between actual DNA content and
predicted DNA content using regression (see Methods section). Note that the ssion yeast cell cycle is different from the Jurkat cell cycle since the two daughter cells divide between the S and G2 phases (and not at the end of M phase as is the case for Jurkat cells). (be) For cells that are actually in a particular phase (for example, b shows cells in G1), the bar plots show the classication results (see Methods section) (for example, b shows that the cells in S, G2 and M are errors). (f) Bar plot of the true positive rates of the cell cycle classication. Using boosting with random undersampling to compensate for class imbalances, we obtain true positive rates of 70.22.2% (G1), 90.11.1% (S), 96.80.3% (G2) and 44.08.4 (M).
Live cells. Half of the cells were resuspended in 100 ml of wash buffer and DRAQ5 (Cat no DR50200, Biostatus) added to a nal concentration of 5 mM before running on the ImageStream X.
Fixed cells. The other half of cells was xed in 70% ethanol for at least 1 h. After xation, the cells were washed once in wash buffer and treated with 0.1% Triton X-100 (Cat no X100-100 ML, Sigma, USA) for 10 min. Cells were spun down and incubated for 1 h with anti-phospho-Ser/Thr-Pro, MPM2 Cy5 conjugate MPM2 (1:100, Cat no 16-220, Millipore, USA) made up in PBS containing 0.2% Tween (Cat no 27,434-8, Sigma, USA) and 0.1% BSA (Cat no A4503-100G, Sigma, USA). Cells were washed once in wash buffer and stained with a 10 mg ml 1 PI (Cat no
P4170, Sigma, USA) and 11 mg ml 1 Ribonuclease A (Cat no R5125, Sigma, USA) solution made up in PBS (100 ml). Cells were stained with PI for at least 30 min and run on the Imagestream X.
Fission yeast. Cell culture conditions and growth media were as previously described by Moreno et al.24 PN1 (wild-type haploid, strain 972h mating type,
lab stock) cells were grown in YE4S media and maintained in exponential phase. B5 106 cells were harvested for xation in 70% ice cold ethanol before storage at
4 C. Cells were then washed and resuspended in 1 ml of 50 mM sodium citrate, treated with 0.1 mg ml 1 RNase A (Sigma-Aldrich, UK) at 37 C overnight.
Subsequently cells were stained with PI (2 mg ml 1) and FITC (2 mg ml 1) before sonication (B20 s) using a sonication probe (JSP, Inc., USA). Cells were then resuspended in a volume of 500 ml, before running on the Imagestream X. Subsequent cell cycle stage assignment was performed as described by Patterson et al.20 In brief, this assignment is based on a combination of morphometric and intensity features extracted from PI images. Low PI intensity cells containing two nuclei are dened as G1. High PI intensity cells containing two nuclei containing cells are dened as S. Elongated, low intensity PI, uni-nucleate cells are dened as G2. Elongated, high intensity PI, single cells are dened as M phase (Supplementary Fig. 5).
Curve tting for DNA histograms. We used the Watson pragmatic algorithm (Supplementary Code 6) to obtain probability distributions for the cells being in G1, S and G2/M phase of cell cycle.
Image acquisition by imaging ow cytometry. We used the ImageStream X platform to capture images of both live and xed asynchronously growing Jurkat cells. For each cell, we captured images of brighteld and darkeld as well as
uorescent channels to measure the PI that quanties DNA content and a MPM2 antibody to identify cells in mitosis. After image acquisition, we used the IDEAS analysis tool (this is software that accompanies the ImageStream X software) to discard multiple cells or debris, omitting them from further analysis, as described in the tutorial (Supplementary Note 1). The resulting data are provided on http://www.cellprofiler.org/imagingflowcytometry
Web End =www.cellproler.org/imagingowcytometry .
Typical ImageStream settings. Sample volume: 2.6 ml (extracted from the 100 ml loaded). Flow diameter: 7 mm. Velocity of ow: 44 m s 1. Resolution: 0.5 mm.
Magnication: 60. Camera sensitivity: 256 on all channels. Camera gain:
1. Brighteld LED intensity: 88 mW. Darkeld laser intensity: 1 mW. 488 nm laser intensity: 25 mW. 642 nm laser intensity: 150 mW.
Image processing. The image sizes from the ImageStream cytometer range between 30 30 and 60 60 pixels (data provided on http://www.cellprofiler.org/imagingflowcytometry
Web End =www.cellproler.org/ima
http://www.cellprofiler.org/imagingflowcytometry
Web End =gingowcytometry ). We reshape their sizes to 55 55 pixel images by either
adding pixels with random values that we sampled from the background of the image for images that are smaller or by discarding pixels on the edge of the image for images that are too large. We note that the discarded pixels are only from the image background and not from the segmented cell. This procedure therefore does not affect the analysis. To demonstrate this we analysed if discarding pixels from larger images has an effect on the results on our method (Supplementary Table 6) and found robust results over a broad parameter range. Only if we reshape the images to sizes that are smaller than the cells diameter (that is, parts of the cells get cropped) does the quality of the method decrease. We then tile the images to15 15 montages, with up to 225 cells per montage. Example montages are pro
vided (data provided on http://www.cellprofiler.org/imagingflowcytometry
Web End =www.cellproler.org/imagingowcytometry ). A script to create the montages is provided (Supplementary Code 1) and its use is described in the tutorial (Supplementary Note 1) and on the webpage.
Segmentation and feature extraction. We load the image montages of 15 15
cells into the open-source image software CellProler (version 2.1.1). The darkeld image shows light scattered from the cells within a cone centred at a 90 angle and hence does not necessarily depict the cells physical shape nor does it align with the brighteld image. Therefore, we do not segment the darkeld image but instead use the full image for further analysis. In the brighteld image, there is sufcient contrast between the cells and the ow media to robustly segment the cells.
We segment the cells in the brighteld image by enhancing the edges of thecells and thresholding on the pixel values. We then extract features, which we
NATURE COMMUNICATIONS | 7:10256 | DOI: 10.1038/ncomms10256 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 7
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms10256
categorized into size and shape, granularity, intensity, radial distribution and texture. The CellProler pipeline to carry out all of these steps is provided (Supplementary Code 2). The measurements are exported in a text le,an example of which is provided (data provided on http://www.cellprofiler.org/imagingflowcytometry
Web End =www.cellproler.org/ http://www.cellprofiler.org/imagingflowcytometry
Web End =imagingowcytometry ). The measurements are post-processed using a script to discard cells with missing values (Supplementary Code 3). The use of these steps is described in the tutorial (Supplementary Note 1) and on the webpage.
Determination of ground truth. To train the machine learning algorithm we need a subset of cells where the cells true state is annotated, that is, the ground truth is known. For the experiment shown in Fig. 1, the cells were labelled with a PI and a MPM2 stain. As the ground truth (expected results) for the cells DNA content, we extracted the integrated intensities of the nuclear PI stain with the imaging software CellProler (Supplementary Code 2). The mitotic cell cycle phases were identied with the IDEAS analysis tool by categorizing the MPM2-positive cells into anaphase, prophase and metaphase using a limited set of user-formulated morphometric parameters (Supplementary Figure 2) on their PI stain images followed by manual conrmation. The telophase cells were identied using a complex set of masks (using the IDEAS analysis tool) on the brighteld images to gate doublet cells. We used those values as the ground truth to train the machine learning algorithm and to evaluate the prediction of the nuclear stain intensity.
We note that the ground truth is measured using the same modality, that is, the ImageStream system; this preserves the consistency of the presentation of the cell for measurement. If we seeded the cells on a plate to use microscopy the cells shape and morphology would be very different. However, the method we describe here could equally well be used to determine cell phenotypes from brighteld and darkeld images from traditional microscopy provided the ground truth is measured using the same microscopy platform. For the analysis of short-lived mitotic phases then large numbers of cells would be required; however, this should not be problematic given the development of high-throughput imaging systems. The advent of three-dimensional high-resolution microscopy has given rise to images with an even richer information content and provided enough cells could be measured then these systems would make good candidates for the method proposed here.
Machine learning. For the prediction of the DNA content, we use LSboostingas implemented in Matlabs tensemble routine (Supplementary Code 4). Forthe assignment of the mitotic cell cycle phases, we use RUSboosting as also implemented in Matlabs tensemble routine (Supplementary Code 5). In both cases, we partition the cells into a training and a testing set. The brighteld and darkeld features of the training set as well as the ground truth of these cells are used to train the ensemble. Once the ensemble is trained, we evaluate its predictive power on the testing set. To demonstrate the generalizability of this approach and to obtain error bars for our results the procedure is 10-fold cross-validated. To prevent overtting the data the stopping criterion of the training was determined via vefold internal cross-validation. All of these steps are described in the tutorial (Supplementary Note 1) and on the webpage.
In addition, we analysed which features have the most signicant contributions for the prediction of both the nuclear stain and the mitotic phases by leave one out cross-validation (Supplementary Table 3). We nd that leaving one feature out has only a minor effect on the results of the supervised machine learning algorithms we used, likely because many features are highly correlated to others. The most important features are intensity, area and shape and radial distribution of the brighteld images.
Retrospective data analysis. The described method can be used retrospectively to analyze data that was not originally acquired with label-free phenotype identication in mind. As demonstrated in the paper, either of two requirements must be met: (1) the phenotype of interest must be recognizable by eye or quantiable/classiable by image analysis, given the existing label-free images, or(2) the phenotype must be recognizable by eye or quantiable/classiable by image analysis in a separately stained subset of images prepared at the same time. Either of these approaches will provide the ground truth required to train the algorithms for label-free identication from retrospective image data on cells that are otherwise identically prepared and imaged, but lacking any stains. This approach offers the possibility to study the properties of different cell phenotypes using data that previously did not allow distinguishing the phenotypes.
The same overall approach could also in theory be used to carry out label-free assays (whether retrospectively or not) using the image data from conventional microscopy as opposed to imaging ow cytometry. Adherent cells that are reasonably attened improve the visibility of morphological features by which to determine phenotypes; however, the non-uniform presentation of each cell (versus imaging ow cytometry) is a disadvantage. Whether any particular application is feasible would be an empirical question.
References
1. Brown, M. & Wittwer, C. Flow cytometry: principles and clinical applications in hematology. Clin. Chem. 46, 12211229 (2000).
2. Darzynkiewicz, Z. & Huang, X. Analysis of cellular DNA content by ow cytometry. Curr. Protoc. Immunol. 5, 7 (2004).
3. Hans, F. & Dimitrov, S. Histone H3 phosphorylation and cell division. Oncogene 20, 30213027 (2001).
4. Sakaue-Sawano, A. et al. Visualizing spatiotemporal dynamics of multicellular cell-cycle progression. Cell 132, 487498 (2008).
5. Chen et al. DNA minor groove-binding ligands: a different class of mammalian DNA topoisomerase inhibitors. Proc. Natl Acad. Sci. USA 9, 81318135 (1993).
6. Wojcik, K. & Dobrucki, J. W. Interaction of a DNA intercalator DRAQ5, and a minor groove binder SYTO17, with chromatin in live cellsInuence on chromatin organization and histone-DNA interactions. Cytometry A 73, 555562 (2008).
7. Miltenburger, H. G., Sachse, G. & Schliermann, M. S-phase cell detection with a monoclonal antibody. Dev. Biol. Stand 66, 9199 (1987).
8. Basiji, D. A., Ortyn, W. E., Liang, L., Venkatachalam, V. & Morissey, P. Cellular image analysis and imaging by ow cytometry. Clin. Lab. Med. 27, 653670 (2007).
9. Filby, A. et al. An imaging ow cytometric method for measuring cell division history and molecular symmetry during mitosis. Cytometry A 79, 496506 (2011).
10. Eliceiri, K. W. et al. Biological imaging software tools. Nat. Methods 9, 697710 (2012).
11. Kamentsky, L. et al. Improved structure, function and compatibility for CellProler: modular high-throughput image analysis software. Bioinformatics 27, 11791180 (2011).
12. Rajaram, S., Pavie, B. & Altschuler, S. J. PhenoRipper: software for rapidly proling microscopy images. Nat. Methods 9, 635637 (2012).
13. Jones, T. R. et al. Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning. Proc. Natl Acad. Sci. USA 106, 18261831 (2009).
14. Perlman, Z. E. et al. Multidimensional drug proling by automated microscopy. Science 306, 11941198 (2004).
15. Watson, J. V., Chambers, S. H. & Smith, P. J. A pragmatic approach to the analysis of DNA histograms with a denable G1 peak. Cytometry 8, 18 (1987).
16. Hastie, T. et al. The Elements of Statistical Learning 2nd edn (Springer, 2008).17. Kafri, R. et al. Dynamics extracted from xed cells reveal feedback linking cell growth to cell cycle. Nature 494, 480483 (2013).
18. Seiffert, C., Khoshgoftaar, T. M., van Hulse, J. & Napolitano, A. RUSBoost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern. A Syst. Humans 40, 185197 (2010).
19. Yuan, C. M. et al. DRAQ5-Based DNA content analysis of hematolymphoid cell subpopulations discriminated by surface antigens and light scatter properties. Cytometry B 58, 4752 (2004).
20. Patterson, J. O., Swaffer, M. & Filby, A. An Imaging Flow Cytometry-based approach to analyse the ssion yeast cell cycle in xed cells. Methods 82, 7484 (2015).
21. Zuleta, I. A., Aranda-Diaz, A. & El-Samad, H. Dynamic characterization of growth and gene expression using high-throughput automated ow cytometry. Nat. Methods 11, 443448 (2014).
22. Xia, X. & Wong, S. T. Concise review: a high-content screening approach to stem cell research and drug discovery. Stem Cells 30, 18001807 (2012).
23. Chan, K. S., Koh, C. G. & Li, H. Y. Mitosis-targeted anti-cancer therapies: where they stand. Cell Death Dis. 3, e411 (2012).
24. Moreno, S., Klar, A. & Nurse, P. Molecular genetic analysis of ssion yeast Schizosaccharomyces pombe. Methods Enzymol. 194, 795823 (1991).
Acknowledgements
P.R. was supported by the Engineering and Physical Sciences Research Council, UK International Collaboration Sabbatical scheme under grant ref: EP/J00619X/1. T.B. was supported by the Studienstiftung des deutschen Volkes. F.J.T. and T.B. were supported by the European Research Council (starting grant LatentCauses) and the Deutsche Forschungsgemeinschaft (SPP 1356 Pluripotency and Cellular Reprogramming).
H.H. and A.E.C. were supported by a grant from the Human Frontiers in Science programme (co-PIs Carpenter, Chang, and Wolthuis). J.O.P. was supported by the Francis Crick Institute (grant number FCI01) which receives its core funding from Cancer Research UK, the UK Medical Research Council, and the Wellcome Trust. In addition J.O.P. was supported by a Boehringer Ingelheim Fonds PhD fellowship. P.R. and A.E.C. acknowledge the support of the Biotechnology and Biological Sciences Research Council/ National Science Foundation under grant BB/N005163/1 and NSF DBI 1458626. We are grateful to our colleagues Lee Kamentsky and Mark Anthony Bray for support with the analysis workow and Michael Laimighofer, Florian Buettner and Carsten Marr for helpful discussions regarding the machine learning. Moreover, we thank Alison Kozol for support with the website and Leslie Gaffney for designing Fig. 1.
Author contributions
P.R., A.F., H.D.S., F.J.T. and A.E.C. conceived and designed the experiments. A.F., J.C., D.D. and J.O.P. performed biological experiments; T.B., H.H., P.R. and A.F. analysed the data. T.B., P.R., A.F. and A.E.C. wrote the manuscript. All authors approved the nal version of the manuscript.
8 NATURE COMMUNICATIONS | 7:10256 | DOI: 10.1038/ncomms10256 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms10256 ARTICLE
Additional information
Supplementary Information accompanies this paper at http://www.nature.com/naturecommunications
Web End =http://www.nature.com/ http://www.nature.com/naturecommunications
Web End =naturecommunications
Competing nancial interests: Although the software described is completely open-source, a provisional patent application has been led relating to the method proposed in this manuscript.
Reprints and permission information is available online at http://npg.nature.com/reprintsandpermissions/
Web End =http://npg.nature.com/ http://npg.nature.com/reprintsandpermissions/
Web End =reprintsandpermissions/
How to cite this article: Blasi, T. et al. Label-free cell cycle analysis for high-throughput imaging ow cytometry. Nat. Commun. 7:10256 doi: 10.1038/ncomms10256 (2016).
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the articles Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
Web End =http://creativecommons.org/licenses/by/4.0/
NATURE COMMUNICATIONS | 7:10256 | DOI: 10.1038/ncomms10256 | http://www.nature.com/naturecommunications
Web End =www.nature.com/naturecommunications 9
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright Nature Publishing Group Jan 2016
Abstract
Imaging flow cytometry combines the high-throughput capabilities of conventional flow cytometry with single-cell imaging. Here we demonstrate label-free prediction of DNA content and quantification of the mitotic cell cycle phases by applying supervised machine learning to morphological features extracted from brightfield and the typically ignored darkfield images of cells from an imaging flow cytometer. This method facilitates non-destructive monitoring of cells avoiding potentially confounding effects of fluorescent stains while maximizing available fluorescence channels. The method is effective in cell cycle analysis for mammalian cells, both fixed and live, and accurately assesses the impact of a cell cycle mitotic phase blocking agent. As the same method is effective in predicting the DNA content of fission yeast, it is likely to have a broad application to other cell types.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer