3Mont: A multi-omics integrative tool for breast

Full text

Turn on search term navigation

Introduction

Cancer as one of the leading causes of death, poses a threat with its incidence rising across the globe. According to American Cancer Society statistics [1], in 2024, breast cancer patients constituted 32% of all new cancer cases among women while prostate cancer accounts for 29% of new cases among men. The annual rise in incidence rates of invasive breast cancer in females, currently at 0.6%, shows a concerning upward trend. Preventive measures and targeted treatment strategies are crucial for reducing the fatality risks. A comprehensive understanding of the underlying mechanisms driving these cancer types is imperative for effective interventions.

The multi-omics integrative approaches facilitate the identification of differentially expressed disease-associated bio-signatures, as well as the molecular classification and prognosis estimation of the diseases. Introducing additional -omics layers, such as metabolomics, proteomics, and transcription factors into existing methodologies provides valuable insights into the understanding of the underlying molecular mechanisms of complex diseases. In cancer tissue analysis, integrating complementary knowledge across different layers allows for a more detailed exploration of tumor architecture and the interactions among bio-signatures at various levels [2,3]. However, the flow of biological information through the layers via integrative approaches presents limitations such as missing values, data heterogeneity, high-dimensionality reduction, and data standardization. Various tools adopting integrative approaches have been developed to overcome these obstacles [4–6]. To achieve strong prediction capability and interpretability for the diagnosis and prognosis of diseases, several machine learning algorithms have been utilized in several studies [7,8]. Identifying the trends or signals with these machine learning models enhances classification performance [9–11], refines the complex patterns among the interacting biomarkers [12,13] and predicts the hub features for therapeutic purposes [14,15]. Moreover, approaches leveraging deep learning algorithms [16–18] have shown success in predicting promising biomarkers for prevention and treatment purposes. However, conventional approaches that are designed to gain new biological and biomedical insights often assess biomolecules individually while building ML models. To generate a more holistic picture, it is important to take into account the collective functionality of biomolecules. These biomolecules, which operate in tandem, are called biological groups. Via focusing on these groups of molecules, one can conduct intricate analyses in the cellular systems and unveil the molecular dynamics in living tissues. In addition to the grouping methodology employed to grasp the collective behavior of sets of molecules, selecting important features can be challenging, considering dimensionality reduction and accounting for heterogeneity among -omics layers.

To assess the most influential features for distinguishing healthy tissues from tumorous ones, different feature ranking algorithms, such as embedded, filter, and wrapper methods are employed [19]. While effectively mapping -omics data from high to relatively lower-dimensional space, it is necessary to identify the irrelevant features, which helps to reduce the running time via optimizing the ML model. To address the obstacles mentioned above, here we present a novel multi-omics integration tool called 3-Multi-Omics Network and Integration Tool (3Mont).

The 3Mont method performs grouping and scoring of the groups using statistical analysis and a machine learning model. The 3Mont is a biological knowledge-based improvement over 3Mint [20]. Initially, the groups defined in 3Mint are upgraded to pro-groups based on the shared gene lists among the groups. In addition, a new Feature Importance Scoring (FIS) component is implemented within the 3Mont workflow to select the most important features in each pro-group. This step eliminates the effect of pro-group size on the pro-group scores while developing a model for BRCA molecular sub-type identification problem. The effectiveness of each pro-group in BRCA molecular sub-type identification is measured using a Random Forest (RF) model built on 3-omics datasets, which is a major improvement over the 3Mint method that uses only gene expression data. Additionally, 3Mont assesses the importance of pro-groups using the Robust Rank Aggregation (RRA) score [21]. Important signatures (contributing features) and the pro-groups’ feature type distributions (miRNA, methylated CpG sites or mRNA) are provided as summary statistics.

Materials and methods

Datasets and preprocessing

The Cancer Genome Atlas (TCGA) Breast Invasive Carcinoma (BRCA) datasets including microRNA (miRNA), gene expression (mRNA), and methylation, were obtained from Xena Public Data Hubs [22]. The tumor samples were classified based on their molecular sub-types as follows: Hormone Receptor positive (HR+) is defined as Estrogen Receptor positive (ER+) and/or Progesterone Receptor positive (PR+). Hormone Receptor negative (HR-) is defined as Estrogen Receptor negative (ER-) and Progesterone Receptor negative (PR-) [23]. 425 samples of HR+ and 124 samples of HR- excluding normal-like sub-types were used to develop a machine-learning model to classify the BRCA molecular subtypes. To prevent class imbalance while creating the ML model, we applied downsampling. In other words, we reduced the number of samples from the majority class to match the number of samples within the minority class. Following a 1:1 ratio, the samples were randomly distributed into the training (90%) and testing sets (10%) in each iteration of the cross-validation step. The preprocessing steps and statistical analyses described in [20] were followed accordingly.

Proposed 3Mont method

This section provides a comprehensive explanation of the developed approach utilizing preprocessed BRCA datasets. The main objective of the 3Mont is to group the features and arrange the groups based on the shared gene lists among them. The upgraded groups, called pro-groups, have unique names consisting of miRNA/s and CpG/s (methylated CpG sites). Each pro-group consists of one or a few groups. Before assigning scores to each pro-group, the size of the pro-groups is normalized by selecting a fixed number of their features. For this process, the feature sets of the pro-groups are given as input into the FIS (feature importance scoring) component. The most significant features, also called as contributing features, are filtered by a predetermined cut-off value (Gini importance score), selected for each pro-group, and used as input for machine learning model development.

3Mont is based on the notion of decision trees which consist of a set of nodes that recursively split the observations into subsets. The effectiveness of this classification is measured for each node using the Gini impurity. A feature or attribute with the smallest Gini impurity is selected as the split. In this way, the contribution of each feature in a tree is calculated by the decrease in impurity (information gain) and is referred to as the feature importance score (FIS) The sum of the FIS for each feature is averaged over all trees. The smaller the mean FIS value, the less contribution the feature has to the RF model.

Following that, the pro-group size is determined by sorting the features according to the FIS score and keeping the most important ones, with a pre-determined feature set size, and discarding the rest. In the current study, the pre-determined feature set size is selected as 10, but using FIS, it can be easily modified to select top k features within a pro-group. The 3Mont pipeline is visually depicted in Fig 1.

[Figure omitted. See PDF.]

The 3Mont pipeline contains four main components: ProG (pro-grouping), FIS (feature importance scoring), S (pro-group scoring), and M (model creation). The features are grouped and groups are arranged based on the shared gene lists among them. Following the normalization of the pro-group sizes using the FIS component, the pro-groups are ordered according to their S component scores. The highest scoring pro-groups are selected to build ML models for classifying BRCA molecular subtypes.

The next component S is based on scoring each pro-group by applying the RF [24] classifier on the reduced expression datasets via just including the selected (contributing) features within internal cross-validation (shown in Fig 2). The RF model utilizes a random collection of individual decision trees which are combined using a voting schema to obtain more accurate classification [24]. Here, the training set is split into internal training and testing sets for the corresponding features of the pro-group and it is given as an input into the classifier. To this end, the expression values of these features were retrieved from the original -omics datasets. Using these matrices, internal training and test sets are constructed. Then, a score based on the bagging (bootstrap and aggregating) ensemble algorithm is assigned to each pro-group. This step is repeated using shuffle splits in internal cross-validation for each pro-group to prevent selection bias and over-fitting problems in the S component.

[Figure omitted. See PDF.]

The expression profiles of each feature and class labels in the ProGroups (represented as a two-class dataset, aggregated from 3 omics datasets) are given as input to the S component. Each dataset is further split into internal training (90%) and internal testing (10%) datasets (encoded by shades of gray). Random splits are repeated 10 times, and the mean accuracy is assigned as the score for each ProGroup.

Next, the top 10 best-scoring pro-groups are introduced into the following M component as the training data and a RF classifier is developed. The performance of the model is tested with a separate data set to prevent over-fitting. With the use of informative pro-groups, this study aims to distinguish the HR- cases with poor prognostic characteristics [25] from the HR+ ones. Furthermore, the detection of contributing features and the interactions among these features may lead to the identification of promising biomarkers for diseases with heterogeneous characteristics.

To enhance sub-type identification, we have also integrated a community detection-based network analysis as follows. Each pro-group identified by 3Mont consists of miRNA- CpG pairs and their associated features include miRNA(s), CpG ID(s), and/or mRNA(s). Firstly, a network is constructed where each associated feature within the top 10 pro-groups is represented as a node. An edge is formed between any two associated features that co-occur in the same pro-group. Average scores calculated by 3Mont define node weights of associated features. In the current research effort, we focus on the top 10 pro-groups to create a network, but this threshold can be adjusted to include any number of pro-groups. Similarly, in the current study, node size reflects the frequency of each feature in the pro-groups over iterations. However, other parameters can also be utilized to enrich the visualization of the graph.

A community detection algorithm was applied in the generated network and highly interconnected features were identified using the cluster_fast_greedy function from the igraph package [26]. The algorithm iteratively merges the features to increase the modularity score. The maximization of the modularity is evaluated with greedy optimization. In this way, the clusters of related features (communities) were identified.

Results and discussion

Size distribution of the identified pro-groups

Firstly, we evaluate the size distribution of the identified pro-groups for BRCA molecular sub-type identification problem. Here the size refers to the number of features within a pro-group. The pro-group size distributions before and after applying the Gini importance are presented in Fig 3. In the left panel, the sizes of pro-groups range from 1 to 35 before applying the filtering process. After applying the Gini importance function, features with FIS are returned in each pro-group. Based on their FIS, the top 10 features from each pro-group are retained. The right panel shows the size of pro-groups which range from 1 to 10, after applying the filtering process. For training the Random Forest classifier, we decided to keep only the selected top 10 features for each pro-group.

[Figure omitted. See PDF.]

The size indicates the total number of features within a pro-group. The Gini importance function is used to rank features and the best features in each pro-group are chosen based on their Feature Importance Scores (FIS). In the right panel, a predetermined cut-off value of 10 is used.

Performance evaluation of 3Mont

As shown in Fig 1, after selecting the top 10 features within the pro-groups based on their FIS, in the scoring component, the performance of each identified pro-group is assessed using the RF classifier. 10 best-scoring pro-groups are given to the M component to create a model for distinguishing HR+ cases from HR- cases. The 3Mont tool is also tested across multiple cancer types, such as bladder, kidney, liver, and breast cancer subtypes. Table 1 summarizes the performance metrics calculated across these different cancer types. As shown in Table 1, 3Mont generated over 0.85 AUC and over 0.80 accuracy values for all tested cancer datasets.

[Figure omitted. See PDF.]

The proposed tool facilitates the identification of patterns and correlations of different diseases via identifying molecular signatures. The average number of features denotes the mean feature number of the highest-scoring group in each iteration. For example, 6 features on average are found in the highest scoring group that is identified for the Her2 vs. Basal-like subtypes of BRCA data set, while around 8 features are found for the LIHC data set.

Significant pro-groups highlighted by 3Mont for BRCA molecular subtype identification (HR+ and HR-)

The top 10 most significant pro-groups and their summary statistics calculated by 3Mont are given in Table 2.

[Figure omitted. See PDF.]

In Table 2, a lower rank indicates stronger statistical significance. The frequency column indicates the number of times the pro-group appears (out of 10 iterations). The average score which is the output of the S component, indicates the mean of the score of the pro-group over the iterations. The Robust Rank Aggregation (RRA) score [21] tracks the rank of each feature over the iterations, giving a score in the range 0–1. Lower RRA scores indicate stronger significance of the pro-groups. The average rank is the mean of the ranks assigned to each pro-group in each iteration. The RRA scores and average ranks in Table 1 have some correlation with each other but convey different useful information.

Based on the RRA scores, the top 10 pro-groups identified for the BRCA molecular sub-type data set are given in Table 2. This table indicates that the miRNA hsa-miR-135b is involved in the discrimination of the HR + /HR- classes for BRCA. It is reported in [27] that the miRNA-135b is a diagnostic biomarker for Triple Negative Breast Cancer (TNBC), including basal-like subtypes, compared to non-TNBCs. Upregulation of miRNA-135b induces the migration process and tumorigenesis by affecting WNT and Hippo signaling regulators [28,29]. Some of those miRNAs, CpGs, and target genes, which are identified by 3Mont for the BRCA molecular sub-type data set have been reported in the literature as follows. Fan et al. showed that hsa-miR-577 has prognostic relevance for TNBC molecular heterogeneity compared to non-TNBC phenotypes [30], where this miRNA is also detected in our analysis.

At the methylation level, the differentially methylated gene SFT2D2 contributes to the metastatic behavior of BRCA cases [31]. A methylation level change in the gene TBCD, which has role in Post-chaperonin tubulin folding pathway [32], has been identified with 3Mont. BRCA patients with induced expression of the SERTAD2 gene, also known as TRIP-Br-2, showed a positive association with tumorigenesis. TRIP-Brs contribute to central carbon metabolism [33]. However, the association of these potential biomarkers has not yet been clearly identified with BRCA subtypes.

Relations between the detected biomarkers for BRCA molecular subtype (HR + /HR- cases) identification

To understand the relations among the identified biomarkers, we followed a network analysis as explained in the methods section. Fig 4 presents the generated network based on the pro-groups and their associated features identified by 3Mont when applied to the BRCA molecular sub-type data set (HR + /HR- cases). Fast greedy modularity maximization algorithm is used to generate the graph, where the node size represents the average score (obtained from Table 2). The network structure is formed by utilizing the connections obtained from 3Mont (features in the same pro-groups are connected). This is the primary driver for the community detection in this analysis. Nodes represent genes (mRNAs), miRNAs and CpG sites (methylation regions) within the top 10 pro-groups. Edges between the nodes imply that these associated features exist within the same pro-group. Three different colors represent the three distinct communities identified within this network.

[Figure omitted. See PDF.]

+ /HR- cases). The network visualizes associations between mRNAs, CpG IDs, miRNAs within the pro-groups. The node size represents the scaled average score of each associated feature within the top 10 pro-groups. The scores are obtained from the Average scores column of Table 2. Different colors represent distinct communities (clusters) detected by the community detection algorithm.

Several nodes in Fig 4 including AGR2, FOXA1, FOXC1, GATA3, MLPH, PPP1R14C, SFT2D2, SPDEF, SRSF12, UGT8, VGLL1, cg14621217 (B3GNTL1) and cg12427162 (SFT2D2), have been reported to be associated with TNBC (Triple Negative Breast Cancer) [31]. These nodes refer to the associated features that are identified by 3Mont for the top 10 pro-groups. It was shown that POLR3G expression is negatively correlated with FOXA1 and AR expression in TNBC [34]. Additionally, the potential therapeutic targets C9orf152 and UGT8 were reported in the onset of TNBC cases [35,36]. The blue community in Fig 4 plays a central role with its key nodes (AGR2, FOXA1, MLPH) and it serves as a bridge for other communities. These key nodes’ connections with PRR15 and FOXC1 in the green community suggest a functional interplay to distinguish the HR + /HR- classes.

One of the most frequently appearing genes in our analysis, FOXA1, has been reported in [37] to induce ER+ BRCA endocrine resistance. It has been reported that FOXC1, another commonly detected gene in 3Mont analysis, induces Hedgehog signaling and serves as a potential prognostic marker in Basal-like BRCA sub-type, displaying aggressive behavior and poor prognosis [38]. Furthermore, the gene GATA3, represented in both Estrogen-dependent gene expression and ESR-mediated signaling pathways, is mutated with high frequency in Luminal subtypes of BRCA [39]. The collaborative action of the genes GATA3, ER-α, and FOXA1 contributes to the differentiation status of Luminal vs Basal-like BRCA cases [40,41]. The over-expression of the gene AGR2 and its interacting biomarker FOXA1 have been shown to decrease the prognosis of ER+ BRCA cases [42].

Furthermore, functional enrichment analysis using Gene Ontology terms is performed to reveal distinct biological processes for the identified features shown in Fig 4 and Table 2. The genes AR, FOXA1 and TGFB2 play role in gland, epithelial tube morphogenesis and positive regulation of cell differentiation [43,44], while GATA3 is involved in mammary gland duct morphogenesis and uterus development [45]. AGR2 and SPDEF contribute to the negative regulation of epithelial to mesenchymal transition [46,47]. The CpG site cg24051242 (LRP5) has role in mammary gland duct morphogenesis and epithelium development [48], while cg17806482 (LAMB1) and cg25979244 (DISC1) are involved in positive regulation of cell differentiation and positive regulation of developmental process [49,50] in cancer tissue. The CpG site cg14986386 (CCDC125) has roles in negative regulation of Ras protein signal transduction and positive regulation of molecular function [51]. Another CpG site cg13975098 (RGS10) is involved in molecular function and negative regulation of signal transduction [52]. The involvement of these features in these biological processes suggests their potential contribution to tumor metastasis and disease progression mechanisms.

Comparative performance evaluation of 3Mont

Fig 5 compares the performance metrics of 3Mont, 3Mint, and some other feature selection algorithms. All algorithms are run using all feature types (mRNA expression, miRNA expression and methylation) except 3Mint which uses only mRNA expression profiles. While the number of features is defined by the highest scoring group in 3Mint (15 genes), other algorithms perform the analysis with the feature number filtered to 10. In 3Mont, after selecting the best-scoring features within each pro-group, the S component is applied to assign a score to each pro-group. Subsequently, the 10 highest-scoring pro-groups are used to train and test the RF classifier. While differentiating between the HR + vs. HR- cases, 3Mont, SKB (Select K Best) and FCBF (Fast Correlation Based Filter) feature selection methods have the highest classification performance. Among the 3 best-performing feature selection methods, 3Mont and SKB share the features of GATA3, cg23205034 targeting the CPQ gene, cg10970143 targeting CFAP45 gene, hsa-mir-934; while SKB and FCBF have hsa-mir-4766, hsa-mir-3682, ZNF454 in their common features list. There are no shared features between 3Mont and FCBF.

[Figure omitted. See PDF.]

The average performance metrics of the best scoring groups, along with standard deviations over iterations, are shown in panels which are labeled as accuracy, Area Under ROC Curve, sensitivity and specificity. All algorithms use mRNA, miRNA, and methylation data, but 3Mint only uses mRNA expression in training and testing the classifier. Abbreviations: SKB: SelectKBest, FCBF: Fast Correlation Based Filter, IG: Information Gain, CMIM: Conditional Mutual Information Maximization, MRMR: Minimum Redundancy Maximum Relevance.

Discussion

There are various strengths of 3Mont over traditional feature selection algorithms. Our tool integrates multiple functions through the development of ProG (pro-grouping), FIS (feature importance scoring), S (pro-group scoring), and M (model creation) components. Initially, 3Mont selects correlated features based on their expression profiles obtained from -omics datasets. Then our tool groups the features if they have common gene(s). Defining groups (each one having unique miRNA and CpG site names) with associated gene(s) distinguishes our tool and 3Mint from other feature selection methods.

Machine learning algorithms typically select features one by one and ignore grouping strategies and the identification of correlated features using statistical approaches. However, 3Mont’s novel ProG component provides a collection of groups with common gene(s) and prevents the repetition of features within groups. Another aim of 3Mont is to select the hub features in each group and identify the significant features for sub-type differentiation. To achieve this, the FIS component is applied to filter the features in each group. These features, called contributing features, are then used to develop machine learning models.

By integrating statistical and ML methods with a grouping approach, 3Mont offers a more comprehensive tool for determining the collective behaviour of features compared to traditional feature selection algorithms. Additionally, restricting the number of features in groups via the FIS component decreases the processing time for sub-type classification in 3Mont compared to 3Mint. Feature interaction networks are constructed in 3Mont to illustrate the interplay of significant biomarkers.

Conclusion

Integration of multi-omics data has enhanced our understanding of heterogeneous diseases, providing a holistic view at the molecular level. The method offered in this study provides opportunities to identify novel bio-signatures, and personalized treatment approaches. 3Mont method categorizes biologically important groups of features across various -omics datasets into biological groups. This approach enables researchers to gain novel insights into the collective functionality among biomolecules. The new concept of pro-groups extends previously defined biological groups with the inclusion of shared features. The construction of pro-groups reduces the number of groups through merging operations. Because pro-groups with more features tend to have higher scores, the sizes of pro-groups are equalized to eliminate the effect of pro-group size in the scoring component. These equalizing pro-groups are utilized for developing an ML model to tackle the classification problem. Finally, the feature importance scoring component enables us to pinpoint the most impactful bio-signatures in each pro-group.

With our new diagnostic models developed in this study, we identified promising biomarkers such as miRNA-135b, GATA, FOXA1, and differentially methylated gene DISC1 to distinguish the BRCA subtypes (HR + /HR- cases). Although 3Mont is used to identify BRCA molecular subtypes in the current study, this approach can also be applied to predict disease progression based on the class labels provided in the data set. The future strategies for 3Mont include the incorporation of additional -omics datasets such as transcription factors and metabolomics data to provide deeper insights into the cellular mechanisms underlying diseases. Another direction is the optimization of feature selection and grouping approaches for 3Mont by utilizing advanced statistical techniques to improve the efficiency and performance of the tool. Expanding the tool to analyze various disease datasets will enable novel bio-signature discovery and accelerate the development of therapeutic approaches.

Supporting information

S1 File. Evaluation of the effect of downsampling with descriptive analysis.

https://doi.org/10.1371/journal.pone.0326154.s001

(DOCX)

S2 File. Comparative Evaluation using the BRCA molecular subtype dataset.

https://doi.org/10.1371/journal.pone.0326154.s002

(DOCX)

S3 File. Tissue-specific gene enrichment analysis of the identified biomarkers.

https://doi.org/10.1371/journal.pone.0326154.s003

(DOCX)

References

1. 1. Siegel RL, Giaquinto AN, Jemal A. Cancer statistics, 2024. CA Cancer J Clin. 2024;74(1):12–49.

* View Article

* Google Scholar

2. 2. Hozhabri H, Moghaddam MM, Moghaddam MM, Mohammadian A. A comprehensive bioinformatics analysis to identify potential prognostic biomarkers among CC and CXC chemokines in breast cancer. Sci Rep. 2022;12(1):10374. pmid:35725915

* View Article

* PubMed/NCBI

* Google Scholar

3. 3. Zoppi J, Guillaume J-F, Neunlist M, Chaffron S. MiBiOmics: an interactive web application for multi-omics data exploration and integration. BMC Bioinform. 2021;22(1):6. pmid:33407076

* View Article

* PubMed/NCBI

* Google Scholar

4. 4. Chalise P, Fridley BL. Integrative clustering of multi-level ’omic data based on non-negative matrix factorization algorithm. Peddada SD, editor. PLoS One. 2017;12(5):e0176278. pmid:28459819

* View Article

* PubMed/NCBI

* Google Scholar

5. 5. Ding J, Blencowe M, Nghiem T, Ha S min, Chen YW, Li G. Mergeomics 2.0: a web server for multi-omics data integration to elucidate disease networks and predict therapeutics. Nucleic Acids Res. 2021;49(W1):W375-87.

* View Article

* Google Scholar

6. 6. Spirko-Burns L, Devarajan K. supervised dimension reduction for large-scale “Omics” data with censored survival outcomes under possible non-proportional hazards. IEEE/ACM Trans Comput Biol Bioinform. 2021;18(5):2032–44. pmid:31940547

* View Article

* PubMed/NCBI

* Google Scholar

7. 7. Alshayeji MH, Ellethy H, Abed S, Gupta R. Computer-aided detection of breast cancer on the Wisconsin dataset: an artificial neural networks approach. Biomed Signal Process Control. 2022;71:103141.

* View Article

* Google Scholar

8. 8. Spooner A, Chen E, Sowmya A, Sachdev P, Kochan NA, Trollor J. A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction. Sci Rep. 2020;10(1):20410.

* View Article

* Google Scholar

9. 9. Burgos N, Colliot O. Machine learning for classification and prediction of brain diseases: recent advances and upcoming challenges. Curr Opin Neurol. 2020;33(4):439–50. pmid:32657885

* View Article

* PubMed/NCBI

* Google Scholar

10. 10. Park D, Kim BH, Lee S-E, Kim DY, Kim M, Kwon HD, et al. Machine learning-based approach for disease severity classification of carpal tunnel syndrome. Sci Rep. 2021;11(1):17464. pmid:34465860

* View Article

* PubMed/NCBI

* Google Scholar

11. 11. Samaras AD, Moustakidis S, Apostolopoulos ID, Papandrianos N, Papageorgiou E. Classification models for assessing coronary artery disease instances using clinical and biometric data: an explainable man-in-the-loop approach. Sci Rep. 2023;13(1):6668.

* View Article

* Google Scholar

12. 12. Usman K, Rajpoot K. Brain tumor classification from multi-modality MRI using wavelets and machine learning. Pattern Anal Appl. 2017;20(3):871–81.

* View Article

* Google Scholar

13. 13. Yu KH, Berry GJ, Rubin DL, Ré C, Altman RB, Snyder M. Association of omics features with histopathology patterns in lung adenocarcinoma. Cell Systems. 2017;5(6):620–7.

* View Article

* Google Scholar

14. 14. DeGroat W, Mendhe D, Bhusari A, Abdelhalim H, Zeeshan S, Ahmed Z. IntelliGenes: a novel machine learning pipeline for biomarker discovery and predictive analysis using multi-genomic profiles. Martelli PL, editor. Bioinformatics. 2023;39(12):btad755.

* View Article

* Google Scholar

15. 15. Kong J, Ha D, Lee J, Kim I, Park M, Im S-H, et al. Network-based machine learning approach to predict immunotherapy response in cancer patients. Nat Commun. 2022;13(1):3703. pmid:35764641

* View Article

* PubMed/NCBI

* Google Scholar

16. 16. An G, Akiba M, Omodaka K, Nakazawa T, Yokota H. Hierarchical deep learning models using transfer learning for disease detection and classification based on small number of medical images. Sci Rep. 2021;11(1):4250. pmid:33649375

* View Article

* PubMed/NCBI

* Google Scholar

17. 17. Leng D, Zheng L, Wen Y, Zhang Y, Wu L, Wang J, et al. A benchmark study of deep learning-based multi-omics data fusion methods for cancer. Genome Biol. 2022;23(1):171. pmid:35945544

* View Article

* PubMed/NCBI

* Google Scholar

18. 18. Miao JH, H K. Cardiotocographic diagnosis of fetal health based on multiclass morphologic pattern predictions using deep learning classification. Int J Adv Comput Sci Appl. 2018;9(5).

* View Article

* Google Scholar

19. 19. Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A. A review of feature selection methods on synthetic data. Knowl Inf Syst. 2012;34(3):483–519.

* View Article

* Google Scholar

20. 20. Unlu Yazici M, Marron JS, Bakir-Gungor B, Zou F, Yousef M. Invention of 3Mint for feature grouping and scoring in multi-omics. Front Genet. 2023;14:1093326.

* View Article

* Google Scholar

21. 21. Kolde R, Laur S, Adler P, Vilo J. Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics. 2012;28(4):573–80.

* View Article

* Google Scholar

22. 22. Goldman MJ, Craft B, Hastie M, Repečka K, McDade F, Kamath A, et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol. 2020;38(6):675–8. pmid:32444850

* View Article

* PubMed/NCBI

* Google Scholar

23. 23. Parker JS, Mullins M, Cheang MCU, Leung S, Voduc D, Vickery T. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol. 2009;27(8):1160–7.

* View Article

* Google Scholar

24. 24. Breiman L. Random Forests. Mach Learn. 2001;45(1):5–32.

* View Article

* Google Scholar

25. 25. Bae SY, Kim S, Lee JH, Lee H-C, Lee SK, Kil WH, et al. Poor prognosis of single hormone receptor- positive breast cancer: similar outcome as triple-negative breast cancer. BMC Cancer. 2015;15:138. pmid:25880075

* View Article

* PubMed/NCBI

* Google Scholar

26. 26. Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal. 2006;1695.

* View Article

* Google Scholar

27. 27. Uva P, Cossu-Rocca P, Loi F, Pira G, Murgia L, Orrù S, et al. miRNA-135b contributes to triple negative breast cancer molecular heterogeneity: different expression profile in basal-like versus non-basal-like phenotypes. Int J Med Sci. 2018;15(6):536–48. pmid:29725243

* View Article

* PubMed/NCBI

* Google Scholar

28. 28. Hua K, Jin J, Zhao J, Song J, Song H, Li D, et al. miR-135b, upregulated in breast cancer, promotes cell growth and disrupts the cell cycle by regulating LATS2. Int J Oncol. 2016;48(5):1997–2006. pmid:26934863

* View Article

* PubMed/NCBI

* Google Scholar

29. 29. Lv Z-D, Xin H-N, Yang Z-C, Wang W-J, Dong J-J, Jin L-Y, et al. miR-135b promotes proliferation and metastasis by targeting APC in triple-negative breast cancer. J Cell Physiol. 2019;234(7):10819–26. pmid:30624764

* View Article

* PubMed/NCBI

* Google Scholar

30. 30. Fan C, Liu N. Identification of dysregulated microRNAs associated with diagnosis and prognosis in triple‑negative breast cancer: an in silico study. Oncol Rep. 2019;41(6):3313–24. pmid:30942465

* View Article

* PubMed/NCBI

* Google Scholar

31. 31. Segaert P, Lopes MB, Casimiro S, Vinga S, Rousseeuw PJ. Robust identification of target genes and outliers in triple-negative breast cancer data. Stat Methods Med Res. 2019;28(10–11):3042–56. pmid:30146936

* View Article

* PubMed/NCBI

* Google Scholar

32. 32. Manoochehri M, Borhani N, Gerhäuser C, Assenov Y, Schönung M, Hielscher T, et al. DNA methylation biomarkers for noninvasive detection of triple-negative breast cancer using liquid biopsy. Int J Cancer. 2023;152(5):1025–35. pmid:36305646

* View Article

* PubMed/NCBI

* Google Scholar

33. 33. Mongre RK, Mishra CB, Jung S, Lee BS, Quynh NTN, Anh NH. Exploring the role of TRIP-Brs in human breast cancer: an investigation of expression, clinicopathological significance, and prognosis. Mol Ther Oncol. 2020;19:105.

* View Article

* Google Scholar

34. 34. Lautré W, Richard E, Feugeas J-P, Dumay-Odelot H, Teichmann M. The POLR3G subunit of human RNA polymerase III regulates tumorigenesis and metastasis in triple-negative breast cancer. Cancers (Basel). 2022;14(23):5732. pmid:36497214

* View Article

* PubMed/NCBI

* Google Scholar

35. 35. Dong P, Yu B, Pan L, Tian X, Liu F. Identification of key genes and pathways in triple-negative breast cancer by integrated bioinformatics analysis. Biomed Res Int. 2018;2018:2760918. pmid:30175120

* View Article

* PubMed/NCBI

* Google Scholar

36. 36. Santuario-Facio SK, Cardona-Huerta S, Perez-Paramo YX, Trevino V, Hernandez-Cabrera F, Rojas-Martinez A, et al. A new gene expression signature for triple negative breast cancer using frozen fresh tissue before neoadjuvant chemotherapy. Mol Med. 2017;23:101–11. pmid:28474731

* View Article

* PubMed/NCBI

* Google Scholar

37. 37. Fu X, Pereira R, Liu C-C, De Angelis C, Shea MJ, Nanda S, et al. High FOXA1 levels induce ER transcriptional reprogramming, a pro-metastatic secretome, and metastasis in endocrine-resistant breast cancer. Cell Rep. 2023;42(8):112821. pmid:37467106

* View Article

* PubMed/NCBI

* Google Scholar

38. 38. Han B, Qu Y, Jin Y, Yu Y, Deng N, Wawrowsky K, et al. FOXC1 activates smoothened-independent hedgehog signaling in basal-like breast cancer. Cell Rep. 2015;13(5):1046–58.

* View Article

* Google Scholar

39. 39. Takaku M, Grimm SA, Wade PA. GATA3 in breast cancer: tumor suppressor or oncogene?. Gene Expr. 2015;16(4):163–8.

* View Article

* Google Scholar

40. 40. Albergaria A, Paredes J, Sousa B, Milanezi F, Carneiro V, Bastos J, et al. Expression of FOXA1 and GATA-3 in breast cancer: the prognostic significance in hormone receptor-negative tumours. Breast Cancer Res. 2009;11(3):R40.

* View Article

* Google Scholar

41. 41. Takaku M, Grimm SA, De Kumar B, Bennett BD, Wade PA. Cancer-specific mutation of GATA3 disrupts the transcriptional regulatory network governed by estrogen receptor alpha, FOXA1 and GATA3. Nucleic Acids Res. 2020;48(9):4756–68.

* View Article

* Google Scholar

42. 42. Zhou M, Gan X li, Ren Y xiang, Chen Q xin, Yang Y zhong, Weng Z jin. AGR2 and FOXA1 as prognostic markers in ER-positive breast cancer. BMC Cancer. 2023;23(1):743.

* View Article

* Google Scholar

43. 43. Rangel N, Fortunati N, Osella-Abate S, Annaratone L, Isella C, Catalano MG. FOXA1 and AR in invasive breast cancer: new findings on their co-expression and impact on prognosis in ER-positive patients. BMC Cancer. 2018;18:703.

* View Article

* Google Scholar

44. 44. Zhou C, Wang D, Li J, Wang Q, Wo L, Zhang X, et al. TGFB2-AS1 inhibits triple-negative breast cancer progression via interaction with SMARCA4 and regulating its targets TGFB2 and SOX2. Proc Natl Acad Sci U S A. 2022;119(39):e2117988119.

* View Article

* Google Scholar

45. 45. Takaku M, Grimm SA, Wade PA. GATA3 in breast cancer: tumor suppressor or oncogene?. Gene Expr. 2015;16(4):163.

* View Article

* Google Scholar

46. 46. Sommerova L, Ondrouskova E, Vojtesek B, Hrstka R. Suppression of AGR2 in a TGF-β-induced Smad regulatory pathway mediates epithelial-mesenchymal transition. BMC Cancer. 2017;17(1):546. pmid:28810836

* View Article

* PubMed/NCBI

* Google Scholar

47. 47. Ye T, Li J, Feng J, Guo J, Wan X, Xie D. The subtype‐specific molecular function of SPDEF in breast cancer and insights into prognostic significance. J Cell Mol Med. 2021;25(15):7307–20.

* View Article

* Google Scholar

48. 48. Lindvall C, Evans NC, Zylstra CR, Li Y, Alexander CM, Williams BO. The Wnt signaling receptor Lrp5 is required for mammary ductal stem cell activity and Wnt1-induced tumorigenesis. J Biol Chem. 2006;281(46):35081–7. pmid:16973609

* View Article

* PubMed/NCBI

* Google Scholar

49. 49. Ran T, Chen Z, Zhao L, Ran W, Fan J, Hong S. LAMB1 is related to the T stage and indicates poor prognosis in gastric cancer. Technol Cancer Res Treat. 2021;20.

* View Article

* Google Scholar

50. 50. Gao X, Mi Y, Guo N, Hu Z, Hu F, Liu D, et al. Disrupted in schizophrenia 1 (DISC1) inhibits glioblastoma development by regulating mitochondria dynamics. Oncotarget. 2016;7(52):85963–74. pmid:27852062

* View Article

* PubMed/NCBI

* Google Scholar

51. 51. Nakajima. Role of Kenae/CCDC125 in cell motility through the deregulation of RhoGTPase. Int J Mol Med. 2009;24(05). http://www.spandidos-publications.com/ijmm/24/5/605

* View Article

* Google Scholar

52. 52. Caldiran FY, Cacan E. RGS10 suppression by DNA methylation is associated with low survival rates in colorectal carcinoma. Pathol Res Pract. 2022;236:154007. pmid:35810565

* View Article

* PubMed/NCBI

* Google Scholar

Citation: Unlu Yazici M, Marron JS, Bakir-Gungor B, Zou F, Yousef M (2025) 3Mont: A multi-omics integrative tool for breast cancer subtype stratification. PLoS One 20(6): e0326154. https://doi.org/10.1371/journal.pone.0326154

About the Authors:

Miray Unlu Yazici

Roles: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Validation, Visualization, Writing – original draft, Writing – review & editing

Affiliation: Department of Bioengineering, Abdullah Gül University, Kayseri, Turkey

J. S. Marron

Roles: Conceptualization, Formal analysis, Methodology, Project administration, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

E-mail: [email protected] (JSM); [email protected] (MY)

Affiliation: Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, North Carolina, United States of America

Burcu Bakir-Gungor

Roles: Supervision, Validation, Writing – review & editing

Affiliations: Department of Bioengineering, Abdullah Gül University, Kayseri, Turkey, Department of Computer Engineering, Abdullah Gul University, Kayseri, Turkey

Fei Zou

Roles: Conceptualization, Formal analysis, Methodology, Supervision, Validation, Visualization, Writing – review & editing

Affiliations: Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America, Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America

Malik Yousef

Roles: Conceptualization, Methodology, Project administration, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

E-mail: [email protected] (JSM); [email protected] (MY)

Affiliations: Department of Information Systems, Zefat Academic College, Zefat, Israel, Galilee Digital Health Research Center, Zefat Academic College, Zefat, Israel

ORICD: https://orcid.org/0000-0001-8780-6303

References

1. Siegel RL, Giaquinto AN, Jemal A. Cancer statistics, 2024. CA Cancer J Clin. 2024;74(1):12–49.

2. Hozhabri H, Moghaddam MM, Moghaddam MM, Mohammadian A. A comprehensive bioinformatics analysis to identify potential prognostic biomarkers among CC and CXC chemokines in breast cancer. Sci Rep. 2022;12(1):10374. pmid:35725915

3. Zoppi J, Guillaume J-F, Neunlist M, Chaffron S. MiBiOmics: an interactive web application for multi-omics data exploration and integration. BMC Bioinform. 2021;22(1):6. pmid:33407076

4. Chalise P, Fridley BL. Integrative clustering of multi-level ’omic data based on non-negative matrix factorization algorithm. Peddada SD, editor. PLoS One. 2017;12(5):e0176278. pmid:28459819

5. Ding J, Blencowe M, Nghiem T, Ha S min, Chen YW, Li G. Mergeomics 2.0: a web server for multi-omics data integration to elucidate disease networks and predict therapeutics. Nucleic Acids Res. 2021;49(W1):W375-87.

6. Spirko-Burns L, Devarajan K. supervised dimension reduction for large-scale “Omics” data with censored survival outcomes under possible non-proportional hazards. IEEE/ACM Trans Comput Biol Bioinform. 2021;18(5):2032–44. pmid:31940547

7. Alshayeji MH, Ellethy H, Abed S, Gupta R. Computer-aided detection of breast cancer on the Wisconsin dataset: an artificial neural networks approach. Biomed Signal Process Control. 2022;71:103141.

8. Spooner A, Chen E, Sowmya A, Sachdev P, Kochan NA, Trollor J. A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction. Sci Rep. 2020;10(1):20410.

9. Burgos N, Colliot O. Machine learning for classification and prediction of brain diseases: recent advances and upcoming challenges. Curr Opin Neurol. 2020;33(4):439–50. pmid:32657885

10. Park D, Kim BH, Lee S-E, Kim DY, Kim M, Kwon HD, et al. Machine learning-based approach for disease severity classification of carpal tunnel syndrome. Sci Rep. 2021;11(1):17464. pmid:34465860

11. Samaras AD, Moustakidis S, Apostolopoulos ID, Papandrianos N, Papageorgiou E. Classification models for assessing coronary artery disease instances using clinical and biometric data: an explainable man-in-the-loop approach. Sci Rep. 2023;13(1):6668.

12. Usman K, Rajpoot K. Brain tumor classification from multi-modality MRI using wavelets and machine learning. Pattern Anal Appl. 2017;20(3):871–81.

13. Yu KH, Berry GJ, Rubin DL, Ré C, Altman RB, Snyder M. Association of omics features with histopathology patterns in lung adenocarcinoma. Cell Systems. 2017;5(6):620–7.

14. DeGroat W, Mendhe D, Bhusari A, Abdelhalim H, Zeeshan S, Ahmed Z. IntelliGenes: a novel machine learning pipeline for biomarker discovery and predictive analysis using multi-genomic profiles. Martelli PL, editor. Bioinformatics. 2023;39(12):btad755.

15. Kong J, Ha D, Lee J, Kim I, Park M, Im S-H, et al. Network-based machine learning approach to predict immunotherapy response in cancer patients. Nat Commun. 2022;13(1):3703. pmid:35764641

16. An G, Akiba M, Omodaka K, Nakazawa T, Yokota H. Hierarchical deep learning models using transfer learning for disease detection and classification based on small number of medical images. Sci Rep. 2021;11(1):4250. pmid:33649375

17. Leng D, Zheng L, Wen Y, Zhang Y, Wu L, Wang J, et al. A benchmark study of deep learning-based multi-omics data fusion methods for cancer. Genome Biol. 2022;23(1):171. pmid:35945544

18. Miao JH, H K. Cardiotocographic diagnosis of fetal health based on multiclass morphologic pattern predictions using deep learning classification. Int J Adv Comput Sci Appl. 2018;9(5).

19. Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A. A review of feature selection methods on synthetic data. Knowl Inf Syst. 2012;34(3):483–519.

20. Unlu Yazici M, Marron JS, Bakir-Gungor B, Zou F, Yousef M. Invention of 3Mint for feature grouping and scoring in multi-omics. Front Genet. 2023;14:1093326.

21. Kolde R, Laur S, Adler P, Vilo J. Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics. 2012;28(4):573–80.

22. Goldman MJ, Craft B, Hastie M, Repečka K, McDade F, Kamath A, et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol. 2020;38(6):675–8. pmid:32444850

23. Parker JS, Mullins M, Cheang MCU, Leung S, Voduc D, Vickery T. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol. 2009;27(8):1160–7.

24. Breiman L. Random Forests. Mach Learn. 2001;45(1):5–32.

25. Bae SY, Kim S, Lee JH, Lee H-C, Lee SK, Kil WH, et al. Poor prognosis of single hormone receptor- positive breast cancer: similar outcome as triple-negative breast cancer. BMC Cancer. 2015;15:138. pmid:25880075

26. Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal. 2006;1695.

27. Uva P, Cossu-Rocca P, Loi F, Pira G, Murgia L, Orrù S, et al. miRNA-135b contributes to triple negative breast cancer molecular heterogeneity: different expression profile in basal-like versus non-basal-like phenotypes. Int J Med Sci. 2018;15(6):536–48. pmid:29725243

28. Hua K, Jin J, Zhao J, Song J, Song H, Li D, et al. miR-135b, upregulated in breast cancer, promotes cell growth and disrupts the cell cycle by regulating LATS2. Int J Oncol. 2016;48(5):1997–2006. pmid:26934863

29. Lv Z-D, Xin H-N, Yang Z-C, Wang W-J, Dong J-J, Jin L-Y, et al. miR-135b promotes proliferation and metastasis by targeting APC in triple-negative breast cancer. J Cell Physiol. 2019;234(7):10819–26. pmid:30624764

30. Fan C, Liu N. Identification of dysregulated microRNAs associated with diagnosis and prognosis in triple‑negative breast cancer: an in silico study. Oncol Rep. 2019;41(6):3313–24. pmid:30942465

31. Segaert P, Lopes MB, Casimiro S, Vinga S, Rousseeuw PJ. Robust identification of target genes and outliers in triple-negative breast cancer data. Stat Methods Med Res. 2019;28(10–11):3042–56. pmid:30146936

32. Manoochehri M, Borhani N, Gerhäuser C, Assenov Y, Schönung M, Hielscher T, et al. DNA methylation biomarkers for noninvasive detection of triple-negative breast cancer using liquid biopsy. Int J Cancer. 2023;152(5):1025–35. pmid:36305646

33. Mongre RK, Mishra CB, Jung S, Lee BS, Quynh NTN, Anh NH. Exploring the role of TRIP-Brs in human breast cancer: an investigation of expression, clinicopathological significance, and prognosis. Mol Ther Oncol. 2020;19:105.

34. Lautré W, Richard E, Feugeas J-P, Dumay-Odelot H, Teichmann M. The POLR3G subunit of human RNA polymerase III regulates tumorigenesis and metastasis in triple-negative breast cancer. Cancers (Basel). 2022;14(23):5732. pmid:36497214

35. Dong P, Yu B, Pan L, Tian X, Liu F. Identification of key genes and pathways in triple-negative breast cancer by integrated bioinformatics analysis. Biomed Res Int. 2018;2018:2760918. pmid:30175120

36. Santuario-Facio SK, Cardona-Huerta S, Perez-Paramo YX, Trevino V, Hernandez-Cabrera F, Rojas-Martinez A, et al. A new gene expression signature for triple negative breast cancer using frozen fresh tissue before neoadjuvant chemotherapy. Mol Med. 2017;23:101–11. pmid:28474731

37. Fu X, Pereira R, Liu C-C, De Angelis C, Shea MJ, Nanda S, et al. High FOXA1 levels induce ER transcriptional reprogramming, a pro-metastatic secretome, and metastasis in endocrine-resistant breast cancer. Cell Rep. 2023;42(8):112821. pmid:37467106

38. Han B, Qu Y, Jin Y, Yu Y, Deng N, Wawrowsky K, et al. FOXC1 activates smoothened-independent hedgehog signaling in basal-like breast cancer. Cell Rep. 2015;13(5):1046–58.

39. Takaku M, Grimm SA, Wade PA. GATA3 in breast cancer: tumor suppressor or oncogene?. Gene Expr. 2015;16(4):163–8.

40. Albergaria A, Paredes J, Sousa B, Milanezi F, Carneiro V, Bastos J, et al. Expression of FOXA1 and GATA-3 in breast cancer: the prognostic significance in hormone receptor-negative tumours. Breast Cancer Res. 2009;11(3):R40.

41. Takaku M, Grimm SA, De Kumar B, Bennett BD, Wade PA. Cancer-specific mutation of GATA3 disrupts the transcriptional regulatory network governed by estrogen receptor alpha, FOXA1 and GATA3. Nucleic Acids Res. 2020;48(9):4756–68.

42. Zhou M, Gan X li, Ren Y xiang, Chen Q xin, Yang Y zhong, Weng Z jin. AGR2 and FOXA1 as prognostic markers in ER-positive breast cancer. BMC Cancer. 2023;23(1):743.

43. Rangel N, Fortunati N, Osella-Abate S, Annaratone L, Isella C, Catalano MG. FOXA1 and AR in invasive breast cancer: new findings on their co-expression and impact on prognosis in ER-positive patients. BMC Cancer. 2018;18:703.

44. Zhou C, Wang D, Li J, Wang Q, Wo L, Zhang X, et al. TGFB2-AS1 inhibits triple-negative breast cancer progression via interaction with SMARCA4 and regulating its targets TGFB2 and SOX2. Proc Natl Acad Sci U S A. 2022;119(39):e2117988119.

45. Takaku M, Grimm SA, Wade PA. GATA3 in breast cancer: tumor suppressor or oncogene?. Gene Expr. 2015;16(4):163.

46. Sommerova L, Ondrouskova E, Vojtesek B, Hrstka R. Suppression of AGR2 in a TGF-β-induced Smad regulatory pathway mediates epithelial-mesenchymal transition. BMC Cancer. 2017;17(1):546. pmid:28810836

47. Ye T, Li J, Feng J, Guo J, Wan X, Xie D. The subtype‐specific molecular function of SPDEF in breast cancer and insights into prognostic significance. J Cell Mol Med. 2021;25(15):7307–20.

48. Lindvall C, Evans NC, Zylstra CR, Li Y, Alexander CM, Williams BO. The Wnt signaling receptor Lrp5 is required for mammary ductal stem cell activity and Wnt1-induced tumorigenesis. J Biol Chem. 2006;281(46):35081–7. pmid:16973609

49. Ran T, Chen Z, Zhao L, Ran W, Fan J, Hong S. LAMB1 is related to the T stage and indicates poor prognosis in gastric cancer. Technol Cancer Res Treat. 2021;20.

50. Gao X, Mi Y, Guo N, Hu Z, Hu F, Liu D, et al. Disrupted in schizophrenia 1 (DISC1) inhibits glioblastoma development by regulating mitochondria dynamics. Oncotarget. 2016;7(52):85963–74. pmid:27852062

51. Nakajima. Role of Kenae/CCDC125 in cell motility through the deregulation of RhoGTPase. Int J Mol Med. 2009;24(05). http://www.spandidos-publications.com/ijmm/24/5/605

52. Caldiran FY, Cacan E. RGS10 suppression by DNA methylation is associated with low survival rates in colorectal carcinoma. Pathol Res Pract. 2022;236:154007. pmid:35810565

Word count: 7307

Show less

This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication: https://creativecommons.org/publicdomain/zero/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Breast Cancer (BRCA) is a heterogeneous disease, and it is one of the most prevalent cancer types among women. Developing effective treatment strategies that address diverse types of BRCA is crucial. Notably, among different BRCA molecular sub-types, Hormone Receptor negative (HR-) BRCA cases, especially Basal-like BRCA sub-types, lack estrogen and progesterone hormone receptors and they exhibit a higher tumor growth rate compared to HR+ cases. Improving survival time and predicting prognosis for distinct molecular profiles is substantial. In this study, we propose a novel approach called 3-Multi-Omics Network and Integration Tool (3Mont), which integrates various -omics data by applying a grouping function, detecting pro-groups, and assigning scores to each pro-group using Feature importance scoring (FIS) component. Following that, machine learning (ML) models are constructed based on the prominent pro-groups, which enable the extraction of promising biomarkers for distinguishing BRCA sub-types. Our tool allows users to analyze the collective behavior of features in each pro-group (biological groups) utilizing ML algorithms. In addition, by constructing the pro-groups and equalizing the feature numbers in each pro-group using the FIS component, this process achieves a significant 20% speedup over the 3Mint tool. Contrary to conventional methods, 3Mont generates networks that illustrate the interplay of the prominent biomarkers of different -omics data. Accordingly, exploring the concerted actions of features in pro-groups facilitates understanding the dynamics of the biomarkers within the generated networks and developing effective strategies for better cancer sub-type stratification. The 3Mont tool, along with all supporting materials, can be found at https://github.com/malikyousef/3Mont.git.

Details

Title

3Mont: A multi-omics integrative tool for breast cancer subtype stratification

Author

Miray Unlu Yazici; Marron, J S; Bakir-Gungor, Burcu; Zou, Fei; Malik Yousef

First page

e0326154

Section

Research Article

Publication year

2025

Publication date

Jun 2025

Publisher

Public Library of Science

e-ISSN

19326203

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1371/journal.pone.0326154

ProQuest document ID

3224904746

3Mont: A multi-omics integrative tool for breast cancer subtype stratification

Jump to:

Full text

Introduction

Materials and methods

Datasets and preprocessing

Proposed 3Mont method

Results and discussion

Size distribution of the identified pro-groups

Performance evaluation of 3Mont

Significant pro-groups highlighted by 3Mont for BRCA molecular subtype identification (HR+ and HR-)

Relations between the detected biomarkers for BRCA molecular subtype (HR + /HR- cases) identification

Comparative performance evaluation of 3Mont

Discussion

Conclusion

Supporting information

References

Abstract

Details

Suggested sources