Integrate multi-omics data with biological

Abstract

Background

Comprehensive molecular profiling of various cancers and other diseases has generated vast amounts of multi-omics data. Each type of -omics data corresponds to one feature space, such as gene expression, miRNA expression, DNA methylation, etc. Integrating multi-omics data can link different layers of molecular feature spaces and is crucial to elucidate molecular pathways underlying various diseases. Machine learning approaches to mining multi-omics data hold great promises in uncovering intricate relationships among molecular features. However, due to the “big p, small n” problem (i.e., small sample sizes with high-dimensional features), training a large-scale generalizable deep learning model with multi-omics data alone is very challenging.

Results

We developed a method called Multi-view Factorization AutoEncoder (MAE) with network constraints that can seamlessly integrate multi-omics data and domain knowledge such as molecular interaction networks. Our method learns feature and patient embeddings simultaneously with deep representation learning. Both feature representations and patient representations are subject to certain constraints specified as regularization terms in the training objective. By incorporating domain knowledge into the training objective, we implicitly introduced a good inductive bias into the machine learning model, which helps improve model generalizability. We performed extensive experiments on the TCGA datasets and demonstrated the power of integrating multi-omics data and biological interaction networks using our proposed method for predicting target clinical variables.

Conclusions

To alleviate the overfitting problem in deep learning on multi-omics data with the “big p, small n” problem, it is helpful to incorporate biological domain knowledge into the model as inductive biases. It is very promising to design machine learning models that facilitate the seamless integration of large-scale multi-omics data and biomedical domain knowledge for uncovering intricate relationships among molecular features and clinical features.

Details

Title

Integrate multi-omics data with biological interaction networks using Multi-view Factorization AutoEncoder (MAE)

Author

Ma, Tianle; Zhang, Aidong

Pages

1-11

Section

Research

Publication year

2019

Publication date

2019

Publisher

Springer Nature B.V.

e-ISSN

14712164

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1186/s12864-019-6285-x

ProQuest document ID

2340837505

© 2019. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Integrate multi-omics data with biological interaction networks using Multi-view Factorization AutoEncoder (MAE)

Jump to:

Abstract

Details

Full text options

Suggested sources