Abstract

Dimensionality reduction greatly facilitates the exploration of cellular heterogeneity in single-cell RNA sequencing data. While most of such approaches are data-driven, it can be useful to incorporate biologically plausible assumptions about the underlying structure or the experimental design. We propose the boosting autoencoder (BAE) approach, which combines the advantages of unsupervised deep learning for dimensionality reduction and boosting for formalizing assumptions. Specifically, our approach selects small sets of genes that explain latent dimensions. As illustrative applications, we explore the diversity of neural cell identities and temporal patterns of embryonic development.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

* Added additional experiments to investigate consistency in variable selection and robustness to removing important variables from the data. Added model comparison with other unsupervised variable selection approaches. New strategy for determining the most important variables selected by the model during training.

Details

Title
Infusing structural assumptions into dimensionality reduction for single-cell RNA sequencing data to identify small gene sets
Author
Hackenberg, Maren; Brunn, Niklas; Vogel, Tanja; Binder, Harald
University/institution
Cold Spring Harbor Laboratory Press
Section
New Results
Publication year
2025
Publication date
Jan 25, 2025
Publisher
Cold Spring Harbor Laboratory Press
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
ProQuest document ID
2928432859
Copyright
© 2025. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.