Abstract

Convolutional neural networks (CNNs) have been successfully used in many applications where important information about data is embedded in the order of features, such as speech and imaging. However, most tabular data do not assume a spatial relationship between features, and thus are unsuitable for modeling using CNNs. To meet this challenge, we develop a novel algorithm, image generator for tabular data (IGTD), to transform tabular data into images by assigning features to pixel positions so that similar features are close to each other in the image. The algorithm searches for an optimized assignment by minimizing the difference between the ranking of distances between features and the ranking of distances between their assigned pixels in the image. We apply IGTD to transform gene expression profiles of cancer cell lines (CCLs) and molecular descriptors of drugs into their respective image representations. Compared with existing transformation methods, IGTD generates compact image representations with better preservation of feature neighborhood structure. Evaluated on benchmark drug screening datasets, CNNs trained on IGTD image representations of CCLs and drugs exhibit a better performance of predicting anti-cancer drug response than both CNNs trained on alternative image representations and prediction models trained on the original tabular data.

Details

Title
Converting tabular data into images for deep learning with convolutional neural networks
Author
Zhu, Yitan 1 ; Brettin, Thomas 1 ; Xia, Fangfang 1 ; Partin, Alexander 1 ; Shukla, Maulik 1 ; Yoo, Hyunseung 1 ; Evrard, Yvonne A. 2 ; Doroshow, James H. 3 ; Stevens, Rick L. 4 

 Argonne National Laboratory, Computing, Environment and Life Sciences, Lemont, USA (GRID:grid.187073.a) (ISNI:0000 0001 1939 4845) 
 Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., Frederick, USA (GRID:grid.418021.e) (ISNI:0000 0004 0535 8394) 
 National Cancer Institute, Developmental Therapeutics Branch, Bethesda, USA (GRID:grid.48336.3a) (ISNI:0000 0004 1936 8075) 
 Argonne National Laboratory, Computing, Environment and Life Sciences, Lemont, USA (GRID:grid.187073.a) (ISNI:0000 0001 1939 4845); The University of Chicago, Department of Computer Science, Chicago, USA (GRID:grid.170205.1) (ISNI:0000 0004 1936 7822) 
Publication year
2021
Publication date
2021
Publisher
Nature Publishing Group
e-ISSN
20452322
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2534802878
Copyright
© This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.