Content area
Extracting key features for phenotype classification from high-dimensional and complex mass spectrometry (MS) data presents a significant challenge. Conventional data representation methods, such as traditional peak lists or grid-based imaging strategies, are often hampered by information loss and compromised signal integrity, thereby limiting the performance of downstream deep learning models. To address this issue, we propose a novel data representation framework named MSIMG. Inspired by object detection in computer vision, MSIMG introduces a data-driven, “density-peak-centric” patch selection strategy. This strategy employs density map estimation and non-maximum suppression algorithms to locate the centers of signal-dense regions, which serve as anchors for dynamic, content-aware patch extraction. This process transforms raw mass spectrometry data into a multi-channel image representation with higher information fidelity. Extensive experiments conducted on two public clinical mass spectrometry datasets demonstrate that MSIMG significantly outperforms both the traditional peak list method and the grid-based MetImage approach. This study confirms that the MSIMG framework, through its content-aware patch selection, provides a more information-dense and discriminative data representation paradigm for deep learning models. Our findings highlight the decisive impact of data representation on model performance and successfully demonstrate the immense potential of applying computer vision strategies to analytical chemistry data, paving the way for the development of more robust and precise clinical diagnostic models.
Details
; Gao Boyong 1 ; Wang Yinchu 2 ; Guo, Lin 2
; Zhang, Wei 2 ; Xiong Xingchuang 2
1 College of Information Engineering, China Jiliang University, Hangzhou 310018, China; [email protected] (F.Z.);
2 National Institute of Metrology, Beijing 100029, China, Key Laboratory of Metrology Digitalization and Digital Metrology for State Market Regulation, State Administration for Market Regulation, Beijing 100029, China, National Metrology Data Center, Beijing 100029, China