Abstract

Radiomic approaches in precision medicine are promising, but variation associated with image acquisition factors can result in severe biases and low generalizability. Multicenter datasets used in these studies are often heterogeneous in multiple imaging parameters and/or have missing information, resulting in multimodal radiomic feature distributions. ComBat is a promising harmonization tool, but it only harmonizes by single/known variables and assumes standardized input data are normally distributed. We propose a procedure that sequentially harmonizes for multiple batch effects in an optimized order, called OPNested ComBat. Furthermore, we propose to address bimodality by employing a Gaussian Mixture Model (GMM) grouping considered as either a batch variable (OPNested + GMM) or as a protected clinical covariate (OPNested − GMM). Methods were evaluated on features extracted with CapTK and PyRadiomics from two public lung computed tomography (CT) datasets. We found that OPNested ComBat improved harmonization performance over standard ComBat. OPNested + GMM ComBat exhibited the best harmonization performance but the lowest predictive performance, while OPNested − GMM ComBat showed poorer harmonization performance, but the highest predictive performance. Our findings emphasize that improved harmonization performance is no guarantee of improved predictive performance, and that these methods show promise for superior standardization of datasets heterogeneous in multiple or unknown imaging parameters and greater generalizability.

Details

Title
Improved generalized ComBat methods for harmonization of radiomic features
Author
Horng, Hannah 1 ; Singh, Apurva 2 ; Yousefi, Bardia 2 ; Cohen, Eric A. 2 ; Haghighi, Babak 2 ; Katz, Sharyn 2 ; Noël, Peter B. 3 ; Kontos, Despina 2 ; Shinohara, Russell T. 4 

 University of Pennsylvania, Center for Biomedical Image Computing and Analysis (CBICA), Department of Radiology, Philadelphia, USA (GRID:grid.25879.31) (ISNI:0000 0004 1936 8972); University of Pennsylvania, Penn Statistics in Imaging and Visualization Endeavor (PennSIVE), Department of Biostatistics, Epidemiology, and Informatics, Philadelphia, USA (GRID:grid.25879.31) (ISNI:0000 0004 1936 8972) 
 University of Pennsylvania, Center for Biomedical Image Computing and Analysis (CBICA), Department of Radiology, Philadelphia, USA (GRID:grid.25879.31) (ISNI:0000 0004 1936 8972) 
 University of Pennsylvania, Laboratory for Advanced Computed Tomography Imaging, Department of Radiology, Philadelphia, USA (GRID:grid.25879.31) (ISNI:0000 0004 1936 8972) 
 University of Pennsylvania, Penn Statistics in Imaging and Visualization Endeavor (PennSIVE), Department of Biostatistics, Epidemiology, and Informatics, Philadelphia, USA (GRID:grid.25879.31) (ISNI:0000 0004 1936 8972) 
Publication year
2022
Publication date
2022
Publisher
Nature Publishing Group
e-ISSN
20452322
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2733868646
Copyright
© The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.