Abstract

肺癌是全球最常见的恶性肿瘤之一,也是癌症相关死亡的主要原因。早期肺癌通常表现为肺结节,准确评估其恶性风险对于延长生存期及避免过度诊疗至关重要。本研究旨在基于人工智能(artificial intelligence, AI)自动提取的影像学特征参数构建模型,评估其在部分实性结节(part-solid nodule, PSN)恶性预测中的效能。方法 回顾性分析2020年10月至2025年2月于兰州大学第二医院接受肺结节切除术的222例患者的229个PSN资料。根据病理结果,将45个良性病变及腺体前驱病变归为非恶性组,184个肺部恶性肿瘤归为恶性组。所有患者均接受胸部计算机断层扫描,使用AI软件提取影像学特征参数。通过单因素分析筛选显著变量,计算方差膨胀因子并剔除共线性较高的变量,LASSO回归进一步筛选关键特征,多因素逻辑回归确定独立危险因素。基于筛选结果,构建逻辑回归、随机森林、XGBoost、LightGBM、支持向量机5种模型,使用受试者工作特征(reciever operating characteristic, ROC)曲线评估模型性能。结果 PSN良恶性的独立危险因素包括粗糙度(ngtdm)、依赖方差(gldm)和短运行低灰度重点(glrlm)。逻辑回归在训练集和测试集的曲线下面积(area under the curve, AUC)分别为0.86和0.89,表现较好。XGBoost的AUC分别为0.78和0.77,表现相对均衡,但准确度较低。支持向量机在训练集的AUC为0.93,测试集AUC降至0.80,表明该模型存在一定的过拟合。LightGBM在训练集表现优异,AUC为0.94,但在测试集上有所下降,AUC为0.88。随机森林模型在训练集和测试集上均表现稳定,训练集AUC为0.89,测试集AUC为0.91,具有较高的稳定性和良好的泛化能力。结论 基于独立危险因素构建的随机森林模型在PSN良恶性预测中表现最佳,可以为临床医生提供有效的辅助预测,支持个体化治疗决策。

Background and objective Lung cancer is one of the most common malignant tumors worldwide and a major cause of cancer-related deaths. Early-stage lung cancer is often manifested as pulmonary nodules, and accurate assessment of the malignancy risk is crucial for prolonging survival and avoiding overtreatment. This study aims to construct a model based on image feature parameters automatically extracted by artificial intelligence (AI) to evaluate its effectiveness in predicting the malignancy of part-solid nodule (PSN). Methods This retrospective study analyzed 229 PSN from 222 patients who underwent pulmonary nodule resection at Lanzhou University Second Hospital between October 2020 and February 2025. According to pathological results, 45 cases of benign lesions and precursor glandular lesion were categorized into the non-malignant group, and 184 cases of pulmonary malignancies were categorized into the malignant group. All patients underwent preoperative chest computed tomography (CT), and AI software was used to extract imaging feature parameters. Univariate analysis was used to screen significant variables; variance inflation factor (VIF) was calculated to exclude highly collinear variables, and LASSO regression was further applied to identify key features. Multivariate Logistic regression was used to determine independent risk factors. Based on the selected variables, five models were constructed: Logistic regression, random forest, XGBoost, LightGBM, and support vector machine (SVM). Receiver operating characteristic (ROC) curves were used to assess the performance of the models. Results The independent risk factors for the malignancy of PSN include roughness (ngtdm), dependence variance (gldm), and short run low gray-level emphasis (glrlm). Logistic regression achieved area under the curves ( AUCs) of 0.86 and 0.89 in the training and testing sets, respectively, showing good performance. XGBoost had AUCs of 0.78 and 0.77, respectively, demonstrating relatively balanced performance, but with lower accuracy. SVM showed an AUC of 0.93 in the training set, which decreased to 0.80 in the testing set, indicating overfitting. LightGBM performed excellently in the training set with an AUC of 0.94, but its performance declined in the testing set, with an AUC of 0.88. In contrast, random forest demonstrated stable performance in both the training and testing sets, with AUCs of 0.89 and 0.91, respectively, exhibiting high stability and excellent generalizability. Conclusion The random forest model constructed based on independent risk factors demonstrated the best performance in predicting the malignancy of PSN and could provide effective auxiliary predictions for clinicians, supporting individualized treatment decisions.

Details

Title
Application Value of an AI-based Imaging Feature Parameter Model for Predicting the Malignancy of Part-solid Pulmonary Nodule
Author
LIN, Mingzhi; HUI, Yiming; LI, Bin; ZHAO, Peilin; ZHENG, Zhizhong; YANG, Zhuowen; SU, Zhipeng; MENG, Yuqi; SONG, Tieniu
Pages
281-290
Section
Clinical Research
Publication year
2025
Publication date
2025
Publisher
Chinese Anti-Cancer Association Chinese Antituberculosis Association
ISSN
10093419
e-ISSN
19996187
Source type
Scholarly Journal
Language of publication
Chinese
ProQuest document ID
3211981679
Copyright
Copyright © 2025. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.