Abstract

Translate

Scanning electron microscopy images, with their high potential to reveal detailed microstructural and compositional information across various fields, are challenging to label and process due to the large volumes being generated, the presence of noise and artifacts, and the reliance on domain expertise. Moreover, the lack of scalable, automated, and interpretable methods for analyzing scanning electron microscopy images has prompted this research, which focuses on three primary objectives. First, the use of semi-supervised learning techniques, including pseudo-labeling and consistency regularization, aims to utilize both labeled and unlabeled scanning electron microscopy data by generating pseudo-labels for the unlabeled data and enforcing consistency in predictions for perturbed inputs. Second, this study introduces a hybrid Vision Transformer (ViT-ResNet50) model, which combines the representational power of ViT with the feature extraction capabilities of ResNet50. Lastly, the use of SHapley Additive exPlanations enhances the model’s interpretability, revealing critical image regions that contribute to predictions. To evaluate performance, the model is assessed using confusion matrices, test accuracy, precision, recall, F1 scores, receiver operating characteristic—area under the curve scores, model fit duration, and trainable parameters, along with a comparative analysis to demonstrate its competitiveness against state-of-the-art models in both semi-supervised and supervised (completely labeled data) settings. As a result, the semi-supervised based ViT-ResNet50 model achieved accuracies of 93.65% and 84.76% on the scanning electron microscopy Aversa and UltraHigh Carbon Steel Database, respectively, with notable interpretability, surpassing baseline models like ResNet101, InceptionV3, InceptionResNetV2, and InceptionV4. The findings highlight the potential of semi-supervised to improve model performance in scenarios with limited labeled data, though challenges such as class imbalance and increased computational cost suggest areas for further optimization.

Details

Title

Hybrid vision transformer framework for efficient and explainable SEM image-based nanomaterial classification

Author

Kaur, Manpreet

; Valderrama, Camilo E

; Liu, Qian¹

¹ Department of Applied Computer Science and Society, The University of Winnipeg , Winnipeg, Canada

First page

015066

Publication year

2025

Publication date

Mar 2025

Publisher

IOP Publishing

e-ISSN

26322153

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1088/2632-2153/adc072

ProQuest document ID

3179842722

© 2025 The Author(s). Published by IOP Publishing Ltd. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Hybrid vision transformer framework for efficient and explainable SEM image-based nanomaterial classification

Jump to:

Abstract

Details

Suggested sources