Full Text

Turn on search term navigation

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Vision Transformer (ViT) is emerging as a new leader in computer vision with its outstanding performance in many tasks (e.g., ImageNet-22k, JFT-300M). However, the success of ViT relies on pretraining on large datasets. It is difficult for us to use ViT to train from scratch on a small-scale imbalanced capsule endoscopic image dataset. This paper adopts a Transformer neural network with a spatial pooling configuration. Transfomer’s self-attention mechanism enables it to capture long-range information effectively, and the exploration of ViT spatial structure by pooling can further improve the performance of ViT on our small-scale capsule endoscopy dataset. We trained from scratch on two publicly available datasets for capsule endoscopy disease classification, obtained 79.15% accuracy on the multi-classification task of the Kvasir-Capsule dataset, and 98.63% accuracy on the binary classification task of the Red Lesion Endoscopy dataset.

Details

Title
Transformer-Based Disease Identification for Small-Scale Imbalanced Capsule Endoscopy Dataset
Author
Long, Bai 1   VIAFID ORCID Logo  ; Wang, Liangyu 1 ; Chen, Tong 2   VIAFID ORCID Logo  ; Zhao, Yuanhao 1 ; Ren, Hongliang 3   VIAFID ORCID Logo 

 Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong 999077, China 
 School of Instrument Science and Opto-Electronics Engineering, Beijing Information Science and Technology University, Beijing 100101, China; School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China 
 Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong 999077, China; Shun Hing Institute of Advanced Engineering, The Chinese University of Hong Kong, Hong Kong 999077, China; Department of Biomedical Engineering, National University of Singapore, Singapore 117583, Singapore; NUS (Suzhou) Research Institute, Suzhou 215000, China 
First page
2747
Publication year
2022
Publication date
2022
Publisher
MDPI AG
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2711287705
Copyright
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.