Abstract

Most of the existing chest X-ray datasets include labels from a list of findings without specifying their locations on the radiographs. This limits the development of machine learning algorithms for the detection and localization of chest abnormalities. In this work, we describe a dataset of more than 100,000 chest X-ray scans that were retrospectively collected from two major hospitals in Vietnam. Out of this raw data, we release 18,000 images that were manually annotated by a total of 17 experienced radiologists with 22 local labels of rectangles surrounding abnormalities and 6 global labels of suspected diseases. The released dataset is divided into a training set of 15,000 and a test set of 3,000. Each scan in the training set was independently labeled by 3 radiologists, while each scan in the test set was labeled by the consensus of 5 radiologists. We designed and built a labeling platform for DICOM images to facilitate these annotation procedures. All images are made publicly available in DICOM format along with the labels of both the training set and the test set.

Measurement(s)

diseases and abnormal findings from chest X-ray scans

Technology Type(s)

AI is used to detect diseases and abnormal findings

Sample Characteristic - Location

Vietnam

Details

Title
VinDr-CXR: An open dataset of chest X-rays with radiologist’s annotations
Author
Nguyen, Ha Q. 1 ; Lam, Khanh 2 ; Le, Linh T. 3 ; Pham, Hieu H. 4 ; Tran, Dat Q. 5 ; Nguyen, Dung B. 5 ; Le, Dung D. 6 ; Pham, Chi M. 6 ; Tong, Hang T. T. 6 ; Dinh, Diep H. 6 ; Do, Cuong D. 6 ; Doan, Luu T. 3 ; Nguyen, Cuong N. 3 ; Nguyen, Binh T. 3 ; Nguyen, Que V. 3 ; Hoang, Au D. 3 ; Phan, Hien N. 3 ; Nguyen, Anh T. 3 ; Ho, Phuong H. 7 ; Ngo, Dat T. 8 ; Nguyen, Nghia T. 8 ; Nguyen, Nhan T. 8 ; Dao, Minh 9 ; Vu, Van 10 

 Vingroup Big Data Institute, Hanoi, Vietnam; VinBigData JSC, Smart Health Center, Hanoi, Vietnam 
 Hospital 108, Department of Radiology, Hanoi, Vietnam 
 Hanoi Medical University Hospital, Department of Radiology, Hanoi, Vietnam (GRID:grid.488446.2) 
 Vingroup Big Data Institute, Hanoi, Vietnam (GRID:grid.488446.2); VinBigData JSC, Smart Health Center, Hanoi, Vietnam (GRID:grid.488446.2); VinUniversity, College of Engineering and Computer Science, Hanoi, Vietnam (GRID:grid.507915.f) (ISNI:0000 0004 8341 3037); VinUniversity, VinUni-Illinois Smart Health Center, Hanoi, Vietnam (GRID:grid.507915.f) (ISNI:0000 0004 8341 3037) 
 Vingroup Big Data Institute, Hanoi, Vietnam (GRID:grid.507915.f) 
 Hospital 108, Department of Radiology, Hanoi, Vietnam (GRID:grid.507915.f) 
 Tam Anh General Hospital, Department of Radiology, Ho Chi Minh City, Vietnam (GRID:grid.488446.2) 
 VinBigData JSC, Smart Health Center, Hanoi, Vietnam (GRID:grid.488446.2) 
 Vingroup Big Data Institute, Hanoi, Vietnam (GRID:grid.488446.2) 
10  Vingroup Big Data Institute, Hanoi, Vietnam (GRID:grid.488446.2); Yale University, Department of Mathematics, New Heaven, USA (GRID:grid.47100.32) (ISNI:0000000419368710) 
Publication year
2022
Publication date
2022
Publisher
Nature Publishing Group
e-ISSN
20524463
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2691948564
Copyright
© The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.