VinDr-BodyPartXR: An Open Dataset for Classification of Body Parts from DICOM X-ray Scans

Dataset Description

X-ray imaging in Digital Imaging and Communications in Medicine (DICOM) format is the most commonly used imaging modality in clinical practice, resulting in vast, nonnormalized databases. This leads to an obstacle in deploying artificial intelligence (AI) solutions for analyzing medical images, which often requires identifying the right body part before feeding the image into a specified AI model. This challenge raises the need for an automated and efficient approach to classifying body parts from X-ray scans. Therefore, Vingroup of Big Data Institute (VinBigData) introduces and releases VinDr-BodyPartXR dataset including 16,093 X-ray images that are collected and manually annotated. 

To the best of of our knowledge, the VinDr-BodyPartXR is currently the largest dataset to date that provides annotations for developing supervised-learning classification algorithms. We believe that the dataset will serve as a benchmark dataset for accelerating the development and evaluation of new machine learning models for the body part X-ray classification.

Figure 1. Examples of VinDr-BodyPartXR scans.

Data statistics

Table 1. Dataset statistics of VinDr-BodyPartXR.


To download the VinDr-BodyPartXR dataset, please sign our Data Use Agreement (DUA) and send the signed DUA to  v.md@vinbigdata.org for obtaining the downloadable link.


For any publication that explores this resource, the authors must cite the original paper:

Hieu H. Pham, Dung V. Do, and Ha Q. Nguyen, ” DICOM Imaging Router: An Open Deep Learning Framework for Classification of Body Parts from DICOM X-ray Scans,” in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCV Workshop 2021).


Correspondence should be addressed to v.md@vinbigdata.org