Dataset SpineXR


VinDr-SpineXR: An open dataset for spinal lesions detection and classification from radiographs

Dataset description

Radiographs are used as the most important imaging tool for identifying spine anomalies in clinical practice. To the best of our knowledge, no existing studies have been devoted to the development and evaluation of a comprehensive system for classifying and localizing multiple spine lesions from X-ray scans. The lack of large datasets with high-quality images and human experts’ annotations is the key obstacle. To fill this gap,  Vingroup Big Data Institute (VinBigdata) has created and made freely available the VinDr-SpineXR: A large-scale X-ray dataset for spinal lesions detection and classification. The VinDr-SpineXR contains 10,469 images from 5,000 studies that are manually annotated with 13 types of abnormalities, each scan was annotated by an expert radiologist.

To the best of our knowledge, the VinDr-SpineXR is currently the largest dataset to date that provides radiologist’s bounding-box annotations for developing supervised-learning object detection algorithms. We believe that the dataset will serve as a benchmark dataset for accelerating the development and evaluation of new machine learning models for the spinal X-ray interpretation.

Figure 1. Examples of spine X-ray scans with radiologist’s annotations. Abnormal findings (local labels) marked by radiologists are plotted on the original images for visualization purposes.

Dataset Statistics

Table 2. Characteristics of patients in the training and test datasets.

Download Dataset

The full version of the VinDr-SpineXR will be submitted and reviewed by PhysioNet ( for public access. We are actively working on that and try to ensure the dataset will be published as soon as possible.


The image and annotation quality of the dataset can be via VinDr Laboratory – our hub for all public datasets. To access the data hub, users are required to complete a request access form.

Author List and Affiliations

Hieu T. Nguyen [1,2], Hieu H. Pham [1,3], Nghia T. Nguyen [1], Ha Q. Nguyen [1,3], Thang Q. Huynh [2], Minh Dao [1], and Van Vu [1,4]

[1] Medical Imaging Center, Vingroup Big Data Institute, Hanoi, Vietnam

[2] School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi, Vietnam

[3] College of Engineering & Computer Science, VinUniversity, Hanoi, Vietnam

[4]  Department of Mathematics, Yale University, New Heaven, USA

*  Corresponding author: Hieu H. Pham (



For any publication that explores this resource, the authors must cite this original paper as follows:


     title={VinDr-SpineXR: A deep learning framework for spinal lesions detection and classification from radiographs}, 
     author={Hieu T. Nguyen and Hieu H. Pham and Nghia T. Nguyen and Ha Q. Nguyen and Thang Q. Huynh an Minh Dao and Van Vu},


We welcome any comments, suggestions or feedback you have for us that help improve the dataset, correspondence should be addressed to: Hieu H. Pham (


