VinDr-SpineXR: An open dataset for spinal lesions detection and classification from radiographs
Dataset Description
Radiographs are used as the most important imaging tool for identifying spine anomalies in clinical practice. To the best of our knowledge, no existing studies have been devoted to the development and evaluation of a comprehensive system for classifying and localizing multiple spine lesions from X-ray scans. The lack of large datasets with high-quality images and human experts’ annotations is the key obstacle. To fill this gap, Vingroup Big Data Institute (VinBigdata) has created and made freely available the VinDr-SpineXR: A large-scale X-ray dataset for spinal lesions detection and classification. The VinDr-SpineXR contains 10,469 images from 5,000 studies that are manually annotated with 13 types of abnormalities, each scan was annotated by an expert radiologist.
To the best of our knowledge, the VinDr-SpineXR is currently the largest dataset to date that provides radiologist’s bounding-box annotations for developing supervised-learning object detection algorithms. We believe that the dataset will serve as a benchmark dataset for accelerating the development and evaluation of new machine learning models for the spinal X-ray interpretation.
Table 1. Overview of publicly available MSK image datasets.
Dataset
Year
Study type
label
Number of images
Digital Hand Atlas [1]
2007
Left hand
Bone age
1,390
Osteoarthritis Initiative [2]
2013
Knee
K&L Grade
8,892
MURA [3]
2017
Upper body
Abnormalities
40,561
RSNA Pediatric Bone Age [4]
2019
Hand
Bone age
14,236
Kuok et al. [5]
2018
Spine
Lumbar vertebrae mask
60
Kim et al. [6]
2020
Spine
Spine position
797
Ours
2021
Spine
Multiple abnormalities
10,469
Figure 1. Examples of spine X-ray scans with radiologist’s annotations. Abnormal findings (local labels) marked by radiologists are plotted on the original images for visualization purposes.
Dataset Statistics
Table 2. Characteristics of patients in the training and test datasets.
Download
The full version of the VinDr-SpineXR can be downloaded from PhysioNet. Note that only credentialed users who sign the specified DUA can access the files.
Visualization
The images and annotations of the dataset can be visualized via VinDr Laboratory – our hub for all public datasets.
Citation
For any publication that explores this resource, the authors must cite this original paper:
Hieu T. Nguyen, Hieu H. Pham, Nghia T. Nguyen, Ha Q. Nguyen, Thang Q. Huynh, Minh Dao, and Van Vu, “VinDr-SpineXR: A deep learning framework for spinal lesions detection and classification from radiographs,” in Proceedings of the 2021 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2021)
[1] Gertych, A., Zhang, A., Sayre, J., Pospiech-Kurkowska, S., Huang, H.: Bone age assessment of children using a digital hand atlas. Computerized Medical Imaging and Graphics 31(4-5), 322–331 (2007).
[2] Osteoarthritis initiative: A multi-center observational study of men and women. https://oai.epi-ucsf.org/datarelease/, accessed: 2021-02-22.
[3] Rajpurkar, P., Irvin, J., Bagul, A., Ding, D., Duan, T., Mehta, H., Yang, B., Zhu, K., Laird, D., Ball, R.L., et al.: MURA: Large dataset for abnormality detection in musculoskeletal radiographs. arXiv preprint arXiv:1712.06957 (2017).
[4] Halabi, S.S., Prevedello, L.M., Kalpathy-Cramer, J., Mamonov, A.B., Bilbily, A., Cicero, M., Pan, I., Pereira, L.A., Sousa, R.T., Abdala, N., et al.: The RSNA pediatric bone age machine learning challenge. Radiology 290(2), 498–503 (2019).
[5] Kuok, C.P., Fu, M.J., Lin, C.J., Horng, M.H., Sun, Y.N.: Vertebrae segmentation from X-ray images using convolutional neural network. In: International Conference on Information Hiding and Image Processing (IHIP). pp. 57–61 (2018).