A wide range of diagnostic tasks can benefit from an automatic system that is able to segment and label individual ribs on chest X-ray (CXR) images. Recently, deep learning (DL) has shown superior performance to other methods in the segmentation and labeling of individual ribs [1]. However, developing DL algorithms for this task requires annotated images for each rib structure at pixel-level. To the best of our knowledge, there exists no such benchmark datasets and protocols. Hence, we introduce a new benchmark dataset, namely VinDr-RibCXR, for automatic segmentation and labeling of individual ribs from chest X-ray (CXR) scans. The VinDr-RibCXR contains 245 CXRs with corresponding ground truth annotations provided by human experts.
The raw images in DICOM format were sourced from VinDrCXR dataset [2], for which all scans have been de-identified to protect patient privacy. Each image was assigned to an expert, who manually segmented and annotated each of 20 ribs, denoted as L1→L10 (left ribs) and R1→R10 (right ribs). The masks of ribs (see Figure 1) were then stored in a JSON file that can later be used for training instance segmentation models.
To the best of our knowledge, the VinDr-RibCXR is the first publicly released dataset that includes segmentation annotations of the individual ribs, and for both anterior and posterior ribs. To develop and evaluate segmentation algorithms, we divided the whole dataset into a training set of 196 images and a validation set of 49 images.