VinDr Lab: Open-source Data Platform
for Medical AI
Building high-quality datasets and algorithms with lean process
Why choose VinDr Lab?
Manage full medical data cycle at study level
Control workflow with blind and/or open annotating
Track project progress and status of each task
Customize preset label groups or create a new one
Allow hierarchical labels
Arrange the order of labels appearing to labelers
View DICOM images with full-fledged toolboxes
Annotate with Bounding Box, Polygon, Brush
Elaborate annotations with notes and comments
Re-assign tasks if unsatisfactory
Monitor the distribution of labels in a project
Control versions of exported labels
100+ experienced radiologists who have collaborated with us in creating high-quality datasets of multiple imaging modalities.
Open Source software
VinDr Lab is available under an open-source, commercially-permissive software license (MIT). The license does not impose restriction on the use of the software.
Open Source application
VinDr Lab documentation
Our public demo
(demo account is provided on Github page)
Our full demo
Please send us a request to grant access.
Our use cases
This is a large-scale dataset of chest X-ray images that was created via the VinDr Lab platform. It contains more than 18,000 CXR scans collected from two major hospitals in Vietnam. The images were labeled for the presence of 28 different radiographic findings and diagnoses in collaboration with a total of 17 experienced radiologists. VinDr-CXR is currently the largest dataset with radiologist-generated annotations. The dataset is explored to organize a competition hosted by the Kaggle platform.
Vingroup Big Data Institute (VinBigdata) has created and made freely available the VinDr-SpineXR: A large-scale X-ray dataset for spinal lesions detection and classification. The VinDr-SpineXR contains 10,469 images from 5,000 studies that are manually annotated with 13 types of abnormalities, each scan was annotated by an expert radiologist.
To the best of our knowledge, the VinDr-SpineXR is currently the largest dataset to date that provides radiologist’s bounding-box annotations for developing supervised-learning object detection algorithms.
VinDr-RibCXR is a dataset for automatic segmentation and labeling of individual ribs from chest X-ray (CXR) scans. The VinDr-RibCXR contains 245 CXRs with corresponding ground truth annotations provided by human experts. Each image was assigned to an expert, who manually segmented and annotated each of 20 ribs, denoted as L1→L10 (left ribs) and R1→R10 (right ribs). The masks of ribs (see Figure 1) were then stored in a JSON file that can later be used for training instance segmentation models.