hirl@diu.edu.bd +8801672580748

Available Dataset

Explore our curated datasets for research and development.

DentIRO: A High-Quality Multi-Class Single-Tooth Intraoral Radiograph Dataset for Automated Dental Diagnosis
Dataset resource
Open

Intraoral periapical radiographs are routinely used in clinical dentistry to evaluate tooth health, restorations, and endodontic treatments. Despite growing interest in automated dental image analysis, publicly available datasets focusing on single-tooth intraoral radiographs remain limited. DentIRO is a publicly released, de-identified dataset consisting of 5,300 tooth-centered intraoral radiographs collected from 3,243 patients at two dental clinics in Bangladesh between 2025 and 2026. Each image corresponds to a single diagnostic tooth and is categorized into one of four clinically relevant classes: Healthy, Caries, Crowned, and Root Canal. Diagnostic labels were assigned by experienced dental practitioners through independent review followed by consensus resolution. Along with the radiographs, DentIRO provides structured metadata describing patient demographics, acquisition site, and diagnostic labels. The dataset is intended to support reproducible research and benchmarking of mac

BDCXR-3257: A Comprehensive Chest Radiography Dataset with Multi-Etiological Pneumonia Labels from a Clinical Setting
Dataset resource
Open

Pneumonia remains a leading cause of illness and death among children worldwide, particularly in low- and middle-income countries where access to expert radiological interpretation is limited. We introduce the BDCXR-3257 dataset, a chest X-ray dataset collected from Dr. M.R. Khan Shishu Hospital & Institute of Child Health, Dhaka, Bangladesh. The dataset contains 3,257 clinically verified chest X-ray images categorized into three classes: Normal, Bacterial Pneumonia, and Viral Pneumonia. All images were independently annotated by two expert clinicians using a duplicate labelling process, achieving near-perfect inter-annotator agreement (Fleiss’ κ > 0.95). To improve image quality and consistency, a preprocessing pipeline including text removal, Brightness Preserving Dynamic Fuzzy Histogram Equalization (BPDFHE), and manual cropping was applied. This dataset provides a clinically relevant and demographically diverse resource for developing, benchmarking, and validating AI-based diagnost

KUB-StoneX: Annotated Kidney-Ureter-Bladder X-ray Dataset with Expert Stone Segmentations and Radiomic Features
Dataset resource
Open

The KUB-StoneX dataset is a publicly available, de-identified clinical imaging resource designed to support reproducible research in urinary stone detection, localization, segmentation, and radiomic analysis using kidney–ureter–bladder (KUB) X-rays. It comprises 2,570 anteroposterior KUB radiographs from 1,703 patients, retrospectively collected from two tertiary care hospitals in Bangladesh. Each image includes expert-verified pixel-level stone segmentation masks, stone-focused region-of-interest (ROI) patches, ROI metadata, and pre-extracted radiomic features. Data are organized in a standardized directory structure with consistent file naming, ensuring direct correspondence between images, annotations, and features. KUB-StoneX aims to facilitate the development, evaluation, and benchmarking of machine learning, deep learning, and hybrid radiomics models for low-cost X-ray–based urinary stone diagnosis, especially in resource-limited clinical settings.

OrthoFrac-XR: A Clinically Validated Multimodal High-Resolution X-ray Imaging Dataset for Bone Fracture Detection and Localization
Dataset resource
Open

Bone fractures are a common orthopedic problem that necessitates prompt and accurate diagnosis to ensure proper treatment, healing, and the prevention of long-term consequences. Delayed or missed diagnoses can result in deformity, chronic discomfort, and a lower quality of life. Although deep learning (DL) approaches have demonstrated promising performance in automated fracture detection from X-ray images, their reliability is sometimes restricted by a lack of clinically verified, high-quality, multimodal datasets. To close this gap, we propose OrthoFrac-XR, a clinically validated, high-resolution, multimodal X-ray dataset designed to support studies in bone fracture detection, classification, and clinical analysis. The dataset includes 1,493 totally classified orthopedic X-rays that were collected from several Bangladeshi institutions, representing roughly 1,300 distinct patients. Distal fracture (314 images), proximal fracture (254 images), post-fracture (349 images), and non-fractur

FD3611: A Multi-Class Color Fundus Image Dataset for Retinal Disease Classification
Dataset resource
Open

The FD3611 dataset is a clinically validated collection of 3,611 color fundus photographs designed to support machine learning and deep learning research in automated retinal disease diagnosis. Collected from the National Institute of Ophthalmology and Hospital (NIOH), Dhaka, Bangladesh, the dataset comprises retinal images obtained during routine ophthalmic examinations using a smartphone-assisted fundus imaging setup, enabling low-cost yet diagnostically meaningful image acquisition. Each image was manually reviewed and annotated by expert ophthalmologists to ensure clinical reliability. The dataset preserves real-world variability in illumination, blur, and contrast, making it a valuable benchmark for developing robust artificial intelligence systems for retinal disease classification, screening, and decision support, particularly in resource-limited clinical environments. The class-wise distribution is as follows: Diabetic Retinopathy (349), Media Hazy (464), Myopic Retinopathy (29

TriHemo-MCV: A Hemoglobin-MCV-Based Tri-Class Dataset for Anemia and Compensated Microcytosis
Dataset resource
Open

The dataset contains 5,037 patient records retrospectively collected from four tertiary hospitals in Bangladesh, ensuring demographic and instrumental diversity. Each record includes Complete Blood Count (CBC) parameters such as hemoglobin (HGB), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), red blood cell count (RBC), hematocrit (HTC), and red cell distribution width (RDW-CV), along with three structured lifestyle questionnaire variables. Samples were categorized into: Non-Anemic: 1664 Anemic: 1732 Compensated Microcytosis: 1641 Labels were assigned based on predefined hematologic

SSMCH-ECG: A Validated Dataset of 12-Lead Paper-Based ECG Images for Cardiac Abnormality Detection
Dataset resource
Open

The SSMCH-ECG dataset is a clinically validated collection of 849 12-lead paper-based ECG images designed to bridge the gap between traditional paper-based diagnostics and modern deep learning applications. Collected from Shaheed Suhrawardy Medical College and Hospital (SSMCH) in Dhaka, Bangladesh, the dataset comprises three distinct clinical classes: Normal (434), Abnormal (344), and Myocardial Infarction (MI) (71). The ground-truth labels were established in collaboration with a clinical specialist, ensuring high reliability in cardiac abnormality detection. This dataset serves as a vital resource for researchers developing algorithms to digitize and classify paper-based ECGs, particularly in resource-limited settings where digital ECG infrastructure may be lacking.

A Dataset on the Mental Health of Undergraduate Students in Bangladesh
Dataset resource
Open

The dataset, which focuses on mental health, was acquired via a survey given to Daffodil International University undergraduate students. A Google Forms survey was used to gather information from 500 respondents, ages 18 to 28, between May 18 and June 15, 2024. There were 143 female and 357 male responders. Four psychometric measures with international validation were used to assess mental health. A standardized questionnaire was used to gather general demographic data. The 20-item Beck Depression Inventory (BDI-II) and the 9-item Patient Health Questionnaire (PHQ-9) were used to assess depressive symptoms. The 8-item UCLA Loneliness Scale (Version 3) was used to measure loneliness in addition to the 21-item Centre for Epidemiological Studies-Depression Scale (CES-D). For future research in psychometric evaluation utilizing the previously indicated scales, the dataset, which contains descriptive statistics on the psychometric variables for the respondents, will be an invaluable resour

A dataset of color fundus images for the detection and classification of eye diseases
Dataset resource
Open

Worldwide, eye ailments are recognized as significant contributors to nonfatal disabling conditions. In Bangladesh, 1.5% of adults suffer from blindness, while 21.6% experience low vision. Therefore, eye disease detection is crucial for preserving vision, preventing blindness, and maintaining overall health. Early detection allows for prompt intervention and treatment, preventing irreversible damage and preserving quality of life. By analyzing the dataset, researchers will be able to identify trends, develop algorithms for diagnosis, assess treatment effectiveness, and inform preventive measures.