BRACS
The BReAst Carcinoma Subtyping (BRACS) is a new dataset of hematoxylin and eosin (H&E) histopathological images of breast carcinoma.
Images (WSI) of hematoxylin and eosin (H&E) stained breast tissues were generated by using an Aperio AT2 scanner at 0.25 µm/pixel for 40× resolution. Some Regions of Interest (RoIs) are associated with a subset of WSIs. See example figure below. Both WSIs and RoIs were annotated according to the seven classes mentioned above (N, PB, UDH, FEA, ADH, DCIS, IC), by three expert pathologists of the Complex Structure Pathological Anatomy and Cytopathology of National Cancer Institute – IRCCS Fondazione Pascale, Naples, Italy.
While the BRACS contains 547 WSIs collected by 189 patients, the BRACS_ROI
subset which
we use in this benchmarks contains 4539 extracted ROIs / patches.
Raw data
Key stats
Modality | Vision (WSI patches) |
Task | Multiclass classification (7 classes) |
Cancer type | Breast |
Data size | 52 GB |
Image dimension | variable |
Magnification (μm/px) | 40x (0.25) |
Files format | png |
Number of images | 4539 |
Splits
The data source provides train/validation/test splits
Splits | Train | Validation | Test |
---|---|---|---|
#Samples | 3657 (80.57%) | 312 (6.87%) | 570 (12.56%) |
Organization
The BRACS data is organized as follows:
BRACS_RoI
├── train
│ ├── 0_N # 1 folder per class
│ ├── 1_PB
│ ├── ...
├── val
│ ├── 0_N
│ ├── ...
├── test
│ ├── 0_N
│ ├── ...
Download and preprocessing
The BRACS
dataset class doesn't download the data during runtime and must be downloaded manually from the official source.