Oncology FM Evaluation Framework by kaiko.ai
eva currently supports performance evaluation for vision Foundation Models ("FMs") and supervised machine learning models on WSI (patch- and slide-level) as well as radiology image classification tasks.
With eva we provide the open-source community with an easy-to-use framework that follows industry best practices to deliver a robust, reproducible and fair evaluation benchmark across FMs of different sizes and architectures.
Support for additional modalities and tasks will be added soon.
Use cases
1. Evaluate your own FMs on public benchmark datasets
With a specified FM as input, you can run eva on several publicly available datasets & tasks. One evaluation run will download (if supported) and preprocess the relevant data, compute embeddings, fit and evaluate a downstream head and report the mean and standard deviation of the relevant performance metrics.
Supported datasets & tasks include:
WSI patch-level pathology datasets
- Patch Camelyon: binary breast cancer classification
- BACH: multiclass breast cancer classification
- CRC: multiclass colorectal cancer classification
- MHIST: binary colorectal polyp cancer classification
- MoNuSAC: multi-organ nuclei segmentation
- CoNSeP: segmentation colorectal nuclei and phenotypes
WSI slide-level pathology datasets
- Camelyon16: binary breast cancer classification
- PANDA: multiclass prostate cancer classification
Radiology datasets
- TotalSegmentator: radiology/CT-scan for segmentation of anatomical structures
- LiTS: radiology/CT-scan for segmentation of liver and tumor
To evaluate FMs, eva provides support for different model-formats, including models trained with PyTorch, models available on HuggingFace and ONNX-models. For other formats custom wrappers can be implemented.
2. Evaluate ML models on your own dataset & task
If you have your own labeled dataset, all that is needed is to implement a dataset class tailored to your source data. Start from one of our out-of-the box provided dataset classes, adapt it to your data and run eva to see how different FMs perform on your task.
Evaluation results
Check out our Leaderboards to inspect evaluation results of publicly available FMs.
License
eva is distributed under the terms of the Apache-2.0 license.
Next steps
Check out the User Guide to get started with eva