Replicate evaluations

To produce the evaluation results presented here, you can run eva with the settings below.

Make sure to replace <task> in the commands below with bach, crc, mhist or patch_camelyon.

Note that to run the commands below you will need to first download the data. BACH, CRC and PatchCamelyon provide automatic download by setting the argument download: true in their respective config-files. In the case of MHIST you will need to download the data manually by following the instructions provided here.

DINO ViT-S16 (random weights)

Evaluating the backbone with randomly initialized weights serves as a baseline to compare the pretrained FMs to an FM that produces embeddings without any prior learning on image tasks. To evaluate, run:

PRETRAINED=false \
EMBEDDINGS_ROOT="./data/embeddings/dino_vits16_random" \
eva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml

DINO ViT-S16 (ImageNet)

The next baseline model, uses a pretrained ViT-S16 backbone with ImageNet weights. To evaluate, run:

EMBEDDINGS_ROOT="./data/embeddings/dino_vits16_imagenet" \
eva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml

DINO ViT-B8 (ImageNet)

To evaluate performance on the larger ViT-B8 backbone pretrained on ImageNet, run:

EMBEDDINGS_ROOT="./data/embeddings/dino_vitb8_imagenet" \
DINO_BACKBONE=dino_vitb8 \
IN_FEATURES=768 \
eva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml

DINOv2 ViT-L14 (ImageNet)

To evaluate performance on Dino v2 ViT-L14 backbone pretrained on ImageNet, run:

PRETRAINED=true \
EMBEDDINGS_ROOT="./data/embeddings/dinov2_vitl14_kaiko" \
REPO_OR_DIR=facebookresearch/dinov2:main \
DINO_BACKBONE=dinov2_vitl14_reg \
FORCE_RELOAD=true \
IN_FEATURES=1024 \
eva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml

Lunit - DINO ViT-S16 (TCGA)

Lunit, released the weights for a DINO ViT-S16 backbone, pretrained on TCGA data on GitHub. To evaluate, run:

PRETRAINED=false \
EMBEDDINGS_ROOT="./data/embeddings/dino_vits16_lunit" \
CHECKPOINT_PATH="https://github.com/lunit-io/benchmark-ssl-pathology/releases/download/pretrained-weights/dino_vit_small_patch16_ep200.torch" \
NORMALIZE_MEAN=[0.70322989,0.53606487,0.66096631] \
NORMALIZE_STD=[0.21716536,0.26081574,0.20723464] \
eva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml

Owkin - iBOT ViT-B16 (TCGA)

Owkin released the weights for "Phikon", an FM trained with iBOT on TCGA data, via HuggingFace. To evaluate, run:

EMBEDDINGS_ROOT="./data/embeddings/dino_vitb16_owkin" \
eva predict_fit --config configs/vision/owkin/phikon/offline/<task>.yaml

Note: since eva provides the config files to evaluate tasks with the Phikon FM in "configs/vision/owkin/phikon/offline", it is not necessary to set the environment variables needed for the runs above.

UNI - DINOv2 ViT-L16 (Mass-100k)

The UNI FM, introduced in [1] is available on HuggingFace. Note that access needs to be requested.

Unlike the other FMs evaluated for our leaderboard, the UNI model uses the vision library timm to load the model. To accomodate this, you will need to modify the config files (see also Model Wrappers).

Make a copy of the task-config you'd like to run, and replace the backbone section with:

backbone:
    class_path: eva.models.ModelFromFunction
    init_args:
        path: timm.create_model
        arguments:
            model_name: vit_large_patch16_224
            patch_size: 16
            init_values: 1e-5
            num_classes: 0
            dynamic_img_size: true
        checkpoint_path: <path/to/pytorch_model.bin>

Now evaluate the model by running:

EMBEDDINGS_ROOT="./data/embeddings/dinov2_vitl16_uni" \
IN_FEATURES=1024 \
eva predict_fit --config path/to/<task>.yaml