Skip to content

How to use eva

Before starting to use eva, it's important to get familiar with the different workflows, subcommands and configurations.

eva subcommands

To run an evaluation, we call:

eva <subcommand> --config <path-to-config-file>

The eva interface supports the subcommands: predict, fit and predict_fit.

  • fit: is used to train a decoder for a specific task and subsequently evaluate the performance. This can be done online or offline *
  • predict: is used to compute embeddings for input images with a provided FM-checkpoint. This is the first step of the offline workflow
  • predict_fit: runs predict and fit sequentially. Like the fit-online run, it runs a complete evaluation with images as input.

* online vs. offline workflows

We distinguish between the online and offline workflow:

  • online: This mode uses raw images as input and generates the embeddings using a frozen FM backbone on the fly to train a downstream head network.
  • offline: In this mode, embeddings are pre-computed and stored locally in a first step, and loaded in a 2nd step from disk to train the downstream head network.

The online workflow can be used to quickly run a complete evaluation without saving and tracking embeddings. The offline workflow runs faster (only one FM-backbone forward pass) and is ideal to experiment with different decoders on the same FM-backbone.

Run configurations

Config files

The setup for an eva run is provided in a .yaml config file which is defined with the --config flag.

A config file specifies the setup for the trainer (including callback for the model backbone), the model (setup of the trainable decoder) and data module.

The config files for the datasets and models that eva supports out of the box, you can find on GitHub. We recommend that you inspect some of them to get a better understanding of their structure and content.

Environment variables

To customize runs, without the need of creating custom config-files, you can overwrite the config-parameters listed below by setting them as environment variables.

Type Description
OUTPUT_ROOT str The directory to store logging outputs and evaluation results
EMBEDDINGS_ROOT str The directory to store the computed embeddings
CHECKPOINT_PATH str Path to the FM-checkpoint to be evaluated
IN_FEATURES int The input feature dimension (embedding)
NUM_CLASSES int Number of classes for classification tasks
N_RUNS int Number of fit runs to perform in a session, defaults to 5
MAX_STEPS int Maximum number of training steps (if early stopping is not triggered)
BATCH_SIZE int Batch size for a training step
PREDICT_BATCH_SIZE int Batch size for a predict step
LR_VALUE float Learning rate for training the decoder
MONITOR_METRIC str The metric to monitor for early stopping and final model checkpoint loading
MONITOR_METRIC_MODE str "min" or "max", depending on the MONITOR_METRIC used
REPO_OR_DIR str GitHub repo with format containing model implementation, e.g. "facebookresearch/dino:main"
DINO_BACKBONE str Backbone model architecture if a facebookresearch/dino FM is evaluated
FORCE_RELOAD bool Whether to force a fresh download of the github repo unconditionally
PRETRAINED bool Whether to load FM-backbone weights from a pretrained model