Callbacks

Writers

`eva.core.callbacks.writers.EmbeddingsWriter`

Bases: BasePredictionWriter

Callback for writing generated embeddings to disk.

This callback writes the embedding files in a separate process to avoid blocking the main process where the model forward pass is executed.

Parameters:

Name	Type	Description	Default
`output_dir`	`str`	The directory where the embeddings will be saved.	required
`backbone`	`Module \| None`	A model to be used as feature extractor. If `None`, it will be expected that the input batch returns the features directly.	`None`
`dataloader_idx_map`	`Dict[int, str] \| None`	A dictionary mapping dataloader indices to their respective names (e.g. train, val, test).	`None`
`group_key`	`str \| None`	The metadata key to group the embeddings by. If specified, the embedding files will be saved in subdirectories named after the group_key. If specified, the key must be present in the metadata of the input batch.	`None`
`overwrite`	`bool`	Whether to overwrite the output directory. Defaults to True.	`True`

Source code in src/eva/core/callbacks/writers/embeddings.py

def __init__(
    self,
    output_dir: str,
    backbone: nn.Module | None = None,
    dataloader_idx_map: Dict[int, str] | None = None,
    group_key: str | None = None,
    overwrite: bool = True,
) -> None:
    """Initializes a new EmbeddingsWriter instance.

    This callback writes the embedding files in a separate process to avoid blocking the
    main process where the model forward pass is executed.

    Args:
        output_dir: The directory where the embeddings will be saved.
        backbone: A model to be used as feature extractor. If `None`,
            it will be expected that the input batch returns the features directly.
        dataloader_idx_map: A dictionary mapping dataloader indices to their respective
            names (e.g. train, val, test).
        group_key: The metadata key to group the embeddings by. If specified, the
            embedding files will be saved in subdirectories named after the group_key.
            If specified, the key must be present in the metadata of the input batch.
        overwrite: Whether to overwrite the output directory. Defaults to True.
    """
    super().__init__(write_interval="batch")

    self._output_dir = output_dir
    self._backbone = backbone
    self._dataloader_idx_map = dataloader_idx_map or {}
    self._group_key = group_key
    self._overwrite = overwrite

    self._write_queue: multiprocessing.Queue
    self._write_process: eva_multiprocessing.Process