Skip to content

Datasets

VisionDataset

eva.vision.data.datasets.VisionDataset

Bases: Dataset, ABC, Generic[DataSample]

Base dataset class for vision tasks.

filename abstractmethod

Returns the filename of the index'th data sample.

Note that this is the relative file path to the root.

Parameters:

Name Type Description Default
index int

The index of the data-sample to select.

required

Returns:

Type Description
str

The filename of the index'th data sample.

Source code in src/eva/vision/data/datasets/vision.py
@abc.abstractmethod
def filename(self, index: int) -> str:
    """Returns the filename of the `index`'th data sample.

    Note that this is the relative file path to the root.

    Args:
        index: The index of the data-sample to select.

    Returns:
        The filename of the `index`'th data sample.
    """

Classification datasets

eva.vision.data.datasets.BACH

Bases: ImageClassification

Dataset class for BACH images and corresponding targets.

The dataset is split into train and validation by taking into account the patient IDs to avoid any data leakage.

Parameters:

Name Type Description Default
root str

Path to the root directory of the dataset. The dataset will be downloaded and extracted here, if it does not already exist.

required
split Literal['train', 'val'] | None

Dataset split to use. If None, the entire dataset is used.

None
download bool

Whether to download the data for the specified split. Note that the download will be executed only by additionally calling the :meth:prepare_data method and if the data does not yet exist on disk.

False
image_transforms Callable | None

A function/transform that takes in an image and returns a transformed version.

None
target_transforms Callable | None

A function/transform that takes in the target and transforms it.

None
Source code in src/eva/vision/data/datasets/classification/bach.py
def __init__(
    self,
    root: str,
    split: Literal["train", "val"] | None = None,
    download: bool = False,
    image_transforms: Callable | None = None,
    target_transforms: Callable | None = None,
) -> None:
    """Initialize the dataset.

    The dataset is split into train and validation by taking into account
    the patient IDs to avoid any data leakage.

    Args:
        root: Path to the root directory of the dataset. The dataset will
            be downloaded and extracted here, if it does not already exist.
        split: Dataset split to use. If `None`, the entire dataset is used.
        download: Whether to download the data for the specified split.
            Note that the download will be executed only by additionally
            calling the :meth:`prepare_data` method and if the data does
            not yet exist on disk.
        image_transforms: A function/transform that takes in an image
            and returns a transformed version.
        target_transforms: A function/transform that takes in the target
            and transforms it.
    """
    super().__init__(
        image_transforms=image_transforms,
        target_transforms=target_transforms,
    )

    self._root = root
    self._split = split
    self._download = download

    self._samples: List[Tuple[str, int]] = []
    self._indices: List[int] = []

eva.vision.data.datasets.PatchCamelyon

Bases: ImageClassification

Dataset class for PatchCamelyon images and corresponding targets.

Parameters:

Name Type Description Default
root str

The path to the dataset root. This path should contain the uncompressed h5 files and the metadata.

required
split Literal['train', 'val', 'test']

The dataset split for training, validation, or testing.

required
download bool

Whether to download the data for the specified split. Note that the download will be executed only by additionally calling the :meth:prepare_data method.

False
image_transforms Callable | None

A function/transform that takes in an image and returns a transformed version.

None
target_transforms Callable | None

A function/transform that takes in the target and transforms it.

None
Source code in src/eva/vision/data/datasets/classification/patch_camelyon.py
def __init__(
    self,
    root: str,
    split: Literal["train", "val", "test"],
    download: bool = False,
    image_transforms: Callable | None = None,
    target_transforms: Callable | None = None,
) -> None:
    """Initializes the dataset.

    Args:
        root: The path to the dataset root. This path should contain
            the uncompressed h5 files and the metadata.
        split: The dataset split for training, validation, or testing.
        download: Whether to download the data for the specified split.
            Note that the download will be executed only by additionally
            calling the :meth:`prepare_data` method.
        image_transforms: A function/transform that takes in an image
            and returns a transformed version.
        target_transforms: A function/transform that takes in the target
            and transforms it.
    """
    super().__init__(
        image_transforms=image_transforms,
        target_transforms=target_transforms,
    )

    self._root = root
    self._split = split
    self._download = download

eva.vision.data.datasets.TotalSegmentatorClassification

Bases: ImageClassification

TotalSegmentator multi-label classification dataset.

Parameters:

Name Type Description Default
root str

Path to the root directory of the dataset. The dataset will be downloaded and extracted here, if it does not already exist.

required
split Literal['train', 'val'] | None

Dataset split to use. If None, the entire dataset is used.

required
version Literal['small', 'full']

The version of the dataset to initialize.

'small'
download bool

Whether to download the data for the specified split. Note that the download will be executed only by additionally calling the :meth:prepare_data method and if the data does not exist yet on disk.

False
image_transforms Callable | None

A function/transform that takes in an image and returns a transformed version.

None
target_transforms Callable | None

A function/transform that takes in the target and transforms it.

None
Source code in src/eva/vision/data/datasets/classification/total_segmentator.py
def __init__(
    self,
    root: str,
    split: Literal["train", "val"] | None,
    version: Literal["small", "full"] = "small",
    download: bool = False,
    image_transforms: Callable | None = None,
    target_transforms: Callable | None = None,
) -> None:
    """Initialize dataset.

    Args:
        root: Path to the root directory of the dataset. The dataset will
            be downloaded and extracted here, if it does not already exist.
        split: Dataset split to use. If None, the entire dataset is used.
        version: The version of the dataset to initialize.
        download: Whether to download the data for the specified split.
            Note that the download will be executed only by additionally
            calling the :meth:`prepare_data` method and if the data does not
            exist yet on disk.
        image_transforms: A function/transform that takes in an image
            and returns a transformed version.
        target_transforms: A function/transform that takes in the target
            and transforms it.
    """
    super().__init__(
        image_transforms=image_transforms,
        target_transforms=target_transforms,
    )

    self._root = root
    self._split = split
    self._version = version
    self._download = download

    self._samples_dirs: List[str] = []
    self._indices: List[int] = []

Segmentation datasets

eva.vision.data.datasets.ImageSegmentation

Bases: VisionDataset[Tuple[ndarray, ndarray]], ABC

Image segmentation abstract dataset.

Parameters:

Name Type Description Default
image_transforms Callable | None

A function/transform that takes in an image and returns a transformed version.

None
target_transforms Callable | None

A function/transform that takes in the target and transforms it.

None
image_target_transforms Callable | None

A function/transforms that takes in an image and a label and returns the transformed versions of both. This transform happens after the image_transforms and target_transforms.

None
Source code in src/eva/vision/data/datasets/segmentation/base.py
def __init__(
    self,
    image_transforms: Callable | None = None,
    target_transforms: Callable | None = None,
    image_target_transforms: Callable | None = None,
) -> None:
    """Initializes the image segmentation base class.

    Args:
        image_transforms: A function/transform that takes in an image
            and returns a transformed version.
        target_transforms: A function/transform that takes in the target
            and transforms it.
        image_target_transforms: A function/transforms that takes in an
            image and a label and returns the transformed versions of both.
            This transform happens after the `image_transforms` and
            `target_transforms`.
    """
    super().__init__()

    self._image_transforms = image_transforms
    self._target_transforms = target_transforms
    self._image_target_transforms = image_target_transforms

classes: List[str] | None property

Returns the list with names of the dataset names.

class_to_idx: Dict[str, int] | None property

Returns a mapping of the class name to its target index.

load_metadata

Returns the dataset metadata.

Parameters:

Name Type Description Default
index int | None

The index of the data sample to return the metadata of. If None, it will return the metadata of the current dataset.

required

Returns:

Type Description
Dict[str, Any] | List[Dict[str, Any]] | None

The sample metadata.

Source code in src/eva/vision/data/datasets/segmentation/base.py
def load_metadata(self, index: int | None) -> Dict[str, Any] | List[Dict[str, Any]] | None:
    """Returns the dataset metadata.

    Args:
        index: The index of the data sample to return the metadata of.
            If `None`, it will return the metadata of the current dataset.

    Returns:
        The sample metadata.
    """

load_image abstractmethod

Loads and returns the index'th image sample.

Parameters:

Name Type Description Default
index int

The index of the data sample to load.

required

Returns:

Type Description
ndarray

The image as a numpy array.

Source code in src/eva/vision/data/datasets/segmentation/base.py
@abc.abstractmethod
def load_image(self, index: int) -> np.ndarray:
    """Loads and returns the `index`'th image sample.

    Args:
        index: The index of the data sample to load.

    Returns:
        The image as a numpy array.
    """

load_mask abstractmethod

Returns the index'th target mask sample.

Parameters:

Name Type Description Default
index int

The index of the data sample target mask to load.

required

Returns:

Type Description
ndarray

The sample mask as a stack of binary mask arrays (label, height, width).

Source code in src/eva/vision/data/datasets/segmentation/base.py
@abc.abstractmethod
def load_mask(self, index: int) -> np.ndarray:
    """Returns the `index`'th target mask sample.

    Args:
        index: The index of the data sample target mask to load.

    Returns:
        The sample mask as a stack of binary mask arrays (label, height, width).
    """

eva.vision.data.datasets.TotalSegmentator2D

Bases: ImageSegmentation

TotalSegmentator 2D segmentation dataset.

Parameters:

Name Type Description Default
root str

Path to the root directory of the dataset. The dataset will be downloaded and extracted here, if it does not already exist.

required
split Literal['train', 'val'] | None

Dataset split to use. If None, the entire dataset is used.

required
version Literal['small', 'full']

The version of the dataset to initialize.

'small'
download bool

Whether to download the data for the specified split. Note that the download will be executed only by additionally calling the :meth:prepare_data method and if the data does not exist yet on disk.

False
image_transforms Callable | None

A function/transform that takes in an image and returns a transformed version.

None
target_transforms Callable | None

A function/transform that takes in the target and transforms it.

None
image_target_transforms Callable | None

A function/transforms that takes in an image and a label and returns the transformed versions of both. This transform happens after the image_transforms and target_transforms.

None
Source code in src/eva/vision/data/datasets/segmentation/total_segmentator.py
def __init__(
    self,
    root: str,
    split: Literal["train", "val"] | None,
    version: Literal["small", "full"] = "small",
    download: bool = False,
    image_transforms: Callable | None = None,
    target_transforms: Callable | None = None,
    image_target_transforms: Callable | None = None,
) -> None:
    """Initialize dataset.

    Args:
        root: Path to the root directory of the dataset. The dataset will
            be downloaded and extracted here, if it does not already exist.
        split: Dataset split to use. If `None`, the entire dataset is used.
        version: The version of the dataset to initialize.
        download: Whether to download the data for the specified split.
            Note that the download will be executed only by additionally
            calling the :meth:`prepare_data` method and if the data does not
            exist yet on disk.
        image_transforms: A function/transform that takes in an image
            and returns a transformed version.
        target_transforms: A function/transform that takes in the target
            and transforms it.
        image_target_transforms: A function/transforms that takes in an
            image and a label and returns the transformed versions of both.
            This transform happens after the `image_transforms` and
            `target_transforms`.
    """
    super().__init__(
        image_transforms=image_transforms,
        target_transforms=target_transforms,
        image_target_transforms=image_target_transforms,
    )

    self._root = root
    self._split = split
    self._version = version
    self._download = download

    self._samples_dirs: List[str] = []
    self._indices: List[int] = []