Datasets
Reference information for the multimodal data Datasets
API.
eva.multimodal.data.datasets.TextImageDataset
Bases: MultimodalDataset[TextImageSample[TargetType]]
, TextDataset
, ABC
, Generic[TargetType]
Base dataset class for text-image tasks.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*args |
Positional arguments for the base class. |
()
|
|
transforms |
TransformsSchema | None
|
The transforms to apply to the text, image and target when loading the samples. |
None
|
**kwargs |
Keyword arguments for the base class. |
{}
|
Source code in src/eva/multimodal/data/datasets/text_image.py
load_image
abstractmethod
Returns the image content.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index |
int
|
The index of the data sample. |
required |
Returns:
Type | Description |
---|---|
Image
|
The image content. |
eva.multimodal.data.datasets.PatchCamelyon
Bases: TextImageDataset[int]
, PatchCamelyon
PatchCamelyon image classification using a multiple choice text prompt.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
root |
str
|
The path to the dataset root. This path should contain the uncompressed h5 files and the metadata. |
required |
split |
Literal['train', 'val', 'test']
|
The dataset split for training, validation, or testing. |
required |
download |
bool
|
Whether to download the data for the specified split.
Note that the download will be executed only by additionally
calling the :meth: |
False
|
transforms |
TransformsSchema | None
|
A function/transform which returns a transformed version of the raw data samples. |
None
|
prompt |
str | None
|
The text prompt to use for classification (multple choice). |
None
|
max_samples |
int | None
|
Maximum number of samples to use. If None, use all samples. |
None
|