Skip to content

Dataloaders

Reference information for the Dataloader classes.

eva.data.DataLoader dataclass

The DataLoader combines a dataset and a sampler.

It provides an iterable over the given dataset.

batch_size: int | None = 1 class-attribute instance-attribute

How many samples per batch to load.

Set to None for iterable dataset where dataset produces batches.

shuffle: bool = False class-attribute instance-attribute

Whether to shuffle the data at every epoch.

sampler: samplers.Sampler | None = None class-attribute instance-attribute

Defines the strategy to draw samples from the dataset.

Can be any Iterable with __len__ implemented. If specified, shuffle must not be specified.

batch_sampler: samplers.Sampler | None = None class-attribute instance-attribute

Like sampler, but returns a batch of indices at a time.

Mutually exclusive with batch_size, shuffle, sampler and drop_last.

num_workers: int = multiprocessing.cpu_count() class-attribute instance-attribute

How many workers to use for loading the data.

By default, it will use the number of CPUs available.

collate_fn: Callable | None = None class-attribute instance-attribute

The batching process.

pin_memory: bool = True class-attribute instance-attribute

Will copy Tensors into CUDA pinned memory before returning them.

drop_last: bool = False class-attribute instance-attribute

Drops the last incomplete batch.

persistent_workers: bool = True class-attribute instance-attribute

Will keep the worker processes after a dataset has been consumed once.

prefetch_factor: int | None = 2 class-attribute instance-attribute

Number of batches loaded in advance by each worker.