Skip to content

Image Utilities

Reference information for the multimodal image utilities API.

eva.multimodal.utils.image.encode_image

Encodes an image tensor into a string format.

Parameters:

Name Type Description Default
image Image

The image tensor to encode.

required
encoding Literal['base64']

The encoding format to use. Currently only supports "base64".

required

Returns:

Type Description
str

An encoded string representation of the image.

Source code in src/eva/multimodal/utils/image/encode.py
def encode_image(image: tv_tensors.Image, encoding: Literal["base64"]) -> str:
    """Encodes an image tensor into a string format.

    Args:
        image: The image tensor to encode.
        encoding: The encoding format to use. Currently only supports "base64".

    Returns:
        An encoded string representation of the image.
    """
    match encoding:
        case "base64":
            image_bytes = io.BytesIO()
            F.to_pil_image(image).save(image_bytes, format="PNG", optimize=True)
            image_bytes.seek(0)
            return base64.b64encode(image_bytes.getvalue()).decode("utf-8")
        case _:
            raise ValueError(f"Unsupported encoding type: {encoding}. Supported: 'base64'")