VertexAIImageCaptioner
VertexAIImageCaptioner enables text generation using Google Vertex AI imagetext generative model.
| Mandatory run variables | “image”: A ByteStream object storing an image |
| Output variables | “captions”: A list of strings generated by the model |
| API reference | Google Vertex |
| GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/google_vertex |
Parameters Overview
VertexAIImageCaptioner uses Google Cloud Application Default Credentials (ADCs) for authentication. For more information on how to set up ADCs, see the official documentation.
Keep in mind that it’s essential to use an account that has access to a project authorized to use Google Vertex AI endpoints.
You can find your project ID in the GCP resource manager or locally by running gcloud projects list in your terminal. For more info on the gcloud CLI, see its official documentation.
Usage
You need to install google-vertex-haystack package to use the VertexAIImageCaptioner:
On its own
Basic usage:
python
import requests
from haystack.dataclasses.byte_stream import ByteStream
from haystack_integrations.components.generators.google_vertex import VertexAIImageCaptioner
captioner = VertexAIImageCaptioner()
image = ByteStream(data=requests.get("https://raw.githubusercontent.com/silvanocerza/robots/main/robot1.jpg").content)
result = captioner.run(image=image)
for caption in result["captions"]:
print(caption)
You can also set the caption language and the number of results:
python
import requests
from haystack.dataclasses.byte_stream import ByteStream
from haystack_integrations.components.generators.google_vertex import VertexAIImageCaptioner
captioner = VertexAIImageCaptioner(
number_of_results=3, # Can't be greater than 3
language="it",
)
image = ByteStream(data=requests.get("https://raw.githubusercontent.com/silvanocerza/robots/main/robot1.jpg").content)
result = captioner.run(image=image)
for caption in result["captions"]:
print(caption)