Structured outputs with Vertex AI, a complete guide w/ instructor¶

Google Cloud's Vertex AI provides enterprise-grade AI capabilities with robust scaling and security features. This guide shows you how to use Instructor with Vertex AI for type-safe, validated responses.

Migration Notice

The direct from_vertexai integration is being deprecated in favor of the unified google-genai SDK. Please use from_provider or from_genai with vertexai=True for new projects. See the migration guide below.

Quick Start¶

Install Instructor with Google GenAI support (which includes Vertex AI):

pip install "instructor[google-genai]"

Simple User Example (Sync)¶

import instructor
from pydantic import BaseModel
import os

# Set your project ID and location
os.environ["GOOGLE_CLOUD_PROJECT"] = "your-project-id"
os.environ["GOOGLE_CLOUD_LOCATION"] = "us-central1"


class User(BaseModel):
    name: str
    age: int


# Using from_provider (recommended)
client = instructor.from_provider(
    "vertexai/gemini-3-flash",
)

resp = client.create(
    response_model=User,
    messages=[
        {
            "role": "user",
            "content": "Extract Jason is 25 years old.",
        }
    ],
)

print(resp)
#> User(name='Jason', age=25)

Simple User Example (Async)¶

import asyncio
import instructor
import vertexai  # type: ignore
from vertexai.generative_models import GenerativeModel  # type: ignore
from pydantic import BaseModel

vertexai.init()


class User(BaseModel):
    name: str
    age: int


client = instructor.from_provider(
    "vertexai/gemini-1.5-pro-preview-0409",
    async_client=True,
    mode=instructor.Mode.TOOLS,
)

async def extract_user():
    user = await client.create(
        messages=[
            {
                "role": "user",
                "content": "Extract Jason is 25 years old.",
            }
        ],
        response_model=User,
    )
    return user


# Run async function
user = asyncio.run(extract_user())
print(user)  # User(name='Jason', age=25)

Streaming Support¶

The v2 VertexAI provider exposes partial streaming with Mode.TOOLS and Mode.MD_JSON. It does not currently advertise public create_iterable() streaming; use partial streaming for incremental results or GenAI when iterable streaming is required.

Streaming Partial Responses¶

import vertexai  # type: ignore
from vertexai.generative_models import GenerativeModel  # type: ignore
import instructor
from pydantic import BaseModel
from instructor.dsl.partial import Partial

vertexai.init()

class UserExtract(BaseModel):
    name: str
    age: int

client = instructor.from_provider(
    "vertexai/gemini-1.5-pro-preview-0409",
    mode=instructor.Mode.TOOLS,
)

# Stream partial responses
response_stream = client.create(
    response_model=Partial[UserExtract],
    stream=True,
    messages=[
        {"role": "user", "content": "Anibal is 23 years old"},
    ],
)

for partial_user in response_stream:
    print(f"Received update: {partial_user}")
# Output might show:
# Received update: UserExtract(name='Anibal', age=None)
# Received update: UserExtract(name='Anibal', age=23)

Async Partial Streaming¶

The async client exposes the same partial streaming contract:

import asyncio
import vertexai  # type: ignore
from vertexai.generative_models import GenerativeModel  # type: ignore
import instructor
from pydantic import BaseModel
from instructor.dsl.partial import Partial

vertexai.init()

class UserExtract(BaseModel):
    name: str
    age: int

client = instructor.from_provider(
    "vertexai/gemini-1.5-pro-preview-0409",
    async_client=True,
    mode=instructor.Mode.TOOLS,
)

async def stream_partial():
    response_stream = await client.create(
        response_model=Partial[UserExtract],
        stream=True,
        messages=[
            {"role": "user", "content": "Anibal is 23 years old"},
        ],
    )

    async for partial_user in response_stream:
        print(f"Received update: {partial_user}")

# Run async functions
asyncio.run(stream_partial())

Migration to Google GenAI¶

The legacy from_vertexai method is being deprecated in favor of the unified Google GenAI SDK. Here's how to migrate:

Old Way (Deprecated)¶

import instructor
import vertexai
from vertexai.generative_models import GenerativeModel

vertexai.init(project="your-project", location="us-central1")

client = instructor.from_provider("google/gemini-2.5-flash", vertexai=True),
    mode=instructor.Mode.TOOLS,
)

New Way (Recommended)¶

import instructor

# Option 1: Using from_provider (simplest)
client = instructor.from_provider(
    "vertexai/gemini-3-flash",
    project="your-project",  # Optional if set in environment
    location="us-central1"   # Optional, defaults to us-central1
)

# Option 2: Using from_genai with Google GenAI SDK
from google import genai
from instructor import from_genai

client = from_genai(
    genai.Client(
        vertexai=True,
        project="your-project",
        location="us-central1",
        model="gemini-3-flash"
    )
)

Environment Variables¶

You can also set these environment variables to avoid passing project/location each time:

export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"

Updates and Compatibility¶

Instructor maintains compatibility with Vertex AI's latest API versions. Check the changelog for updates.

Partial streaming is available through both synchronous and asynchronous interfaces. Public iterable streaming is not yet part of the VertexAI capability contract.