Multi-Provider AI: Building Model-Agnostic Agents with pydantic-ai¶

One of the most frustrating aspects of building with LLMs is vendor lock-in. You build a system around Claude, then Gemini releases a faster model. Or you want to use Groq for speed but Claude for quality. Switching means rewriting API calls, parsing logic, and error handling.

I solved this problem by building a model-agnostic agent layer using pydantic-ai. This post explains how.

The Problem: Every Provider Is Different¶

Here's what calling different LLM providers looks like without abstraction:

OpenAIAnthropicGoogle Gemini

from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": prompt}]
)
result = response.choices[0].message.content

import anthropic

client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4",
    max_tokens=4096,
    messages=[{"role": "user", "content": prompt}]
)
result = response.content[0].text

import google.generativeai as genai

model = genai.GenerativeModel("gemini-2.5-flash")
response = model.generate_content(prompt)
result = response.text

Three different APIs. Three different response structures. Three times the maintenance burden.

The Solution: pydantic-ai as Abstraction Layer¶

pydantic-ai provides a unified interface across providers:

from pydantic_ai import Agent

# Create an agent with any supported model
agent = Agent("google-gla:gemini-3-flash-preview")

# Run it the same way regardless of provider
result = agent.run_sync("Extract the company name from this job posting...")
print(result.output)

The magic is in the model string format: provider:model-name. Switch providers by changing a single string.

My Model Normalization Layer¶

In my CV build system, I wanted even simpler model names. Instead of google-gla:gemini-3-flash-preview, I wanted flash-3. So I built a normalization layer:

def normalize_model_name(model: str) -> str:
    """Convert friendly model names to pydantic-ai format.

    Mapping:
      gemini-*       → google-gla:gemini-*
      claude-*       → anthropic:claude-*
      openai/*       → openai:* (strip prefix, add colon)
      qwen/*         → openai:qwen/* (use OpenAI-compatible endpoint)
    """
    if model.startswith('gemini-'):
        return f"google-gla:{model}"
    elif model.startswith('claude-'):
        return f"anthropic:{model}"
    elif model.startswith('openai/'):
        # OpenAI-compatible models via Groq
        return f"openai:{model.removeprefix('openai/')}"
    elif model.startswith('qwen/'):
        # Qwen models via OpenAI-compatible endpoint
        return f"openai:{model}"
    else:
        # Already in pydantic-ai format or unknown
        return model

Now my build commands accept simple names:

just add https://company.com/job flash-3     # Gemini 3 Flash
just add https://company.com/job sonnet-4    # Claude Sonnet 4
just add https://company.com/job gpt-oss-120b  # Groq's open model

My system supports multiple providers and models:

Shortcut	Full Model Name	Provider	Use Case
`flash-3`	`gemini-3-flash-preview`	Google	Fast, cheap, good for extraction
`flash-2-5`	`gemini-2.5-flash`	Google	Production flash model
`pro-3`	`gemini-3-pro-preview`	Google	Higher quality, slower
`haiku-4`	`claude-haiku-4`	Anthropic	Fast Claude
`sonnet-4`	`claude-sonnet-4`	Anthropic	Balanced quality/speed
`opus-4`	`claude-opus-4`	Anthropic	Maximum quality
`gpt-oss-120b`	`openai/gpt-oss-120b`	Groq	Open-source via Groq
`qwen3-32b`	`qwen/qwen3-32b`	Groq	Qwen via OpenAI-compatible

The workflow dispatch in GitHub Actions presents these as a dropdown:

inputs:
  model:
    description: "LLM model to use"
    required: true
    type: choice
    options:
      # Google Gemini models
      - gemini-2.5-flash
      - gemini-3-flash-preview
      - gemini-2.5-pro
      - gemini-3-pro-preview
      # OpenAI-compatible models via Groq
      - openai/gpt-oss-120b
      - qwen/qwen3-32b
      # Anthropic Claude models
      - claude-haiku-4
      - claude-sonnet-4
      - claude-opus-4
    default: gemini-3-flash-preview

Environment Variable Management¶

Each provider needs its own API key:

# .env file
GEMINI_API_KEY=your-gemini-key
ANTHROPIC_API_KEY=your-anthropic-key
OPENAI_API_KEY=your-groq-key  # Groq uses OpenAI-compatible API

Pydantic-ai automatically reads these based on the model prefix. No manual key routing needed.

For GitHub Actions, I set all three as repository secrets and pass them to the Python script:

- name: Extract job details
  env:
    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
    GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
  run: uv run python .github/scripts/add.py

The script doesn't need to know which key to use — pydantic-ai handles it based on the model string.

Structured Output: JSON Schema Enforcement¶

Beyond provider abstraction, pydantic-ai gives me structured output. Instead of parsing free-form text, I define a schema and the LLM returns data that matches:

from pydantic import BaseModel
from pydantic_ai import Agent

class JobPosting(BaseModel):
    company: str
    title: str
    location: str | None
    requirements: list[str]
    nice_to_have: list[str]

agent = Agent(
    "google-gla:gemini-3-flash-preview",
    result_type=JobPosting,  # Enforce this schema
)

result = agent.run_sync(f"Extract job details from: {job_text}")
job: JobPosting = result.output  # Strongly typed!

No regex parsing. No JSON extraction from markdown code blocks. Just typed data.

Why This Matters for Production¶

A/B Testing Models — Try Gemini vs Claude on the same task, compare results
Cost Optimization — Use cheap models for simple tasks, expensive ones for complex
Fallback Chains — If one provider is down, switch to another
Speed Tuning — Use Groq for latency-sensitive tasks, Claude for quality-sensitive

In my CV system, I use:

Gemini Flash for job posting extraction (fast, cheap)
Gemini Pro or Claude Sonnet for CV tailoring (needs quality)
Any model for iterative feedback (user choice)

The Abstraction Tax¶

There's always a cost to abstraction. With pydantic-ai:

Pros: Clean code, easy switching, structured output, usage tracking
Cons: Another dependency, slight overhead, may lag behind latest provider features

For my use case, the benefits far outweigh the costs. I can experiment with new models in minutes instead of hours.

Key Takeaways¶

Abstract early — Don't couple your code to a specific provider
Use friendly names — Build a normalization layer for your team
Structured output — Define schemas, not prompts that ask for JSON
Track usage — pydantic-ai gives you token counts for free

The LLM landscape changes fast. Building model-agnostic from day one means you're always ready to switch.

Tools mentioned:

pydantic-ai — Python AI agent framework
Groq — Fast LLM inference
Google Gemini — Google's LLM API
Anthropic Claude — Claude API