Skip to content

Provider Comparison

This library currently ships first-party adapters for Anthropic, OpenAI, and Google Gemini.

Capability Matrix

ProviderSelected seeded completion modelsStreamingTool callingVision inputsSession persistenceNotes
Anthropicclaude-sonnet-4-6, claude-haiku-4-5, claude-opus-4-6YesYesYesVia Conversation + session storesAnthropic cache read/write pricing is modeled separately, including block-level and request-level cache_control.
OpenAIgpt-5.4, gpt-5.4-mini, gpt-5.4-nano, gpt-4o, gpt-4o-mini, o3YesYesYesVia Conversation + session storesUses the stateless Responses API with store: false and library-owned history replay.
Google Geminigemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite, gemini-3.1-pro-preview, gemini-3.1-flash-lite-previewYesYesYesVia Conversation + session storesStreaming uses the dedicated streamGenerateContent endpoint, and explicit caches are managed with client.googleCaches.

These are the checked-in seeded completion models, not the provider's full live catalog. Use client.models.listRemote({ provider }) when you want discovery against the provider's current model list.

Translation Differences

ConcernAnthropicOpenAIGemini
System prompt handlingLifted into system blocksFlattened into top-level instructionsLifted into systemInstruction
Assistant role nameassistantassistantmodel
Tool call payloadtool_use blocksfunction_call.arguments JSON string with call_idfunctionCall.args object
Tool result payloadtool_result block in a user turnfunction_call_output item keyed by call_idfunctionResponse part in a user turn
Streaming terminatorSSE close / message_stopresponse.completed / stream closeSSE close on dedicated stream endpoint

Choosing a Provider

  • Choose OpenAI when you want Responses-first model coverage, automatic prompt caching on supported models, and the broadest ecosystem compatibility.
  • Choose Anthropic when long-context tool workflows or prompt caching behavior are central to the workload.
  • Choose Gemini when you need a single provider surface that is comfortable with mixed text, vision, document, and audio inputs.

Operational Notes

  • All three adapters normalize token usage into a shared UsageMetrics shape and estimate cost from the model registry.
  • All three adapters map auth, rate-limit, context-window, and generic provider failures into typed LLMError subclasses.
  • OpenAI and Gemini cached-read usage is priced separately from uncached input when the provider returns cached-token counts.
  • Live-provider smoke tests can be executed with LIVE_TESTS=1 pnpm test:live after populating .env.

Provider-agnostic LLM tooling for TypeScript applications.