Session API Reference
This page documents the built-in framework-agnostic session API exposed by createSessionApi().
Use this when you want your application or frontend to work with conversations over HTTP instead of instantiating Conversation directly inside the same process.
Overview
The library exposes a session API through createSessionApi({ client, sessionStore, ... }).
It is intentionally built on top of standard web Request and Response objects, so it can be mounted in:
- Next.js route handlers
- Hono
- Cloudflare Workers
- Express or Fastify adapters
- plain Node HTTP wrappers
Construction
import {
LLMClient,
PostgresSessionStore,
createSessionApi,
} from 'unified-llm-client';
const sessionStore = PostgresSessionStore.fromEnv();
const client = LLMClient.fromEnv({
defaultModel: 'gpt-4o',
sessionStore,
});
const sessionApi = createSessionApi({
client,
sessionStore,
});Important Runtime Notes
- A session store is required for durable API behavior.
- If the client already has a store configured,
SessionApican reuse it. contextManagerand executabletoolsare server-side configuration, not client-submitted payloads.- Middleware can resolve tenant identity or reject requests before any model call runs.
withRequestContext(context, execute)is the hook for request-local DB scoping or RLS-style session setup.
Endpoints
POST /sessions
Creates and persists a new session snapshot.
Example body:
{
"sessionId": "support-123",
"system": "Be concise.",
"messages": [
{ "role": "user", "content": "Initial history item" }
],
"model": "gpt-4o"
}Returns 201 with:
session.idsession.createdAtsession.updatedAtsession.messagessession.totals
POST /sessions/{id}/message
Loads the stored session, appends one user input, runs the model, persists the updated snapshot, and returns the canonical response plus updated session state.
Example body:
{
"content": "Summarize the latest ticket."
}If stream=true is passed in the query string or request body, the endpoint returns text/event-stream.
GET /sessions/{id}
Returns the session metadata and full history.
The query parameter include accepts comma-separated values:
messagesusagecost
By default, this route includes messages and cost. usage is included when explicitly requested and a usage logger with aggregation support is configured.
GET /sessions/{id}/messages
Returns paginated session history.
Supported query parameters:
cursorZero-based offset encoded as a stringlimitRange1..100
DELETE /sessions/{id}
Deletes the stored session snapshot.
POST /sessions/{id}/compact
Performs manual compaction against the stored session snapshot.
Options:
- Provide
maxMessagesand/ormaxTokensin the body to use a one-offSlidingWindowStrategy - Or configure
contextManagerwhen constructingSessionApi
POST /sessions/{id}/fork
Creates a new session from an earlier point in the conversation.
Example body:
{
"fromMessageIndex": 3,
"newSessionId": "support-123-branch",
"resetUsage": true
}Notes:
fromMessageIndexis evaluated against the full session history, including the system message when presentresetUsagedefaults totrue- When
resetUsageisfalse, the stored aggregate totals are copied into the fork unchanged
GET /sessions
Lists tenant-scoped sessions with pagination.
Supported query parameters:
cursorlimitmodelprovidertenantId
Streaming Event Mapping
POST /sessions/{id}/message?stream=true emits canonical SSE events:
session.message.startedresponse.text.deltaresponse.tool_call.startresponse.tool_call.deltaresponse.tool_call.resultresponse.completedresponse.error
These are canonical library events, not raw provider transport frames.
Tenant Auth And RLS Context
middleware is the authentication and tenant-resolution layer.
Example:
const sessionApi = createSessionApi({
client,
sessionStore,
middleware: [
async (request) => {
const tenantId = request.headers.get('x-tenant-id');
return tenantId
? { tenantId }
: Response.json({ error: 'Unauthorized' }, { status: 401 });
},
],
withRequestContext: async (context, execute) => {
return execute();
},
});Operational rule:
- If middleware resolves a tenant id, it overrides any tenant id the caller attempted to send in body or query parameters
OpenAI Responses API Mapping
Previous Response Chains
POST /sessions/{id}/message is the provider-agnostic equivalent of building continuity from a prior response id.
Instead of resending a previous-response handle, the library anchors continuity to the stored sessionId.
Conversation Identity
sessionId is the practical equivalent of a conversation or thread identifier.
Context Management
maxContextTokens + contextManager maps to the same operational problem as compact-threshold style context management:
maxContextTokensis the thresholdSlidingWindowStrategyorSummarisationStrategyis the trimming policyPOST /sessions/{id}/compactis the explicit manual trigger
Future Async Handling
The current endpoint design is synchronous.
The intended future async shape is:
- keep
POST /sessions/{id}/messagefor sync flows - add an async mode that returns a job id immediately
- persist job state separately from the conversation snapshot
- merge the final completion back into the session history when done
Include Semantics
GET /sessions/{id}?include=messages,usage,cost is the equivalent of asking for expanded response metadata in one request.