0xtrace
SDK v1.0.4

0xtrace Documentation

AI observability for LLM applications. Intercept every call, visualize prompt deltas, and kill context bloat before it kills your budget.

Quickstart

Get your first trace into the dashboard in under 5 minutes.

1
Install the SDK
2
Wrap your client
3
See your traces

Installation

bash
npm install 0xtrace

The SDK works with any OpenAI-compatible provider — OpenAI, Groq, Together AI, Mistral, and any service that exposes the chat.completions.create interface.

Basic Setup

typescript
import OpenAI from "openai";
import { Tracer, wrapOpenAI } from "0xtrace";

const tracer = new Tracer({
  ingestUrl: "https://your-app.vercel.app/api/ingest",
  apiKey:    process.env.INGEST_API_KEY,
  sessionId: crypto.randomUUID(), // groups calls into one agent run
});

const client = wrapOpenAI(new OpenAI(), tracer);

// Use exactly like the original client — nothing else changes
const response = await client.chat.completions.create({
  model:    "gpt-4o",
  messages: [{ role: "user", content: "Hello" }],
});

With Groq

typescript
import OpenAI from "openai";
import { Tracer, wrapOpenAI } from "0xtrace";

const tracer = new Tracer({
  ingestUrl: "https://your-app.vercel.app/api/ingest",
  apiKey:    process.env.INGEST_API_KEY,
  sessionId: crypto.randomUUID(),
});

// Pass Groq as an OpenAI-compatible client
const client = wrapOpenAI(
  new OpenAI({
    apiKey:  process.env.GROQ_API_KEY,
    baseURL: "https://api.groq.com/openai/v1",
  }),
  tracer,
);
Tip The ingest URL and API key are found in your project Settings page. Set them as environment variables — never hardcode them.

SDK Reference

Tracer Options

Pass these options when constructing a Tracer instance.

OptionTypeDescription
ingestUrlstringrequiredFull URL of your /api/ingest endpoint.
apiKeystringrequiredYour project API key from the Settings page.
sessionIdstringoptionalGroups multiple calls into one agent session. Auto-generated if omitted.
metadataRecord<string, string>optionalArbitrary key/value pairs attached to every trace (e.g. userId, environment).
enabledbooleanoptionalSet false to disable telemetry entirely, e.g. in unit tests. Default: true.
timeoutMsnumberoptionalMax ms to wait for the ingest POST before aborting. Default: 5000.
onError(err, payload) => voidoptionalCalled when ingest fails after all retries. Defaults to console.warn.

wrapOpenAI(client, tracer)

Wraps an OpenAI client instance with a transparent telemetry proxy. Returns a drop-in replacement — same types, same streaming behaviour, zero added latency. Works with both standard and streaming responses.

typescript
import OpenAI from "openai";
import { Tracer, wrapOpenAI } from "0xtrace";

const tracer = new Tracer({ ingestUrl: "...", apiKey: "..." });
const ai     = wrapOpenAI(new OpenAI(), tracer);

// Streaming works identically
const stream = await ai.chat.completions.create({
  model:    "gpt-4o",
  messages,
  stream:   true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

tracer.flush()

Waits for all buffered and in-flight payloads to be delivered. Call this before process exit or at the end of integration tests to ensure no traces are dropped.

typescript
// In tests
afterAll(async () => {
  await tracer.flush();
});

// On process exit
process.on("SIGTERM", async () => {
  await tracer.flush();
  process.exit(0);
});
Note In serverless environments (Vercel, AWS Lambda) the process may be frozen before flush completes. The SDK batches and retries automatically, so most traces will arrive — but calling flush() before returning from a long-running route is good practice.

Dashboard Guide

Sessions

Every agent run grouped by session ID. Each row shows the model, number of steps, total token usage, cost, average latency, and anomaly status. Click a row to open the session detail.

Explorer

Raw call browser — one row per LLM call, not per session. Filter by model, status, or session ID prefix. Sort by any column. Use this to find a specific call when you know its rough timestamp.

Diff X-Ray

Step-by-step prompt delta visualizer. Green lines were added to the context, red lines were removed. The metadata panel shows tokens, cost, latency, and context window usage for that specific call.

Cost Analysis

14-day spend chart, model breakdown table (cost, tokens, calls, avg latency per model), top 10 sessions by cost, and a latency distribution histogram.

Anomalies

Auto-detected and SDK-flagged issues. Four detection types: token explosion (context grew >2.5× session avg), high latency (>5s), session cost spike (>5× account avg), and explicit SDK flags.

Replay Engine

Re-fire any captured prompt against any model. Edit the messages, switch the model, and compare outputs side by side. Useful for prompt optimization without re-running your entire agent.

Settings

Manage API keys for the active project — generate new keys, revoke compromised ones. The ingest URL and a code snippet are shown here for easy copy-paste into your environment variables.

Self-Hosting

0xtrace is MIT-licensed and fully self-hostable. You need a Supabase project, an Upstash Redis database, and a Vercel account (free tier works).

Environment Variables

bash
# .env.local
NEXT_PUBLIC_SUPABASE_URL=         # Supabase → Project Settings → API
NEXT_PUBLIC_SUPABASE_ANON_KEY=    # Supabase → Project Settings → API
SUPABASE_SERVICE_ROLE_KEY=        # Supabase → Project Settings → API (secret)
UPSTASH_REDIS_REST_URL=           # Upstash → database → REST API
UPSTASH_REDIS_REST_TOKEN=         # Upstash → database → REST API
CRON_SECRET=                      # Any long random string you generate
NEXT_PUBLIC_APP_URL=              # Your deployment URL, e.g. https://0xtrace-mu.vercel.app
OPENAI_API_KEY=                   # Required for Replay Engine → OpenAI models
GROQ_API_KEY=                     # Required for Replay Engine → Groq models

Database Setup

Run the migrations in your Supabase SQL Editor. The schema creates:

profilesAuto-created on GitHub OAuth signup via trigger
projectsWorkspace isolation boundary, one per developer team
api_keysHashed keys scoped to a project
llm_callsOne row per LLM call with full metrics
prompt_snapshotsKeyframe + delta storage for prompt arrays
Info Row Level Security is enabled on all tables. Users can only access rows where their auth.uid() matches the user_id on the projects table. The ingestion pipeline uses the service role key to bypass RLS after validating the API key.

Cron Setup

The drain-queue route runs every minute. On Vercel Hobby, cron jobs are limited to daily — use cron-job.org (free) to call it every minute instead.

bash
# cron-job.org configuration
URL:      https://your-app.vercel.app/api/cron/drain-queue
Schedule: Every 1 minute
Header:   Authorization: Bearer YOUR_CRON_SECRET

Architecture

text
Your AI App
  └── wrapOpenAI(client, tracer)
        │  Proxy intercepts chat.completions.create
        │  Telemetry fires in microtask — <2ms overhead
        ▼
  POST /api/ingest
        │  Validates API key (SHA-256 hash lookup)
        │  Resolves project_id from key
        │  Pushes trace to Redis queue
        ▼
  Upstash Redis (trace:queue)
        │  Absorbs burst traffic & infinite agent loops
        │  Never blocks the caller
        ▼
  GET /api/cron/drain-queue  (runs every minute)
        │  Pops up to 100 traces atomically
        │  Step 1 → stores full prompt snapshot
        │  Step 2+ → stores JSON diff only (~85% storage reduction)
        │  Inserts into llm_calls + prompt_snapshots
        ▼
  Supabase PostgreSQL
        │  All rows scoped to project_id
        │  RLS enforces tenant isolation
        ▼
  Dashboard
        └── Scoped to active project via oxtr_project cookie

Keyframe + Delta Storage

Most observability tools store the full prompt array on every step. For a 10-step agent, that's 10 copies of an ever-growing JSON blob. 0xtrace uses a keyframe + delta model instead:

Without 0xtrace
Step 1: 500 tokens stored
Step 2: 2,800 tokens stored
Step 3: 12,400 tokens stored
Step 4: 34,000 tokens stored
Step 5: 84,200 tokens stored
Total: ~134k tokens in DB
With 0xtrace
Step 1: 500 tokens stored (keyframe)
Step 2: ~40 tokens stored (delta)
Step 3: ~60 tokens stored (delta)
Step 4: ~80 tokens stored (delta)
Step 5: ~90 tokens stored (delta)
Total: ~770 tokens in DB

FAQ

Does the SDK add latency to my LLM calls?

No. Telemetry fires in a microtask (Promise.resolve().then()) after your code continues. The measured overhead is under 2ms even on high-frequency loops.

What happens if the ingest endpoint is down?

The SDK retries up to 3 times with exponential backoff (200ms → 400ms → 800ms). If all retries fail, the onError callback fires and your application continues normally. Your agent is never blocked.

What happens if my agent loops infinitely?

The Redis queue absorbs the traffic. The cron drains 100 traces per minute regardless of how many arrive. Your Supabase database is never hit directly from user traffic.

Is my prompt data stored in plaintext?

Prompt arrays are stored in Supabase under your own project's service role key. API keys are SHA-256 hashed before storage — the plaintext is shown once and never persisted. You control the database.

Can I use this with Anthropic / Gemini / other providers?

Any provider that exposes an OpenAI-compatible API (chat.completions.create) works out of the box. Native Anthropic SDK support is on the roadmap.

How do I group calls from one agent run together?

Pass a sessionId when constructing the Tracer. All calls from that instance are grouped under the same session in the dashboard. Use crypto.randomUUID() per agent invocation.

0xtrace
MIT licensed · open source