Product

Why We Built Tracia

Every LLM observability tool made us do too much work before we could do any work. So we built one that doesn't.

Daniel Marchuk

Every LLM app I've built has started the same way. I write a prompt, call an API, get a response. It works. Then I think: I should probably track what's happening here.

And that's where things go sideways.

I've tried the existing tools. Every time, it's the same pattern: setting environment variables, wrapping clients, adding decorators to my functions. All before I can write a single line of product code. Then I realize tracing and prompt management are separate things I need to wire together myself. And if I want to change a prompt or switch models, that's a code change and a redeploy.

I kept thinking: I just want to see what my prompts are doing. Why does this require so much plumbing?

The landscape today

The tools that exist today are good at what they do. But they all made me choose, and none of them got out of the way fast enough.

Setup tax. Environment variables, client wrappers, decorators. Some tools need multiple API calls just to fetch and run a prompt. All of this before I can write a single line of product code.

Model support gaps. Playgrounds that let you configure invalid parameter combinations for specific models and providers. No guardrails, just silent failures or confusing errors when you hit the provider's actual constraints.

Built for engineers only. When a PM or non-technical teammate wanted to test a prompt, the interface was a wall of config. These tools are clearly built by engineers for engineers.

Prompt and tracing as separate features. You set up tracing in one place, prompt management in another, then wire them together yourself.

Proxy or nothing. Some tools route all your LLM traffic through their proxy. Simple to set up, but if you don't want that dependency in your call path, you need a different tool entirely.

Price shock. Some platforms start at $500/month. For a team that just wants to see what their prompts are doing, that's hard to justify.

OpenAI-centric. Documentation and examples focused almost entirely on OpenAI. If you're using Anthropic, Google, or Bedrock, you're often on your own.

Every tool I tried solved part of the problem. Tracing over here, prompt management over there, evaluations somewhere else. I wanted one tool that did all of it, with less setup than any of them.

The insight

Most LLM applications aren't complex agent systems with branching chains and dozens of tools. They're prompt-in, response-out. A customer support bot. A content summarizer. A code reviewer. Simple workflows that need simple observability.

But current tools are built for the complex case. They assume you want fine-grained control over every span. They assume you'll invest time in SDK instrumentation. They assume you're running an infrastructure team that can manage self-hosted deployments.

Most developers don't have that. Most developers want to add observability in 5 minutes and move on to building their actual product.

That's the gap. That's why we built Tracia.

One line of code

With Tracia, your prompts live in the dashboard, not scattered across your codebase. You write your prompt once in the editor, define your variables, pick your model, and call it like this:

import { Tracia } from 'tracia';

const tracia = new Tracia({ apiKey: 'tr_xxx' });

const response = await tracia.prompts.run('welcome-email', {
  name: 'Alice',
  product: 'Tracia'
});
// ✓ Prompt fetched, rendered, executed, and traced automatically

One API key. No decorators. No environment variables beyond the key itself. Tracia fetches the prompt, renders your variables, executes it against your chosen provider (OpenAI, Anthropic, Google, or Amazon Bedrock), and traces the call automatically. Every trace links back to its prompt and version.

Want to change the prompt? Edit it in the dashboard. It's live instantly. No code change, no redeploy. Want to switch from GPT-4o to Claude? Change the model in the dropdown. Same thing. Live instantly.

This is what we mean by zero-configuration tracing. You call prompts.run() and everything else is handled.

"But I don't want a proxy"

This was the question that kept coming up when I talked to developers. They liked the simplicity. But they didn't want another service sitting between them and their LLM provider. Added latency, another point of failure, one more vendor to trust with your data. Fair enough.

Most tools make you choose here. Proxy-based tools route everything through their servers. SDK-based tools add wrappers and decorators but still leave you managing your own provider calls and prompt storage separately.

We wanted to offer both. So we built runLocal().

import { Tracia } from 'tracia';

const tracia = new Tracia({ apiKey: 'tr_xxx' });

const response = await tracia.runLocal({
  model: 'gpt-4o', // or claude-sonnet-4, gemini-2.0-flash, +100 more
  messages: [{ role: 'user', content: 'Hello!' }]
});
// ✓ Your prompt, your infrastructure - traced automatically

Your API key. Your infrastructure. Zero added latency. Tracia submits traces in the background. You get full observability without giving up control. It works across 100+ models from OpenAI, Anthropic, Google, and Amazon Bedrock.

Two modes. Same dashboard. Same traces. Same evaluations. You pick the tradeoff that works for you.

What you get

Tracia isn't just a simpler wrapper. Once your traces are flowing, you get a full observability platform.

Prompt management. Version history with instant rollback. A playground for testing across models. A prompt library with public templates you can fork in one click.

Trace visibility. A waterfall view showing every span, its duration, tokens, cost, and the full input/output. Click any trace and you're looking at the exact prompt version that generated it.

Automated evaluations. Define rules that score every response automatically. 11 built-in rules (contains, regex match, JSON validation, length limits, word count, and more), plus LLM-as-judge evaluators. No manual review needed.

Analytics. Cost trends by model and prompt. Latency percentiles (P50/P95). Token usage over time. Period-over-period comparisons. Know exactly what you're spending and where.

Test cases. Define expected inputs and outputs for your prompts, run them in batch, and validate changes before deploying to production.

All of this works with both prompts.run() and runLocal(). Both TypeScript and Python SDKs.

What Tracia doesn't do (yet)

We're not trying to replace every tool in the stack. We don't have OpenTelemetry export, and we don't have deep framework-specific agent integrations. Self-hosting is on the roadmap but not available today, so if you need open-source self-hosted observability, other tools cover that better for now.

What we do offer is a faster path from zero to full observability. If you're building prompt-based LLM features and want to spend your time on your product instead of your tooling, that's what Tracia is for.

Who this is for

If you've ever spent more time setting up tracing than writing your actual prompt. If you wished you could edit a prompt without redeploying your app. If you wanted to know how much a specific prompt costs you per day. If you needed to test prompt changes before they hit production.

Tracia was built for you.

Start with prompts.run() for the fastest path, or use runLocal() if you want to keep your existing provider setup. Either way, you'll have full observability in under a minute.

We're in public beta and shipping fast. 16 releases since December 27 and counting.

Try it free. No credit card required. You'll be tracing in under a minute.

Get started at tracia.io

Ready to get started?

Zero-config LLM tracing, prompt management, and cost tracking. Free to start.

Get started free