Developer Documentation

Opik is open-source (GitHub: comet-ml, ~19k stars) and can be self-hosted or used via the managed Comet cloud platform. The core feature set — tracing, evaluation, and experiment management — is included free in the source code.

Integration & Setup:

Install the Opik Python SDK via pip install opik and configure your API key or point to a local instance.
Use decorators or context managers to instrument LLM calls, tool invocations, and retrieval steps — traces are automatically structured as a hierarchy.
Define Test Suites with global and item-level assertions; results surface as clear pass/fail outputs without requiring individual eval metric definitions.

Evaluation Pipeline:

Choose from 30+ built-in LLM-as-a-Judge metrics: answer relevance, context precision, hallucination detection, task completion, and more.
Run evaluations against development traces, CI test datasets, or live production traffic for continuous quality gates.

Ollie Coding Assistant:

Ollie reads failing traces, identifies root causes, proposes code diffs, and applies them with version control integration.
Each fix auto-generates a new regression test case to prevent recurrence.

Production Monitoring:

Real-time evaluation of production traces with configurable alerting thresholds.
Guardrails API to proactively block content violating policy or exposing PII.
Cost Intelligence dashboard tracks token usage and spend per developer/team for coding agents like Claude Code and Codex.