Prerequisites: You’ve completed the Quickstart and understand the Context & Event Schema.
What You Get
- Evaluator-defined Recall on unlabeled traffic (flag queries with
recall < 1.0) - Precision (proxy): Ratio of supporting passages to total context (detects context bloat)
- F1 & nDCG derived from recall + precision
- P95 Latency tracking for evaluation time
- Trend charts to catch drift over time
- Environment filtering to compare prod vs staging vs dev
- Evaluator Accuracy (when ground truth is provided)
Enable Monitoring
Control volume withsample_rate to manage costs.
- Python
- TypeScript
Sampling Guidance
| Use Case | Recommended sample_rate |
|---|---|
| Change testing | 1.0 (100%) for test queries |
| High-volume production | 0.05 - 0.10 (5-10%) |
| Low-volume or critical | 0.25 - 0.50 (25-50%) |
| Debugging | 1.0 temporarily |
Smart Sampling with Decorator
Use the decorator with dynamic sampling based on metadata:The Monitoring Dashboard

KPI Cards
At the top, you’ll see summary metrics for the selected period:| Metric | Description |
|---|---|
| Recall | Average fraction of requirements covered |
| Precision | Average fraction of supporting documents |
| F1 | Harmonic mean of recall and precision |
| nDCG | Ranking quality (if scores provided) |
| P95 Latency | 95th percentile evaluation time |
| Evaluator Accuracy | F1 against your ground truth (shown when gold data exists) |
Trend Charts
The trend chart has three tabs:| Tab | Metrics | Scale |
|---|---|---|
| Quality | Recall, Precision, F1, nDCG, Subquery Effectiveness | 0-100% |
| Latency | P50, P95 evaluation latency | milliseconds |
| Structure | Trace Depth (multi-hop only) | count |
Latency Tab

Structure Tab

Filtering
Filter your view using:- Environment: Select a specific env (prod, staging, dev)
- Period: 24 hours, 7 days, or 30 days
Cost Management
- Sampling keeps evaluation cost predictable. Start at
0.05and tune up if needed. - SDK batching minimizes request overhead (events are queued and sent in batches).
- Async mode (default) — logging never blocks your request path.
Estimated Costs
| Monthly Evaluations | Seer Cost |
|---|---|
| 100k | ~$16-20 |
| 1M | ~$160-200 |
| 10M | ~$1,600-2,000 |
Privacy Considerations
- You control what you send. If passages are sensitive, include only what’s needed for evaluation.
- Use
metadatato tag records with access boundaries (e.g.,collection,tenant_id) for future filtering. - Consider truncating or summarizing very long passages.