Documentation Index
Fetch the complete documentation index at: https://docs.litigationlabs.io/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Every courtroom session in CaseSim involves dozens of AI decisions: the opposing counsel chooses whether to object, the judge weighs a ruling, a witness formulates an answer. Trace observability is the system that records each of these decisions — what went in, what came out, and how long it took — so you can review the full reasoning chain after a session ends.
Think of it as a court reporter for the AI. Where the session transcript shows you what happened in the courtroom, trace observability shows you why it happened — the exact instructions each agent received and the raw responses it produced.
Key Concepts
Traces
A trace is a single request-level record. When you ask a question during cross-examination, the system creates one trace for that entire interaction. The trace captures:
- Input — What triggered the request (your question, an objection, a procedural action).
- Output — The final result (the witness answer, the judge’s ruling, the OCA’s decision).
- Metadata — Session ID, scenario, witness, trial phase, and timing.
Each courtroom action maps to a named trace:
| Action | Trace Name | What It Records |
|---|
| You ask a witness a question | courtroom/turn-stream | Your question, OCA’s objection decision, witness response, score changes |
| You object to OCA’s question | courtroom/objection | Your objection type and basis, judge’s ruling, whether the witness answered |
| You reply to OCA’s objection | courtroom/reply | Your reply strategy, OCA’s counter-argument, judge’s ruling |
| You allow or proceed past an OCA question | courtroom/proceed | The procedural action, judge ruling or witness response |
| OCA generates a new question | courtroom/oca-question | The generated question, whether it was intentionally defective, linked facts |
| Cross-examination begins | courtroom/begin-cross | Which side examines first, the witness on the stand |
Generations
Nested inside each trace are one or more generations — individual calls to the language model. A single trace for a player question might contain three generations:
- OCA objection check — Did opposing counsel object?
- Judge ruling — If so, was it sustained or overruled?
- Witness answer — What did the witness say?
Each generation records the system prompt, the conversation messages sent to the model, the model’s raw response, token usage, and latency.
Sessions and Users
Traces are grouped by session ID, which corresponds to your CaseSim session. This means you can open a single session in the dashboard and see every AI interaction from start to finish, in order — a complete audit trail of the entire simulated trial.
What You Can See
The Full Decision Chain
For any courtroom moment, you can trace the complete chain of AI reasoning. For example, when you ask a leading question during direct examination:
- Input: Your question (“Isn’t it true that you saw the defendant at 9 PM?”)
- OCA generation: The opposing counsel agent receives the question, the trial context, and its instructions. It decides to object for leading.
- Judge generation: The judge agent receives the objection, the question, and the applicable rules. It rules to sustain.
- Output: The objection was sustained, the witness did not answer, and your score reflects the sustained objection.
Every step is visible. You can read the exact prompt the judge received and understand precisely why it ruled the way it did.
Each generation includes latency data. If a session felt slow, you can identify which agent call took the longest and whether the delay was in the model response or in processing.
Patterns Across Sessions
Because traces carry metadata (scenario, phase, witness, tags), you can filter across sessions to answer questions like:
- How often does the judge sustain objections in the direct examination phase?
- Which witness generates the longest model responses?
- Are OCA questions becoming repetitive in longer sessions?
Accessing the Dashboard
The trace dashboard is hosted on Langfuse, a dedicated observability platform. Access it through LLabs Connect at the URL provided by your team administrator.
Navigating a Session
- Filter by session — Enter the session ID (visible in your browser URL during a CaseSim session) to see all traces for that session, ordered chronologically.
- Select a trace — Click any trace to expand it. The top level shows the input and output summary.
- Inspect generations — Expand the nested generation spans to see the full prompt, model response, and token counts.
Reading a Trace
Each trace displays:
| Field | What It Means |
|---|
| Name | The courtroom action (e.g., courtroom/turn-stream) |
| Input | The triggering data — your question, objection details, or procedural action |
| Output | The result — witness response, ruling, score changes, or flow outcome |
| Tags | Categorization labels like courtroom, the phase (direct, cross), and action type |
| Duration | Total time from request start to stream completion |
| Metadata | Scenario ID, witness ID, player side, and trial phase |
Reading a Generation
Inside each trace, generation spans show:
| Field | What It Means |
|---|
| Function ID | Which agent produced this generation (e.g., judge-ruling, witness-answer, oca-objection-check) |
| Input messages | The system prompt and conversation history sent to the model |
| Output | The model’s raw response text |
| Model | Which language model was used |
| Tokens | Input and output token counts |
| Latency | Time for this specific model call |
Traced Routes
Beyond courtroom interactions, the following operations are also traced:
| Route | Trace Name | Purpose |
|---|
| Session insights generation | insights/generate | Post-session analysis and performance summary |
| Automated evaluations | evals/automated | Batch quality scoring of agent responses |
| Prompt tuning | prompt-tuning/generate | Generating prompt variations for agent improvement |
| Scenario generation | generate-scenario | AI-assisted creation of new case scenarios |
Practical Uses
Reviewing a Specific Ruling
If a judge ruling felt incorrect during your session, locate the trace for that turn. Expand the judge generation to read the full prompt and the model’s reasoning. This reveals whether the issue was in the prompt instructions, the context provided, or the model’s interpretation.
Understanding OCA Behavior
If opposing counsel seemed to object too frequently or not enough, filter traces by courtroom/turn-stream for your session. Each trace’s output includes the OCA’s decision and reasoning, letting you see the pattern across the full examination.
Validating Score Calculations
Turn-stream traces include score deltas in their output — points awarded, elicits unlocked, and rebuttal coverage. If a score seems off, the trace shows exactly which facts the system matched and what points were assigned.
Comparing Sessions
Run the same scenario twice with different questioning strategies. Filter each session’s traces side by side to see how different approaches affected OCA behavior, judge rulings, and witness responses.
Architecture Summary
The observability system operates in two layers:
-
Trace layer — Each route handler wraps its logic in a trace context that records input, output, session metadata, and tags. This is the high-level “what happened” record.
-
Generation layer — Each call to the language model (via the AI SDK) emits telemetry that Langfuse captures as a nested generation span. This is the low-level “what the model saw and said” record.
Both layers flush their data asynchronously after the response is sent to your browser, so tracing adds no perceptible latency to your session.