Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.litigationlabs.io/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Every courtroom session in CaseSim involves dozens of AI decisions: the opposing counsel chooses whether to object, the judge weighs a ruling, a witness formulates an answer. Trace observability is the system that records each of these decisions — what went in, what came out, and how long it took — so you can review the full reasoning chain after a session ends. Think of it as a court reporter for the AI. Where the session transcript shows you what happened in the courtroom, trace observability shows you why it happened — the exact instructions each agent received and the raw responses it produced.

Key Concepts

Traces

A trace is a single request-level record. When you ask a question during cross-examination, the system creates one trace for that entire interaction. The trace captures:
  • Input — What triggered the request (your question, an objection, a procedural action).
  • Output — The final result (the witness answer, the judge’s ruling, the OCA’s decision).
  • Metadata — Session ID, scenario, witness, trial phase, and timing.
Each courtroom action maps to a named trace:
ActionTrace NameWhat It Records
You ask a witness a questioncourtroom/turn-streamYour question, OCA’s objection decision, witness response, score changes
You object to OCA’s questioncourtroom/objectionYour objection type and basis, judge’s ruling, whether the witness answered
You reply to OCA’s objectioncourtroom/replyYour reply strategy, OCA’s counter-argument, judge’s ruling
You allow or proceed past an OCA questioncourtroom/proceedThe procedural action, judge ruling or witness response
OCA generates a new questioncourtroom/oca-questionThe generated question, whether it was intentionally defective, linked facts
Cross-examination beginscourtroom/begin-crossWhich side examines first, the witness on the stand

Generations

Nested inside each trace are one or more generations — individual calls to the language model. A single trace for a player question might contain three generations:
  1. OCA objection check — Did opposing counsel object?
  2. Judge ruling — If so, was it sustained or overruled?
  3. Witness answer — What did the witness say?
Each generation records the system prompt, the conversation messages sent to the model, the model’s raw response, token usage, and latency.

Sessions and Users

Traces are grouped by session ID, which corresponds to your CaseSim session. This means you can open a single session in the dashboard and see every AI interaction from start to finish, in order — a complete audit trail of the entire simulated trial.

What You Can See

The Full Decision Chain

For any courtroom moment, you can trace the complete chain of AI reasoning. For example, when you ask a leading question during direct examination:
  1. Input: Your question (“Isn’t it true that you saw the defendant at 9 PM?”)
  2. OCA generation: The opposing counsel agent receives the question, the trial context, and its instructions. It decides to object for leading.
  3. Judge generation: The judge agent receives the objection, the question, and the applicable rules. It rules to sustain.
  4. Output: The objection was sustained, the witness did not answer, and your score reflects the sustained objection.
Every step is visible. You can read the exact prompt the judge received and understand precisely why it ruled the way it did.

Timing and Performance

Each generation includes latency data. If a session felt slow, you can identify which agent call took the longest and whether the delay was in the model response or in processing.

Patterns Across Sessions

Because traces carry metadata (scenario, phase, witness, tags), you can filter across sessions to answer questions like:
  • How often does the judge sustain objections in the direct examination phase?
  • Which witness generates the longest model responses?
  • Are OCA questions becoming repetitive in longer sessions?

Accessing the Dashboard

The trace dashboard is hosted on Langfuse, a dedicated observability platform. Access it through LLabs Connect at the URL provided by your team administrator.
  1. Filter by session — Enter the session ID (visible in your browser URL during a CaseSim session) to see all traces for that session, ordered chronologically.
  2. Select a trace — Click any trace to expand it. The top level shows the input and output summary.
  3. Inspect generations — Expand the nested generation spans to see the full prompt, model response, and token counts.

Reading a Trace

Each trace displays:
FieldWhat It Means
NameThe courtroom action (e.g., courtroom/turn-stream)
InputThe triggering data — your question, objection details, or procedural action
OutputThe result — witness response, ruling, score changes, or flow outcome
TagsCategorization labels like courtroom, the phase (direct, cross), and action type
DurationTotal time from request start to stream completion
MetadataScenario ID, witness ID, player side, and trial phase

Reading a Generation

Inside each trace, generation spans show:
FieldWhat It Means
Function IDWhich agent produced this generation (e.g., judge-ruling, witness-answer, oca-objection-check)
Input messagesThe system prompt and conversation history sent to the model
OutputThe model’s raw response text
ModelWhich language model was used
TokensInput and output token counts
LatencyTime for this specific model call

Traced Routes

Beyond courtroom interactions, the following operations are also traced:
RouteTrace NamePurpose
Session insights generationinsights/generatePost-session analysis and performance summary
Automated evaluationsevals/automatedBatch quality scoring of agent responses
Prompt tuningprompt-tuning/generateGenerating prompt variations for agent improvement
Scenario generationgenerate-scenarioAI-assisted creation of new case scenarios

Practical Uses

Reviewing a Specific Ruling

If a judge ruling felt incorrect during your session, locate the trace for that turn. Expand the judge generation to read the full prompt and the model’s reasoning. This reveals whether the issue was in the prompt instructions, the context provided, or the model’s interpretation.

Understanding OCA Behavior

If opposing counsel seemed to object too frequently or not enough, filter traces by courtroom/turn-stream for your session. Each trace’s output includes the OCA’s decision and reasoning, letting you see the pattern across the full examination.

Validating Score Calculations

Turn-stream traces include score deltas in their output — points awarded, elicits unlocked, and rebuttal coverage. If a score seems off, the trace shows exactly which facts the system matched and what points were assigned.

Comparing Sessions

Run the same scenario twice with different questioning strategies. Filter each session’s traces side by side to see how different approaches affected OCA behavior, judge rulings, and witness responses.

Architecture Summary

The observability system operates in two layers:
  1. Trace layer — Each route handler wraps its logic in a trace context that records input, output, session metadata, and tags. This is the high-level “what happened” record.
  2. Generation layer — Each call to the language model (via the AI SDK) emits telemetry that Langfuse captures as a nested generation span. This is the low-level “what the model saw and said” record.
Both layers flush their data asynchronously after the response is sent to your browser, so tracing adds no perceptible latency to your session.