Definition
The session record count test monitors the number of records — typically turns — per session. It’s a numeric test (no LLM evaluator involved) that aggregates the row count per session ID and lets you alert on pathological cases: runaway sessions with hundreds of turns, or sessions that never went past one.Taxonomy
- Task types: LLM.
- Availability: and .
- Evaluation level: session.
- Computation: deterministic aggregation.
Why it matters
- Runaway sessions (too many turns) often signal tool-call loops, clarification loops, or frustrated users hammering on the same question.
- Very short sessions (one turn then drop-off) may signal users bouncing off the product before the assistant could help.
- Track both tails — mean and percentile views of session length often tell very different stories.
Available measurements
| Measurement | What it means |
|---|---|
totalSessions | Number of sessions in the window |
meanRecordsPerSession | Average number of records per session |
medianRecordsPerSession | Median number of records per session |
stdRecordsPerSession | Standard deviation of records per session |
minRecordsPerSession | Shortest session in the window |
maxRecordsPerSession | Longest session in the window |
p90RecordsPerSession | 90th-percentile session length |
p95RecordsPerSession | 95th-percentile session length |
p99RecordsPerSession | 99th-percentile session length |
shortSessionCount | Count of sessions below a short-session threshold |
mediumSessionCount | Count of sessions in the medium-length band |
longSessionCount | Count of sessions above a long-session threshold |
Required columns
- Session ID: Groups turns belonging to the same conversation.
Test configuration examples
Related
- Session duration — wall-clock view of session length.
- Session cost, Session token count — cost and token aggregates per session.

