Call Insights Service¶
Language: Python 3.10+
Framework: FastAPI 0.115+
Source: ~/PycharmProjects/AI/call-insights-service
Purpose¶
The Call Insights Service ingests real-time call session reports from Atlas, stores full session data (events, audio segments, conversation turns), runs automated diagnostics, and provides APIs for STT analysis and quality reporting.
Session Ingestion¶
Atlas publishes session reports to the Redis stream call-insights:sessions during and after each call. The InsightsStreamConsumer runs as a background task consuming this stream via a consumer group (call-insights-service).
On each message:
1. Parse SessionReportIngest schema
2. Persist session, events, audio segments, and conversation turns to PostgreSQL
3. Run automatic diagnosis (diagnosis.py) to detect issues
4. Store audio files in MinIO/S3
Data Model¶
CallSession¶
| Field | Description |
|---|---|
call_id, room_id, job_id |
LiveKit identifiers |
tenant_id, agent_id, channel_type |
Call context |
stt_provider/model, llm_provider/model, tts_provider/model |
Pipeline config at call time |
vad_threshold, noise_suppression_enabled, rnnoise_enabled |
VAD config |
started_at, ended_at, duration_seconds |
Timing |
close_reason, was_transferred, error |
Outcome |
total_user_turns, total_agent_turns, avg_e2e_latency_ms |
Metrics |
has_near_miss, has_issues, issue_summary |
Diagnostic flags |
ConversationTurn¶
| Field | Description |
|---|---|
role |
user or assistant |
content |
Transcript text |
stt_confidence |
STT confidence score |
transcription_delay_ms |
Time from end-of-speech to transcript |
end_of_turn_delay_ms |
VAD end-of-turn detection latency |
llm_ttft_ms |
LLM time-to-first-token |
tts_ttfb_ms |
TTS time-to-first-byte |
e2e_latency_ms |
Full end-to-end latency for this turn |
tool_calls |
JSON — tool calls made this turn |
AudioSegment¶
| Field | Description |
|---|---|
segment_type |
stt_input, full_pre_rnnoise, full_post_rnnoise, near_miss |
s3_key |
MinIO/S3 object key |
rms_mean, rms_peak, snr_estimate |
Audio quality metrics |
duration_ms |
Segment duration |
SttReport (user-reported issue)¶
| Field | Description |
|---|---|
stt_output |
What the STT produced |
expected_output |
What the user expected |
issue_type |
Classification of the error |
source_s3_key |
MinIO key (short-term) |
archive_s3_key |
DigitalOcean Spaces key (long-term) |
Analysis¶
Full-Call STT Analysis¶
Replays every user turn's audio through multiple STT providers in parallel and compares:
- Transcript accuracy
- Confidence scores
- Latency per provider
- Clipping detection (dBFS measurement)
Result is a grid: turns × audio renditions × STT providers.
Rendition types per turn: pre_rnnoise, post_rnnoise, stt_input
Supported analysis providers: Groq, Deepgram, Cartesia, ElevenLabs, Gladia
Diagnosis¶
diagnosis.py runs automatically post-ingestion and classifies issues by severity:
| Severity | Examples |
|---|---|
critical |
Dropped utterances, STT total failure |
warning |
High e2e latency, RNNoise over-attenuation |
info |
Near-miss speech detected, low confidence turns |
REST API¶
Sessions¶
| Method | Path | Purpose |
|---|---|---|
GET |
/api/v1/sessions |
List sessions (tenant, call_id, has_issues, date range) |
GET |
/api/v1/sessions/{session_id} |
Session detail |
GET |
/api/v1/sessions/by-call/{call_id} |
Resolve call_id → session |
GET |
/api/v1/sessions/{session_id}/timeline |
Events and audio segments |
GET |
/api/v1/sessions/{session_id}/conversation |
Conversation turns |
GET |
/api/v1/sessions/{session_id}/diagnosis |
Detected issues |
GET |
/api/v1/sessions/{session_id}/audio/{segment_id} |
Presigned audio URL (1 h TTL) |
Analysis¶
| Method | Path | Purpose |
|---|---|---|
POST |
/api/v1/analysis/{session_id} |
Run full-call STT analysis |
GET |
/api/v1/analysis/{session_id} |
Get cached analysis |
GET |
/api/v1/analysis |
List analyses (paginated) |
POST |
/api/v1/analysis/batch |
Enqueue batch (up to 50 sessions) |
GET |
/api/v1/analysis/batch/{job_id} |
Batch job progress |
POST |
/api/v1/analysis/batch/{job_id}/cancel |
Cancel batch |
STT Reports¶
| Method | Path | Purpose |
|---|---|---|
POST |
/api/v1/reports |
Submit transcription issue report |
GET |
/api/v1/reports |
List reports |
GET |
/api/v1/reports/{report_id}/audio |
Download reported audio |
POST |
/api/v1/reports/{report_id}/analyze |
Run per-turn provider comparison |
GET |
/api/v1/reports/{report_id}/analysis |
Get cached report analysis |
External Dependencies¶
| Dependency | Purpose |
|---|---|
| PostgreSQL | Sessions, turns, events, analysis, reports |
| Redis / Valkey | call-insights:sessions stream ingestion |
| MinIO / S3 | Short-term audio storage (presigned URLs, 1 h TTL) |
| DigitalOcean Spaces | Long-term archive for reported audio clips |
| STT Providers | Groq, Deepgram, Cartesia, ElevenLabs, Gladia (analysis only) |