Skip to content

Transcript Service

The Transcript Service handles real-time call transcription ingestion, distribution, and persistence. It captures live STT output from Azure ACS and LiveKit, publishes segments to Redis for live clients, and persists the full transcript with sentiment data to PostgreSQL.

Overview

Property Value
Service transcript-service
Image communication-services/transcript-service
Namespace nexivo
Replicas 3
Language Java 21
Framework Spring Boot 3.4.4
Storage PostgreSQL (persistence), Redis (pub/sub)

Tech Stack

Component Library
Framework Spring Boot 3.4.4
Language Java 21
Database PostgreSQL
Real-time messaging Redis pub/sub
ACS integration Azure ACS CallAutomation SDK
Internal libraries msl-core, multi-tenancy-core, communication-core

Core Data Model

CallTranscript Entity

Field Type Description
id UUID Primary key
callId String Associated call identifier
text String Transcribed text segment
confidence Double STT confidence score
offset Long Start offset in milliseconds
duration Long Duration in milliseconds
isFinal Boolean Whether segment is a final (non-interim) result
participantId String Resolved participant identifier
sentiment JSON Sentiment data attached to segment

Key DTOs

CallTranscriptInfo — raw per-utterance transcription returned by the REST API:

Field Type
text String
confidence Double
offset Long
duration Long
isFinal Boolean
participantId String
sentiment JSON

Utterance — AI-processed speaker aggregation:

Field Type
speaker_id String
speaker_role String
text String
start_time Long
end_time Long

ParticipantInfo — resolved participant identity:

Field Type
id String
roleId String
displayName String
kind Enum: AI / Human

REST API

Method Path Response Description
GET /calls/{callId}/transcripts List<CallTranscriptInfo> Raw per-utterance transcriptions for a call
GET /calls/{callId}/utterances List<Utterance> Aggregated utterances with speaker roles
GET /health Health check

WebSocket Endpoints

Path Feed Source Description
/call-transcripts/{callId} Azure ACS Live ACS transcription feed for a call
/livekit-transcripts LiveKit Live LiveKit transcription feed
/transcripts/{transcriptId} Redis Redis-backed generic transcript stream

Transcription Flow

sequenceDiagram
    participant ACS as Azure ACS
    participant LK as LiveKit
    participant ACSH as AcsCallTranscriptHandler
    participant LKH as LivekitTranscriptHandler
    participant CP as ContactService /<br/>AiProvisioningService
    participant Redis
    participant TH as TranscriptHandler<br/>(Redis Listener)
    participant WS as WebSocket Subscribers
    participant PG as PostgreSQL

    ACS->>ACSH: TranscriptionData (WebSocket)
    LK->>LKH: Transcription event (WebSocket)

    ACSH->>CP: Resolve participantId
    LKH->>CP: Resolve participantId
    CP-->>ACSH: ParticipantInfo (Human / AI)
    CP-->>LKH: ParticipantInfo (Human / AI)

    ACSH->>Redis: Publish to /transcripts/{callId}
    LKH->>Redis: Publish to /transcripts/{callId}

    ACSH->>PG: Persist CallTranscript
    LKH->>PG: Persist CallTranscript

    Redis->>TH: Message received
    TH->>WS: Broadcast to /transcripts/{transcriptId} subscribers

Handlers

AcsCallTranscriptHandler

Receives TranscriptionData events from Azure ACS over the /call-transcripts/{callId} WebSocket.

  1. Sets TenantContextHolder for the request scope.
  2. Resolves participant via ContactService.findContactById() (human) or AiProvisioningService.findAiAgent() (AI).
  3. Publishes enriched segment to Redis channel /transcripts/{callId}.
  4. Persists CallTranscript to PostgreSQL.
  5. Clears TenantContextHolder.

LivekitTranscriptHandler

Mirrors the ACS handler flow for LiveKit transcription events received on /livekit-transcripts.

TranscriptHandler

Redis pub/sub listener. On message receipt, broadcasts the transcript segment to all active WebSocket subscribers on /transcripts/{transcriptId}.

Participant Resolution

Source Service Kind
Human contacts ContactService.findContactById() Human
AI agents AiProvisioningService.findAiAgent() AI

Multi-Tenancy

TenantContextHolder is set at the start and cleared at the end of each handler invocation. The path /transcripts/.* is excluded from the global tenant filter to allow the Redis listener to operate without a tenant context.