Skip to content

Voice AI Pipeline — Overview

The Voice AI pipeline handles real-time AI-powered voice calls end-to-end. It sits inside the Nexivo platform as a specialised subsystem dedicated to voice interactions.

Pipeline Architecture

graph LR
    CS[Call Service] -->|agent_id| LK[LiveKit]
    LK -->|media + agent_id| AT[Atlas\nAdapter]
    AT <-->|agent config| CP[Compass]
    AT --> STT[STT Service]
    STT --> LLM[LLM Service]
    LLM --> TTS[TTS Service]
    TTS --> AT
    AT -->|post-call data| BL[Billing Service]

Services

Service Role
Call Service Handles inbound/outbound calls; passes agent_id to LiveKit
LiveKit Media and session layer
Atlas Central adapter; orchestrates the AI pipeline
Compass Agent Provisioning; returns agent config by agent_id
STT Service Speech-to-Text: VAD → Language Detection → Model Router → Turn Detection
LLM Service Language Model: Guardrails → Model Router → Tool Polling → LangChain
TTS Service Text-to-Speech: Model Router → TTS Server → Language Router
Billing Service Receives and stores call data published post-call

Key Responsibilities of Atlas

Atlas is the hub of the pipeline. It:

  1. Receives the media session and agent_id from LiveKit
  2. Fetches agent configuration from Compass
  3. Streams audio to STT and receives transcripts
  4. Sends transcripts and context to LLM and receives response text
  5. Sends response text to TTS and streams back the synthesised audio
  6. Publishes call data to Billing after the call ends