Voice AI Pipeline — Overview¶
The Voice AI pipeline handles real-time AI-powered voice calls end-to-end. It sits inside the Nexivo platform as a specialised subsystem dedicated to voice interactions.
Pipeline Architecture¶
graph LR
CS[Call Service] -->|agent_id| LK[LiveKit]
LK -->|media + agent_id| AT[Atlas\nAdapter]
AT <-->|agent config| CP[Compass]
AT --> STT[STT Service]
STT --> LLM[LLM Service]
LLM --> TTS[TTS Service]
TTS --> AT
AT -->|post-call data| BL[Billing Service]
Services¶
| Service | Role |
|---|---|
| Call Service | Handles inbound/outbound calls; passes agent_id to LiveKit |
| LiveKit | Media and session layer |
| Atlas | Central adapter; orchestrates the AI pipeline |
| Compass | Agent Provisioning; returns agent config by agent_id |
| STT Service | Speech-to-Text: VAD → Language Detection → Model Router → Turn Detection |
| LLM Service | Language Model: Guardrails → Model Router → Tool Polling → LangChain |
| TTS Service | Text-to-Speech: Model Router → TTS Server → Language Router |
| Billing Service | Receives and stores call data published post-call |
Key Responsibilities of Atlas¶
Atlas is the hub of the pipeline. It:
- Receives the media session and
agent_idfrom LiveKit - Fetches agent configuration from Compass
- Streams audio to STT and receives transcripts
- Sends transcripts and context to LLM and receives response text
- Sends response text to TTS and streams back the synthesised audio
- Publishes call data to Billing after the call ends