Billing Service¶
Language: Java 21
Framework: Spring Boot 3.5.5
Source: ~/IdeaProjects/NEXIVO/billing-service
Purpose¶
The Billing Service receives AI usage events from Atlas after every call, calculates costs against configurable pricing tables, stores per-call and aggregated records, and exposes APIs for cost reporting and billing period management.
Event Ingestion¶
Atlas publishes a POST /usage/events request to the Billing Service at the end of every call.
AtlasUsageEventDto¶
{
"callId": "uuid",
"tenantId": "string",
"channelId": "string",
"agentId": "string",
"timestamp": "2026-06-21T10:05:32Z",
"metrics": {
"stt": {
"provider": "openai",
"model": "gpt-4o-transcribe",
"durationSeconds": 45,
"transcriptChars": 1200
},
"llm": {
"provider": "google",
"model": "gemini-2.5-flash",
"inputTokens": 500,
"outputTokens": 150,
"turnCount": 5
},
"tts": {
"provider": "openai",
"model": "gpt-4o-mini-tts",
"characters": 800,
"responseChars": 800
},
"realtime": null
},
"metadata": {
"language": "en",
"wasTransferred": false,
"isRealtimeMode": false
}
}
Cost Calculation¶
CostCalculationService applies pricing rules from the pricing_configs table:
| Component | Formula |
|---|---|
| STT | (durationSeconds / 60) × pricePerMinute |
| LLM | (inputTokens / 1000) × pricePerKInputTokens + (outputTokens / 1000) × pricePerKOutputTokens |
| TTS | (characters / 1000) × pricePerKCharacters |
| Realtime | Same as LLM tokens; falls back to LLM pricing if no REALTIME entry |
All costs use BigDecimal at scale 6 with HALF_UP rounding.
Pricing is looked up by (provider, model, usageType) at the event's timestamp — supporting time-windowed rate changes via effectiveFrom / effectiveTo.
Data Model¶
usage_events¶
| Field | Description |
|---|---|
id |
UUID primary key |
call_id |
Call identifier |
tenant_id |
Multi-tenant isolation |
| STT fields | stt_provider, stt_model, stt_duration_seconds, user_transcript_chars |
| LLM fields | llm_provider, llm_model, llm_input_tokens, llm_output_tokens, turn_count |
| TTS fields | tts_provider, tts_model, tts_characters, agent_response_chars |
| Metadata | language, was_transferred, is_realtime_mode |
| Timestamps | event_time, created_at |
aggregated_usage¶
Pre-computed hourly / daily / monthly summaries per tenant. Includes:
- Total calls, durations, tokens, characters
- Provider breakdown maps (JSONB)
- Pre-computed estimated_stt_cost, estimated_llm_cost, estimated_tts_cost, estimated_total_cost
Aggregation jobs run on schedule:
| Period | Cron | Retention |
|---|---|---|
| Hourly | 0 5 * * * * |
90 days |
| Daily | 0 15 0 * * * |
1 year |
| Monthly | 0 30 0 1 * * |
3 years |
pricing_configs¶
| Field | Description |
|---|---|
provider |
e.g. openai, google, elevenlabs |
model |
e.g. gpt-4o-transcribe, gemini-2.5-flash |
usage_type |
STT, LLM, TTS, REALTIME |
price_per_minute |
STT rate (per minute) |
price_per_k_input_tokens |
LLM input rate (per 1K tokens) |
price_per_k_output_tokens |
LLM output rate (per 1K tokens) |
price_per_k_characters |
TTS rate (per 1K characters) |
effective_from / effective_to |
Time-bound rate validity |
currency |
Default USD |
billing_periods¶
Tracks per-tenant billing cycles.
| Status | Description |
|---|---|
OPEN |
Current active period |
CLOSED |
Period ended, totals locked |
INVOICED |
Invoice generated |
PAID |
Payment received |
REST API¶
Usage¶
| Method | Path | Purpose |
|---|---|---|
POST |
/usage/events |
Ingest Atlas call event |
GET |
/usage/summary |
Total usage + costs for tenant + period |
GET |
/usage/by-channel |
Aggregate by channel |
GET |
/usage/by-agent |
Aggregate by agent |
GET |
/usage/calls/{callId} |
Single call usage record |
GET |
/usage/trends |
Time-series usage (HOURLY / DAILY / MONTHLY) |
GET |
/usage/realtime/daily |
Live daily counters from Redis |
GET |
/usage/realtime/monthly |
Live monthly counters from Redis |
GET |
/usage |
Paginated usage events |
Costs¶
| Method | Path | Purpose |
|---|---|---|
GET |
/costs/breakdown |
STT / LLM / TTS cost breakdown |
GET |
/costs/by-provider |
Costs grouped by provider |
GET |
/costs/by-channel |
Costs grouped by channel |
GET |
/costs/calls/{callId} |
Single call cost |
GET |
/costs/forecast |
30-day cost projection |
Pricing¶
| Method | Path | Purpose |
|---|---|---|
GET |
/pricing |
All active pricing configs |
GET |
/pricing/{provider}/{model} |
Pricing for specific provider/model |
POST |
/pricing |
Create pricing config |
PUT |
/pricing/{id} |
Update pricing config |
DELETE |
/pricing/{id} |
Deactivate pricing config |
GET |
/pricing/history/{provider}/{model} |
Historical pricing versions |
Billing Periods¶
| Method | Path | Purpose |
|---|---|---|
GET |
/billing-periods |
List periods for tenant |
GET |
/billing-periods/current |
Get open period |
POST |
/billing-periods/{id}/close |
Close billing period |
Real-Time Counters (Redis)¶
RealtimeUsageCounterService maintains fast Redis hash counters incremented on every event:
billing:usage:daily:{tenantId}:{yyyy-MM-dd} TTL 7 days
billing:usage:monthly:{tenantId}:{yyyy-MM} TTL 90 days
Fields: calls, duration, llm_input_tokens, llm_output_tokens, tts_characters, stt_duration
Used by the Agent Desktop AI dashboard for live counters.
External Dependencies¶
| Dependency | Purpose |
|---|---|
| PostgreSQL 17 | Usage events, aggregates, pricing, billing periods |
| Redis 7 | Real-time usage counters |
| RabbitMQ | Async event ingestion (alongside REST) |
| Spring Cloud Kubernetes | Service discovery |