Billing Service¶

Language: Java 21
Framework: Spring Boot 3.5.5
Source: ~/IdeaProjects/NEXIVO/billing-service

Purpose¶

The Billing Service receives AI usage events from Atlas after every call, calculates costs against configurable pricing tables, stores per-call and aggregated records, and exposes APIs for cost reporting and billing period management.

Event Ingestion¶

Atlas publishes a POST /usage/events request to the Billing Service at the end of every call.

`AtlasUsageEventDto`¶

{
  "callId": "uuid",
  "tenantId": "string",
  "channelId": "string",
  "agentId": "string",
  "timestamp": "2026-06-21T10:05:32Z",
  "metrics": {
    "stt": {
      "provider": "openai",
      "model": "gpt-4o-transcribe",
      "durationSeconds": 45,
      "transcriptChars": 1200
    },
    "llm": {
      "provider": "google",
      "model": "gemini-2.5-flash",
      "inputTokens": 500,
      "outputTokens": 150,
      "turnCount": 5
    },
    "tts": {
      "provider": "openai",
      "model": "gpt-4o-mini-tts",
      "characters": 800,
      "responseChars": 800
    },
    "realtime": null
  },
  "metadata": {
    "language": "en",
    "wasTransferred": false,
    "isRealtimeMode": false
  }
}

Cost Calculation¶

CostCalculationService applies pricing rules from the pricing_configs table:

Component	Formula
STT	`(durationSeconds / 60) × pricePerMinute`
LLM	`(inputTokens / 1000) × pricePerKInputTokens + (outputTokens / 1000) × pricePerKOutputTokens`
TTS	`(characters / 1000) × pricePerKCharacters`
Realtime	Same as LLM tokens; falls back to LLM pricing if no REALTIME entry

All costs use BigDecimal at scale 6 with HALF_UP rounding.

Pricing is looked up by (provider, model, usageType) at the event's timestamp — supporting time-windowed rate changes via effectiveFrom / effectiveTo.

Data Model¶

`usage_events`¶

Field	Description
`id`	UUID primary key
`call_id`	Call identifier
`tenant_id`	Multi-tenant isolation
STT fields	`stt_provider`, `stt_model`, `stt_duration_seconds`, `user_transcript_chars`
LLM fields	`llm_provider`, `llm_model`, `llm_input_tokens`, `llm_output_tokens`, `turn_count`
TTS fields	`tts_provider`, `tts_model`, `tts_characters`, `agent_response_chars`
Metadata	`language`, `was_transferred`, `is_realtime_mode`
Timestamps	`event_time`, `created_at`

`aggregated_usage`¶

Pre-computed hourly / daily / monthly summaries per tenant. Includes: - Total calls, durations, tokens, characters - Provider breakdown maps (JSONB) - Pre-computed estimated_stt_cost, estimated_llm_cost, estimated_tts_cost, estimated_total_cost

Aggregation jobs run on schedule:

Period	Cron	Retention
Hourly	`0 5 * * * *`	90 days
Daily	`0 15 0 * * *`	1 year
Monthly	`0 30 0 1 * *`	3 years

`pricing_configs`¶

Field	Description
`provider`	e.g. `openai`, `google`, `elevenlabs`
`model`	e.g. `gpt-4o-transcribe`, `gemini-2.5-flash`
`usage_type`	`STT`, `LLM`, `TTS`, `REALTIME`
`price_per_minute`	STT rate (per minute)
`price_per_k_input_tokens`	LLM input rate (per 1K tokens)
`price_per_k_output_tokens`	LLM output rate (per 1K tokens)
`price_per_k_characters`	TTS rate (per 1K characters)
`effective_from` / `effective_to`	Time-bound rate validity
`currency`	Default `USD`

`billing_periods`¶

Tracks per-tenant billing cycles.

Status	Description
`OPEN`	Current active period
`CLOSED`	Period ended, totals locked
`INVOICED`	Invoice generated
`PAID`	Payment received

REST API¶

Usage¶

Method	Path	Purpose
`POST`	`/usage/events`	Ingest Atlas call event
`GET`	`/usage/summary`	Total usage + costs for tenant + period
`GET`	`/usage/by-channel`	Aggregate by channel
`GET`	`/usage/by-agent`	Aggregate by agent
`GET`	`/usage/calls/{callId}`	Single call usage record
`GET`	`/usage/trends`	Time-series usage (HOURLY / DAILY / MONTHLY)
`GET`	`/usage/realtime/daily`	Live daily counters from Redis
`GET`	`/usage/realtime/monthly`	Live monthly counters from Redis
`GET`	`/usage`	Paginated usage events

Costs¶

Method	Path	Purpose
`GET`	`/costs/breakdown`	STT / LLM / TTS cost breakdown
`GET`	`/costs/by-provider`	Costs grouped by provider
`GET`	`/costs/by-channel`	Costs grouped by channel
`GET`	`/costs/calls/{callId}`	Single call cost
`GET`	`/costs/forecast`	30-day cost projection

Pricing¶

Method	Path	Purpose
`GET`	`/pricing`	All active pricing configs
`GET`	`/pricing/{provider}/{model}`	Pricing for specific provider/model
`POST`	`/pricing`	Create pricing config
`PUT`	`/pricing/{id}`	Update pricing config
`DELETE`	`/pricing/{id}`	Deactivate pricing config
`GET`	`/pricing/history/{provider}/{model}`	Historical pricing versions

Billing Periods¶

Method	Path	Purpose
`GET`	`/billing-periods`	List periods for tenant
`GET`	`/billing-periods/current`	Get open period
`POST`	`/billing-periods/{id}/close`	Close billing period

Real-Time Counters (Redis)¶

RealtimeUsageCounterService maintains fast Redis hash counters incremented on every event:

billing:usage:daily:{tenantId}:{yyyy-MM-dd}    TTL 7 days
billing:usage:monthly:{tenantId}:{yyyy-MM}      TTL 90 days

Fields: calls, duration, llm_input_tokens, llm_output_tokens, tts_characters, stt_duration

Used by the Agent Desktop AI dashboard for live counters.

External Dependencies¶

Dependency	Purpose
PostgreSQL 17	Usage events, aggregates, pricing, billing periods
Redis 7	Real-time usage counters
RabbitMQ	Async event ingestion (alongside REST)
Spring Cloud Kubernetes	Service discovery