Skip to content
Lago Billing Integration

Lago Billing Integration

Lago Billing Integration

Lago is an open-source, API-first billing infrastructure platform. This integration syncs cost data from the Koku REST API to Lago, enabling service providers to generate itemized invoices for their customers based on actual cloud and OpenShift resource consumption.

Source code: lago-integration-sample

Use Case

A service provider hosts infrastructure (cloud accounts, OpenShift clusters) on behalf of multiple customers. Koku aggregates the costs; this integration routes each customer’s share to Lago for automatic invoice generation.

    graph LR
    A["Cloud Providers<br/>AWS, Azure, GCP"] --> B["Koku<br/>Cost Management"]
    C["OpenShift Clusters<br/>Operator"] --> B
    B -->|REST API| D[lago-sync]
    D -->|Events API| E["Lago<br/>Invoicing"]
    E --> F[Customer Invoices]
  

How It Works

  1. Koku processes billing data from cloud providers and OpenShift operators into its PostgreSQL/Trino summary tables
  2. lago-sync fetches daily cost reports from the Koku API with dimensional grouping (account, service, project, cluster, etc.)
  3. For each cost line item, customer filters determine which customer is billed (based on account IDs, namespace patterns, tags, etc.)
  4. Matched costs are pushed to Lago as usage events with deterministic transaction IDs for deduplication
  5. Lago aggregates events per billing period and generates itemized invoices with per-dimension line items

Billing Lifecycle

    sequenceDiagram
    participant CM as Cost Management
    participant Sync as lago-sync
    participant Lago as Lago
    participant Invoice as Invoice

    Note over CM: Cloud providers send<br/>billing data daily
    CM->>CM: Process & summarize costs
    
    rect rgb(240, 248, 255)
    Note over Sync: Daily sync (cron)
    Sync->>CM: GET /reports/{provider}/costs/
    CM-->>Sync: Nested cost data (JSON)
    Sync->>Sync: Route costs to customers<br/>via filter matching
    Sync->>Lago: POST /events/batch<br/>(deterministic transaction_ids)
    Lago-->>Sync: 200 OK (or 422 duplicate)
    end

    Note over Lago: Events accumulate in<br/>current billing period

    rect rgb(255, 248, 240)
    Note over Lago: Month-end
    Lago->>Lago: Close billing period
    Lago->>Invoice: Generate draft invoice<br/>(per-project line items)
    Invoice-->>Lago: PDF + webhook
    end

    rect rgb(240, 255, 240)
    Note over Sync: Pre-invoice check
    Sync->>CM: Reconcile totals
    Sync->>Lago: Compare with past_usage
    Note over Sync: OK / MISMATCH
    end
  

Data Mapping

Koku ConceptLago EntityDescription
Provider (aws, azure, gcp, openshift)Billable MetricOne metric per provider, aggregates cost_amount
Daily cost for a dimension combinationEventOne event per leaf per day per customer
Customer’s resources for a providerSubscriptionLinks customer to the billing plan
group_by dimensions (account, project, etc.)pricing_group_keysProduces per-dimension invoice line items
cost.total.valuecost_amount propertyThe billable amount (pass-through at 1:1)

Invoice Itemization

Charges are configured with pricing_group_keys so that each unique combination of dimensions becomes a separate invoice line item:

OpenShift invoice (grouped by project + cluster):

OCP Daily Cost (project=frontend, cluster=prod-01)  .... $  420.00
OCP Daily Cost (project=backend, cluster=prod-01)   .... $  890.00
OCP Daily Cost (project=monitoring, cluster=prod-01) ... $  150.00
OCP Daily Overhead (project=frontend, cluster=prod-01).. $   63.00
────────────────────────────────────────────────────────────────────
Total                                                    $1,523.00

AWS invoice (grouped by account + service):

AWS Daily Cost (account=123456789012, service=AmazonEC2)  $2,340.00
AWS Daily Cost (account=123456789012, service=AmazonS3)   $   89.50
AWS Daily Cost (account=123456789012, service=AmazonRDS)  $  620.00
────────────────────────────────────────────────────────────────────
Total                                                     $3,049.50

Grouping dimensions are configurable per provider.

Sample Invoice

Here is a real invoice generated by Lago from OpenShift cost data synced by this integration (May 2026, 353 events across a 3-day sync):

Invoice: PCO-202605-003              Customer: E2E Test Customer
Period:  May 2026                     Currency: USD

OCP Daily Cost
──────────────────────────────────────────────────────────────────
  project=Worker unallocated .................... $  1,316.63
  project=analytics ............................. $    607.43
  project=cost-management ....................... $    607.13
  project=Platform unallocated .................. $    530.69
  project=fall .................................. $    299.08
  project=snowdown .............................. $    299.08
  project=openshift ............................. $    258.36
  ... (82 more line items)
                                      Subtotal:   $  5,513.27

OCP Daily Overhead
──────────────────────────────────────────────────────────────────
  project=analytics ............................. $    970.47
  project=cost-management ....................... $    800.11
  project=netobserv ............................. $    466.66
  project=fall .................................. $    452.36
  ... (34 more line items)
                                      Subtotal:   $  5,513.24

──────────────────────────────────────────────────────────────────
                                      Tax:        $      0.00
                                      TOTAL:      $ 11,026.51

Each project is a separate invoice line item (driven by pricing_group_keys). The full PDF and an Excel billing report are available in the source repository’s docs/ folder.

Customer-to-Resource Mapping

Each customer in the configuration defines which Koku resources they own via filters that match against the report API’s dimensional data:

customers:
  - external_id: "customer_acme"
    name: "Acme Corp"
    currency: "USD"
    tax_identification_number: "US12-3456789"
    address:
      country: "US"
      state: "CA"
      zipcode: "94105"
    resources:
      - provider: aws
        filter:
          account: ["123456789012", "234567890123"]
      - provider: openshift
        filter:
          project: ["acme-*"]
          cluster: ["prod-cluster-01"]

Filter values support glob patterns (* and ?), enabling flexible matching like acme-* for all namespaces starting with “acme-”.

Supported Filter Dimensions

ProviderDimensions
AWSaccount, service, region, tag:<key>
Azuresubscription_guid, service_name, resource_location, tag:<key>
GCPaccount, service, region, tag:<key>
OpenShiftcluster, project, node, vm_name, tag:<key>

All providers support tag:<key> for tag-based filtering and grouping. Filter values support glob patterns (* and ?).

Koku API Usage

The integration calls the Koku report API endpoints:

EndpointPurpose
GET /api/cost-management/v1/reports/aws/costs/AWS costs
GET /api/cost-management/v1/reports/azure/costs/Azure costs
GET /api/cost-management/v1/reports/gcp/costs/GCP costs
GET /api/cost-management/v1/reports/openshift/costs/OpenShift costs

Key parameters used:

ParameterValuePurpose
filter[resolution]dailyOne data point per day
start_dateYYYY-MM-DDStart of date range
end_dateYYYY-MM-DDEnd of date range
cost_typecalculated_amortized_cost (AWS only)Amortized RI/SP costs
group_by[<dimension>]*Group results by dimension

Authentication: The integration supports both OAuth2 service accounts (for the SaaS API at console.redhat.com) and x-rh-identity headers (for local development).

The response is a nested JSON tree grouped by the requested dimensions. The integration walks this tree recursively to reach leaf cost values.

AWS Cost Type

For AWS, the integration uses cost_type=calculated_amortized_cost. This spreads Reserved Instance upfront payments and Savings Plan discounts across the reservation period, giving the true economic cost per day rather than cash-flow timing.

OpenShift Cost Breakdown

OpenShift costs include both direct costs and distributed overhead:

Cost FieldMeaning
cost.rawBase infrastructure cost
cost.markupMarkup from cost models
cost.usageUsage-based cost model rates
cost.totalSum of all above
cost.platform_distributedPlatform overhead allocated to project
cost.worker_unallocated_distributedUnallocated worker cost distributed

The integration generates separate events for direct costs and overhead, allowing them to appear as distinct line items on invoices.

Taxes

The integration pushes pre-tax cost amounts to Lago. Tax calculation is handled entirely by Lago based on customer configuration.

Tax Options

OptionBest ForHow It Works
Manual ratesFixed rate, few jurisdictionsCreate tax objects in Lago, assign per customer via tax_codes
Lago EU TaxesEU B2B with reverse chargeAuto-detects VAT rate from customer country + VAT ID (VIES validated)
AvalaraUS multi-state, global complianceFull tax engine — calculates per line item based on addresses
AnrokUS + international, audit trailSimilar to Avalara, alternative provider

Tax Hierarchy in Lago

Billing Entity default tax
  → overridden by Customer tax_codes
    → overridden by Plan-level tax
      → overridden by Charge-level tax
        → overridden by Tax provider (Avalara/Anrok)

Customer address and tax_identification_number are provisioned during bootstrap from config.yaml, enabling Lago’s tax engine to calculate correctly.

Reliability Features

FeatureImplementation
Idempotent syncDeterministic transaction_id per event; Lago deduplicates
Cost correctionDetects cost changes via state DB; pushes delta correction events
State trackingSQLite database records what has been synced and event cost fingerprints
Retry with backoffTransient errors (429, 5xx, timeouts) retried 3× with exponential backoff
Partial failure handlingFailed batches don’t block remaining batches; errors reported
Currency validationVerifies Koku report currency matches customer config; errors on mismatch
ReconciliationCross-system comparison of Koku totals vs Lago usage
Dry-run modePreview events without pushing (--dry-run)
Pre-flight validationlago-sync validate checks connectivity, credentials, entities
Billing boundary protectionWarns when syncing close to month-end billing period closure
Config validationActionable error messages for malformed configuration
Non-zero exit codesProcess exits with code 1 on failures (cron/CI-friendly)

Quick Start

Prerequisites

  • Python 3.11+
  • A running Koku instance with processed cost data
  • A running Lago instance (self-hosted or cloud)

Install and Configure

git clone https://github.com/pgarciaq/lago-integration-sample.git
cd lago-integration-sample
pip install -e ".[dev]"
cp config.example.yaml config.yaml
# Edit config.yaml with your credentials and customer definitions

Validate Configuration

Check connectivity, credentials, and Lago entity existence before syncing:

lago-sync validate

Bootstrap Lago Entities

Creates billable metrics, plan, charges, customers, and subscriptions:

lago-sync bootstrap

# Update existing charges after config changes (e.g. invoice_group_by)
lago-sync bootstrap --update

Sync Cost Data

# Preview first (no data pushed)
lago-sync sync --month 2024-01 --dry-run

# Sync a month
lago-sync sync --month 2024-01

# Re-sync after data reprocessing (detects changes, pushes corrections)
lago-sync sync --month 2024-01 --force

# Daily sync (default: yesterday)
lago-sync sync

Reconcile

Compare Koku totals against Lago usage before finalizing invoices:

lago-sync reconcile --month 2024-01

Scheduling

For production use, schedule syncs via cron:

# Daily: sync yesterday's data at 6 AM
0 6 * * * cd /path/to/lago-integration-sample && lago-sync sync

# Monthly: full re-sync on the 2nd (after cloud providers finalize data)
0 8 2 * * cd /path/to/lago-integration-sample && lago-sync sync --month $(date -d "last month" +\%Y-\%m) --force

The monthly --force sync detects and corrects any costs that were reprocessed after the daily sync originally ran (e.g., AWS Reserved Instance adjustments).

Architecture

    graph TD
    subgraph "lago-sync"
        CONFIG["config.yaml<br/>Customer-Resource mapping"]
        KOKU_CLIENT["koku_client.py<br/>Fetches report data"]
        LAGO_SYNC["lago_sync.py<br/>Routes costs to customers"]
        BOOTSTRAP["bootstrap.py<br/>Provisions Lago entities"]
        STATE["state.py<br/>SQLite sync tracking"]
        RECONCILE["reconcile.py<br/>Cross-system verification"]
    end

    subgraph "Koku"
        API["Report API<br/>/reports/provider/costs/"]
    end

    subgraph "Lago"
        EVENTS["Events API<br/>/api/v1/events/batch"]
        INVOICES[Invoice Engine]
    end

    CONFIG --> KOKU_CLIENT
    CONFIG --> LAGO_SYNC
    KOKU_CLIENT --> API
    API --> KOKU_CLIENT
    LAGO_SYNC --> EVENTS
    EVENTS --> INVOICES
    STATE --> LAGO_SYNC
  

Limitations

  • Batch, not real-time — Designed for daily or monthly sync cycles, not streaming.
  • Credits/refunds — Not handled by the integration. Use Lago’s credit note feature for adjustments.
  • Currency conversion — Koku reports in the provider’s native currency. Multi- currency customers require exchange rate configuration in Lago.
  • Data latency — OpenShift data can arrive hourly. Cloud provider data (AWS, Azure, GCP) is pulled once per day (as frequently as hyperscalers update their billing data). End-of-month data may continue being refined for 1–3 days.

Further Reading