Skip to content
Architecture

Architecture

This page describes the high-level architecture of the Koku platform, the data flow from metering to reporting, and the responsibilities of each component.

High-Level Data Flow

    flowchart TD
    subgraph sources [Data Sources]
        direction LR
        AWS[AWS CUR]
        Azure[Azure Exports]
        GCP[GCP BigQuery]
        OCP[OpenShift Clusters]
    end

    subgraph kokuBackend [Koku Backend]
        Masu[Masu Pipeline · Celery Workers]
        S3[S3 / MinIO · Parquet Storage]
        Trino[Trino · cloud only]
        PG[(PostgreSQL)]
        API[REST API · Django]
    end

    subgraph consumers [Consumers]
        direction LR
        UI[koku-ui · React]
        ExtAPI[External Systems · Billing / ERP / BI]
    end

    AWS --> Masu
    Azure --> Masu
    GCP --> Masu
    OCP -->|metrics upload| Masu

    Masu --> S3
    S3 --> Trino
    Trino --> PG
    Masu -->|on-prem path| PG

    PG --> API
    API --> UI
    API --> ExtAPI
  

Cloud providers send billing data (AWS CUR, Azure exports, GCP BigQuery) to the Masu pipeline. OpenShift clusters upload Prometheus-based metrics via the koku-metrics-operator. The pipeline converts everything to Parquet, stores it in S3/MinIO, and aggregates it through Trino (cloud) or directly in PostgreSQL (on-prem). The REST API serves cost and usage data to the koku-ui frontend and any external system.

OpenShift Metering Flow

    sequenceDiagram
    participant Cluster as OpenShift Cluster
    participant Operator as koku-metrics-operator
    participant Prom as Prometheus / Thanos
    participant Ingress as Koku Ingress API
    participant Pipeline as Masu Pipeline

    Operator->>Prom: Query CPU, memory, storage,<br/>GPU, VM metrics (hourly)
    Prom-->>Operator: Metric results
    Operator->>Operator: Generate CSV reports
    Operator->>Operator: Package as tar.gz with manifest
    Operator->>Ingress: Upload tar.gz
    Ingress->>Pipeline: Trigger processing
    Pipeline->>Pipeline: Convert to Parquet, summarize,<br/>apply cost models
  

The koku-metrics-operator runs on each monitored OpenShift cluster. Every upload cycle it queries Prometheus/Thanos for node, pod, storage, GPU, and VM metrics, packages the results as CSV files in a tar.gz archive, and uploads them to the Koku ingress endpoint.

Component Overview

    flowchart TB
    subgraph apiLayer [API Layer]
        DRF[Django REST Framework]
        RBAC[RBAC Middleware]
        ProvMap[ProviderMap<br/>Query Engine]
    end

    subgraph pipeline [Data Pipeline]
        Orch[Orchestrator]
        DL[Downloaders<br/>AWS / Azure / GCP / OCP]
        Parquet[Parquet Processors]
        Summary[Summary Updater]
        CostModel[Cost Model Engine]
    end

    subgraph storage [Storage]
        Redis[(Redis<br/>Cache + Broker)]
        PG2[(PostgreSQL<br/>Multi-Tenant)]
        S3_2[S3 / MinIO]
        Trino2[Trino / Hive<br/>cloud only]
    end

    DRF --> RBAC
    RBAC --> ProvMap
    ProvMap --> PG2

    Orch --> DL
    DL --> Parquet
    Parquet --> S3_2
    S3_2 --> Trino2
    Trino2 --> Summary
    Parquet -->|on-prem| Summary
    Summary --> PG2
    CostModel --> PG2

    Orch -.-> Redis
    DL -.-> Redis
  

Component Responsibilities

ComponentPurpose
Django REST FrameworkHTTP API for reports, tags, resource types, forecasts, cost models, sources
RBAC MiddlewareTenant isolation and fine-grained access control via x-rh-identity header
ProviderMapMaps query parameters to database columns and SQL fragments per provider
OrchestratorCelery beat task that polls providers and starts manifest processing
DownloadersProvider-specific modules that fetch billing data (CUR, exports, BigQuery, tar.gz)
Parquet ProcessorsConvert raw data to Parquet format and upload to S3/MinIO
Summary UpdaterAggregates Parquet data (via Trino or PostgreSQL) into summary tables
Cost Model EngineApplies tiered rates, tag rates, markup, and distribution to OpenShift usage
PostgreSQLMulti-tenant database (schema-per-tenant via django-tenants)
RedisCelery broker, result backend, RBAC cache, and API cache
S3 / MinIOObject storage for Parquet files
Trino / HiveDistributed SQL engine for heavy aggregation (cloud deployment only)