Skip to content

enajski/materialize-sql-risk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Materialize SQL Risk POC

Proof of concept high-performance Clojure HTTP service demonstrating Risk Management logic using Postgres for system-of-record and Materialize for real-time headroom/exposure calculation and simulation. Values are modeled as fixed-size timeseries (295 doubles) and aggregated element-wise.

Project has an expected 10000 Requests per day for each read / write but with an invariant of max 250ms latency.

Architecture

graph TD
    App[Clojure App] -->|Writes| PG[(Postgres)]
    PG -->|Replication| MZ[(Materialize)]
    App -->|Reads/Simulate| MZ
    MZ -->|Views| App
Loading
  • Postgres: Stores limits and deals in long-format (295 rows per entity). Logical replication enabled.
  • Materialize: Ingests limits and deals, calculates deal_allocations, exposure_by_client and headroom_available per index.
  • Clojure App:
    • Writes to Postgres (HikariCP).
    • Reads from Materialize (HikariCP).
    • reitit + pohjavirta for high-performance HTTP handling.
    • JSON via jsonista.

Running

Prerequisites

  • Docker & Docker Compose
  • Java 21+ (if running locally without docker)
  • Clojure CLI

Start

docker compose up --build

Wait for the mz-init container to complete initialization (it waits for Materialize to be ready and applies views).

Verification

Run the integration test script:

./scripts/integration_test.sh

Load Testing

Use wrk or hey to load test:

# Get Headroom
wrk -t4 -c100 -d30s http://localhost:8080/headroom/A

# Simulate
wrk -t4 -c100 -d30s -s scripts/wrk_post.lua http://localhost:8080/headroom/simulate
./scripts/low_load_sim_pg.sh

Tracing (Telemere + OpenTelemetry + Jaeger)

The app emits OpenTelemetry spans via Telemere and exports them to Jaeger using OTLP gRPC.

Docker Compose starts a Jaeger all-in-one container and configures the app with:

  • OTEL_ENABLED=true
  • OTEL_SERVICE_NAME=materialize-sql-risk
  • OTEL_OTLP_ENDPOINT=http://jaeger:4318/v1/traces
  • OTEL_SAMPLE_RATIO=1.0
  • APP_ENV=dev (or prod for lower default sampling)

Open the Jaeger UI:

http://localhost:16686

You can disable tracing by setting OTEL_ENABLED=false. If OTEL_SAMPLE_RATIO is not set, it defaults to 1.0 in dev and 0.1 in prod.

Performance Notes

  1. Dual Pools: Separate HikariCP pools for Postgres (Write) and Materialize (Read) ensure that slow reads don't block writes and vice versa.
  2. Row Decoding: Using rs/as-unqualified-lower-maps (or porsas/rs->compiled-record logic if further optimized) reduces allocation overhead during result set processing.
  3. Minimal Middleware: reitit with minimal middleware stack avoids per-request overhead.
  4. Materialize vs. Postgres Simulation:
    • /headroom/simulate runs on Materialize and can use strict serializable snapshots with a total time budget.
    • /headroom/simulate-pg runs on Postgres directly for fresh, low-latency results.
  5. Pre-aggregated Exposure (PG): exposure_by_client is maintained via triggers so PG simulate avoids scanning an ever-growing deals table. This stabilizes p95 over long runs.
  6. Eventual Consistency: There is a small lag (ms to seconds) between writing to Postgres and data appearing in Materialize. The app is robust to this for reads, but sequential write-then-immediate-read flows need to account for it.
  7. Strict Budgeting: Strict-serializable simulate attempts are bounded by SIMULATE_TOTAL_TIMEOUT_MS and MZ_STRICT_STATEMENT_TIMEOUT_MS to keep latency within the SLO.

How we measure

  • ./scripts/low_load_sim.sh: exercises Materialize simulate under low load.
  • ./scripts/low_load_sim_pg.sh: exercises the pure-Postgres simulate path under low load.
  • ./scripts/integration_test.sh: validates the Materialize simulate path end-to-end.
  • ./scripts/integration_test_pg.sh: validates the Postgres simulate path end-to-end.

Metrics & Dashboards (Prometheus + Grafana)

The app exposes Prometheus metrics at /metrics:

  • app_http_request_duration_seconds (method/route/status)
  • app_db_query_duration_seconds (system/op)
  • hikari_* connection gauges

Docker Compose includes Prometheus and Grafana with a pre-provisioned dashboard.

Start the stack:

docker compose up --build

Open:

  • Prometheus: http://localhost:9090
  • Grafana: http://localhost:3000 (admin/admin)

Dashboard: “Materialize SQL Risk - App”.

API

All curves are fixed-size arrays of 295 doubles.

  • PUT /limits/:clientId
    • Body: {"limit_curve":[...295 doubles...]}
  • PUT /deals/:dealId
    • Body: {"borrower_id":"A","deal_amount_curve":[...295...],"guarantor_id":"G","guarantee_amount_curve":[...295...]}
    • guarantee_amount_curve is optional (defaults to zeros).
  • GET /headroom/:clientId
    • Response: {"client_id":"A","limit_curve":[...],"exposure_curve":[...],"headroom_curve":[...]}
  • POST /headroom/simulate
    • Body: {"borrower_id":"A","deal_amount_curve":[...],"guarantor_id":"G","guarantee_amount_curve":[...]}
    • Response: {"overall_ok":true,"clients":[{"client_id":"A","headroom_available_curve":[...],"draft_exposure_curve":[...],"headroom_simulated_curve":[...],"ok":true}]}
  • POST /headroom/simulate-pg
    • Same request/response shape as /headroom/simulate, but computed directly in Postgres.

About

Differential Dataflow for risk management

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors