Skip to content

coord-T9–T12: integration tests, eligibility, config validation, reliability scoring#1

Open
CodeByMAB wants to merge 8 commits into
mainfrom
claude/decentralized-ai-bitcoin-waZYz
Open

coord-T9–T12: integration tests, eligibility, config validation, reliability scoring#1
CodeByMAB wants to merge 8 commits into
mainfrom
claude/decentralized-ai-bitcoin-waZYz

Conversation

@CodeByMAB
Copy link
Copy Markdown
Owner

Summary

  • Gap #3 — Integration test suite (coord-T9): testutil package with MustDB/TruncateAll/NodeFixture; integration tests for registry, scheduler, FL rounds, and stake verifier; migration 000009 fixes critical fl_rounds.round_number DEFAULT bug
  • Gap #4 — Task eligibility validation (coord-T10): filterEligible now enforces task-type compatibility via supportsTaskType; migration 000010 adds supported_task_types TEXT[]; 6 unit tests in eligibility_test.go
  • Gap #5 — Config validation hardening (coord-T11): validate() enforces BRS-POS-02 tier floors, stake operational bounds (SRS-STAKE-04), FL numerical constraints, and S3 all-or-nothing group; 31 tests in validate_test.go
  • Gap #6 — Reliability scoring with uptime fraction (coord-T12): SRS-SCHED-04 reliability = task_success_fraction × uptime_fraction implemented in computeReliability; migration 000011 adds heartbeat_log table; RecordHeartbeat now appends log rows; 10 unit + 2 integration tests in reliability_test.go

Test plan

  • go test ./internal/... passes (unit tests; no DB required)
  • OWM_TEST_DSN=<dsn> go test ./internal/... passes all integration tests against a real Postgres instance with migrations applied
  • go build ./... compiles cleanly
  • Migration sequence 000009 → 000010 → 000011 applies without error

🤖 Generated with Claude Code

CodeByMAB and others added 8 commits May 17, 2026 20:40
- Add internal/testutil package: MustDB (skips when OWM_TEST_DSN unset,
  runs migrations, returns pgxpool), TruncateAll, Logger, NodeFixture/Sign
- Fill in 7 skipped integration tests across 4 packages:
    registry: TestRegisterIntegration (register → activate → re-register)
    scheduler: TestScheduleIntegration, TestRequeueTimedOutIntegration
    fl: TestOpenRoundIntegration, TestSubmitGradientIntegration,
        TestTryAggregateIntegration (nil storage → placeholder fedAvg)
    stake: TestSlash_MockForceClose (mock LN, real DB, verifies
           slashing_events + node status + stake_status)
- Add migration 000009: fl_rounds.round_number now has an auto-increment
  default via a dedicated sequence (INSERT in OpenRound omitted the column;
  without a DEFAULT the insert would violate NOT NULL)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Migration 000010: add supported_task_types TEXT[] DEFAULT '{}' to nodes
- registry.Node: add SupportedTaskTypes []string field; persist it through
  Register upsert and return it from GetByPublicKey / ListActive
- scheduler.filterEligible: enforce task type match in addition to tier —
  a node with an empty SupportedTaskTypes list accepts all task types
  (backward-compatible); non-empty list must contain the requested task type
- New supportsTaskType() helper (pure, no imports)
- eligibility_test.go (package scheduler): 6 unit tests covering tier-only,
  empty list, mismatch, combined tier+type, nil input, and supportsTaskType table
- TestScheduleIntegration_TaskTypeMismatch: integration test confirms Schedule
  errors when no node supports the requested type and succeeds when one does

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extend validate() with four new rule groups:

Tier minimums (BRS-POS-02):
  - Each of t1/t2/t3 must meet its BRS floor (100k/500k/2M sats)
  - Ordering t1 ≤ t2 ≤ t3 is enforced; equal boundaries are allowed
  - Nil map is caught cleanly (all values read as 0)

Stake operational config:
  - verify_interval_hours, degraded_grace_period_hours, slash_cooldown_days ≥ 1
  - t1_auto_slash_signals ≥ 3 and t2t3_maintainer_acks ≥ 2 (SRS-STAKE-04)

FL config bounds:
  - min_participants ≥ 2, gradient_l2_clip_norm > 0,
    anomaly_std_dev_threshold > 0, round_interval_minutes ≥ 1,
    top_k_sparsification_pct ∈ [0, 1]

S3 group check:
  - If any of endpoint/bucket/access_key/secret_key is non-empty,
    all four must be set (partial config is rejected)

Add validate_test.go: 31 targeted tests (29 new), each mutating exactly one
field of a passing base config to verify the specific error message substring.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements SRS-SCHED-04: reliability = task_success_fraction × uptime_fraction
over a rolling 7-day window (clamped to node registration time).

Migration 000011: heartbeat_log table (node_id, recorded_at) with a DESC index
on (node_id, recorded_at) for efficient 7-day window queries. Includes a
pruning note for rows older than 8 days.

RecordHeartbeat: now appends to heartbeat_log on every heartbeat, preserving
the existing last_heartbeat update in the nodes table.

UpdateReliability: replaced the placeholder with three focused scalar queries
plus the Go-side computeReliability helper:
  - task_success_fraction = completed / max(total, 1)  [1.0 if no tasks yet]
  - uptime_fraction       = received_hb / expected_hb  [1.0 if window < 60 s]
  - expected_hb           = windowDuration / 60 s      (SRS-NODE-04 rate)
  - reliability           = clamp(task × uptime, 0, 1)

computeReliability is a pure function — no DB, fully deterministic.

reliability_test.go (package registry): 10 unit tests covering normal,
edge (new node, no heartbeats, extra heartbeats, sub-interval window, zero
tasks) and boundary cases; plus 2 integration tests for the full DB path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- testutil/db.go: append sslmode=disable to the migration DSN so
  golang-migrate's pq driver works against the CI Postgres service
  container (no TLS configured)
- rpc/server_test.go: also accept "broken pipe" as a valid mTLS
  rejection indicator; Linux TCP stack can reset the connection before
  the TLS handshake message reaches the client
- observer/client.go: check raw-bytes length before bytes.TrimSpace in
  decodeSigningKey; binary keys whose first/last byte is ASCII whitespace
  were silently truncated, causing intermittent NewClient failures
- lightning/lnd_client.go: replace deprecated grpc.DialContext+WithBlock
  with grpc.NewClient (lazy connect, no startup timeout)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- observer/client.go: gate the early raw-bytes check with !hexEncoded so
  64-char hex strings (also 64 bytes) are correctly hex-decoded rather
  than treated as a raw key; fixes TestDecodeSigningKey_64CharSeedHex
- registry/reliability_test.go: backdate registered_at by 2 hours after
  registration so the seeded historical data (heartbeats/tasks) falls
  inside the clamped window; fixes TestUpdateReliabilityIntegration
- registry/registry.go: use COALESCE(onion_address,'') in ListActive and
  GetByPublicKey so a nullable TEXT column never causes "cannot scan NULL
  into *string"; fixes TestScheduleIntegration_TaskTypeMismatch

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pin registered_at to base-1s (1 second before the earliest seeded
heartbeat) so the rolling window is ~61 minutes.  The previous fix used
a 2-hour interval, giving expectedHB=120 against only 60 seeded
heartbeats → uptime=0.5 → reliability=0.375, failing the ≈0.75 target.
With a ~61-minute window: expectedHB≈61, uptime≈60/61≈0.98,
reliability≈0.735, safely within ±0.05 of 0.75.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CI installs protoc via apt-get (v3.21.12) and protoc-gen-go-grpc
(v1.6.2), then runs git diff --exit-code to guard against stale stubs.
The committed files were generated locally with v7.34.0/v1.6.1, causing
a spurious diff on every run.  Update the header comment-only version
strings to match what CI regenerates so the diff check is clean.
No functional code changes in the generated files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant