Skip to content

feat(cli): add deploy-adapter command for user-controlled LoRA deployment#191

Open
zeyuyuyu wants to merge 13 commits intomainfrom
feature/storage-first-flow-v2
Open

feat(cli): add deploy-adapter command for user-controlled LoRA deployment#191
zeyuyuyu wants to merge 13 commits intomainfrom
feature/storage-first-flow-v2

Conversation

@zeyuyuyu
Copy link
Collaborator

Summary

  • New CLI command fine-tuning deploy-adapter that triggers adapter deployment to the inference GPU via the broker's /v1/lora/adapters/deploy API
  • New --deploy flag on acknowledge-model to acknowledge + deploy in one step
  • Without --deploy, acknowledge only downloads the model; users must call deploy-adapter separately when they want to go live

Companion PR

Broker-side changes: 0gfoundation/0g-serving-broker#379

New Commands

# Separate deploy step
0g-compute-cli fine-tuning deploy-adapter \
  --provider <ADDR> --model <MODEL> --task-id <ID> --wait

# Or combine with acknowledge
0g-compute-cli fine-tuning acknowledge-model \
  --provider <ADDR> --task-id <ID> --data-path ./model \
  --model <MODEL> --deploy

Test plan

  • deploy-adapter with --wait correctly polls until adapter is active
  • acknowledge-model --deploy does acknowledge + deploy in one flow
  • acknowledge-model without --deploy still works as before (download only)
  • deploy-adapter returns proper error when adapter not found or already active

zeyuyuyu and others added 6 commits February 11, 2026 18:23
- Dataset upload: try 0G Storage first, fallback to TEE on failure
- Model download: default 'auto' mode tries 0G Storage first, TEE fallback
- Upload function now returns root hash from 0g-storage-client output
- Add 'auto' download method (default) alongside 'tee' and '0g-storage'
- Update 0g-storage-client binary to latest main (fixes contract ABI)

Co-authored-by: Cursor <cursoragent@cursor.com>
…sh verification

The method was missing after cherry-pick, causing TypeScript build failure.
This method verifies downloaded model hash against on-chain modelRootHash
when using TEE fallback download.

Co-authored-by: Cursor <cursoragent@cursor.com>
When data-path points to an existing directory, 0G Storage client fails
with "is a directory" error. Now auto-appends a filename (model_{taskId}.bin)
when the path is a directory, so 0G Storage client can create the output file.

Co-authored-by: Cursor <cursoragent@cursor.com>
- [HIGH] Reject if root hash is empty after successful upload in zg-storage
- [MEDIUM] Replace console.log/error with logger in SDK code (dataset.ts, zg-storage.ts)
- [LOW] Fix JSDoc: default downloadMethod is 'auto', not '0g-storage'
- [LOW] Improve CLI --download-method option description formatting

Co-authored-by: Cursor <cursoragent@cursor.com>
- Add makeAdapterName utility for consistent LoRA adapter naming
- Add `fine-tuning get-adapter-name` command
- Add `fine-tuning chat` command with --adapter-name option
- Add `inference get-secret --duration` for non-interactive API key generation

These commands enable CLI-driven E2E testing of the LoRA serving flow.

Made-with: Cursor
…ge-model

New command `fine-tuning deploy-adapter` triggers adapter deployment
to the inference GPU via the broker's new /v1/lora/adapters/deploy API.

The `acknowledge-model` command now accepts `--deploy` to automatically
wait for the broker to download the adapter and deploy it in one step.

Without --deploy, acknowledge only downloads the model; users must
call deploy-adapter separately when they want to go live.
@vercel
Copy link

vercel bot commented Mar 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
0g-serving-user-broker Ready Ready Preview, Comment Mar 27, 2026 8:05am

Request Review

- Fix deploy-adapter to resolve adapter name from broker API by taskId
  (handles base model name mismatch between CLI and broker internal paths)
- Add --broker-url option to deploy-adapter for direct broker targeting
  (bypasses contract lookup in dev/2-CVM setups)
- Fix malformed .option() syntax in model-usage and download commands
  (template literal used incorrectly as first argument to Commander)

Tested: full E2E flow with real fine-tuning, deploy, and inference
on 2-CVM testnet setup.
In 2-CVM setups where the fine-tuning provider differs from the
inference provider, the --deploy flag on acknowledge-model failed
because it tried to look up the inference broker URL using the
fine-tuning provider address. Adding --broker-url allows users to
specify the inference broker URL directly.
- chat command now queries broker by taskId to get the real adapter name,
  fixing 400 errors when broker uses path-based model names
- deploy-adapter polling loop re-resolves adapter name dynamically,
  fixing timeout when called right after acknowledge (before broker
  processes the on-chain event)
…-flow-v2

# Conflicts:
#	cli.commonjs/cli/fine-tuning.d.ts.map
#	cli.commonjs/cli/fine-tuning.js
#	cli.commonjs/cli/fine-tuning.js.map
#	cli.commonjs/cli/inference.d.ts.map
#	cli.commonjs/cli/inference.js
#	cli.commonjs/cli/inference.js.map
#	cli.commonjs/cli/ledger.d.ts.map
#	cli.commonjs/cli/util.d.ts.map
#	cli.commonjs/sdk/common/utils/index.d.ts
#	cli.commonjs/sdk/common/utils/index.d.ts.map
#	cli.commonjs/sdk/common/utils/index.js
#	cli.commonjs/sdk/common/utils/index.js.map
#	cli.commonjs/sdk/fine-tuning/broker/broker.d.ts
#	cli.commonjs/sdk/fine-tuning/broker/broker.d.ts.map
#	cli.commonjs/sdk/fine-tuning/broker/broker.js
#	cli.commonjs/sdk/fine-tuning/broker/broker.js.map
#	cli.commonjs/sdk/fine-tuning/broker/dataset.d.ts
#	cli.commonjs/sdk/fine-tuning/broker/dataset.d.ts.map
#	cli.commonjs/sdk/fine-tuning/broker/dataset.js
#	cli.commonjs/sdk/fine-tuning/broker/dataset.js.map
#	cli.commonjs/sdk/fine-tuning/broker/model.d.ts
#	cli.commonjs/sdk/fine-tuning/broker/model.d.ts.map
#	cli.commonjs/sdk/fine-tuning/broker/model.js
#	cli.commonjs/sdk/fine-tuning/broker/model.js.map
#	cli.commonjs/sdk/fine-tuning/zg-storage/zg-storage.d.ts
#	cli.commonjs/sdk/fine-tuning/zg-storage/zg-storage.d.ts.map
#	cli.commonjs/sdk/fine-tuning/zg-storage/zg-storage.js
#	cli.commonjs/sdk/fine-tuning/zg-storage/zg-storage.js.map
#	cli.commonjs/sdk/inference/broker/broker.d.ts.map
#	cli.commonjs/sdk/ledger/broker.d.ts.map
#	lib.commonjs/common/utils/index.d.ts
#	lib.commonjs/common/utils/index.d.ts.map
#	lib.commonjs/common/utils/index.js
#	lib.commonjs/common/utils/index.js.map
#	lib.commonjs/fine-tuning/broker/broker.d.ts
#	lib.commonjs/fine-tuning/broker/broker.d.ts.map
#	lib.commonjs/fine-tuning/broker/broker.js
#	lib.commonjs/fine-tuning/broker/broker.js.map
#	lib.commonjs/fine-tuning/broker/dataset.d.ts
#	lib.commonjs/fine-tuning/broker/dataset.d.ts.map
#	lib.commonjs/fine-tuning/broker/dataset.js
#	lib.commonjs/fine-tuning/broker/dataset.js.map
#	lib.commonjs/fine-tuning/broker/model.d.ts
#	lib.commonjs/fine-tuning/broker/model.d.ts.map
#	lib.commonjs/fine-tuning/broker/model.js
#	lib.commonjs/fine-tuning/broker/model.js.map
#	lib.commonjs/fine-tuning/zg-storage/zg-storage.d.ts
#	lib.commonjs/fine-tuning/zg-storage/zg-storage.d.ts.map
#	lib.commonjs/fine-tuning/zg-storage/zg-storage.js
#	lib.commonjs/fine-tuning/zg-storage/zg-storage.js.map
#	lib.commonjs/inference/broker/broker.d.ts.map
#	lib.commonjs/ledger/broker.d.ts.map
#	lib.esm/_commonjsHelpers-0f3c985d.js
#	lib.esm/adm-zip-0609067e.js
#	lib.esm/adm-zip-0609067e.js.map
#	lib.esm/browser-0fd4d439.js
#	lib.esm/browser-0fd4d439.js.map
#	lib.esm/common/utils/index.d.ts
#	lib.esm/common/utils/index.d.ts.map
#	lib.esm/fine-tuning/broker/broker.d.ts
#	lib.esm/fine-tuning/broker/broker.d.ts.map
#	lib.esm/fine-tuning/broker/dataset.d.ts
#	lib.esm/fine-tuning/broker/dataset.d.ts.map
#	lib.esm/fine-tuning/broker/model.d.ts
#	lib.esm/fine-tuning/broker/model.d.ts.map
#	lib.esm/fine-tuning/zg-storage/zg-storage.d.ts
#	lib.esm/fine-tuning/zg-storage/zg-storage.d.ts.map
#	lib.esm/index-291e4713.js
#	lib.esm/index.d.ts
#	lib.esm/index.mjs
#	lib.esm/inference/broker/broker.d.ts.map
#	lib.esm/ledger/broker.d.ts.map
#	src.ts/cli/fine-tuning.ts
#	src.ts/sdk/fine-tuning/broker/dataset.ts
#	src.ts/sdk/fine-tuning/broker/model.ts
#	src.ts/sdk/fine-tuning/zg-storage/zg-storage.ts
#	types/common/utils/index.d.ts
#	types/common/utils/index.d.ts.map
#	types/fine-tuning/broker/broker.d.ts
#	types/fine-tuning/broker/broker.d.ts.map
#	types/fine-tuning/broker/dataset.d.ts
#	types/fine-tuning/broker/dataset.d.ts.map
#	types/fine-tuning/broker/model.d.ts
#	types/fine-tuning/broker/model.d.ts.map
#	types/fine-tuning/zg-storage/zg-storage.d.ts
#	types/fine-tuning/zg-storage/zg-storage.d.ts.map
#	types/inference/broker/broker.d.ts.map
#	types/ledger/broker.d.ts.map
These commands need getServiceMetadata/getRequestHeaders which are
not available on ReadOnlyInferenceBroker.
@HONGJICAI
Copy link
Collaborator

@claude Review PR

@claude
Copy link

claude bot commented Mar 26, 2026

Claude finished @HONGJICAI's task in 2m 51s —— View job


Code Review Complete ✅

Reviewed PR #191: feat(cli): add deploy-adapter command for user-controlled LoRA deployment


Summary

This PR adds new CLI commands for deploying fine-tuned LoRA adapters to inference GPUs. While the feature is useful, there are significant architecture violations that need to be addressed before merging.

⚠️ Critical Issues

[HIGH] Architecture Violation: Business Logic in CLI

Location: src.ts/cli/fine-tuning.ts:793-966

Issue: The deployAdapterToBroker function contains 360+ lines of business logic that should be in the SDK, not the CLI. This violates the core architecture principle from CLAUDE.md:

Critical Design Principle: CLI and Web UI are consumers of the SDK. They should NOT contain business logic. All business logic belongs in the SDK.

Problems:

  1. Direct HTTP API calls to broker endpoints (lines 813, 820, 844, 860, 902, 916, 936)
  2. Complex polling logic with state management (lines 834-897, 930-964)
  3. Adapter name resolution logic (lines 819-829)
  4. Error handling and response processing embedded in CLI

Impact:

  • Web UI cannot reuse this functionality
  • Logic cannot be unit tested independently
  • Harder to maintain and debug
  • Violates separation of concerns

Required Fix:
Move all business logic to SDK. Create new methods in src.ts/sdk/inference/broker/broker.ts or a new processor:

// Proposed SDK methods (in SDK, not CLI):
class InferenceBroker {
  // Deploy adapter and optionally wait for completion
  async deployAdapter(
    providerAddress: string,
    baseModel: string,
    taskId: string,
    options?: {
      wait?: boolean;
      timeoutSeconds?: number;
      onProgress?: (state: AdapterState) => void;
    }
  ): Promise<DeploymentResult>

  // Get adapter status
  async getAdapterStatus(
    providerAddress: string,
    adapterName: string
  ): Promise<AdapterStatus>

  // List available adapters
  async listAdapters(providerAddress: string): Promise<AdapterInfo[]>
  
  // Send chat request to fine-tuned model
  async sendFineTunedChatRequest(
    providerAddress: string,
    adapterName: string,
    messages: ChatMessage[]
  ): Promise<ChatResponse>
}

Then the CLI would become a thin wrapper:

// CLI becomes thin wrapper
async function deployAdapterToBroker(...) {
  const spinner = ora('Deploying adapter...').start()
  
  await broker.inference.deployAdapter(
    providerAddress,
    baseModel,
    taskId,
    {
      wait,
      timeoutSeconds,
      onProgress: (state) => spinner.text = `Deploying... (${state})`
    }
  )
  
  spinner.succeed('Deployed!')
}

Fix this →


[HIGH] Business Logic in chat Command

Location: src.ts/cli/fine-tuning.ts:672-790

Issue: The chat command contains business logic for:

  • Adapter name resolution from broker API (lines 724-738)
  • Direct HTTP requests to inference endpoint (lines 756-772)
  • Response parsing

Required Fix: Move to SDK as broker.inference.sendFineTunedChatRequest() method.


TypeScript Issues

[MEDIUM] Use of any Types

Locations:

  • src.ts/cli/fine-tuning.ts:794 - broker: any
  • src.ts/cli/fine-tuning.ts:732 - (a: any)
  • src.ts/cli/fine-tuning.ts:822 - (a: any)
  • src.ts/cli/fine-tuning.ts:879 - err: any
  • src.ts/cli/fine-tuning.ts:923 - err: any

Issue: Using any defeats TypeScript's type safety. From CLAUDE.md:

No any types in public APIs

Fix: Define proper types:

interface AdapterInfo {
  adapterName: string
  taskId: string
  baseModel: string
  state: 'pending' | 'downloading' | 'ready' | 'active' | 'failed'
}

interface AdapterStatusResponse {
  state: string
  error?: string
}

interface DeployResponse {
  message: string
  adapterName?: string
}

[MEDIUM] Missing Type Definitions

Location: All API responses throughout the file

Issue: No type definitions for broker API responses

Fix: Create proper interfaces in SDK for all API responses


Code Quality Issues

[MEDIUM] Code Duplication

Locations:

  • src.ts/cli/fine-tuning.ts:819-829 - Adapter name resolution
  • src.ts/cli/fine-tuning.ts:724-738 - Same logic duplicated
  • src.ts/cli/fine-tuning.ts:842-856 - Same logic again

Issue: Same adapter name resolution logic appears 3+ times

Fix: Extract to SDK method:

// In SDK
async resolveAdapterName(
  providerAddress: string,
  taskId: string,
  localGuess: string
): Promise<string>

[LOW/nit] Missing Progress Indicators

Location: src.ts/cli/fine-tuning.ts:834-897

Issue: Polling loops print status directly with console.log instead of using spinners

Suggestion: Use ora spinner for better UX:

const spinner = ora('Waiting for adapter to be ready...').start()
// Update spinner.text in loop
spinner.succeed('Adapter ready!')

[LOW/nit] Magic Numbers

Location:

  • Line 888: setTimeout(r, 3000) - 3 second poll interval
  • Line 957: setTimeout(r, 2000) - 2 second poll interval
  • Line 466: 180 - timeout seconds

Suggestion: Extract to named constants:

const ADAPTER_POLL_INTERVAL_MS = 3000
const DEPLOY_POLL_INTERVAL_MS = 2000
const DEFAULT_DEPLOY_TIMEOUT_SECONDS = 180

✅ Good Practices

Positive feedback:

  1. ✅ Good use of makeAdapterName utility function - properly extracted to SDK common utils
  2. ✅ Proper input validation (line 431-437)
  3. ✅ Good error messages with chalk colors
  4. ✅ Mutual exclusion validation for --adapter-name vs --model + --task-id (lines 701-714)
  5. ✅ Fallback handling for API failures (lines 827, 854)
  6. ✅ Proper use of existing SDK method broker.inference.getServiceMetadata() (line 718, 798)
  7. ✅ Good CLI help text and descriptions

Security Review

✅ No security issues found:

  • No private key logging
  • No sensitive data exposure
  • Input validation is present
  • No SQL injection or similar risks

Breaking Changes Review

✅ No breaking changes:

  • No modifications to public SDK APIs (broker.ts files)
  • Only adds new CLI commands
  • Existing commands unchanged (except acknowledge-model gets optional --deploy flag)

Recommendations

Must Fix Before Merge (High Priority)

  1. Move all business logic to SDK - This is critical and blocks merge

    • Create deployAdapter(), getAdapterStatus(), listAdapters() in SDK
    • Move HTTP calls, polling logic, state management to SDK
    • Update CLI to be thin wrapper
  2. Add proper TypeScript types - Remove all any types

    • Define interfaces for all API responses
    • Type function parameters properly
  3. Eliminate code duplication - Move adapter name resolution to SDK

Should Fix (Medium Priority)

  1. Move chat command logic to SDK as sendFineTunedChatRequest()
  2. Add progress spinners for better UX
  3. Extract magic numbers to constants

Nice to Have (Low Priority)

  1. Add JSDoc comments to new utility functions
  2. Consider adding unit tests for SDK methods (after refactor)

Architecture Guidance

From CLAUDE.md:

CLI Rules:

  • ✅ Parse command-line arguments
  • ✅ Format output for terminal display
  • ✅ Handle user interaction (prompts, spinners)
  • ✅ Call SDK methods
  • NO business logic - all logic must be in SDK

This PR currently violates the last rule significantly. Please refactor before merging.


Test Plan Review

The test plan in the PR description is good, but should be expanded after refactoring:

  • SDK unit tests for deployAdapter(), getAdapterStatus(), listAdapters()
  • CLI integration tests for all new commands
  • Test adapter name resolution edge cases
  • Test polling timeout behavior
  • Test error handling (broker offline, adapter not found, etc.)

Questions for Author

  1. Why was the deploy logic implemented in CLI instead of SDK?
  2. Is there a reason the broker API client isn't abstracted?
  3. Should the polling intervals be configurable?

Overall Assessment: Feature is valuable, but needs significant refactoring to meet architecture standards. The core issue is business logic in CLI that should be in SDK. Please refactor before merging.


@zeyuyuyu
Copy link
Collaborator Author

@HONGJICAI Thanks for triggering the review. Here's my response to the Claude bot's findings:

Addressed / Will Fix

[HIGH] Business logic in CLI (deploy, chat, status polling)
Agreed — this is the main issue. The deploy/chat/status logic should be moved to SDK methods (deployAdapter(), getAdapterStatus(), listAdapters(), sendFineTunedChatRequest()). CLI should be a thin wrapper.
→ Will refactor in next iteration.

[MEDIUM] any types
Agreed. Will add proper TypeScript interfaces for all API responses (AdapterInfo, AdapterStatusResponse, DeployResponse).

[MEDIUM] Code duplication (adapter name resolution 3x)
Agreed. Will extract to a single SDK utility method resolveAdapterName().

[LOW] Magic numbers (poll intervals, timeouts)
Will extract to named constants.

Context

This PR was initially built as a quick CLI prototype to validate the E2E flow (broker → 0G Storage → vLLM → LoRA serving). Now that the flow is validated end-to-end (see test history), the refactor to proper SDK architecture makes sense as a natural next step.

Plan

  1. Create SDK methods: deployAdapter(), getAdapterStatus(), listAdapters(), chat()
  2. Add proper TypeScript types for all API responses
  3. Extract shared logic (adapter name resolution)
  4. Reduce CLI to thin wrapper calling SDK
  5. Add unit tests for SDK methods

Will push the refactored version shortly.

- New `LoRAProcessor` class in `src.ts/sdk/inference/broker/lora.ts`
  with typed interfaces (AdapterInfo, AdapterStatusResponse, etc.)
- New SDK methods on InferenceBroker: deployAdapter(), getAdapterStatus(),
  listAdapters(), resolveAdapterName(), chatWithFineTunedModel()
- CLI deploy-adapter and chat commands are now thin wrappers calling SDK
- Eliminated 3x duplicated adapter name resolution logic
- Replaced all `any` types with proper TypeScript interfaces
- Extracted magic numbers to named constants

Addresses code review feedback from Claude bot on PR #191.
@zeyuyuyu
Copy link
Collaborator Author

Update: All review items addressed ✅

All issues from the Claude bot review have been fixed and pushed in commit b9f2795.

[HIGH] Business logic moved from CLI to SDK

Created new LoRAProcessor SDK class (src.ts/sdk/inference/broker/lora.ts) with:

  • deployAdapter() — deploy with optional wait/polling
  • getAdapterStatus() — query single adapter state
  • listAdapters() — list all adapters
  • resolveAdapterName() — resolve broker-side adapter name from taskId
  • chat() — send chat request to fine-tuned model

New public methods on InferenceBroker: deployAdapter(), getAdapterStatus(), listAdapters(), resolveAdapterName(), chatWithFineTunedModel().

CLI is now a thin wrapper:

  • deploy-adapter: ~170 lines → ~30 lines
  • chat: ~90 lines → ~40 lines

[MEDIUM] any types → proper TypeScript interfaces

Added: AdapterInfo, AdapterStatusResponse, DeployResponse, ChatResponse, DeployAdapterOptions, ChatOptions, AdapterState.

[MEDIUM] Code duplication eliminated

Adapter name resolution consolidated into single LoRAProcessor.resolveAdapterName().

[LOW] Magic numbers extracted

ADAPTER_POLL_INTERVAL_MS, DEPLOY_POLL_INTERVAL_MS, DEFAULT_DEPLOY_TIMEOUT_SECONDS.

E2E Verified

All CLI commands tested against live devnet — deploy-adapter, chat (both --model+--task-id and --adapter-name modes) pass. No breaking changes to CLI interface.

@zeyuyuyu
Copy link
Collaborator Author

@claude review this pr

@claude
Copy link

claude bot commented Mar 27, 2026

Claude finished @zeyuyuyu's task in 5m 13s —— View job


PR Review Complete ✅

Reviewed PR #191: feat(cli): add deploy-adapter command for user-controlled LoRA deployment


Executive Summary

Excellent refactoring work! 🎉 The author has successfully addressed all major issues from the previous review. The business logic has been properly moved from CLI to SDK, TypeScript types are well-defined, code duplication is eliminated, and the architecture now correctly follows CLAUDE.md standards.

Changes Summary

Metric Before After Status
Business logic in CLI ~360 lines ~33 lines ✅ Fixed
TypeScript any types 5+ instances 0 in SDK, 1 in CLI ✅ Mostly Fixed
Code duplication 3x adapter resolution Single method ✅ Fixed
Magic numbers Hardcoded Named constants ✅ Fixed
Architecture compliance ❌ Violated ✅ Compliant ✅ Fixed

Detailed Review

✅ Architecture Compliance - EXCELLENT

Previous Issue: 360+ lines of business logic in CLI
Current Status: ✅ RESOLVED

The refactoring successfully separates concerns:

  1. SDK Layer (src.ts/sdk/inference/broker/lora.ts):

    • New LoRAProcessor class (302 lines) containing all business logic
    • Methods: deployAdapter(), getAdapterStatus(), listAdapters(), resolveAdapterName(), chat()
    • Proper polling logic with state management
    • HTTP API calls to broker endpoints
    • Error handling and response processing
  2. Public API (src.ts/sdk/inference/broker/broker.ts):

    • Clean public methods exposed on InferenceBroker: lines 667-763
    • Methods: listAdapters(), getAdapterStatus(), resolveAdapterName(), deployAdapter(), chatWithFineTunedModel()
    • Proper error wrapping with throwFormattedError()
  3. CLI Layer (src.ts/cli/fine-tuning.ts):

    • Now a thin wrapper (33 lines for deployAdapterToBroker())
    • Handles argument parsing, output formatting, progress display
    • Delegates all logic to SDK methods
    • chat command is also properly thin (~40 lines)

This is exactly what CLAUDE.md requires:

CLI and Web UI are consumers of the SDK. They should NOT contain business logic. All business logic belongs in the SDK.

Perfect implementation!


✅ TypeScript Type Safety - GREATLY IMPROVED

Previous Issue: 5+ instances of any types
Current Status: ✅ MOSTLY RESOLVED

New Interfaces (lines 8-64 in lora.ts):

type AdapterState = 'init' | 'pending' | 'downloading' | 'ready' | 'active' | ...
interface AdapterInfo { adapterName, taskId, baseModel, userAddress, state, ... }
interface AdapterStatusResponse { adapterName, state, error? }
interface DeployResponse { message, adapterName? }
interface ChatResponse { id, model, choices, usage? }
interface DeployAdapterOptions { wait?, timeoutSeconds?, brokerUrlOverride?, onProgress? }
interface ChatOptions { systemPrompt? }

Excellent type definitions! All interfaces are well-structured and comprehensive.

Remaining Minor Issue 🔍:

Location: src.ts/cli/fine-tuning.ts:757

async function deployAdapterToBroker(
    broker: { inference: { deployAdapter: Function; resolveAdapterName: Function } },
    // ^^ Should use proper type instead of inline type with Function
    ...
)

Suggestion: Use proper type:

import type { ZGComputeNetworkBroker } from '../sdk'

async function deployAdapterToBroker(
    broker: ZGComputeNetworkBroker,
    providerAddress: string,
    baseModel: string,
    taskId: string,
    wait: boolean,
    timeoutSeconds: number,
    brokerUrlOverride?: string
)

Impact: Low - this is internal CLI code, but for consistency should use proper types.

Fix this →


✅ Code Duplication - ELIMINATED

Previous Issue: Adapter name resolution logic duplicated 3+ times
Current Status: ✅ RESOLVED

Solution:

  1. Created utility function: src.ts/sdk/common/utils/adapter-name.ts
    export function makeAdapterName(baseModel: string, taskId: string): string
  2. Single resolveAdapterName() method in LoRAProcessor that uses this utility
  3. Used throughout CLI consistently

Perfect DRY principle application!


✅ Magic Numbers - EXTRACTED

Previous Issue: Hardcoded intervals and timeouts
Current Status: ✅ RESOLVED

Constants (lora.ts:4-6):

const ADAPTER_POLL_INTERVAL_MS = 3000
const DEPLOY_POLL_INTERVAL_MS = 2000
const DEFAULT_DEPLOY_TIMEOUT_SECONDS = 120

Good practice! Makes the code more maintainable.


Code Quality Review

✅ Good Practices Observed

  1. Dependency Injection (lora.ts:73-82):

    constructor(
        getEndpoint: (providerAddress: string) => Promise<string>,
        getHeaders: (providerAddress: string, content?: string) => Promise<Record<string, string>>
    )

    ✅ Excellent! The LoRAProcessor receives dependencies via constructor, making it testable and decoupled.

  2. Progress Callbacks (lora.ts:59):

    onProgress?: (state: string) => void

    ✅ Great UX pattern! Allows CLI to display progress without SDK being CLI-aware.

  3. Error Handling (lora.ts:106-108, 196-198, 222-224):

    } catch {
        // Fall back to locally generated name
    }

    ✅ Graceful degradation when broker API is unavailable.

  4. State Validation (lora.ts:194-195, 219-221):

    if (state === 'ready' || state === 'active') break
    if (state === 'failed') break

    ✅ Proper state machine handling.

  5. Timeout Handling (lora.ts:201-209, 238-264):

    if (Date.now() >= deadline && lastState !== 'ready' && lastState !== 'active') {
        throw new Error(`Timed out after ${timeoutSeconds}s...`)
    }

    ✅ Prevents infinite loops with clear error messages.


🔍 Minor Issues & Suggestions

[LOW] Type Annotation Could Be More Specific

Location: lora.ts:117

this.getHeaders = getHeaders
// ^^ Returns Record<string, string> but is actually used as request headers

Suggestion: Consider a more specific type:

type RequestHeaders = Record<string, string> & {
    Authorization?: string
    'Content-Type'?: string
    // Add other expected headers
}

Impact: Very low - current type is acceptable.


[LOW] Potential Race Condition in Name Resolution

Location: lora.ts:162-181

Observation: During the wait loop, adapter name resolution is attempted multiple times:

while (Date.now() < deadline) {
    if (!nameResolved) {
        try {
            const resolved = await this.resolveAdapterName(...)
            if (resolved !== localName) {
                adapterName = resolved
                nameResolved = true
            }
        } catch { }
    }
    // ... status check uses adapterName
}

Potential Issue: If broker assigns a different name than local guess, there's a window where status checks might use the wrong name.

Current Mitigation: The code correctly retries name resolution until it succeeds. ✅

Suggestion: Consider adding a log message when name is resolved:

if (resolved !== localName) {
    adapterName = resolved
    nameResolved = true
    onProgress?.(`Adapter name resolved: ${adapterName}`)
}

Impact: Very low - current logic is sound, suggestion is just for better observability.


[LOW/nit] Error Message Could Be More Helpful

Location: lora.ts:252

throw new Error('Adapter deployment failed.')

Suggestion: Include more context:

throw new Error(`Adapter deployment failed for ${adapterName}. Check broker logs for details.`)

Impact: Low - improves debugging experience.


[LOW/nit] Comment Could Clarify Adapter Name Resolution Logic

Location: lora.ts:89-110

The resolveAdapterName method has subtle behavior that could be documented:

/**
 * Resolves the actual adapter name from the broker.
 * 
 * The broker may use a different base model path than the local naming convention,
 * so we query the broker's adapter list to find the correct name by task ID.
 * Falls back to locally generated name if broker query fails.
 * 
 * @param providerAddress - Provider address
 * @param taskId - Fine-tuning task ID
 * @param baseModel - Base model name (for local fallback)
 * @param brokerUrlOverride - Optional broker URL override
 * @returns Adapter name (broker's name if available, otherwise local guess)
 */
async resolveAdapterName(...) { ... }

Impact: Very low - improves maintainability.


Security Review

✅ No Security Issues Found

  • ✅ No private key logging
  • ✅ No sensitive data in errors
  • ✅ Headers are properly passed through without logging
  • ✅ Input validation is present (--model required when using --deploy)
  • ✅ No SQL injection or similar risks
  • ✅ API calls use proper authentication via getHeaders()

Breaking Changes Review

✅ No Breaking Changes

Public API Changes:

  • ✅ Only additions to InferenceBroker (new methods)
  • ✅ No modifications to existing method signatures
  • ✅ Backward compatible
  • acknowledge-model gets optional --deploy flag (defaults to false, backward compatible)

CLI Changes:

  • ✅ New commands: deploy-adapter, chat, get-adapter-name
  • ✅ Existing commands unchanged (except acknowledge-model gets optional flag)
  • inference.ts gets optional --duration flag (defaults to interactive prompt)

No version bump required - these are pure additions.


Testing Considerations

Recommended Test Coverage

From the author's update comment:

E2E Verified - All CLI commands tested against live devnet

✅ Good manual testing!

Additional Recommendations:

  1. SDK Unit Tests (recommended additions):

    describe('LoRAProcessor', () => {
        describe('deployAdapter', () => {
            it('should wait for adapter to be active when wait=true')
            it('should timeout after specified duration')
            it('should skip deploy if already active')
            it('should handle broker API failures gracefully')
            it('should resolve adapter name from broker')
        })
        
        describe('chat', () => {
            it('should send chat request with proper headers')
            it('should format messages correctly')
            it('should handle streaming responses') // Future
        })
    })
  2. Integration Tests:

    • Mock broker API responses
    • Test CLI commands end-to-end
    • Test error scenarios (broker offline, adapter not found, etc.)
  3. Edge Cases to Test:

    • Broker returns different adapter name than local guess
    • Adapter deployment fails after acknowledge
    • Network timeouts during polling
    • Multiple concurrent deploys

Note: Adding tests is not blocking for merge (feature is E2E verified), but recommended for future robustness.


Documentation Review

✅ CLI Help Text is Good

deploy-adapter:

'Deploy a downloaded LoRA adapter to the inference GPU (triggers vLLM loading)'

chat:

'Send a chat request to a fine-tuned model via the inference broker'

✅ Clear and descriptive!

📝 Suggestion: Add SDK Documentation

Location: src.ts/sdk/inference/broker/broker.ts:662-763

The new public methods have minimal JSDoc. Consider adding comprehensive documentation:

/**
 * Deploys a fine-tuned LoRA adapter to the inference GPU.
 * 
 * After acknowledging a fine-tuned model, you must deploy it to the inference
 * GPU before it can be used for inference. This method triggers the deployment
 * and optionally waits for it to complete.
 * 
 * @param {string} providerAddress - Address of the inference provider
 * @param {string} baseModel - Base model name used in fine-tuning (e.g., "Qwen2.5-0.5B-Instruct")
 * @param {string} taskId - Fine-tuning task ID
 * @param {DeployAdapterOptions} options - Deployment options
 * @param {boolean} options.wait - If true, waits until adapter is active (default: false)
 * @param {number} options.timeoutSeconds - Timeout when waiting (default: 120)
 * @param {string} options.brokerUrlOverride - Override broker URL (skip contract lookup)
 * @param {function} options.onProgress - Callback for progress updates
 * 
 * @returns {Promise<DeployResponse>} Deployment result with message and adapter name
 * 
 * @example
 * ```typescript
 * // Deploy without waiting
 * const result = await broker.inference.deployAdapter(
 *   providerAddress,
 *   'Qwen2.5-0.5B-Instruct',
 *   taskId
 * );
 * 
 * // Deploy and wait for completion
 * const result = await broker.inference.deployAdapter(
 *   providerAddress,
 *   'Qwen2.5-0.5B-Instruct',
 *   taskId,
 *   {
 *     wait: true,
 *     timeoutSeconds: 180,
 *     onProgress: (state) => console.log(`State: ${state}`)
 *   }
 * );
 * ```
 * 
 * @throws {Error} If deployment times out or fails
 */
public deployAdapter = async (...) => { ... }

Impact: Medium - improves developer experience for SDK users.

Add JSDoc →


Additional Observations

✅ Good: Consistent Naming Convention

The makeAdapterName() utility correctly documents:

/**
 * Build a deterministic LoRA adapter name from base model and task ID.
 * Must match the broker-side naming in lora/manager.go MakeAdapterName().
 */

Excellent! Ensures client-side and broker-side names match.


✅ Good: Proper CLI Option Validation

Location: src.ts/cli/fine-tuning.ts:431-438

if (options.deploy && !options.model) {
    console.error(chalk.red('Error: --model is required when using --deploy'))
    process.exit(1)
}

Good UX! Clear error message when required option is missing.


✅ Good: Mutual Exclusion Validation

Location: src.ts/cli/fine-tuning.ts:701-714

if (options.adapterName) {
    adapterName = options.adapterName
} else if (options.model && options.taskId) {
    adapterName = makeAdapterName(options.model, options.taskId)
} else {
    console.error(chalk.red('Error: Provide either --adapter-name or both --model and --task-id'))
    process.exit(1)
}

Excellent validation! Handles both use cases clearly.


✅ Good: Small Change to inference.ts

Location: src.ts/cli/inference.ts:1020-1056

Adding --duration flag to skip interactive prompt is a good CLI improvement:

.option('--duration <ms>', 'Token duration in milliseconds (0 = never expires). Skips interactive prompt when provided.')

Good UX enhancement! Enables non-interactive usage (CI/CD, scripts).


Final Assessment

Overall Grade: A (Excellent Work!)

Strengths:

  • Architecture: Perfect separation of concerns (SDK vs CLI)
  • Type Safety: Well-defined interfaces, minimal any types
  • Code Organization: Clean, logical structure
  • Error Handling: Comprehensive with graceful degradation
  • UX: Good progress callbacks and clear error messages
  • Backward Compatibility: No breaking changes
  • Testing: E2E verified on live devnet

Areas for Minor Improvement:

  • 🔍 CLI function type could use proper import (low priority)
  • 📝 SDK methods could use more comprehensive JSDoc (recommended)
  • 🧪 Unit tests would improve robustness (future work)

Recommendations

✅ Ready to Merge

The PR successfully addresses all major issues from the previous review. The refactoring is excellent and follows CLAUDE.md standards.

Before Merge (Optional but Recommended):

  1. Fix CLI type annotation - 2 minute fix
  2. Add JSDoc to SDK methods - 15 minute task

After Merge (Future Work):

  1. Add unit tests for LoRAProcessor
  2. Add integration tests for CLI commands
  3. Consider adding adapter state transition diagram to docs

Comparison: Before vs After

Aspect Previous Review This Review
Architecture ❌ Business logic in CLI ✅ Properly separated
Code Lines 360+ in CLI 33 in CLI, 302 in SDK
Type Safety ❌ 5+ any types ✅ 1 minor issue
Duplication ❌ 3x adapter resolution ✅ Single method
Constants ❌ Magic numbers ✅ Named constants
Reusability ❌ CLI-only ✅ SDK can be used by Web UI
Testability ❌ Difficult ✅ Easy (dependency injection)
Maintainability ❌ Low ✅ High

Result: 🎉 Excellent improvement! The refactoring is a textbook example of how to properly separate concerns in a TypeScript project.


Acknowledgment

@zeyuyuyu - Outstanding work on this refactoring! You've taken the feedback from the previous review and executed a professional, comprehensive solution that:

  1. ✅ Addresses every issue raised
  2. ✅ Follows architecture best practices
  3. ✅ Maintains backward compatibility
  4. ✅ Improves code quality significantly
  5. ✅ Sets a good pattern for future features

This is exactly how SDK/CLI separation should be done. The LoRAProcessor is well-designed with proper dependency injection, making it testable and reusable. The CLI is now a true thin wrapper. Well done! 👏


Verdict: ✅ APPROVED - Ready to merge with optional minor improvements suggested above.


@zeyuyuyu
Copy link
Collaborator Author

@claude review this pr

@claude
Copy link

claude bot commented Mar 27, 2026

Claude finished @zeyuyuyu's task in 4m 47s —— View job


Code Review Complete ✅

Reviewed PR #191 follow-up: All major issues from previous review have been successfully addressed! 🎉


Summary

The refactoring is excellent and addresses all critical architecture violations from the previous review. The business logic has been properly moved to the SDK, TypeScript types are clean, and the CLI is now a thin wrapper as intended. This is a significant improvement and aligns well with the project's architecture standards.

Major Improvements ✅

[HIGH] Business Logic Successfully Moved to SDK ✅

Previously: 360+ lines of business logic in CLI
Now: Clean SDK architecture with dedicated LoRAProcessor class

Files:

  • New: src.ts/sdk/inference/broker/lora.ts (+302 lines) - Well-structured SDK processor
  • Modified: src.ts/sdk/inference/broker/broker.ts (+130 lines) - Public API methods added
  • Modified: src.ts/cli/fine-tuning.ts - Now a thin wrapper (~30 lines per command)

New Public API Methods on InferenceBroker:

deployAdapter(providerAddress, baseModel, taskId, options?)
getAdapterStatus(providerAddress, adapterName, brokerUrlOverride?)
listAdapters(providerAddress, brokerUrlOverride?)
resolveAdapterName(providerAddress, taskId, baseModel, brokerUrlOverride?)
chatWithFineTunedModel(providerAddress, adapterName, message, options?)

Architecture now follows CLAUDE.md principles:

  • ✅ SDK contains all business logic (HTTP calls, polling, state management)
  • ✅ CLI is a thin wrapper (argument parsing, output formatting, progress display)
  • ✅ Web UI can now reuse the same SDK methods
  • ✅ Code is testable (SDK can be unit tested independently)

[MEDIUM] TypeScript Types Properly Defined ✅

Previously: Multiple any types throughout
Now: Comprehensive type definitions

New interfaces (all properly typed):

AdapterState: Union type with all possible states
AdapterInfo: Complete adapter metadata
AdapterStatusResponse: Status check response
DeployResponse: Deployment result
ChatResponse: OpenAI-compatible chat response
DeployAdapterOptions: Deployment configuration
ChatOptions: Chat configuration

Remaining any types: Only 1 in CLI code (line 531 of fine-tuning.ts) for Commander.js options - acceptable for CLI code.

[MEDIUM] Code Duplication Eliminated ✅

Previously: Adapter name resolution appeared 3+ times
Now: Single source of truth

  • Consolidated into LoRAProcessor.resolveAdapterName() SDK method
  • Shared utility makeAdapterName() in src.ts/sdk/common/utils/adapter-name.ts
  • Both CLI and SDK use the same utility function

[LOW] Magic Numbers Extracted ✅

Previously: Hardcoded timeouts and intervals throughout
Now: Named constants in SDK

const ADAPTER_POLL_INTERVAL_MS = 3000
const DEPLOY_POLL_INTERVAL_MS = 2000
const DEFAULT_DEPLOY_TIMEOUT_SECONDS = 120

[MEDIUM] --deploy Flag Properly Implemented ✅

The acknowledge-model --deploy flow works correctly:

  1. Validates --model is required when using --deploy (lines 431-438)
  2. Calls acknowledgeModel() SDK method
  3. If --deploy is set, calls deployAdapter() with wait=true
  4. Proper error handling and user feedback

New Issues Found

While the refactoring successfully addressed all major concerns, there are some new issues introduced that should be fixed before merge:

[MEDIUM] Error Handling Issues in SDK

Issue 1: Silent error swallowing in lora.ts

Multiple empty catch blocks that could mask real errors:

// Line 106-108
try {
    const resp = await axios.get(`${baseUrl}/v1/lora/adapters`)
    // ... resolve adapter name
} catch {
    // Fall back to locally generated name
}

Problem: This swallows ALL errors (network failures, auth issues, 500 errors, etc.), not just "adapter not found". Could hide real problems from users.

Same pattern at: Lines 178-180, 196-198, 222-224

Fix: Check specific error types:

try {
    const resp = await axios.get(`${baseUrl}/v1/lora/adapters`)
    const adapters: AdapterInfo[] = resp.data?.adapters || []
    const match = adapters.find((a) => a.taskId === taskId)
    if (match?.adapterName) {
        return match.adapterName
    }
} catch (err) {
    if (axios.isAxiosError(err)) {
        // Only fall back on 404 (not found) or network errors
        if (!err.response || err.response.status === 404) {
            // Expected - adapter list not available yet
        } else {
            // Unexpected error - rethrow with context
            throw new Error(
                `Failed to resolve adapter name: ${err.message} (status: ${err.response?.status})`
            )
        }
    } else {
        throw err
    }
}
return localName

Issue 2: Fragile error message checking

Line 255-258 in lora.ts:

catch (err: unknown) {
    if (err instanceof Error && err.message.includes('failed')) {
        throw err
    }
}

Problem: Checking if error message includes "failed" is fragile and language-dependent. Should check error type or HTTP status code instead.

Fix:

catch (err: unknown) {
    if (axios.isAxiosError(err) && err.response?.status >= 400) {
        throw new Error(
            `Adapter deployment failed: ${err.response?.data?.message || err.message}`
        )
    }
    // Network error or timeout - continue polling
}

Issue 3: No axios timeout configuration

All axios calls in lora.ts have no timeout. Could hang indefinitely on network issues.

Fix: Add timeout to all axios calls:

const resp = await axios.get(`${baseUrl}/v1/lora/adapters`, {
    timeout: 30000 // 30 second timeout
})

Fix error handling →


[LOW] Type Safety Issues

Issue 1: Unnecessary type assertion in broker.ts:117

return headers as unknown as Record<string, string>

Problem: Double cast (as unknown as) bypasses TypeScript's type checking. This is a code smell.

Likely cause: getRequestHeaders() probably already returns the correct type. The double cast suggests type mismatch that should be fixed at the source.

Investigation needed: Check what requestProcessor.getRequestHeaders() returns and fix the return type if needed, or add a proper type guard if the cast is truly necessary.

Issue 2: Imprecise function types in CLI

src.ts/cli/fine-tuning.ts:757:

broker: { inference: { deployAdapter: Function; resolveAdapterName: Function } }

Problem: Using Function type is too loose. Should import proper SDK types.

Fix:

import type { ZGComputeNetworkBroker } from '../sdk'

async function deployAdapterToBroker(
    broker: ZGComputeNetworkBroker,
    providerAddress: string,
    // ...
)

Fix type issues →


[LOW] Code Quality Issues

Issue 1: Missing JSDoc on public SDK class

LoRAProcessor class and its public methods lack JSDoc comments. Per CLAUDE.md:

Comprehensive JSDoc for public APIs

Fix: Add JSDoc to all public methods with:

  • Description of what the method does
  • @param for each parameter
  • @returns for return value
  • @throws for possible errors
  • @example for complex methods

Example:

/**
 * Deploys a LoRA adapter to the inference GPU and optionally waits for completion.
 *
 * @param providerAddress - Address of the inference provider
 * @param baseModel - Base model name (e.g., "Qwen2.5-0.5B-Instruct")
 * @param taskId - Fine-tuning task ID
 * @param options - Deployment options (wait, timeout, progress callback)
 * @returns Promise resolving to deployment result with status message
 * @throws {Error} If deployment fails or times out
 *
 * @example
 * ```typescript
 * const result = await broker.inference.deployAdapter(
 *   '0x123...',
 *   'Qwen2.5-0.5B-Instruct',
 *   'task-456',
 *   {
 *     wait: true,
 *     timeoutSeconds: 180,
 *     onProgress: (state) => console.log(`State: ${state}`)
 *   }
 * )
 * console.log(result.message)
 * ```
 */
public async deployAdapter(...)

Issue 2: Fragile URL manipulation

lora.ts:86:

return endpoint.replace(/\/v1\/proxy$/, '')

Problem: Assumes endpoint always ends with /v1/proxy. If broker changes endpoint format, this breaks silently.

Fix: Use proper URL parsing:

private async getBrokerBaseUrl(providerAddress: string): Promise<string> {
    const endpoint = await this.getEndpoint(providerAddress)
    try {
        const url = new URL(endpoint)
        // Remove /v1/proxy from path
        url.pathname = url.pathname.replace(/\/v1\/proxy$/, '')
        return url.toString().replace(/\/$/, '') // Remove trailing slash
    } catch {
        // Fallback to string replacement if URL parsing fails
        return endpoint.replace(/\/v1\/proxy$/, '')
    }
}

Issue 3: Magic number in CLI

fine-tuning.ts:466:

180,  // timeout seconds for --deploy

Fix: Import constant from SDK:

import { DEFAULT_DEPLOY_TIMEOUT_SECONDS } from '../sdk/inference/broker/lora'

// Then use:
DEFAULT_DEPLOY_TIMEOUT_SECONDS

[nit] Minor Improvements

Suggestion 1: More informative progress callback

Current onProgress callback only receives state string. Could be more helpful:

export interface DeployProgress {
    state: AdapterState
    elapsedSeconds: number
    message: string
}

export interface DeployAdapterOptions {
    wait?: boolean
    timeoutSeconds?: number
    brokerUrlOverride?: string
    onProgress?: (progress: DeployProgress) => void
}

Suggestion 2: Return type inconsistency

deployAdapter returns DeployResponse with optional adapterName?, but the adapter name is always known after resolveAdapterName() is called. Could make it required:

export interface DeployResponse {
    message: string
    adapterName: string  // Remove optional `?`
}

Positive Patterns to Celebrate 🎉

  1. Excellent dependency injection - LoRAProcessor constructor takes getEndpoint and getHeaders functions, making it testable and flexible
  2. Progress callbacks - Good UX with onProgress callback for long-running operations
  3. Graceful degradation - Falls back to local adapter name if broker is unavailable
  4. Comprehensive options - brokerUrlOverride option allows bypassing contract lookup for testing
  5. OpenAI-compatible response types - ChatResponse matches OpenAI API format for easy migration
  6. Clean separation of concerns - CLI only handles argument parsing and output formatting
  7. Proper exports - All new types and classes properly exported via index.ts
  8. Consistent naming - All methods use clear, descriptive names (deployAdapter, getAdapterStatus, etc.)

Security Review

No security issues found:

  • No private key logging
  • No sensitive data in error messages
  • Authentication properly delegated to getHeaders() function
  • No injection vulnerabilities
  • Input validation present (e.g., --model required when using --deploy)

Breaking Changes Review

No breaking changes:

  • Only adds new public API methods to InferenceBroker
  • No modifications to existing SDK APIs
  • No changes to existing CLI commands (only new optional --deploy flag)
  • No changes to broker.ts method signatures

Test Plan Review

The test plan in the PR description is good, but should be expanded to include SDK-level tests:

Should add:

  • SDK unit tests for LoRAProcessor:
    • deployAdapter() with and without wait option
    • resolveAdapterName() fallback behavior
    • getAdapterStatus() error handling
    • listAdapters() empty list handling
    • chat() request header generation
  • Integration tests with mock broker API
  • Error handling tests:
    • Network timeout
    • 404 (adapter not found)
    • 401 (authentication failure)
    • 500 (broker internal error)
  • CLI integration tests for end-to-end flows

Existing test plan is good ✅:

  • deploy-adapter with --wait polls correctly
  • acknowledge-model --deploy works end-to-end
  • acknowledge-model without --deploy still works
  • deploy-adapter error handling

Recommendations

Must Fix Before Merge (Medium Priority)

  1. Fix error handling in lora.ts

    • Replace empty catch blocks with specific AxiosError checks
    • Add timeout to all axios calls
    • Provide better error messages to users
  2. Add JSDoc to LoRAProcessor class

    • Required by CLAUDE.md for public APIs
    • Makes SDK more maintainable and discoverable

Should Fix (Low Priority)

  1. Fix type safety issues

    • Remove unnecessary type assertion in broker.ts:117
    • Use proper SDK types in CLI instead of Function
  2. Improve URL handling

    • Use proper URL parsing instead of string replacement
  3. Extract magic number in CLI

    • Import DEFAULT_DEPLOY_TIMEOUT_SECONDS from SDK

Nice to Have

  1. More informative progress callbacks

    • Include elapsed time and estimated time remaining
  2. Add SDK unit tests

    • Test error handling paths
    • Test polling timeout behavior
    • Test adapter name resolution fallback

Final Verdict

Recommendation: Approve after fixing error handling ✅ (with reservations)

The refactoring is excellent and successfully addresses all critical architecture violations. The code now follows proper SDK architecture patterns and is much more maintainable. However, the error handling issues (empty catch blocks, missing timeouts) should be fixed before merge to prevent silent failures and provide better user experience.

Summary:

  • Architecture: Perfect - business logic in SDK, CLI is thin wrapper
  • TypeScript: Excellent - proper types, minimal any usage
  • ⚠️ Error Handling: Needs improvement - silent error swallowing, missing timeouts
  • Code Quality: Good - clean separation, proper exports, consistent naming
  • ⚠️ Documentation: Missing JSDoc on new SDK class
  • Security: No issues found
  • Breaking Changes: None

Great work on the refactoring! 🎉 The architecture is now solid. Just needs some polish on error handling and documentation before merge.


this.modelProcessor = modelProcessor

this.loraProcessor = new LoRAProcessor(
async (providerAddress: string) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@claude Is it better we have the requestProcessor as LoRAProcessor's dependency instead of passing a func?

Copy link

@claude claude bot Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude finished @HONGJICAI's task in 2m 27s —— View job


Architecture Discussion: Dependency Injection Pattern

Great question! Let me analyze both approaches:


Current Approach: Function Injection

this.loraProcessor = new LoRAProcessor(
    async (providerAddress: string) => { /* getEndpoint */ },
    async (providerAddress: string, content?: string) => { /* getHeaders */ }
)

Alternative: Object Injection

this.loraProcessor = new LoRAProcessor(this.requestProcessor)
// Then LoRAProcessor would call requestProcessor.getServiceMetadata() directly

Analysis

I recommend keeping the current function injection approach, but I understand why it might look overly verbose. Here's my reasoning:

✅ Why Current Approach (Function Injection) is Better

  1. Loose Coupling

    • LoRAProcessor doesn't depend on the RequestProcessor class
    • It only depends on the specific functionality it needs (endpoint + headers)
    • Follows the Dependency Inversion Principle (depend on abstractions, not concrete classes)
  2. Testability

    • Easy to mock - just pass simple functions in tests:
      const mockProcessor = new LoRAProcessor(
          async () => 'http://test-endpoint',
          async () => ({ Authorization: 'test-token' })
      )
    • No need to mock an entire RequestProcessor object with all its methods
  3. Interface Segregation Principle (ISP)

    • LoRAProcessor only gets access to what it needs
    • If you pass the whole requestProcessor, LoRAProcessor could call methods it shouldn't (like getService(), topUpAccountIfNeeded(), etc.)
  4. Flexibility

    • Easy to substitute different implementations
    • Not locked into RequestProcessor class structure
  5. Architectural Consistency

    • Looking at the codebase, LoRAProcessor is intentionally different from other processors
    • Other processors (RequestProcessor, ResponseProcessor, etc.) all extend ZGServingUserBrokerBase and need contract, ledger, metadata, cache
    • LoRAProcessor doesn't need blockchain/storage access - it only needs HTTP endpoint and auth headers
    • This keeps it focused on LoRA deployment logic, not blockchain logic

⚠️ Issues with Object Injection

If you pass the whole requestProcessor:

class LoRAProcessor {
    constructor(private requestProcessor: RequestProcessor) {}
    
    async deploy(...) {
        const meta = await this.requestProcessor.getServiceMetadata(...)
        // Now LoRAProcessor can also call:
        // - this.requestProcessor.getService()
        // - this.requestProcessor.topUpAccountIfNeeded()
        // - etc.
        // Should it have access to all of these? Probably not.
    }
}

Problems:

  • Tight coupling to RequestProcessor class
  • Harder to test - need to mock entire RequestProcessor with all its dependencies
  • Violates ISP - LoRAProcessor gets access to methods it doesn't need
  • Less flexible - can't easily substitute a different implementation

🔍 However, I Notice One Issue

Line 117 has an awkward type cast:

return headers as unknown as Record<string, string>

This suggests the types aren't aligned properly. The RequestProcessor.getRequestHeaders() returns ServingRequestHeaders, but LoRAProcessor expects Record<string, string>.

This should be fixed - either:

  1. Change LoRAProcessor to accept ServingRequestHeaders type, or
  2. Change getRequestHeaders() to return Record<string, string>, or
  3. Create a proper conversion function instead of type assertion

💡 Alternative: Interface-Based Injection (Middle Ground)

If you want cleaner code while maintaining loose coupling, consider an interface:

interface LoRADependencies {
    getEndpoint(providerAddress: string): Promise<string>
    getHeaders(providerAddress: string, content?: string): Promise<Record<string, string>>
}

class LoRAProcessor {
    constructor(private deps: LoRADependencies) {}
    
    async deploy(...) {
        const endpoint = await this.deps.getEndpoint(...)
        const headers = await this.deps.getHeaders(...)
    }
}

// In broker.ts:
this.loraProcessor = new LoRAProcessor({
    getEndpoint: async (addr) => {
        const meta = await this.requestProcessor.getServiceMetadata(addr)
        return meta.endpoint
    },
    getHeaders: async (addr, content) => {
        return await this.requestProcessor.getRequestHeaders(addr, content)
    }
})

Benefits:

  • ✅ Cleaner, more organized code
  • ✅ Still loosely coupled
  • ✅ Easy to test (mock the interface)
  • ✅ Type-safe
  • ✅ Can add more methods to interface if needed

🎯 Recommendation

Keep the current function injection approach, as it:

  • Maintains proper separation of concerns
  • Follows SOLID principles (especially DIP and ISP)
  • Makes testing easier
  • Is consistent with LoRAProcessor's design (it doesn't need blockchain access)

But fix the type cast on line 117 - that's a code smell that should be addressed.

Optional enhancement: If you find the instantiation code too verbose, consider the interface-based approach above as a cleaner middle ground.


Summary Table

Approach Coupling Testability SOLID Compliance Code Cleanliness
Current (functions) Loose Excellent ✅ DIP, ISP Verbose but clear
Object injection Tight Poor ❌ Violates DIP, ISP Clean but wrong
Interface-based Loose Excellent ✅ DIP, ISP Clean and right

Verdict: Current approach is architecturally sound. The verbosity is a small price to pay for proper decoupling and testability. The interface-based approach would be an improvement if you want cleaner code while maintaining good architecture.


Does this help clarify the design decision? Happy to discuss further if you'd like to explore the interface-based approach or have other concerns!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The Claude bot's analysis is correct — function injection gives us looser coupling and better testability. But I agree the current code is a bit verbose.

I'll adopt the interface-based approach as a middle ground:

interface LoRADependencies {
    getEndpoint(providerAddress: string): Promise<string>
    getHeaders(providerAddress: string, content?: string): Promise<Record<string, string>>
}

This keeps the decoupling benefits while making the code cleaner. Will also fix the as unknown as Record<string, string> type cast while at it.

Comment on lines +427 to +428
'--broker-url <url>',
'Inference broker URL (for --deploy; skips contract lookup)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this param necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly necessary — the CLI can resolve the broker URL from the on-chain service metadata automatically.

It was added as a dev/debug shortcut to skip the contract lookup (useful when testing against a specific broker endpoint). Happy to remove it if you think it adds unnecessary complexity. The deploy-adapter command has the same option.

- Replace function params with `LoRADependencies` interface for cleaner DI
- Fix `as unknown as Record<string, string>` type cast with proper
  Object.entries conversion
- Remove --broker-url CLI option from deploy-adapter and acknowledge-model
  (broker URL is always resolved from on-chain service metadata)

Addresses HONGJICAI's review comments on PR #191.
@zeyuyuyu
Copy link
Collaborator Author

Update: Review feedback addressed ✅ (commit a2910df)

Addressed both inline review comments:

1. LoRA dependency injection → Interface pattern (response to broker.ts comment)

  • Defined LoRADependencies interface replacing two function parameters
  • LoRAProcessor now takes a single deps: LoRADependencies in its constructor
  • Fixed the as unknown as Record<string, string> type cast with proper Object.entries conversion

2. Removed --broker-url CLI parameter (response to fine-tuning.ts comment)

  • Removed from both deploy-adapter and acknowledge-model --deploy
  • Broker URL is now always resolved from on-chain service metadata
  • brokerUrlOverride also removed from SDK methods for a cleaner API

E2E re-tested — all 3 scenarios pass:

  • deploy-adapter
  • chat --model --task-id
  • chat --adapter-name

Regarding test files: This PR only contains .ts source files — no .sh or .py test files. The existing CI (cli-test.yml + cli.integration.test.ts) was set up in a previous PR and runs on push/PR to main. Happy to add more e2e test coverage if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants