feat(optimizer): [2/N] Optimizer REST Service and Controller#531
Conversation
Introduces the optimizer service module with: - MySQL/H2 schema for table_operations, table_stats, table_stats_history, and table_operations_history - JPA entities with JSON column support (vladmihalcea hibernate-types) - All model/DTO/enum types: OperationType, OperationStatus, TableStats, CompleteOperationRequest, JobResult, OperationMetrics, etc. - JPA AttributeConverters for JobResult and OperationMetrics JSON columns - MapStruct mapper (OptimizerMapper) for entity→DTO conversion - Spring Boot application shell and build wiring (settings.gradle, build.gradle dockerPrereqs) No repositories, controllers, or service layer yet — those follow in subsequent PRs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove OperationMetrics class and converter; stats are read directly from table_stats instead of duplicating into operations - Remove orphanFilesDeleted/orphanBytesDeleted from history entity, DTO, and schema; operation-specific data belongs in the result JSON - Add addedSizeBytes to CommitDelta for tracking write volume - Fix OperationType javadoc to describe current state, not roadmap - Fix TableOperationsHistoryRow javadoc: written on operation complete, not by Spark app directly - Add field comments to all DTOs and request objects Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Spring Data JPA repositories for all four optimizer tables with filtered query support. Includes tests exercising save/find, filtered queries, upsert semantics, and append-only history. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address PR review comments: rename findFiltered → find across all repos, remove redundant findByTableUuid/findByTableUuidSince from history repos, add explicit assertion to context test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Service interface and implementation for all optimizer CRUD operations including complete-operation lifecycle, stats upsert with history double-write, and filtered queries. Three REST controllers expose the endpoints. The apps/optimizer shared module provides lightweight entity/repo copies for the analyzer and scheduler apps. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Align OptimizerDataServiceImpl with renamed repository methods from optimizer-1 review feedback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
e628426 to
ef3260f
Compare
Shared JPA entities and repositories for optimizer apps (analyzer, scheduler). All repos expose a single find method with optional filters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve repo conflicts by taking optimizer-1's clean find-only versions. Scheduler-specific methods and streamAll removed per review feedback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mkuchenbecker
left a comment
There was a problem hiding this comment.
this needs tests
These fields never belonged in the data model — remove them at the source rather than adding then deleting in a later PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Propagate CompleteOperationRequest orphan field removal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Propagate CompleteOperationRequest orphan field removal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
H2 integration tests for OptimizerDataServiceImpl covering completeOperation (write history, not-found) and upsertTableStats (create, update, history append). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strengthen upsertTableStats test to verify history rows contain the raw delta stats from each call, not just the row count. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
OptimizerDataServiceImpl pipes Optional<T> filters straight through to the repo (no .orElse(null) at the boundary). Adds the optimizer.repo.default-limit config property and threads it into the list-shaped calls. Service-impl test updates the one direct statsHistoryRepository.find(...) call to pass Optional.empty(). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Instant.now() on Linux CI carries nanoseconds; MySQL TIMESTAMP(6) and H2 in MySQL mode store microseconds. The scheduledAt = :scheduledAt predicate in find(...) compared nano-resolution param against micro-resolution stored value and missed. Local (macOS, micro-only) hid the bug. Truncate to ChronoUnit.MICROS at write time in the one repo test that exercises the watermark round-trip. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
## Optimizer Stack | PR | Content | |---|---| | #527 | Data Model | | #530 **(this)** | Database Repos | | #531 | REST service | | #533 | Analyzer app | | #534 | Scheduler app | | #tbd | Spark BatchedOFD app | | #tbd | Infra, docker-compose, smoke test | ## Summary PR 1 of N in the optimizer stack. [Overall Project](https://docs.google.com/document/d/1oGQWkmlVw0HG-D4Nx37q0oUEQd53Ni4q5Q7h1voaZIQ/edit?tab=t.0#heading=h.vtl818e4m9f7) [Service Design doc](https://docs.google.com/document/d/1xYOD7iuPjZO05UWfT1Dkmf4lYNw4iOjqY10UT2X1FAg/edit?tab=t.0). Spring Data JPA repositories for all four optimizer tables with filtered query support, plus tests exercising save/find, filtered queries, upsert semantics, and append-only history. ## Changes - [ ] Client-facing API Changes - [ ] Internal API Changes - [ ] Bug Fixes - [x] New Features - [ ] Performance Improvements - [ ] Code Style - [ ] Refactoring - [ ] Documentation - [x] Tests **Repositories**: `TableOperationsRepository`, `TableOperationsHistoryRepository`, `TableStatsRepository`, `TableStatsHistoryRepository` — each with JPQL filtered query methods. **Tests**: Repository tests for all four tables plus `OptimizerServiceContextTest` verifying the Spring context loads. ## Testing Done - [ ] Manually Tested on local docker setup. Please include commands ran, and their output. - [x] Added new tests for the changes made. - [ ] Updated existing tests to reflect the changes made. - [ ] No tests added or updated. Please explain why. If unsure, please feel free to ask for help. - [ ] Some other form of testing like staging or soak time in production. Please explain. `./gradlew :services:optimizer:test` — all tests pass (H2 in MySQL mode). # Additional Information - [ ] Breaking Changes - [ ] Deprecations - [x] Large PR broken into smaller PRs, and PR plan linked in the description. --------- Co-authored-by: mkuchenbecker <mkuchenbecker@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
All four list-style endpoints now require a caller-supplied limit. No server-side default, no max — getting the API contract right comes first; bounds can land separately. - TableOperationsController.listTableOperations: @RequestParam int limit - TableStatsController.listTableStats: @RequestParam int limit - TableStatsController.getStatsHistory: drop defaultValue="100" - TableOperationsHistoryController.getHistory: drop defaultValue="100" - OptimizerDataService: listTableOperations / listTableStats gain int limit - OptimizerDataServiceImpl: drop @value("${optimizer.repo.default-limit}") and the defaultLimit field; thread caller-supplied limit straight to PageRequest.of(0, limit), which cascades to MySQL LIMIT n. - application.properties: remove now-unused optimizer.repo.default-limit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The PR description looks good with all the design docs including BDP ticket and make the PR comprehensive. But people outside of LinkedIn would not be able to open the links. We are following the convention of not linking internal tickets and doc in the OSS repo. I thought about this in the previous two PRs which are already merged, but did not mention about this. Now I am thinking should we put the internal references, as I am also in the process of raising PRs. Let me know your thoughts on this. |
agreed and removed up the stack. |
Every optimizer endpoint now returns a uniform ApiError body
({code, message, path}) on non-2xx. Reshape updateOperation so the
operationId lives in the URL (and is repeated in the body for
self-describing payloads).
- api/error/ApiError.java — DTO shape.
- api/error/GlobalExceptionHandler.java — @RestControllerAdvice mapping
framework exceptions to {VALIDATION_ERROR, INVALID_PARAMETER,
MISSING_PARAMETER, MALFORMED_REQUEST, INTERNAL_ERROR}. Reason of
ResponseStatusException is parsed as "CODE: message" for endpoint-
specific 404s.
- Controllers: orElseThrow(ResponseStatusException) replaces bare 404.
TableOperationsController moves updateOperation to
POST /v1/optimizer/operations/{id}/update; rejects with 400
PATH_BODY_MISMATCH when body.operationId != path.id.
- UpdateOperationRequest: @notblank operationId, @NotNull status.
- UpsertTableStatsRequest: @notblank databaseName, tableName.
- spring-boot-starter-validation dep added.
- New ControllerErrorHandlingTest: 13 MockMvc cases covering 404 / 400
validation / 400 type-mismatch / 400 missing-param / 400 malformed-
body / 400 path-body-mismatch + happy-path sanity.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Address review comments on PR #596 — minimal diff, no bean validation, no per-framework-exception handlers, scoped to the optimizer controllers. - @RestControllerAdvice scoped to the three optimizer controllers via assignableTypes; no longer global. - Two handlers only: ResponseStatusException → ApiError with code = status.name() + reason as message; Exception → 500 INTERNAL_ERROR. Drop MethodArgumentNotValidException, MethodArgumentTypeMismatchException, MissingServletRequestParameterException, HttpMessageNotReadableException — Spring's defaults handle those. - Drop the "CODE: message" reason-parsing convention. - Drop @notblank / @NotNull on UpdateOperationRequest and UpsertTableStatsRequest; drop @Valid on controllers; drop spring-boot-starter-validation dep. Validate operationId / status server- side in TableOperationsController.updateOperation — loose-coupling so relaxing required fields later doesn't break wire callers. - String.format throughout; no message concatenation. - ControllerErrorHandlingTest trimmed from 13 cases to 7: only what the controllers actually own (404s, server-side validation on updateOperation, happy-path sanity). Framework-level 4xx left to Spring. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ng defaults
The custom advice was producing a body shape (ApiError {code, message,
path}) that duplicated Spring Boot's default error JSON ({timestamp,
status, error, message, path}). The only substantive difference was
that Spring Boot 2.7 omits the `message` field by default.
Replace the custom advice with a one-line config:
server.error.include-message=always
Now ResponseStatusException reasons (e.g. "no operation with id X")
reach the caller via Spring's default error body, no custom code.
- Delete api/error/GlobalExceptionHandler.java
- Delete api/error/ApiError.java
- application.properties: server.error.include-message=always
- ControllerErrorHandlingTest assertions trimmed to status-code-only
(MockMvc does not trigger Spring's error-dispatch to
BasicErrorController, so body assertions cannot be made in tests
even though the body is populated on real HTTP requests).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per review — no change needed on this file. The path/body validation lives in the controller; the DTO carries the same fields as before with the existing javadoc. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…chenb/optimizer-2
Address review on PR #531 — each endpoint lists the HTTP response codes it actually returns, in the same "Resource ACTION: STATUS" style used by services/tables. Codes per endpoint: - POST /operations/{id}/update — 201, 400, 404 - GET /operations/{id} — 200, 404 - GET /operations — 200, 400 - POST /operations-history — 201 - GET /operations-history/{u} — 200, 400 - PUT /stats/{u} — 200 - GET /stats/{u} — 200, 404 - GET /stats — 200, 400 - GET /stats/{u}/history — 200, 400 Annotations only — no runtime behavior change, no new tests required. swagger-annotations 2.1.11 is already on the optimizer classpath via openhouse.springboot-conventions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
abhisheknath2011
left a comment
There was a problem hiding this comment.
Thanks @mkuchenbecker. The initial set of APIs look good. We can iterate and update as needed.
## Optimizer Stack | PR | Content | |---|---| | #527 | Data Model | | #530 | Database Repos | | #531 | REST service | | #533 **(this)** | Analyzer app | | #534 | Scheduler app | | #599 | Spark BatchedOFD app | | #tbd | Infra, docker-compose, smoke test | ## Summary PR 3 of N in the optimizer stack. Introduces `apps/optimizer-analyzer`, a Spring Boot CommandLineRunner that evaluates every table in `table_stats` against pluggable `OperationAnalyzer` strategies. The first strategy, `OrphanFilesDeletionAnalyzer`, schedules OFD operations with 24h success / 1h failure retry cadence, a 6h SCHEDULED timeout, and a 5-strike circuit breaker. Key design choices: - Bulk-loads operations and history into maps (one query per type), then iterates the stats list — O(types) queries, not O(tables). - Uses the existing generic `find()` repository methods with null params. - Pure unit tests with Mockito — no Spring context needed. ## Changes - [ ] Client-facing API Changes - [ ] Internal API Changes - [ ] Bug Fixes - [x] New Features - [ ] Performance Improvements - [ ] Code Style - [ ] Refactoring - [ ] Documentation - [x] Tests **Core**: `AnalyzerRunner` — loads table_stats, pre-loads operations and history into maps, evaluates each table against all analyzers, circuit breaker logic. **Strategy interface**: `OperationAnalyzer` — `isEnabled(table)`, `shouldSchedule(table, currentOp, latestHistory)`, `getCircuitBreakerThreshold()`. **Cadence policy**: `CadencePolicy` — encapsulates time-based retry logic shared across operation types. **OFD analyzer**: `OrphanFilesDeletionAnalyzer` — enabled via `maintenance.optimizer.ofd.enabled` table property. ## Testing Done - [ ] Manually Tested on local docker setup. Please include commands ran, and their output. - [x] Added new tests for the changes made. - [ ] Updated existing tests to reflect the changes made. - [ ] No tests added or updated. Please explain why. If unsure, please feel free to ask for help. - [ ] Some other form of testing like staging or soak time in production. Please explain. 25 unit tests: - `AnalyzerRunnerTest` (7 tests) — eligible table insertion, cadence skip, disabled table, shouldSchedule=false, null UUID, circuit breaker trip, below-threshold pass - `OrphanFilesDeletionAnalyzerTest` (18 tests) — isEnabled variants, shouldSchedule for no-op/PENDING/SCHEDULING/SCHEDULED with history combinations ``` ./gradlew :apps:optimizer-analyzer:test # BUILD SUCCESSFUL — 25 tests pass ``` # Additional Information - [ ] Breaking Changes - [ ] Deprecations - [x] Large PR broken into smaller PRs, and PR plan linked in the description. --------- Co-authored-by: mkuchenbecker <mkuchenbecker@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Abhishek Nath <anath1@linkedin.com>
Reverts the 17 commits that landed on main after v0.5.417, bringing the tree back to exactly the v0.5.417 state. Squashed into a single revert commit for reviewability and to allow reinstating everything as one unit (revert this commit to bring all 17 changes back). Reverted commits (v0.5.417..main, newest first): - Revert #579 (HTS fields in table list api) (#610) - feat(optimizer): [3/N] Analyzer (#533) - [DataLoader] Handle Cast(Literal, TIMESTAMP/DATE/TIME) in scan optimizer (#569 follow-up) (#583) - Skip metadata.json parse in drop path (#589) - feat(optimizer): [2/N] Optimizer REST Service and Controller (#531) - [BDP-102028] feat(optimizer): [1/N] Optimizer Database (#530) - [RTAS]: Fix bug - remove fs scheme from tableLocation in commit (cont) (#594) - Trigger ELR process (#593) - [BDP-102028] feat(optimizer): [0/N] Optimizer API and internal model (#527) - Fail retention app when the columnPattern mismatch partition spec (#552) - [DataLoader] Drop OpenTelemetry minimum version to 1.38.0 (#590) - [DataLoader] Emit OpenTelemetry metrics for read operations (#582) - Cache iceberg metadata to reduce redundant requests to storage (#509) - bump iceberg 1.2 version to 1.2.0.17 (#587) - Support returning HTS fields in table list api (#579) - [DataLoader] Add unique id property to OpenHouseDataLoader (#580) - [DataLoader] Add OpenTelemetry metrics support (#575) ## Summary <!--- HINT: Replace #nnn with corresponding Issue number, if you are fixing an existing issue --> [Issue](https://github.com/linkedin/openhouse/issues/#nnn)] Briefly discuss the summary of the changes made in this pull request in 2-3 lines. ## Changes - [ ] Client-facing API Changes - [ ] Internal API Changes - [ ] Bug Fixes - [ ] New Features - [ ] Performance Improvements - [ ] Code Style - [ ] Refactoring - [ ] Documentation - [ ] Tests For all the boxes checked, please include additional details of the changes made in this pull request. ## Testing Done <!--- Check any relevant boxes with "x" --> - [ ] Manually Tested on local docker setup. Please include commands ran, and their output. - [ ] Added new tests for the changes made. - [ ] Updated existing tests to reflect the changes made. - [ ] No tests added or updated. Please explain why. If unsure, please feel free to ask for help. - [ ] Some other form of testing like staging or soak time in production. Please explain. For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request. # Additional Information - [ ] Breaking Changes - [ ] Deprecations - [ ] Large PR broken into smaller PRs, and PR plan linked in the description. For all the boxes checked, include additional details of the changes made in this pull request. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Optimizer Stack
Summary
PR 2 of N in the optimizer stack.
Service layer and REST controllers for the optimizer service, plus the
apps/optimizershared module providing lightweight entity/repo copies for the analyzer and scheduler apps.Changes
Service layer:
OptimizerDataServiceinterface andOptimizerDataServiceImpl— CRUD operations, complete-operation lifecycle, stats upsert with history double-write, filtered queries.Controllers:
TableOperationsController,TableOperationsHistoryController,TableStatsController— REST endpoints per the design doc API spec.Shared module (
apps/optimizer): Lightweight entity and repository copies used by the analyzer and scheduler apps to read optimizer state directly from MySQL.Testing Done
H2 integration tests in
OptimizerDataServiceImplTest(5 tests):completeOperation_writesHistoryFromOperationRow— saves SCHEDULED row, completes it, asserts history DTO fieldscompleteOperation_notFound_returnsEmpty— completes nonexistent ID, asserts emptyupsertTableStats_createsNewRow— upserts new table, asserts DTO and repo rowupsertTableStats_updatesExistingRow— upserts twice, asserts overwrite with single rowupsertTableStats_appendsHistoryOnEveryCall— upserts twice, asserts 2 history rowsAdditional Information