Skip to content

Commit 3ca2b9e

Browse files
edandylyticsclaudecodex
authored
Sideloading support: API for Output Sets (#46)
* add external API endpoints for output sets GET /v1/output-sets lists output file sets from completed runs with filters for partner, tenant, schoolYear, sentToOds, createdAfter, and bundle. POST /v1/output-sets/:setUid/download-links returns presigned S3 download URLs for all files in a set. New scopes: read:jobs (list metadata) and read:jobs:output-files (download content). Partner isolation enforced via token scopes. Query params validated — sentToOds accepts only true/false, createdAfter must be a parseable date, schoolYear must be a 4-digit end year. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * align output-set API response and JSON typing return the output-set list as the flat array shape exercised by the PR 6 tests and serialize it through the shared DTO helper. annotate run_output_file_set.files with a PrismaJson type so the generated client exposes string[] directly and the controller no longer needs an ad hoc cast. Co-authored-by: Codex <noreply@openai.com> * document external API output-set endpoints add the read-only output-set workflow to the external API README, including the new scopes, endpoint table entries, polling step, and request/response examples for listing output sets and fetching download links. Co-authored-by: Codex <noreply@openai.com> * update external API docs and local scopes Document the output retrieval flow and keep the local Keycloak client scopes aligned with the external API README. Co-authored-by: Codex <noreply@openai.com> * update external API docs: say "successful runs" not "completed runs" The output-set list endpoint filters to run.status = 'success', so "completed runs" was misleading — it could imply failed runs are included. Updated four occurrences in the README to say "successful runs" for accuracy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * improve output-set tests and docs after review Tests: restructure GET /output-sets tests into "with invalid token" and "with valid token" groups so auth tests are separate from param validation. Add listOutputSets helper to make filter tests read as the single query param they're varying. Remove redundant per-test token creation. Docs: add 404 to error table, clarify presigned URL expiry (AWS session rotation vs TTL), remove PR-scoped language from token/verify limitation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * clean up output-set integration tests - Add seedOutputSet factory to eliminate repeated job+output-set creation boilerplate. Each filter test now reads as just the dimension it varies. - Add listOutputSets helper for the request side (same idea). - Restructure GET /output-sets into "with invalid token" / "with valid token" / "filters" groups. Valid-token tests share a single token via beforeAll. - Apply same "with valid token" structure to download-links tests. - Remove unnecessary client_id from output-set test tokens. - Rename "with seeded output file sets" to "filters". - Tighten response shape test: exact length, no ordering claim. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * tighten output-set filter test assertions Each filter test now: seeds a counterexample, makes an unfiltered request proving both sets are visible (resAll, length 2), then makes a filtered request asserting exactly one result with the correct field value. Also: remove unnecessary non-null assertions on seedJob().runs, reorder helpers before beforeEach, use exact toHaveLength over toBeGreaterThanOrEqual. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * address review feedback on output-set tests - Lift seedOutputSet to Output Sets V1 describe, removing duplicates from GET and POST blocks. - Narrow token scopes to minimum required: GET uses read:jobs only, POST uses read:jobs:output-files only. - Tighten successful-runs and partner-isolation tests to assert exact length and UID instead of toContain/every. - Fix ordering test: use explicit timestamps and assert exact UID order instead of relying on incidental insertion order. - Restore missing 200 status assertions on two tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * validate createdAfter with class-validator isISO8601 Replace permissive new Date() parsing with isISO8601 from class-validator using strict and strictSeparator options. This accepts any valid ISO 8601 precision (year, date, full timestamp) but rejects locale formats like "03/15/2024" and space-separated datetimes like "2024-03-15 00:00:00Z". Tests cover garbage strings, locale formats, and space separator. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * return 400 for unknown school year instead of empty array A nonexistent school year is more likely a caller typo than "no data yet." Return a clear error so callers can fix their input. Tenant and bundle don't get this validation yet — they return [] for unknown values, which is a reasonable v1 default. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Codex <noreply@openai.com>
1 parent 283d27e commit 3ca2b9e

9 files changed

Lines changed: 671 additions & 23 deletions

File tree

app/api/integration/tests/external-api.v1.spec.ts

Lines changed: 341 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,12 @@ import { signExternalApiToken, TEST_ISSUER } from '../helpers/external-api/token
33
import * as jose from 'jose';
44
import { EXTERNAL_API_SCOPE_KEY } from '../../src/external-api/auth/external-api-scope.decorator';
55
import { ExternalApiV1TokenController } from '../../src/external-api/v1/token.v1.controller';
6-
import { partnerA } from '../fixtures/context-fixtures/partner-fixtures';
7-
import { tenantA, tenantX } from '../fixtures/context-fixtures/tenant-fixtures';
8-
import { allBundles, bundleA, bundleX } from '../fixtures/em-bundle-fixtures';
6+
import { partnerA, partnerX } from '../fixtures/context-fixtures/partner-fixtures';
7+
import { tenantA, tenantB, tenantX } from '../fixtures/context-fixtures/tenant-fixtures';
8+
import { allBundles, bundleA, bundleB, bundleX } from '../fixtures/em-bundle-fixtures';
99
import { EarthbeamBundlesService } from 'api/src/earthbeam/earthbeam-bundles.service';
10-
import { odsConfigA2425 } from '../fixtures/context-fixtures/ods-fixture';
10+
import { odsConfigA2425, odsConfigA2526, odsConfigB2526, odsConfigX2425 } from '../fixtures/context-fixtures/ods-fixture';
11+
import { schoolYear2425, schoolYear2526 } from '../fixtures/context-fixtures/school-year-fixtures';
1112

1213
import { FileService } from 'api/src/files/file.service';
1314
import { ExternalApiAuthService } from '../../src/external-api/auth/external-api.auth.service';
@@ -17,6 +18,7 @@ import { userA } from '../fixtures/user-fixtures';
1718
import { GetJobDto } from '@edanalytics/models';
1819
import { plainToInstance } from 'class-transformer';
1920
import { ExecutorAwsService } from 'api/src/earthbeam/executor/executor.aws.service';
21+
import { seedJob } from '../factories/job-factory';
2022

2123
describe('ExternalApiV1', () => {
2224
describe('Token Auth', () => {
@@ -777,4 +779,339 @@ describe('ExternalApiV1', () => {
777779
});
778780
});
779781
});
782+
783+
describe('Output Sets V1', () => {
784+
const seedOutputSet = async (overrides: {
785+
odsConfig?: typeof odsConfigA2425;
786+
bundle?: typeof bundleA;
787+
tenant?: typeof tenantA;
788+
runStatus?: 'success' | 'error' | 'new' | 'running';
789+
files?: string[];
790+
sentToOds?: boolean;
791+
} = {}) => {
792+
const job = await seedJob({
793+
odsConfig: overrides.odsConfig ?? odsConfigA2425,
794+
bundle: overrides.bundle ?? bundleA,
795+
tenant: overrides.tenant ?? tenantA,
796+
runStatus: overrides.runStatus ?? 'success',
797+
});
798+
const set = await prisma.runOutputFileSet.create({
799+
data: {
800+
runId: job.runs[0].id,
801+
files: overrides.files ?? ['output.jsonl'],
802+
sentToOds: overrides.sentToOds ?? true,
803+
path: `${job.fileBasePath}/output`,
804+
},
805+
});
806+
return { job, set };
807+
};
808+
809+
describe('GET /output-sets', () => {
810+
const endpoint = '/v1/output-sets';
811+
812+
describe('with invalid token', () => {
813+
it('should return 401 without a token', async () => {
814+
const res = await request(app.getHttpServer())
815+
.get(endpoint)
816+
.query({ partner: partnerA.id });
817+
expect(res.status).toBe(401);
818+
});
819+
820+
it('should return 403 with token missing read:jobs scope', async () => {
821+
const token = await signExternalApiToken({
822+
scope: 'create:jobs partner:partner-a',
823+
});
824+
const res = await request(app.getHttpServer())
825+
.get(endpoint)
826+
.query({ partner: partnerA.id })
827+
.set('Authorization', `Bearer ${token}`);
828+
expect(res.status).toBe(403);
829+
});
830+
});
831+
832+
describe('with valid token', () => {
833+
let token: string;
834+
835+
beforeAll(async () => {
836+
token = await signExternalApiToken({
837+
scope: 'read:jobs partner:partner-a',
838+
});
839+
});
840+
841+
it('should return 400 without required partner parameter', async () => {
842+
const res = await request(app.getHttpServer())
843+
.get(endpoint)
844+
.set('Authorization', `Bearer ${token}`);
845+
expect(res.status).toBe(400);
846+
});
847+
848+
it('should return 400 for invalid sentToOds value', async () => {
849+
const res = await request(app.getHttpServer())
850+
.get(endpoint)
851+
.query({ partner: partnerA.id, sentToOds: 'yes' })
852+
.set('Authorization', `Bearer ${token}`);
853+
expect(res.status).toBe(400);
854+
expect(res.body.message).toContain('sentToOds');
855+
});
856+
857+
it.each([
858+
'not-a-date',
859+
'03/15/2024',
860+
'March 15, 2024',
861+
'2024-03-15 00:00:00Z',
862+
])('should return 400 for non-ISO createdAfter value: %s', async (value) => {
863+
const res = await request(app.getHttpServer())
864+
.get(endpoint)
865+
.query({ partner: partnerA.id, createdAfter: value })
866+
.set('Authorization', `Bearer ${token}`);
867+
expect(res.status).toBe(400);
868+
expect(res.body.message).toContain('ISO 8601 timestamp');
869+
});
870+
871+
it('should return 400 for invalid schoolYear value', async () => {
872+
const res = await request(app.getHttpServer())
873+
.get(endpoint)
874+
.query({ partner: partnerA.id, schoolYear: '2024abc' })
875+
.set('Authorization', `Bearer ${token}`);
876+
expect(res.status).toBe(400);
877+
expect(res.body.message).toContain('schoolYear');
878+
});
879+
880+
it('should return 400 for unknown schoolYear', async () => {
881+
const res = await request(app.getHttpServer())
882+
.get(endpoint)
883+
.query({ partner: partnerA.id, schoolYear: '1999' })
884+
.set('Authorization', `Bearer ${token}`);
885+
expect(res.status).toBe(400);
886+
expect(res.body.message).toContain('Unknown school year');
887+
});
888+
889+
it('should return 404 when partner does not match token scopes', async () => {
890+
const tokenWrongPartner = await signExternalApiToken({
891+
scope: 'read:jobs partner:partner-b',
892+
});
893+
const res = await request(app.getHttpServer())
894+
.get(endpoint)
895+
.query({ partner: partnerA.id })
896+
.set('Authorization', `Bearer ${tokenWrongPartner}`);
897+
expect(res.status).toBe(404);
898+
});
899+
900+
describe('filters', () => {
901+
const listOutputSets = (query: Record<string, string> = {}) =>
902+
request(app.getHttpServer())
903+
.get(endpoint)
904+
.query({ partner: partnerA.id, ...query })
905+
.set('Authorization', `Bearer ${token}`);
906+
907+
let setA: Awaited<ReturnType<typeof seedOutputSet>>;
908+
909+
beforeEach(async () => {
910+
setA = await seedOutputSet({
911+
files: ['output1.jsonl', 'output2.jsonl'],
912+
});
913+
});
914+
915+
it('should return output sets with correct shape', async () => {
916+
const res = await listOutputSets();
917+
918+
expect(res.status).toBe(200);
919+
expect(res.body).toBeInstanceOf(Array);
920+
expect(res.body).toHaveLength(1);
921+
922+
const set = res.body[0];
923+
expect(set.uid).toBe(setA.set.uid);
924+
expect(set.files).toEqual(['output1.jsonl', 'output2.jsonl']);
925+
expect(set.sentToOds).toBe(true);
926+
expect(set.createdAt).toBeDefined();
927+
expect(set.jobUid).toBe(setA.job.uid);
928+
expect(set.partner).toBe(partnerA.id);
929+
expect(set.tenant).toBe(tenantA.code);
930+
expect(set.schoolYear).toBe(String(schoolYear2425.endYear));
931+
expect(set.bundle).toBe(bundleA.path);
932+
});
933+
934+
it('should only include sets from successful runs', async () => {
935+
await seedOutputSet({ runStatus: 'error' });
936+
937+
const res = await listOutputSets();
938+
939+
expect(res.status).toBe(200);
940+
expect(res.body).toHaveLength(1);
941+
expect(res.body[0].uid).toBe(setA.set.uid);
942+
});
943+
944+
it('should exclude sets from other partners', async () => {
945+
await seedOutputSet({
946+
odsConfig: odsConfigX2425,
947+
bundle: bundleX,
948+
tenant: tenantX,
949+
});
950+
951+
const res = await listOutputSets();
952+
953+
expect(res.status).toBe(200);
954+
expect(res.body).toHaveLength(1);
955+
expect(res.body[0].uid).toBe(setA.set.uid);
956+
});
957+
958+
it('should filter by tenant', async () => {
959+
await seedOutputSet({
960+
odsConfig: odsConfigB2526,
961+
tenant: tenantB,
962+
});
963+
964+
const resAll = await listOutputSets();
965+
expect(resAll.body).toHaveLength(2);
966+
967+
const res = await listOutputSets({ tenant: tenantA.code });
968+
expect(res.body).toHaveLength(1);
969+
expect(res.body[0].tenant).toBe(tenantA.code);
970+
});
971+
972+
it('should filter by schoolYear (end year)', async () => {
973+
await seedOutputSet({
974+
odsConfig: odsConfigA2526,
975+
});
976+
977+
const resAll = await listOutputSets();
978+
expect(resAll.body).toHaveLength(2);
979+
980+
const res = await listOutputSets({ schoolYear: String(schoolYear2425.endYear) });
981+
expect(res.body).toHaveLength(1);
982+
expect(res.body[0].schoolYear).toBe(String(schoolYear2425.endYear));
983+
});
984+
985+
it('should filter by sentToOds', async () => {
986+
await seedOutputSet({ sentToOds: false });
987+
988+
const resAll = await listOutputSets();
989+
expect(resAll.body).toHaveLength(2);
990+
991+
const res = await listOutputSets({ sentToOds: 'false' });
992+
expect(res.body).toHaveLength(1);
993+
expect(res.body[0].sentToOds).toBe(false);
994+
});
995+
996+
it('should filter by createdAfter', async () => {
997+
// Backdate setA to a known old timestamp
998+
await prisma.runOutputFileSet.update({
999+
where: { uid: setA.set.uid },
1000+
data: { createdOn: new Date('2020-01-01T00:00:00Z') },
1001+
});
1002+
1003+
const { set: newerSet } = await seedOutputSet();
1004+
1005+
const resAll = await listOutputSets();
1006+
expect(resAll.body).toHaveLength(2);
1007+
1008+
const res = await listOutputSets({ createdAfter: '2024-01-01T00:00:00Z' });
1009+
expect(res.body).toHaveLength(1);
1010+
expect(res.body[0].uid).toBe(newerSet.uid);
1011+
});
1012+
1013+
it('should filter by bundle', async () => {
1014+
await seedOutputSet({
1015+
odsConfig: odsConfigA2526,
1016+
bundle: bundleB,
1017+
});
1018+
1019+
const resAll = await listOutputSets();
1020+
expect(resAll.body).toHaveLength(2);
1021+
1022+
const res = await listOutputSets({ bundle: bundleA.path });
1023+
expect(res.body).toHaveLength(1);
1024+
expect(res.body[0].bundle).toBe(bundleA.path);
1025+
});
1026+
1027+
it('should return results ordered by createdAt ascending', async () => {
1028+
// Give setA an explicit later timestamp
1029+
await prisma.runOutputFileSet.update({
1030+
where: { uid: setA.set.uid },
1031+
data: { createdOn: new Date('2025-06-01T00:00:00Z') },
1032+
});
1033+
1034+
const { set: olderSet } = await seedOutputSet();
1035+
await prisma.runOutputFileSet.update({
1036+
where: { uid: olderSet.uid },
1037+
data: { createdOn: new Date('2025-01-01T00:00:00Z') },
1038+
});
1039+
1040+
const res = await listOutputSets();
1041+
1042+
expect(res.body).toHaveLength(2);
1043+
expect(res.body[0].uid).toBe(olderSet.uid);
1044+
expect(res.body[1].uid).toBe(setA.set.uid);
1045+
});
1046+
});
1047+
});
1048+
});
1049+
1050+
describe('POST /output-sets/:setUid/download-links', () => {
1051+
it('should return 401 without a token', async () => {
1052+
const res = await request(app.getHttpServer())
1053+
.post('/v1/output-sets/00000000-0000-0000-0000-000000000000/download-links');
1054+
expect(res.status).toBe(401);
1055+
});
1056+
1057+
it('should return 403 with token missing read:jobs:output-files scope', async () => {
1058+
const token = await signExternalApiToken({
1059+
scope: 'read:jobs partner:partner-a',
1060+
});
1061+
const res = await request(app.getHttpServer())
1062+
.post('/v1/output-sets/00000000-0000-0000-0000-000000000000/download-links')
1063+
.set('Authorization', `Bearer ${token}`);
1064+
expect(res.status).toBe(403);
1065+
});
1066+
1067+
describe('with valid token', () => {
1068+
let token: string;
1069+
1070+
beforeAll(async () => {
1071+
token = await signExternalApiToken({
1072+
scope: 'read:jobs:output-files partner:partner-a',
1073+
});
1074+
});
1075+
1076+
it('should return 404 when set UID does not exist', async () => {
1077+
const res = await request(app.getHttpServer())
1078+
.post('/v1/output-sets/00000000-0000-0000-0000-000000000000/download-links')
1079+
.set('Authorization', `Bearer ${token}`);
1080+
expect(res.status).toBe(404);
1081+
});
1082+
1083+
it('should return 404 when set belongs to a different partner', async () => {
1084+
const { set: setX } = await seedOutputSet({
1085+
odsConfig: odsConfigX2425,
1086+
bundle: bundleX,
1087+
tenant: tenantX,
1088+
});
1089+
1090+
const res = await request(app.getHttpServer())
1091+
.post(`/v1/output-sets/${setX.uid}/download-links`)
1092+
.set('Authorization', `Bearer ${token}`);
1093+
expect(res.status).toBe(404);
1094+
});
1095+
1096+
it('should return presigned download links for all files in the set', async () => {
1097+
const { set: setA } = await seedOutputSet({
1098+
files: ['output1.jsonl', 'output2.jsonl'],
1099+
});
1100+
1101+
const res = await request(app.getHttpServer())
1102+
.post(`/v1/output-sets/${setA.uid}/download-links`)
1103+
.set('Authorization', `Bearer ${token}`);
1104+
1105+
expect(res.status).toBe(200);
1106+
expect(res.body.downloadLinks).toBeDefined();
1107+
expect(Object.keys(res.body.downloadLinks)).toEqual(['output1.jsonl', 'output2.jsonl']);
1108+
1109+
// FileService.getPresignedDownloadUrl is mocked to return `s3-test-download-url://{fullPath}`
1110+
expect(res.body.downloadLinks['output1.jsonl']).toContain('s3-test-download-url://');
1111+
expect(res.body.downloadLinks['output1.jsonl']).toContain(`${setA.path}/output1.jsonl`);
1112+
expect(res.body.downloadLinks['output2.jsonl']).toContain(`${setA.path}/output2.jsonl`);
1113+
});
1114+
});
1115+
});
1116+
});
7801117
});

app/api/keycloak/config.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,14 @@ clientScopes:
1515
protocol: openid-connect
1616
attributes:
1717
"include.in.token.scope": "true"
18+
- name: "read:jobs"
19+
protocol: openid-connect
20+
attributes:
21+
"include.in.token.scope": "true"
22+
- name: "read:jobs:output-files"
23+
protocol: openid-connect
24+
attributes:
25+
"include.in.token.scope": "true"
1826
- name: "partner:ea"
1927
protocol: openid-connect
2028
attributes:
@@ -90,6 +98,8 @@ clients:
9098
directAccessGrantsEnabled: false
9199
defaultClientScopes:
92100
- "create:jobs"
101+
- "read:jobs"
102+
- "read:jobs:output-files"
93103
- "partner:ea"
94104
protocolMappers:
95105
# The app validates the aud claim, so we need to inject it.

0 commit comments

Comments
 (0)