feat(utils): port isGitUrl/resolveChunk/formatResults from semble#2
Conversation
Port src/semble/utils.py to TypeScript:
- isGitUrl: detects remote git URLs by scheme prefix
(https/http/ssh/git/git+ssh/file) or scp-style user@host:repo
(excludes user@host:/abs/path via negative lookahead).
- resolveChunk: returns the chunk containing line in filePath, with
a strict inner match (line < endLine) winning over a boundary
match (line === endLine) which is kept only as a fallback for
end-of-file lines.
- formatResults: wraps SearchResult.toDict outputs as
{ query, results }.
Stopgap structural Chunk/SearchResult types are defined inline
until src/types.ts lands from Unit 1.
Ref: src/semble/utils.py
There was a problem hiding this comment.
Code Review
This pull request ports utility functions and their corresponding tests from Python to TypeScript, introducing helper functions for identifying Git URLs, resolving code chunks by line number, and formatting search results. The review feedback suggests improving cross-platform compatibility in resolveChunk by normalizing path separators to handle Windows backslashes and adding a unit test to verify this behavior.
There was a problem hiding this comment.
No issues found across 2 files
Architecture diagram
sequenceDiagram
participant Client as Caller (e.g. Search API)
participant Utils as utils.ts
participant Types as types.ts (future)
participant Python as Original semble Python
Note over Client,Python: Port of `src/semble/utils.py` to TypeScript
Client->>Utils: isGitUrl(path)
alt Scheme match (https://, http://, ssh://, git://, git+ssh://, file://)
Utils->>Utils: Check path.startsWith(scheme)
Utils-->>Client: true
else SCP-style match (user@host:repo)
Utils->>Utils: Test /^[\w.-]+@[\w.-]+:(?!\/)/
Utils-->>Client: true / false
else No match
Utils-->>Client: false
end
Client->>Utils: resolveChunk(chunks, filePath, line)
Utils->>Utils: Iterate chunks matching filePath
alt line < endLine (strict inner match)
Utils->>Utils: Return chunk immediately
else line === endLine (boundary match)
Utils->>Utils: Store as fallback (first only)
end
Note over Utils: After loop, return fallback or null
Client->>Utils: formatResults(query, results)
Utils->>Utils: Map results via toDict()
Utils-->>Client: { query, results }
Note over Types: Stopgap inline types until Unit 1 merges
Note over Python: Behavior identical to Python original
amondnet
left a comment
There was a problem hiding this comment.
Applied 0, deferred 2.
Both gemini-code-assist comments asked to add backslash→slash path normalization inside resolveChunk plus a matching test. Deferring both:
- Upstream
semble.utils.resolve_chunkuses strictchunk.file_path == file_pathequality — no normalization. Parity with semble is load-bearing for this port (per CLAUDE.md). resolveChunkis a pure key lookup over chunks produced by the indexer. Cross-platform path canonicalization is a file-walker / indexing concern (Unit 8), where both stored and queried paths can share one canonical form. Normalizing only at the lookup site would mask, not fix, inconsistent emission.
cubic-dev-ai found no issues.
Port `src/semble/utils.py` to TypeScript as part of the parallel @pleaseai/csp port effort (Unit 3).
What changed
Stopgap types
`src/types.ts` doesn't exist yet (Unit 1's territory). Per the task brief, structural `Chunk` / `SearchResult` types are defined inline using camelCase fields (`filePath`, `startLine`, `endLine`) per the @pleaseai/csp public-API convention. These should be replaced with re-exports from `./types.ts` once Unit 1 merges.
Verification
```
bun test src/utils.test.ts
25 pass
0 fail
25 expect() calls
```
Source
`src/semble/utils.py`
Summary by cubic
Ports
isGitUrl,resolveChunk, andformatResultsfrom Python to TypeScript with full test coverage. This delivers Unit 3 utils for the@pleaseai/cspport and keeps behavior 1:1 with the original.src/utils.tsexports:isGitUrl(path): detects git URLs by scheme or scp-styleuser@host:repo(excludesuser@host:/abs/path).resolveChunk(chunks, filePath, line): strict inner match wins;endLineboundary is a fallback.formatResults(query, results): returns{ query, results }viatoDict().Chunk/SearchResultinterfaces inline; replace with./types.tswhen available.src/utils.test.ts: 25 tests for URL detection, chunk resolution, and result formatting.Written for commit fb5cb4e. Summary will update on new commits.