Feature/upgrade tree sitter#1
Conversation
- Upgrade tree-sitter from 0.23 to 0.26.5 - Upgrade tree-sitter-rust to 0.24.0 - Upgrade tree-sitter-python to 0.25.0 - Upgrade tree-sitter-javascript to 0.25.0 - Upgrade tree-sitter-typescript to 0.23.2 - Upgrade tree-sitter-c to 0.24.1 - Upgrade tree-sitter-cpp to 0.23.4 - Upgrade tree-sitter-c-sharp to 0.23.1 - Upgrade tree-sitter-go to 0.25.0 - Upgrade tree-sitter-java to 0.23.5 Breaking change: tree-sitter 0.26 changed named_child() parameter from usize to u32. Fixed all type mismatches in extract_docstring() methods across Rust, JavaScript, C#, Go, Java extractors and the extract_c_style_doc() helper function. All 195 tests pass successfully.
flupkede
left a comment
There was a problem hiding this comment.
🔍 AI Code Review — PR #1
This PR upgrades tree-sitter from version 0.23 to 0.26 along with all language grammars (Rust 0.23→0.24, Python 0.23→0.25, etc.). The main code changes in src/chunker/extractor.rs adapt to tree-sitter API changes where named_child() now requires u32 instead of usize. The GitHub workflow has also been refactored to make macOS builds optional and manual-only due to cost concerns.
Summary
- ✅ Tree-sitter upgrade is well-structured and consistent
- ✅ Code changes correctly adapt to breaking API changes (type casts)
⚠️ Potential issue: Release job dependency on macOS build⚠️ Inconsistency: Cargo.toml version vs Cargo.lock
Findings
- 🔴 Critical: 0
- 🟡 Warning: 2
- 🟢 Suggestion: 0
|
|
||
| release: | ||
| needs: build | ||
| needs: [build] |
There was a problem hiding this comment.
🟡 Warning: The release job only depends on needs: [build], not on build-macos. If a user manually triggers the workflow with include_macos: true, the release will proceed even if the macOS build fails.
| needs: [build] | |
| release: | |
| needs: [build] | |
| if: | | |
| always() && | |
| needs.build.result == 'success' && | |
| (github.event_name != 'workflow_dispatch' || !inputs.include_macos || needs['build-macos']?.result == 'success') |
| [package] | ||
| name = "codesearch" | ||
| version = "0.1.47" | ||
| version = "0.1.48" |
There was a problem hiding this comment.
🟡 Warning: Version mismatch - Cargo.toml shows version 0.1.49, but Cargo.lock was updated to 0.1.48. These should be consistent. The lockfile should be regenerated after the version bump to 0.1.49.
cargo update
cargo buildThis will update the lockfile to match the declared version.
Modified search_exact() to use MUST constraint when both identifier and target_kind are detected. This prevents boosting ALL items of a kind (e.g., all enums when searching for 'ChunkKind enum') and only boosts items matching both the identifier AND kind. Changes: - Added contains_identifier() helper for PascalCase/snake_case/camelCase detection - Modified detect_structural_intent() to require both keyword AND identifier - Fixed kind mappings: 'struct' now correctly maps to ChunkKind::Struct - Updated search_exact() to use intersection (MUST) instead of union Results: - Q15 (Chunk struct): #4 → #1, P@10: 0.70 → 1.00 ✅ - Q16 (Chunker trait): #2 → #1, P@10: 1.00 → 1.00 ✅ - Q17 (ChunkKind enum): #1 → #5 but noise reduced 6→1 enums, P@10: 0.67 → 0.73 ✅ - Average P@10: 0.85 → 0.86 (+0.01)
Modified search_exact() to use MUST constraint when both identifier and target_kind are detected. This prevents boosting ALL items of a kind (e.g., all enums when searching for 'ChunkKind enum') and only boosts items matching both the identifier AND kind. Changes: - Added contains_identifier() helper for PascalCase/snake_case/camelCase detection - Modified detect_structural_intent() to require both keyword AND identifier - Fixed kind mappings: 'struct' now correctly maps to ChunkKind::Struct - Updated search_exact() to use intersection (MUST) instead of union Results: - Q15 (Chunk struct): #4 → #1, P@10: 0.70 → 1.00 ✅ - Q16 (Chunker trait): #2 → #1, P@10: 1.00 → 1.00 ✅ - Q17 (ChunkKind enum): #1 → #5 but noise reduced 6→1 enums, P@10: 0.67 → 0.73 ✅ - Average P@10: 0.85 → 0.86 (+0.01)
Modified search_exact() to use MUST constraint when both identifier and target_kind are detected. This prevents boosting ALL items of a kind (e.g., all enums when searching for 'ChunkKind enum') and only boosts items matching both the identifier AND kind. Changes: - Added contains_identifier() helper for PascalCase/snake_case/camelCase detection - Modified detect_structural_intent() to require both keyword AND identifier - Fixed kind mappings: 'struct' now correctly maps to ChunkKind::Struct - Updated search_exact() to use intersection (MUST) instead of union Results: - Q15 (Chunk struct): #4 → #1, P@10: 0.70 → 1.00 ✅ - Q16 (Chunker trait): #2 → #1, P@10: 1.00 → 1.00 ✅ - Q17 (ChunkKind enum): #1 → #5 but noise reduced 6→1 enums, P@10: 0.67 → 0.73 ✅ - Average P@10: 0.85 → 0.86 (+0.01)
No description provided.