⚡ Bolt: Optimize JSON-LD parsing in MovieScraper#81
⚡ Bolt: Optimize JSON-LD parsing in MovieScraper#81bartholomej wants to merge 1 commit intomasterfrom
Conversation
- Parse JSON-LD once in `MovieScraper.movie` and pass the object to helpers. - Refactor `getMovieYear` and `getMovieDuration` to accept parsed object. - Avoids redundant `JSON.parse` calls. - Preserves fallback logic for duration scraping. - Updates tests to reflect signature changes. Co-authored-by: bartholomej <5861310+bartholomej@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
📝 WalkthroughWalkthroughThe pull request refactors JSON-LD handling by shifting the parsing responsibility from individual helper functions to the service layer. Helper functions now accept pre-parsed JSON-LD objects instead of raw strings, with improved null-safety and error handling throughout. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #81 +/- ##
==========================================
- Coverage 99.52% 99.20% -0.32%
==========================================
Files 28 28
Lines 629 630 +1
Branches 143 145 +2
==========================================
- Hits 626 625 -1
- Misses 3 5 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/services/movie.service.ts`:
- Around line 44-50: The code calls
movieHtml.querySelector('script[type="application/ld+json"]').innerText without
checking for a null result, which can throw a TypeError before the JSON.parse
try/catch; update the block in movie.service.ts to first get the element into a
variable (e.g., const ldScript = movieHtml.querySelector(...)) and use optional
chaining or a null check when reading innerText (e.g., const jsonLd =
ldScript?.innerText), then only attempt JSON.parse when jsonLd is not null/empty
and handle the missing-script case by leaving movieJsonLd null or logging a
clear warning; reference the existing symbols movieHtml, jsonLd, movieJsonLd and
the querySelector call to locate where to change.
🧹 Nitpick comments (1)
src/services/movie.service.ts (1)
59-59: Consider a narrower type thananyforjsonLd.Using
anyloses all type safety. Even a minimal interface (e.g.,{ dateCreated?: string; duration?: string }) orRecord<string, unknown>would catch misuse at compile time.
| const jsonLd = movieHtml.querySelector('script[type="application/ld+json"]').innerText; | ||
| return this.buildMovie(+movieId, movieNode, asideNode, pageClasses, jsonLd, options); | ||
| let movieJsonLd = null; | ||
| try { | ||
| movieJsonLd = JSON.parse(jsonLd); | ||
| } catch (e) { | ||
| console.error('node-csfd-api: Error parsing JSON-LD', e); | ||
| } |
There was a problem hiding this comment.
Potential TypeError if the JSON-LD <script> tag is missing.
Line 44 calls .innerText on the result of querySelector(...) without a null check. If the page lacks a script[type="application/ld+json"] element, this throws before the try/catch on line 46 can catch it.
🐛 Proposed fix — use optional chaining
- const jsonLd = movieHtml.querySelector('script[type="application/ld+json"]').innerText;
- let movieJsonLd = null;
- try {
- movieJsonLd = JSON.parse(jsonLd);
- } catch (e) {
- console.error('node-csfd-api: Error parsing JSON-LD', e);
- }
+ const jsonLdRaw = movieHtml.querySelector('script[type="application/ld+json"]')?.innerText;
+ let movieJsonLd = null;
+ if (jsonLdRaw) {
+ try {
+ movieJsonLd = JSON.parse(jsonLdRaw);
+ } catch (e) {
+ console.error('node-csfd-api: Error parsing JSON-LD', e);
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const jsonLd = movieHtml.querySelector('script[type="application/ld+json"]').innerText; | |
| return this.buildMovie(+movieId, movieNode, asideNode, pageClasses, jsonLd, options); | |
| let movieJsonLd = null; | |
| try { | |
| movieJsonLd = JSON.parse(jsonLd); | |
| } catch (e) { | |
| console.error('node-csfd-api: Error parsing JSON-LD', e); | |
| } | |
| const jsonLdRaw = movieHtml.querySelector('script[type="application/ld+json"]')?.innerText; | |
| let movieJsonLd = null; | |
| if (jsonLdRaw) { | |
| try { | |
| movieJsonLd = JSON.parse(jsonLdRaw); | |
| } catch (e) { | |
| console.error('node-csfd-api: Error parsing JSON-LD', e); | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@src/services/movie.service.ts` around lines 44 - 50, The code calls
movieHtml.querySelector('script[type="application/ld+json"]').innerText without
checking for a null result, which can throw a TypeError before the JSON.parse
try/catch; update the block in movie.service.ts to first get the element into a
variable (e.g., const ldScript = movieHtml.querySelector(...)) and use optional
chaining or a null check when reading innerText (e.g., const jsonLd =
ldScript?.innerText), then only attempt JSON.parse when jsonLd is not null/empty
and handle the missing-script case by leaving movieJsonLd null or logging a
clear warning; reference the existing symbols movieHtml, jsonLd, movieJsonLd and
the querySelector call to locate where to change.
💡 What: Optimized
MovieScraperto parse JSON-LD data only once instead of multiple times in helper functions.🎯 Why: Parsing JSON is synchronous and can be expensive. Since multiple fields (year, duration) are extracted from the same JSON-LD block, it is more efficient to parse it once and pass the object.
📊 Impact: Reduces redundant work. For each movie scraped,
JSON.parseis called 1 time instead of 3 times (once for year, once for duration, and potentially more if future fields are added).🔬 Measurement: Verified with
yarn test. All tests passed. The logic ensures that if JSON parsing fails, existing fallback mechanisms (DOM scraping) are used, maintaining robustness.PR created automatically by Jules for task 5305642222362313870 started by @bartholomej
Summary by CodeRabbit