⚡ Bolt: Optimize JSON-LD parsing in MovieScraper#86
⚡ Bolt: Optimize JSON-LD parsing in MovieScraper#86bartholomej wants to merge 1 commit intomasterfrom
Conversation
Parse JSON-LD string once in `MovieScraper.movie()` and pass the object to helpers, avoiding redundant `JSON.parse` calls in `getMovieYear` and `getMovieDuration`. This reduces the overhead of parsing large JSON strings multiple times per movie request. - Define `MovieJsonLd` interface. - Update `getMovieYear` and `getMovieDuration` to accept `MovieJsonLd` object. - Update `MovieScraper.movie` to parse JSON-LD once. - Update tests to reflect signature changes. Co-authored-by: bartholomej <5861310+bartholomej@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
📝 WalkthroughWalkthroughThis PR introduces type-safe JSON-LD handling by creating a Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/dto/movie.ts (1)
186-188:dateCreatedanddurationshould be optional; preferunknownoveranyin the index signature.Two separate points on the interface shape:
Both fields are declared as required
string, but a live CSFD page that lacksdateCreatedin its JSON-LD will causegetMovieYearto silently returnNaN(because+undefined === NaN) rather thannull, even thoughjsonLdis non-null. Making the known fields optional matches the real data contract and lets TypeScript surface the missing-field check.
[key: string]: anydisables type-checking for all index-accessed properties.unknownis a drop-in replacement that preserves the open shape while forcing callers to narrow before use.✏️ Proposed fix
export interface MovieJsonLd { - dateCreated: string; - duration: string; - [key: string]: any; + dateCreated?: string; + duration?: string; + [key: string]: unknown; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/dto/movie.ts` around lines 186 - 188, Update the movie DTO so known optional fields are optional and the index signature uses unknown: change dateCreated and duration from required strings to optional (dateCreated?: string; duration?: string;) and replace [key: string]: any with [key: string]: unknown; then update any callers (e.g., getMovieYear) to properly narrow/check dateCreated before converting so it can return null instead of producing NaN.src/helpers/movie.helper.ts (1)
127-132: GuardjsonLd.dateCreatedbefore converting — links to optional-field suggestion inMovieJsonLd.If
dateCreatedis made optional inMovieJsonLd(per the dto comment above), the existingif (jsonLd)guard is no longer sufficient and+undefined=NaNwill be silently returned. Adding the field guard now is low-cost and defensive:✏️ Proposed fix
export const getMovieYear = (jsonLd: MovieJsonLd | null): number => { - if (jsonLd) { + if (jsonLd?.dateCreated) { return +jsonLd.dateCreated; } return null; };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/helpers/movie.helper.ts` around lines 127 - 132, The getMovieYear function currently only guards for jsonLd but not jsonLd.dateCreated, which can lead to NaN if dateCreated is undefined; update getMovieYear to check both jsonLd and jsonLd.dateCreated (or use a nullish check) before converting, and return null when dateCreated is missing; reference the getMovieYear function and the MovieJsonLd type so you add a guard like "if (!jsonLd || jsonLd.dateCreated == null) return null" and only then return Number(jsonLd.dateCreated).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/services/movie.service.ts`:
- Around line 44-50: The code reads
movieHtml.querySelector('script[type="application/ld+json"]').innerText without
guarding for a missing element, which can throw before the JSON.parse try/catch;
update the extraction in the movie() flow to first retrieve the element (use
optional chaining or a null check on the result of movieHtml.querySelector),
only read innerText when the element exists, and if absent leave jsonLd as null
(and optionally log a controlled message) so JSON.parse is never invoked on
undefined; refer to movieHtml, jsonLdString, and MovieJsonLd to locate and fix
the logic.
In `@tests/movie.test.ts`:
- Around line 52-59: getJsonLd currently falls back to parsing the string '{}'
which produces a truthy empty object and breaks consumers like getMovieYear;
change getJsonLd to detect absence of the script element and return null
immediately instead of using '?? "{}"', i.e., capture the result of
node.querySelector('script[type="application/ld+json"]') into a variable, if
it's null return null, otherwise read its innerText and JSON.parse inside the
try/catch to return MovieJsonLd or null; update references to getJsonLd and
ensure getMovieYear receives null when no JSON‑LD is present.
---
Nitpick comments:
In `@src/dto/movie.ts`:
- Around line 186-188: Update the movie DTO so known optional fields are
optional and the index signature uses unknown: change dateCreated and duration
from required strings to optional (dateCreated?: string; duration?: string;) and
replace [key: string]: any with [key: string]: unknown; then update any callers
(e.g., getMovieYear) to properly narrow/check dateCreated before converting so
it can return null instead of producing NaN.
In `@src/helpers/movie.helper.ts`:
- Around line 127-132: The getMovieYear function currently only guards for
jsonLd but not jsonLd.dateCreated, which can lead to NaN if dateCreated is
undefined; update getMovieYear to check both jsonLd and jsonLd.dateCreated (or
use a nullish check) before converting, and return null when dateCreated is
missing; reference the getMovieYear function and the MovieJsonLd type so you add
a guard like "if (!jsonLd || jsonLd.dateCreated == null) return null" and only
then return Number(jsonLd.dateCreated).
| const jsonLdString = movieHtml.querySelector('script[type="application/ld+json"]').innerText; | ||
| let jsonLd: MovieJsonLd | null = null; | ||
| try { | ||
| jsonLd = JSON.parse(jsonLdString); | ||
| } catch (e) { | ||
| console.error('node-csfd-api: Error parsing JSON-LD', e); | ||
| } |
There was a problem hiding this comment.
NPE if the <script type="application/ld+json"> tag is absent.
querySelector returns null if no matching element is found. On line 44, .innerText is accessed directly on the return value without optional chaining. When the tag is missing, this throws TypeError: Cannot read properties of null (reading 'innerText') before reaching the try/catch on line 46, crashing the public movie() method entirely. The test helper at tests/movie.test.ts line 53 correctly uses ?.innerText, making the discrepancy between production and test defensive behavior visible.
🐛 Proposed fix
- const jsonLdString = movieHtml.querySelector('script[type="application/ld+json"]').innerText;
- let jsonLd: MovieJsonLd | null = null;
- try {
- jsonLd = JSON.parse(jsonLdString);
- } catch (e) {
- console.error('node-csfd-api: Error parsing JSON-LD', e);
- }
+ const jsonLdString = movieHtml.querySelector('script[type="application/ld+json"]')?.innerText;
+ let jsonLd: MovieJsonLd | null = null;
+ if (jsonLdString) {
+ try {
+ jsonLd = JSON.parse(jsonLdString) as MovieJsonLd;
+ } catch (e) {
+ console.error('node-csfd-api: Error parsing JSON-LD', e);
+ }
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/services/movie.service.ts` around lines 44 - 50, The code reads
movieHtml.querySelector('script[type="application/ld+json"]').innerText without
guarding for a missing element, which can throw before the JSON.parse try/catch;
update the extraction in the movie() flow to first retrieve the element (use
optional chaining or a null check on the result of movieHtml.querySelector),
only read innerText when the element exists, and if absent leave jsonLd as null
(and optionally log a controlled message) so JSON.parse is never invoked on
undefined; refer to movieHtml, jsonLdString, and MovieJsonLd to locate and fix
the logic.
| const getJsonLd = (node: HTMLElement): MovieJsonLd | null => { | ||
| const text = node.querySelector('script[type="application/ld+json"]')?.innerText ?? '{}'; | ||
| try { | ||
| return JSON.parse(text); | ||
| } catch (e) { | ||
| return null; | ||
| } | ||
| }; |
There was a problem hiding this comment.
?? '{}' fallback returns a truthy empty object, breaking getMovieYear for pages without a JSON-LD tag.
When script[type="application/ld+json"] is absent, innerText is undefined, text becomes '{}', and JSON.parse('{}') returns {} — a truthy object with no dateCreated or duration. Any subsequent call to getMovieYear({}) will execute the if (jsonLd) branch and return +undefined (i.e., NaN) instead of null.
Return null directly when the tag is missing to stay consistent with the MovieJsonLd | null contract:
✏️ Proposed fix
const getJsonLd = (node: HTMLElement): MovieJsonLd | null => {
- const text = node.querySelector('script[type="application/ld+json"]')?.innerText ?? '{}';
- try {
- return JSON.parse(text);
- } catch (e) {
- return null;
- }
+ const text = node.querySelector('script[type="application/ld+json"]')?.innerText;
+ if (!text) return null;
+ try {
+ return JSON.parse(text) as MovieJsonLd;
+ } catch (e) {
+ return null;
+ }
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const getJsonLd = (node: HTMLElement): MovieJsonLd | null => { | |
| const text = node.querySelector('script[type="application/ld+json"]')?.innerText ?? '{}'; | |
| try { | |
| return JSON.parse(text); | |
| } catch (e) { | |
| return null; | |
| } | |
| }; | |
| const getJsonLd = (node: HTMLElement): MovieJsonLd | null => { | |
| const text = node.querySelector('script[type="application/ld+json"]')?.innerText; | |
| if (!text) return null; | |
| try { | |
| return JSON.parse(text) as MovieJsonLd; | |
| } catch (e) { | |
| return null; | |
| } | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/movie.test.ts` around lines 52 - 59, getJsonLd currently falls back to
parsing the string '{}' which produces a truthy empty object and breaks
consumers like getMovieYear; change getJsonLd to detect absence of the script
element and return null immediately instead of using '?? "{}"', i.e., capture
the result of node.querySelector('script[type="application/ld+json"]') into a
variable, if it's null return null, otherwise read its innerText and JSON.parse
inside the try/catch to return MovieJsonLd or null; update references to
getJsonLd and ensure getMovieYear receives null when no JSON‑LD is present.
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #86 +/- ##
==========================================
- Coverage 99.52% 99.20% -0.32%
==========================================
Files 28 28
Lines 629 631 +2
Branches 143 145 +2
==========================================
Hits 626 626
- Misses 3 5 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
💡 What: Optimized JSON-LD parsing in
MovieScraper.🎯 Why:
JSON.parsewas called multiple times for the same large JSON-LD string in helper functions (getMovieYear,getMovieDuration).📊 Impact: Reduces synchronous parsing overhead by parsing once per movie request.
🔬 Measurement: Verified with unit tests in
tests/movie.test.tsand integration testcsfd.movie(10135)intests/fetchers.test.ts.PR created automatically by Jules for task 7258766754463568208 started by @bartholomej
Summary by CodeRabbit
Release Notes
Refactor
Tests