Skip to content

feat: block scraping and rewrite canonical on preview builds#2566

Open
vladfrangu wants to merge 4 commits into
masterfrom
preview-robots-canonical
Open

feat: block scraping and rewrite canonical on preview builds#2566
vladfrangu wants to merge 4 commits into
masterfrom
preview-robots-canonical

Conversation

@vladfrangu
Copy link
Copy Markdown
Member

Adds a postBuild plugin that, on PR/preview deployments (detected via APIFY_DOCS_ABSOLUTE_URL hostname, same heuristic as the existing preview banner), overwrites robots.txt with a global Disallow: / and rewrites every page's canonical link to point at https://docs.apify.com{path}. Keeps previews out of search indexes and prevents them from competing with production URLs.

The second commit handles SWC's HTML minifier stripping attribute quotes and omitting </head> — without it the canonical rewrite would have shipped as dead code (caught by staff-review against the actual build output).

Adds a postBuild plugin that, for PR/preview deployments, overwrites
robots.txt with a global Disallow and rewrites every page's canonical
link to point at docs.apify.com so search engines don't index previews
or compete with production URLs.
The original regex required quotes around `canonical` and the fallback
inserted before `</head>`. Both forms are stripped by the SWC HTML
minimizer, so neither branch fired against real build output and the
canonical rewrite shipped as dead code. Make the regex quote-optional,
add a `<body` fallback for the closing-tag-less case, and special-case
`404.html` to match the production canonical form.
@github-actions github-actions Bot added this to the 141st sprint - Tooling team milestone May 25, 2026
@github-actions github-actions Bot added the t-tooling Issues with this label are in the ownership of the tooling team. label May 25, 2026
@apify-service-account
Copy link
Copy Markdown
Contributor

apify-service-account commented May 25, 2026

✅ Preview for this PR (commit c3175b50) is ready at https://pr-2566.preview.docs.apify.com (see action run).

@vladfrangu vladfrangu requested a review from TC-MO May 25, 2026 10:33
@vladfrangu vladfrangu added the adhoc Ad-hoc unplanned task added during the sprint. label May 25, 2026
@vladfrangu vladfrangu requested a review from B4nan May 25, 2026 15:19
@vladfrangu
Copy link
Copy Markdown
Member Author

(staff-review was already ran)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants