-
Notifications
You must be signed in to change notification settings - Fork 35
Description
Bug Report: Emoji Variation Selectors Not Removed from Slugs
Description
The slug() function does not fully remove emojis that contain Unicode variation selectors (U+FE00 to U+FE0F), leaving invisible characters in the generated slug. This creates broken anchor links.
Expected Behavior
Based on the test fixtures, emojis should be completely removed from slugs:
"😄 unicode emoji"→"-unicode-emoji"✓
Actual Behavior
Emojis composed with variation selectors leave the invisible variation selector character in the slug:
"🗃️ File Cabinet"→"️-file-cabinet"✗ (contains invisible U+FE0F)
Reproduction
const { slug } = require('github-slugger');
// Simple emoji - works correctly
console.log(slug('✅ Checkmark'));
// Output: "-checkmark" ✓
// Emoji with variation selector - leaves invisible character
console.log(slug('🗃️ File Cabinet'));
// Output: "️-file-cabinet" ✗ (starts with invisible U+FE0F)
// Inspect the bytes to see the problem
const result = slug('🗃️ File Cabinet');
console.log('Bytes:', Buffer.from(result).toString('hex'));
// Output: "efb88f2d66696c652d636162696e6574"
// ^^^^^^ = U+FE0F (variation selector - should not be here!)Root Cause
Some emojis are composed of multiple Unicode characters:
- Base emoji character (e.g.,
U+1F5C3= 🗃) - Variation Selector-16 (
U+FE0F) - makes the emoji display in color
Examples:
🗃️=U+1F5C3+U+FE0F(2 characters)1️⃣=U+0031+U+FE0F+U+20E3(3 characters)✅=U+2705(1 character) ✓
The current implementation removes the visible emoji character but leaves the invisible variation selector behind.
Impact
This breaks markdown anchor links in real-world usage:
## 🗃️ File Management
Link to section: [See File Management](#-file-management)The link won't work because the slug contains an invisible character that doesn't match.
Affected Emojis
Common emojis with variation selectors include:
- 🗃️ (File Cabinet)
- ☑️ (Check Box)
- ✔️ (Check Mark)
- ⭐️ (Star)
- ❤️ (Red Heart)
- 1️⃣ 2️⃣ 3️⃣ (Keycap Numbers)
- Many others with U+FE0F modifier
Proposed Fix
Strip variation selectors (U+FE00 through U+FE0F) after processing emojis:
function slug(value) {
// ... existing slug generation code ...
// Remove variation selectors that may remain after emoji removal
result = result.replace(/[\uFE00-\uFE0F]/g, '');
return result;
}Workaround
Users can post-process the slug output:
const { slug } = require('github-slugger');
const result = slug('🗃️ File Cabinet').replace(/[\uFE00-\uFE0F]/g, '');
// Output: "-file-cabinet" ✓Environment
github-sluggerversion: 2.0.0 (latest)- Node.js version: v20+
- Platform: All platforms
Additional Context
Similar issues have been addressed in related libraries:
- emoji-regex #45 - variation selector handling
- emojibase #1 - variation selector capture
This is becoming more common as modern emoji keyboards automatically add variation selectors for better rendering.
Would you like me to submit a pull request with the fix?