Skip to content

Add challenge 106: Token Embedding Layer (Medium)#283

Open
claude[bot] wants to merge 1 commit into
mainfrom
add-challenge-106-token-embedding-layer
Open

Add challenge 106: Token Embedding Layer (Medium)#283
claude[bot] wants to merge 1 commit into
mainfrom
add-challenge-106-token-embedding-layer

Conversation

@claude

@claude claude Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds challenge 106: Token Embedding Layer — the BERT-style input embedding kernel that gathers token and positional embeddings, sums them, and applies LayerNorm with learnable affine parameters.
  • Exercises indirect-addressed gathers from two large lookup tables, coalesced reads across the embedding dimension, and per-token reductions in a single fused kernel — solidly Medium, distinct from the existing attention/sampling/quantization challenges.

Validation

  • Verified end-to-end on the live platform via scripts/run_challenge.py (--action submitAll tests passed on Tesla T4).
  • pre-commit run --all-files clean.

Test plan

  • Linting passes
  • Reference impl matches the CUDA solution across all 9 functional cases and the perf case (B=32, T=512, V=30,000, P=2,048, D=768)
  • First HTML example matches generate_example_test()
  • Performance test memory ~150 MB (fits 5× in 16 GB Tesla T4 VRAM)

🤖 Generated with Claude Code

Implements the BERT-style input embedding kernel: gather token and
positional embeddings, sum them, and apply Layer Normalization with
learnable affine parameters. Exercises indirect-addressed gathers,
coalesced reads across the embedding dimension, and per-token reductions
in a single fused kernel.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants