diff --git a/README.md b/README.md
index 1b53cb7..d0cbd04 100644
--- a/README.md
+++ b/README.md
@@ -44,7 +44,7 @@ graph TB
 6. **Sample upload** lets users upload WAV files (songs, snippets) as reference tracks. The system analyzes metadata, generates CLAP embeddings, and the agent finds similar library samples via audio-to-audio cosine similarity
 7. **Pair feedback** lets users evaluate sample pairs — the agent presents plausible pairs with side-by-side playback and a "Play Together" mixed preview (aligned to song context key/BPM), collects thumbs up/down verdicts, and computes relational audio features in the background. Rapid pairing mode with random anchors and a "Next Pair" button enables fast verdict collection.
 8. **Preference learning** trains a logistic regression on 10-dimensional feature vectors (4 pair scores + 6 relational audio features) from accumulated verdicts. Auto-retrains every 5th verdict after 15 verdicts. Learned preferences are injected into the agent's system prompt and surfaced via the `show_preferences` tool as natural-language explanations.
-9. **Kit builder** assembles a complete multi-sample kit (e.g., kick + snare + hihat + bass + pad) using a greedy algorithm that maximizes pairwise compatibility. Per-type CLAP search retrieves candidates, fast inline scoring selects samples, and CNN diversity penalties prevent spectral redundancy — all rendered as an interactive kit card with per-slot playback
+9. **Kit builder** assembles a complete multi-sample kit (e.g., kick + snare + hihat + bass + pad) using a greedy algorithm that maximizes pairwise compatibility. Per-type CLAP search retrieves candidates, fast inline scoring selects samples, and CNN diversity penalties prevent spectral redundancy. Key compatibility scoring is skipped for unpitched/percussive types (drums, percussion, fx, etc.) — all rendered as an interactive kit card with per-slot playback
 10. **Agent streams response** back as SSE in Vercel AI SDK format, with transparent tool-call display and a song context badge in the chat header
 
 ### Why CLAP + CNN + Agent?
diff --git a/backend/src/samplespace/agents/sample_agent.py b/backend/src/samplespace/agents/sample_agent.py
index 351ad80..a73fd5b 100644
--- a/backend/src/samplespace/agents/sample_agent.py
+++ b/backend/src/samplespace/agents/sample_agent.py
@@ -41,7 +41,7 @@
 - **Pair feedback**: present_pair → user verdict → record_verdict. The system learns from verdicts over time — after enough feedback, use show_preferences to explain what it has learned.
 - **Rapid pairing**: When the user asks to "start a pairing session" or "evaluate pairs," use present_pair with anchor_type and candidate_type (omit sample_id for random anchors). When you receive a `[NEXT_PAIR]` message, call record_verdict for the previous pair, then immediately call present_pair again with the same types — keep it fast, minimal commentary.
 - **Upload flow**: User uploads a WAV → analyze_sample → find_similar_to_upload to find library matches.
-- If the user references a sample by name rather than ID, search for it first.
+- **Resolving sample references**: Users will refer to samples by ordinal position ("the 3rd one", "the first result"), by filename ("warm-pad.wav"), or by description ("that bass loop"). When they use an ordinal, resolve it from the most recent search or tool results in the conversation — each result includes a 1-based `index` field. When they use a filename or partial name, search for it first.
 
 ## Output Rules
 
diff --git a/backend/src/samplespace/agents/tools/formatting.py b/backend/src/samplespace/agents/tools/formatting.py
index 5561e3d..e352737 100644
--- a/backend/src/samplespace/agents/tools/formatting.py
+++ b/backend/src/samplespace/agents/tools/formatting.py
@@ -11,8 +11,8 @@ def format_sample_results(
 ) -> str:
     """Format a list of sample results as a playable sample-results code fence."""
     samples: list[dict[str, object]] = []
-    for s in results:
-        payload = sample_to_payload(s)
+    for i, s in enumerate(results, start=1):
+        payload = sample_to_payload(s, index=i)
         if annotations and s.id in annotations:
             payload["annotation"] = annotations[s.id]
         samples.append(payload)
@@ -21,13 +21,19 @@ def format_sample_results(
     return f"{header}\n\n```sample-results\n{json_str}\n```"
 
 
-def sample_to_payload(sample: SampleSchema, audio_url: str | None = None) -> dict[str, object]:
+def sample_to_payload(
+    sample: SampleSchema,
+    audio_url: str | None = None,
+    index: int | None = None,
+) -> dict[str, object]:
     """Build a JSON-serializable payload dict for a sample."""
     payload: dict[str, object] = {
         "id": sample.id,
         "filename": sample.filename,
         "audio_url": audio_url or f"/api/samples/{sample.id}/audio",
     }
+    if index is not None:
+        payload["index"] = index
     if sample.sample_type:
         payload["type"] = sample.sample_type
     if sample.is_loop:
diff --git a/backend/src/samplespace/agents/tools/transform_tools.py b/backend/src/samplespace/agents/tools/transform_tools.py
index 87b95ec..29d6ea5 100644
--- a/backend/src/samplespace/agents/tools/transform_tools.py
+++ b/backend/src/samplespace/agents/tools/transform_tools.py
@@ -8,6 +8,7 @@
 
 from samplespace.agents.deps import AgentDeps
 from samplespace.schemas.sample import SampleSchema
+from samplespace.schemas.sample_type import UNPITCHED_TYPES
 from samplespace.services import audio_transform as audio_transform_service
 from samplespace.services import music_theory as music_theory_service
 from samplespace.services import sample as sample_service
@@ -60,6 +61,12 @@ async def transform_single_sample(
     will_stretch = target_bpm is not None and sample.bpm is not None
     skipped: list[str] = []
 
+    # Percussive/noise types: skip pitch-shifting (degrades quality), keep BPM stretching
+    if sample.sample_type and sample.sample_type.lower() in UNPITCHED_TYPES:
+        if will_pitch:
+            skipped.append("Pitch-shift skipped — percussive sample type.")
+        will_pitch = False
+
     if target_key and not sample.key:
         skipped.append("Key transformation skipped — sample has no detected key.")
     if target_bpm and not sample.bpm:
diff --git a/backend/src/samplespace/schemas/sample_type.py b/backend/src/samplespace/schemas/sample_type.py
index 62d0fb3..045e5d6 100644
--- a/backend/src/samplespace/schemas/sample_type.py
+++ b/backend/src/samplespace/schemas/sample_type.py
@@ -28,6 +28,20 @@ class SampleType(StrEnum):
 
 SAMPLE_TYPES: list[str] = sorted(t.value for t in SampleType)
 
+# Sample types where pitch-shifting is harmful or meaningless.
+# Percussive/noise-based — even as loops, shifting their pitch
+# degrades quality without musical benefit. BPM time-stretching still applies.
+UNPITCHED_TYPES: set[str] = {
+    SampleType.KICK,
+    SampleType.SNARE,
+    SampleType.HIHAT,
+    SampleType.CLAP,
+    SampleType.CYMBAL,
+    SampleType.PERCUSSION,
+    SampleType.DRUM,
+    SampleType.FX,
+}
+
 # Keyword-to-type mapping for inferring sample type from file paths.
 # Keys are SampleType enum members; values are directory/segment keywords
 # that map to that type (checked against lowercased path segments).
diff --git a/backend/src/samplespace/services/candidate_search.py b/backend/src/samplespace/services/candidate_search.py
index 8e50a85..15970ac 100644
--- a/backend/src/samplespace/services/candidate_search.py
+++ b/backend/src/samplespace/services/candidate_search.py
@@ -6,20 +6,10 @@
 """
 
 from samplespace.schemas.sample import SampleSchema
-from samplespace.schemas.sample_type import SampleType
+from samplespace.schemas.sample_type import UNPITCHED_TYPES
 from samplespace.schemas.thread import SongContext
 from samplespace.services.music_theory import normalize_bpm, semitone_key_score
 
-# Types that are typically one-shots (no meaningful key)
-ONE_SHOT_TYPES: set[str] = {
-    SampleType.KICK,
-    SampleType.SNARE,
-    SampleType.HIHAT,
-    SampleType.CLAP,
-    SampleType.PERCUSSION,
-    SampleType.FX,
-}
-
 # Re-ranking weight profiles: (clap, bpm, key)
 _TONAL_WEIGHTS = (0.4, 0.25, 0.35)
 _PERCUSSIVE_WEIGHTS = (0.5, 0.5, 0.0)
@@ -43,7 +33,7 @@ def build_clap_query(
     parts.append(f"{sample_type} sample")
 
     if song_context:
-        if sample_type.lower() not in ONE_SHOT_TYPES and song_context.key:
+        if sample_type.lower() not in UNPITCHED_TYPES and song_context.key:
             parts.append(song_context.key)
         if song_context.bpm:
             parts.append(f"{song_context.bpm} BPM")
@@ -67,7 +57,7 @@ def rerank_candidates(
 
     has_bpm = song_context.bpm is not None
     has_key = song_context.key is not None
-    is_tonal = sample_type.lower() not in ONE_SHOT_TYPES
+    is_tonal = sample_type.lower() not in UNPITCHED_TYPES
 
     w_clap, w_bpm, w_key = _TONAL_WEIGHTS if is_tonal else _PERCUSSIVE_WEIGHTS
 
diff --git a/backend/src/samplespace/services/kit_builder.py b/backend/src/samplespace/services/kit_builder.py
index 35427e4..88225d8 100644
--- a/backend/src/samplespace/services/kit_builder.py
+++ b/backend/src/samplespace/services/kit_builder.py
@@ -14,7 +14,7 @@
 from samplespace.models.sample import Sample
 from samplespace.schemas.kit import KitResult, KitSlot, PairwiseEntry
 from samplespace.schemas.sample import SampleSchema
-from samplespace.schemas.sample_type import SAMPLE_TYPES, SampleType
+from samplespace.schemas.sample_type import SAMPLE_TYPES, UNPITCHED_TYPES, SampleType
 from samplespace.schemas.thread import SongContext
 from samplespace.services import embedding as embedding_service
 from samplespace.services import music_theory as music_theory_service
@@ -267,8 +267,10 @@ def _fast_compatibility(sample_a: SampleSchema, sample_b: SampleSchema) -> float
         pair_key = frozenset({sample_a.sample_type.lower(), sample_b.sample_type.lower()})
         scores.append(TYPE_COMPLEMENTARITY.get(pair_key, DEFAULT_TYPE_SCORE))
 
-    # Key compatibility (only for loops with keys)
-    if sample_a.is_loop and sample_b.is_loop and sample_a.key and sample_b.key:
+    # Key compatibility (only for pitched loops with keys)
+    a_pitched = sample_a.sample_type and sample_a.sample_type.lower() not in UNPITCHED_TYPES
+    b_pitched = sample_b.sample_type and sample_b.sample_type.lower() not in UNPITCHED_TYPES
+    if sample_a.is_loop and sample_b.is_loop and sample_a.key and sample_b.key and a_pitched and b_pitched:
         value, _ = music_theory_service.key_compatibility_score(sample_a.key, sample_b.key)
         scores.append(value)
 
diff --git a/backend/src/samplespace/services/pair_scoring.py b/backend/src/samplespace/services/pair_scoring.py
index 6af4a03..ca18c90 100644
--- a/backend/src/samplespace/services/pair_scoring.py
+++ b/backend/src/samplespace/services/pair_scoring.py
@@ -5,7 +5,7 @@
 
 from samplespace.models.sample import Sample
 from samplespace.schemas.pair import DimensionScore, PairScore
-from samplespace.schemas.sample_type import SampleType
+from samplespace.schemas.sample_type import UNPITCHED_TYPES, SampleType
 from samplespace.services import music_theory as music_theory_service
 from samplespace.services.music_theory import normalize_bpm
 
@@ -67,7 +67,9 @@ async def score_pair(db: AsyncSession, sample_a_id: str, sample_b_id: str) -> Pa
 
     dimensions: dict[str, DimensionScore] = {}
 
-    if sample_a.is_loop and sample_b.is_loop and sample_a.key and sample_b.key:
+    a_pitched = sample_a.sample_type and sample_a.sample_type.lower() not in UNPITCHED_TYPES
+    b_pitched = sample_b.sample_type and sample_b.sample_type.lower() not in UNPITCHED_TYPES
+    if sample_a.is_loop and sample_b.is_loop and sample_a.key and sample_b.key and a_pitched and b_pitched:
         dimensions["key"] = _compute_key_score(sample_a.key, sample_b.key)
 
     if sample_a.is_loop and sample_b.is_loop and sample_a.bpm and sample_b.bpm:
diff --git a/docs/DEMO_WORKFLOWS.md b/docs/DEMO_WORKFLOWS.md
index de9c428..4e95bf3 100644
--- a/docs/DEMO_WORKFLOWS.md
+++ b/docs/DEMO_WORKFLOWS.md
@@ -33,8 +33,9 @@ The agent encodes this text description into a 512-dim CLAP embedding and finds
 **What to watch for:**
 
 - "Searching samples..." spinner → checkmark
-- Results list with sample filenames, types, keys, BPMs, and IDs
+- Results list with numbered indices (#1, #2, #3...), sample filenames, types, keys, and BPMs
 - If song context is set, the query is automatically enriched with the vibe (expand the tool call to see the enriched query)
+- Users can reference results naturally: "find more like #3" or "the second one sounds great"
 
 **Variations:**
 
@@ -44,9 +45,9 @@ The agent encodes this text description into a 512-dim CLAP embedding and finds
 
 ### Audio-to-Audio Similarity
 
-> "Find samples that sound like `[sample_id]`"
+> "Find samples that sound like #2"
 
-Uses a custom-trained CNN on mel spectrograms to find spectrally similar samples. This is true audio-to-audio similarity — the CNN learns library-specific spectral features that CLAP's text-audio space can't capture.
+Uses a custom-trained CNN on mel spectrograms to find spectrally similar samples. This is true audio-to-audio similarity — the CNN learns library-specific spectral features that CLAP's text-audio space can't capture. Reference any sample from a previous search result by its number.
 
 **What to watch for:**
 
@@ -73,9 +74,9 @@ The system generates a CLAP embedding for the upload and searches the library in
 
 ### Interactive Pair Evaluation
 
-> "Show me a pair to evaluate starting from `[sample_id]` — try matching it with a snare"
+> "Show me a pair to evaluate — match a kick with a snare"
 
-The agent finds candidates via CNN similarity (top 15), filters by the requested type, scores each candidate across key/BPM/type/spectral dimensions, and selects the candidate closest to a 0.6 score — plausible but not obvious, to make the evaluation interesting.
+The agent picks a kick from the library, finds snare candidates via CNN similarity (top 15), scores each candidate across key/BPM/type/spectral dimensions, and selects a candidate closest to a 0.6 score — plausible but not obvious, to make the evaluation interesting.
 
 **What to watch for:**
 
@@ -90,9 +91,11 @@ The agent finds candidates via CNN similarity (top 15), filters by the requested
 
 ### Pitch and Tempo Transformation
 
-> "That pad sounds great but it's in the wrong key. Match `[sample_id]` to my song context."
+After searching for samples:
 
-The agent resolves the target key/BPM from the persisted song context, computes the semitone delta via circle-of-fifths logic, handles cross-mode transformations (major↔minor via relative keys), and runs pitch-shift/time-stretch.
+> "That pad sounds great but it's in the wrong key. Transform #4 to match my song context."
+
+The agent resolves the target key/BPM from the persisted song context, computes the semitone delta via circle-of-fifths logic, handles cross-mode transformations (major↔minor via relative keys), and runs pitch-shift/time-stretch. Percussive types (kick, snare, hihat, clap, cymbal, percussion, drum, fx) skip pitch-shifting — only BPM time-stretching is applied — since pitch-shifting degrades transient-heavy content.
 
 **What to watch for:**
 
@@ -123,12 +126,13 @@ A 6-step workflow demonstrating conversational memory and context-awareness acro
 
 - Agent calls `search_by_description` — expand the tool call to see the query enriched with "warm and dusty" vibe automatically
 - Results are influenced by the persisted vibe context
+- Each result is numbered (#1, #2, #3...) for easy reference
 
 **Step 3 — Analyze and check compatibility:**
 
-> "What key is `[bass_id]` in? Will it work with a pad in C major?"
+> "What key is #1 in? Will it work with a pad in C major?"
 
-- Agent calls `analyze_sample` then `check_key_compatibility`
+- Agent resolves #1 from the search results, then calls `analyze_sample` and `check_key_compatibility`
 - Two sequential tool calls, each with spinner → checkmark
 - Key compatibility explains the circle-of-fifths distance and whether the keys are relative major/minor pairs
 
@@ -145,9 +149,9 @@ A 6-step workflow demonstrating conversational memory and context-awareness acro
 > "Transform the kit to match my song context"
 
 - Agent calls `transform_kit` with the slots from step 4, resolving targets from song context (A minor, 85 BPM)
-- Kit block re-renders with transformed audio URLs — each loop is pitch-shifted and/or time-stretched
+- Kit block re-renders with transformed audio URLs — tonal loops are pitch-shifted and/or time-stretched; percussive loops (drums, hihats, etc.) are only time-stretched (pitch-shifting is skipped to preserve transient quality)
 - Response lists per-slot transforms (e.g., "bass: D minor → A minor (-5 semitones), 90 → 85 BPM")
-- One-shots are included as-is (no transform needed)
+- One-shots are included as-is (no transform needed); percussive loops note "Pitch-shift skipped — percussive sample type."
 
 **Step 6 — Preview the full kit:**
 
@@ -170,20 +174,27 @@ A feedback loop: find samples, explore neighbors, evaluate pairs, build system k
 > "Find me aggressive, distorted kicks"
 
 - Agent calls `search_by_description` — CLAP semantic search
-- Results render as playable sample cards with waveforms
+- Results render as numbered, playable sample cards with waveforms
+
+**Step 2 — Explore neighbors:**
+
+> "Find more samples that sound like #1"
 
-**Step 2 — Inspect in the detail view:**
+- Agent resolves #1 from the previous results, then calls `find_similar_samples` — CNN audio-to-audio similarity
+- New results are also numbered for continued referencing
 
-Navigate to the Sample Library page and click the magnifying glass on the kick you found.
+**Step 3 — Inspect in the detail view:**
+
+Navigate to the Sample Library page and click the magnifying glass on one of the kicks.
 
 - Detail panel opens alongside the list with full metadata, waveform, and mel spectrogram
 - Toggle to "CNN View" to see the exact 2-second, 128-mel-bin input the CNN processes
 - Scroll down to "Similar Samples" — these are the CNN's nearest spectral neighbors with similarity percentages
 - Play similar samples inline to audition them without leaving the panel
 
-**Step 3 — Evaluate a pairing:**
+**Step 4 — Evaluate a pairing:**
 
-> "Show me a pair to evaluate with `[kick_id]` — try matching it with a snare"
+> "Show me a pair to evaluate — match that kick with a snare"
 
 - Agent calls `present_pair` with candidate_type=snare
 - Candidates are found via CLAP search enriched with song context (vibe, genre, key, BPM)
@@ -191,7 +202,7 @@ Navigate to the Sample Library page and click the magnifying glass on the kick y
 - **Play Together** button layers both samples for audition as a mix
 - Click "Works" or "Doesn't work"
 
-**Step 4 — Rapid pairing mode:**
+**Step 5 — Rapid pairing mode:**
 
 > "Start a pairing session with kicks and basses"
 
@@ -201,7 +212,7 @@ Navigate to the Sample Library page and click the magnifying glass on the kick y
 - Each pair uses a new random anchor for diverse training data
 - Repeat rapidly to build up verdicts — the preference model auto-trains after 15+
 
-**Step 5 — Check what the system learned:**
+**Step 6 — Check what the system learned:**
 
 After 15+ verdicts (mix of approvals and rejections):
 
@@ -248,9 +259,9 @@ Attach a WAV via the paperclip button, then:
 
 **Key compatibility:** *"Are D minor and F major compatible?"* — circle-of-fifths check. Response explains they're relative major/minor pairs (highly compatible, score 0.95).
 
-**Complement suggestion:** *"Suggest a bass that complements `[pad_id]`"* — CLAP search + key/BPM filtering. Results show key compatibility annotations (checkmarks for same/relative keys).
+**Complement by reference:** *"Suggest a bass that complements #3"* — after a search, reference any result by number. CLAP search + key/BPM filtering. Results show key compatibility annotations.
 
-**Rate a pair:** *"How compatible are `[sample_a_id]` and `[sample_b_id]`?"* — multi-dimensional breakdown showing key, BPM, type complementarity, and spectral scores with a natural-language summary.
+**Rate a pair:** *"How compatible are #1 and #5?"* — multi-dimensional breakdown showing key, BPM, type complementarity, and spectral scores with a natural-language summary.
 
 **Sample detail view:** Navigate to the Sample Library page, click the magnifying glass on any sample card. The list splits to reveal a detail panel with full metadata, interactive waveform, mel spectrogram (toggle between Full and CNN View to see what the model sees during inference), and CNN-similar samples ranked by similarity percentage. Play similar samples inline to audition them.
 
@@ -258,7 +269,9 @@ Attach a WAV via the paperclip button, then:
 
 **Context-aware search:** *"I'm in G major at 140 BPM. Find me an uplifting lead"* — sets song context then searches with vibe enrichment, all in one turn.
 
-**Transform a kit:** *"Transform the kit to match my song context"* — pitch-shifts and time-stretches all loops in the kit to the target key/BPM. One-shots pass through unchanged.
+**Transform by reference:** *"Transform #2 to match my song context"* — pitch-shifts and/or time-stretches the sample to the target key/BPM. Percussive types skip pitch-shifting (only BPM-stretched). Listen to the result inline.
+
+**Transform a kit:** *"Transform the kit to match my song context"* — pitch-shifts and time-stretches tonal loops in the kit to the target key/BPM. One-shots pass through unchanged; percussive loops are only time-stretched (pitch-shift skipped).
 
 **Preview a kit:** *"Let me hear the full kit together"* — layers all kit samples into a single mixed audio preview for auditioning the full arrangement.
 
@@ -267,7 +280,8 @@ Attach a WAV via the paperclip button, then:
 ## Tips for Presenters
 
 - **Start fresh:** Each workflow assumes a new chat thread (no prior song context) unless noted. Click the new chat button in the sidebar.
-- **Sample IDs:** Replace `[sample_id]` placeholders with actual IDs from your library — every tool result includes sample IDs you can reference.
+- **Reference by number:** Search results are numbered (#1, #2, #3...). Use these in follow-up prompts: "find more like #2", "transform #4 to match my context", "how compatible are #1 and #3?"
+- **Reference by name:** You can also use filenames: "transform warm-pad.wav to match my context". The agent will look it up.
 - **Expand tool calls:** Click the collapsible tool call indicators to show input parameters and raw output. This demonstrates the agent's reasoning and the multi-modal retrieval pipeline.
 - **Audio playback:** Click waveforms to play samples; scrub by clicking along the waveform. Multiple samples can be played in sequence.
 - **Pair verdicts:** The thumbs up/down buttons auto-send a message to the agent — you don't need to type anything after clicking.
diff --git a/docs/preference-learning-flow.md b/docs/preference-learning-flow.md
index e71747e..f6e338f 100644
--- a/docs/preference-learning-flow.md
+++ b/docs/preference-learning-flow.md
@@ -49,4 +49,4 @@ flowchart TD
 | 9 | spectral_centroid_gap | pair_features | [0, 1] | Normalized centroid difference |
 | 10 | rms_energy_ratio | pair_features | [0, 1] | Normalized log energy ratio |
 
-Missing pair scores (e.g., key/BPM for one-shots) are imputed as 0.5 (neutral midpoint).
+Missing pair scores (e.g., key for one-shots or unpitched types, BPM for one-shots) are imputed as 0.5 (neutral midpoint).
diff --git a/frontend/components/elements/sample-card.tsx b/frontend/components/elements/sample-card.tsx
index d8e76fb..879c7fa 100644
--- a/frontend/components/elements/sample-card.tsx
+++ b/frontend/components/elements/sample-card.tsx
@@ -12,6 +12,7 @@ export interface SamplePayload {
   type?: string;
   key?: string;
   bpm?: number;
+  index?: number;
 }
 
 interface SampleCardProps {
@@ -47,6 +48,11 @@ export function SampleCard({
       }`}
     >
       <div className="flex items-center gap-2">
+        {sample.index != null && (
+          <span className="flex size-5 shrink-0 items-center justify-center rounded-full bg-muted font-medium text-muted-foreground text-xs">
+            {sample.index}
+          </span>
+        )}
         {onTogglePlay && (
           <Button
             className="size-7 shrink-0 rounded-full"