Add difficulty-controlled Sample-N preference generation pipeline by EMZEDI · Pull Request #209 · ComplexData-MILA/AIF-Gen

EMZEDI · 2026-05-14T22:41:15Z

Summary

This PR implements an opt-in difficulty-controlled preference data generation pipeline for AIF-Gen using:

RLCD-style persona-conditioned candidate prompting
HelpSteer2-style rubric scoring
West-of-N style margin-based pair selection

The pipeline follows: Sample-N -> Score -> Select-by-Margin and introduces two primary knobs:

n_candidates (cost/coverage)
target_margin (0 hardest, 1 easiest)

Legacy behavior is preserved when no generation config is provided.

What was added

New pair selector module:
- aif_gen/generate/mappers/pair_selector.py
- ScoredCandidate dataclass
- select_pair(...) with:
  - target margin control
  - min/max margin filters
  - length-ratio filter
  - persona-aware tie-break
Response mapper enhancements:
- generate_candidate_prompt(...) with personas:
  - aligned
  - anti_aligned
  - neutral
- default_persona_schedule(...)
- exported persona constants in aif_gen/generate/mappers/__init__.py
Rubric judge additions in aif_gen/validation/llm_judge.py:
- _get_rubric_judge_prompt(...)
- _RubricResponse schema
- _get_rubric_score(...)
- weighted aggregation defaults:
  - preference_adherence: 0.6
  - objective_fidelity: 0.25
  - coherence: 0.15
Generation engine refactor (aif_gen/generate/engine.py):
- generate_continual_dataset(..., generation_config=None)
- new sampled path _generate_sample_sampled(...)
- parallel candidate sampling + rubric scoring + margin selection
- dedicated rubric cache index when sampled pipeline is enabled
- legacy path unchanged when generation_config is absent
CLI wiring (aif_gen/cli/commands/generate.py):
- added flags:
  - --n-candidates
  - --target-margin
- reads optional YAML generation: block
- CLI values override YAML values

Tests

Added:
- test/test_generate/test_pair_selector.py
Extended:
- test/test_generate/test_response_mapper.py

Validation run:

pytest test/ -x -q -> 197 passed

Notes

Existing generation configs remain backward compatible.
Sampled pipeline is opt-in via CLI flags or YAML generation: block.

Implement RLCD-style persona-conditioned candidate sampling, rubric-based scoring, and West-of-N margin pair selection. - Add pair selector with margin/length constraints - Add persona prompt generation and default schedule - Add rubric judge prompt/schema/scoring helper - Add opt-in generation_config path in engine - Add CLI knobs (--n-candidates, --target-margin) with YAML generation block support - Add/extend tests for selector and response mapper - Preserve legacy generation behavior when generation config is absent

Francodbalso self-requested a review May 15, 2026 17:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add difficulty-controlled Sample-N preference generation pipeline#209

Add difficulty-controlled Sample-N preference generation pipeline#209
EMZEDI wants to merge 1 commit into
mainfrom
feat/sample-n-difficulty-pipeline

EMZEDI commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EMZEDI commented May 14, 2026

Summary

What was added

Tests

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant