Skip to content

feat: NeMo RL: don't do on-policy fixes during validation #534

@bxyu-nvidia

Description

@bxyu-nvidia

Use cases, pain points, and background
Why should we do this? Why is this needed or wanted?

Description:
What should we do?

Design:
What files should be touched? What logic should be written?

Out of scope:
What are some items that this issue could be mistaken to cover that this issue should explicitly NOT cover?

Acceptance Criteria:

  • Individual items that need to be finished in order for this issue to be considered completed

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions