Skip to content

Improvements to discrete action samplers#403

Closed
alexnikulkov wants to merge 1 commit intofacebookresearch:masterfrom
alexnikulkov:export-D26676495
Closed

Improvements to discrete action samplers#403
alexnikulkov wants to merge 1 commit intofacebookresearch:masterfrom
alexnikulkov:export-D26676495

Conversation

@alexnikulkov
Copy link
Copy Markdown
Contributor

Summary:

  1. Add support for decaying temperature to SoftmaxActionSampler
  2. Make sure we don't sample invalid actions in EpsilonGreedyActionSampler (indicated by hugely negative scores)

Differential Revision: D26676495

Summary:
1. Add support for decaying temperature to `SoftmaxActionSampler`
2. Make sure we don't sample invalid actions in `EpsilonGreedyActionSampler` (indicated by hugely negative scores)

Differential Revision: D26676495

fbshipit-source-id: 03b71a1dd8eaf6b3f0e6d5a3310966f9874dfbef
@facebook-github-bot
Copy link
Copy Markdown

This pull request was exported from Phabricator. Differential Revision: D26676495

@facebook-github-bot
Copy link
Copy Markdown

This pull request has been merged in 5fd7243.

xuruiyang pushed a commit that referenced this pull request Sep 20, 2025
Summary:
Pull Request resolved: #403

1. Add support for decaying temperature to `SoftmaxActionSampler`
2. Make sure we don't sample invalid actions in `EpsilonGreedyActionSampler` (indicated by hugely negative scores)

Reviewed By: czxttkl

Differential Revision: D26676495

fbshipit-source-id: 4248fc0b979be484252a2baa73690242e66e78e1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants