Skip to content

hud-evals/hud-blank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Blank Environment

A minimal HUD environment template to use as a starting point for building your own environments.

Setup

uv sync
hud set HUD_API_KEY=your-key-here   # CLI auth, get one at hud.ai/project/api-keys

Deploy & Run

hud deploy .                              # deploy the environment (once)
hud sync tasks <taskset-name>             # push tasks to a taskset (fast, re-run on every task change)
hud eval <taskset-name> --remote --full

Iteration loop: hud deploy is the slow step — run it once. After that, edit tasks.py and re-run hud sync tasks (takes seconds). Only redeploy when env.py or the Dockerfile changes.

See Deploy & Go Remote for deploy flags, secrets, and auto-deploy options.

Scenarios

count-letters

Count occurrences of a letter in a word. No tools — pure text reasoning.

env("count-letters", word="strawberry", letter="r")

evaluate-expression

Compute a math expression using calculator tools (add, subtract, multiply). The value starts at 0 and the agent must use the tools to arrive at the answer.

env("evaluate-expression", expression="3 + 2 * 3", expected=9)

Documentation

To learn more about tasks, evaluations, and running at scale see the full docs.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages