Add RFC 000#44
Conversation
| This project aims at standardizing environments for both training and evaluation. In the training space, this means also standardizing reward pipelines, while in the eval space this means helping with reproducibility where a model can be shipped with a complete set of agentic evals that can be easily run by others. | ||
|
|
||
| ### The problem with abstraction boundaries | ||
| Ideally, we would draw a boundary between environments and everything else (orchestration, resource allocation, RPCs, etc). We will try to do this as much as possible, but we will have to create additional interfaces so that if folks want to cross this boundary, they can. This will likely be necessary for things like reward pipelines that call reward models (which will very likely need to RPC to GPU machines), as well as for agentic evals like Tau where the eval itself involve two agents interacting with one another (and sending many RPCs). |
There was a problem hiding this comment.
Lets add: Interfaces for container providers also that we will need to support
| 2. Nailing our tools support | ||
| 3. Landing the basics of _sandboxing_, _versioning_, _binary distribution_, _dependency management_. | ||
|
|
||
| We will conclude this phase with version 0.3. |
There was a problem hiding this comment.
is it a convention to bump by 0.3 for evert phase? Just curious
There was a problem hiding this comment.
eh, I just thought 3 phases before 1.0, so let's do 0.3 :D at that point, we have 0.9 --> 1.0 for the final changes
|
|
||
| In the **first phase** of this project, we will focus **exclusively** on the narrowest definition of environments, without even worrying about rewards nor evals. Instead, the focus in this phase (and in the RFCs you find in this directory) is going to be on: | ||
| 1. Establishing a convention on what is an environment and where we draw the "environment" box. | ||
| 2. Nailing our tools support |
There was a problem hiding this comment.
suggestion: we can be precise tools meaning MCP as well as local tools
| We will group development from now till version 1.0 into three phases. | ||
|
|
||
| In the **first phase** of this project, we will focus **exclusively** on the narrowest definition of environments, without even worrying about rewards nor evals. Instead, the focus in this phase (and in the RFCs you find in this directory) is going to be on: | ||
| 1. Establishing a convention on what is an environment and where we draw the "environment" box. |
There was a problem hiding this comment.
One more thing for us to cover in phase 1 is: RPC method. In the current iteration, we only have HTTP but its possible that we might need a more long running session instead of request/response. This is a pattern which is applicable to any interpreted language - bash, python, ruby, etc.. We have taken an opinionated approach with pythonExec but I dont think we can skip bash or other languages.
…e.md Add RFC 000
No description provided.