Tinker Integration by ultmaster · Pull Request #245 · microsoft/agent-lightning

ultmaster · 2025-10-30T04:24:54Z

This pull request adds a new example integration for using Tinker as a backend training service with Agent-lightning. It introduces a comprehensive bridge package (agl_tinker) that adapts Agent-lightning rollouts and datasets to Tinker’s reinforcement learning workflow, allowing seamless fine-tuning and evaluation workflows. The changes include documentation, environment setup, and core adapter modules that enable interoperability between the two frameworks.

New Example Integration and Documentation:

Added a new tinker example to the catalog, with a clear note on its unmaintained status and compatibility with Agent-lightning v0.2.1.
Provided a detailed README.md in examples/tinker/ explaining the integration, setup instructions, workflow differences, troubleshooting, and included files.
Added an .env.example template for environment variables required to run the Tinker integration, including keys for Tinker, OpenAI, WANDB, and CrewAI telemetry settings.

Core Bridge Package (agl_tinker/) Implementation:

Introduced agl_tinker/algo.py, implementing an Algorithm wrapper that plugs Tinker’s training loop into Agent-lightning’s resource management and dataset system.
Added agl_tinker/env.py, which provides adapters for Tinker’s RL environment and dataset builders, allowing Agent-lightning tasks to be used in Tinker workflows without modifying rollout logic.
Included license headers in new package files for compliance.

+        An available port number.
+    """
+    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
+        s.bind(("", 0))


To fix the issue, the code should avoid binding the socket to all interfaces ("" or "0.0.0.0"). In the context of port discovery, it is safer and functionally equivalent to bind the socket to the loopback address (127.0.0.1), which restricts the socket to local traffic only and avoids any potential security risk. In file examples/tinker/hello.py, replace s.bind(("", 0)) with s.bind(("127.0.0.1", 0)) in the _find_available_port() function. No other changes, methods, or imports are needed.

+        An available port number.
+    """
+    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
+        s.bind(("", 0))


To fix this problem, change the s.bind(("", 0)) line in _find_available_port() so it binds to the loopback interface only, by specifying '127.0.0.1' instead of the empty string. This restricts the socket to accept connections only from the local machine during the port discovery. No additional imports or significant code changes are needed; just one argument needs updating in the identified function in examples/tinker/q20_train.py at line 61.

Copilot

Pull Request Overview

This PR adds a comprehensive Tinker integration example to Agent-lightning, enabling reinforcement learning fine-tuning using Tinker as the backend training service. The integration bridges Agent-lightning's rollout architecture with Tinker's training infrastructure.

Key changes:

Added new Tinker integration example with bridge code (agl_tinker/) to connect Agent-lightning with Tinker's RL training
Implemented two example agents: a minimal "Hello" agent and a complex 20 Questions game using CrewAI
Added wandb dependency to the tinker extra in project configuration

Reviewed Changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`uv.lock`	Added wandb dependency for tinker extra
`pyproject.toml`	Added wandb to tinker dependencies
`examples/tinker/README.md`	Comprehensive documentation for Tinker integration
`examples/tinker/hello.py`	Minimal training example demonstrating identity repetition
`examples/tinker/q20_*.py`	20 Questions game implementation with training and evaluation scripts
`examples/tinker/agl_tinker/*.py`	Bridge package connecting Agent-lightning with Tinker
`examples/tinker/.env.example`	Environment variable template
`examples/README.md`	Updated examples catalog with Tinker entry

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-10-30T04:27:07Z

+        rew = 1.0
+    elif ("not " + task) in content_lower:
+        rew = -1.0
+    elif ("you're" + task) in content_lower or ("you are" + task) in content_lower:


Missing space in string concatenation for 'you're' check. Should be (\"you're \" + task) and (\"you are \" + task) to properly match phrases like "you're 42" instead of "you're42".

Suggested change

elif ("you're" + task) in content_lower or ("you are" + task) in content_lower:

elif ("you're " + task) in content_lower or ("you are " + task) in content_lower:

Copilot · 2025-10-30T04:27:08Z

+                len(val_indices),
+            )
+            train_dataset = [train_dataset[i] for i in train_indices]
+            val_dataset = [train_dataset[i] for i in val_indices]


The validation dataset is being created from the wrong source. It should use the original train_dataset variable from line 233, but after line 251, train_dataset has been reassigned to a list. This line should reference the original dataset before reassignment. Store the original dataset in a separate variable to avoid this issue.

ultmaster added 30 commits October 20, 2025 14:43

add tinker dependency

87798cc

.

20b13f1

.

05748c5

.

3ac0e1f

fix litellm proxy

978d9f4

.

9007fb2

copy rl_train

e7ef326

minimize

b878de1

.

8a002f8

.

38aa5fd

.

2bdf7ff

.

a2c9b0a

test with hello 1234567

6bf00d8

Merge branch 'main' of github.com:microsoft/agent-lightning into tinker

580bc13

move files

9b6ca72

update test llm

5d80825

.

faa46db

update test llm

46bdd2b

.

99f1d7a

.

6d912b8

.

a9271bf

.

c88e0cf

hello example running

0a5f8e0

Merge branch 'main' of github.com:microsoft/agent-lightning into tinker

0e3244b

update twenty question nouns dataset

8c75644

update uv lock

f00b1f3

update uv lock from main

991aa2b

Merge branch 'main' of github.com:microsoft/agent-lightning into tinker

8ef2c82

CrewAI baseline

53b2d70

.

0a21535

ultmaster added 22 commits October 28, 2025 14:17

.

2d7e665

update lr5e-4 16x8

5bd9b61

.

3d94def

.

635788d

update 32x4

064e535

.

6b86d65

fix error handling

c633634

.

92a5a3f

fix rollout status

2f29d04

.

a0e5987

.

3633849

.

9ae7ae4

update search

efa2af1

.

af4c8d0

update q20 train

003b8f7

Preparing code for merge

5241410

checkpoint

5f01765

update docstrings

2440ef0

minor fix

d1ed396

.

14ecf0d

update readme

94e5d5d

update Readme

4776134

Copilot AI review requested due to automatic review settings October 30, 2025 04:24

github-advanced-security AI found potential problems Oct 30, 2025

View reviewed changes

Copilot AI reviewed Oct 30, 2025

View reviewed changes

ultmaster added 2 commits October 30, 2025 12:33

resolve comments

917f37d

resolve comments

0d1cd44

ultmaster merged commit 496e793 into main Oct 30, 2025
12 checks passed

totoluo pushed a commit to totoluo/agent-lightning that referenced this pull request Nov 14, 2025

Tinker Integration (microsoft#245)

408fe27

ultmaster deleted the feature/tinker-example branch December 11, 2025 16:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tinker Integration#245

Tinker Integration#245
ultmaster merged 101 commits into
mainfrom
feature/tinker-example

ultmaster commented Oct 30, 2025

Uh oh!

Check warning

Copilot Autofix

Check warning

Copilot Autofix

Copilot AI left a comment

Uh oh!

Copilot AI Oct 30, 2025

Uh oh!

Copilot AI Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	elif ("you're" + task) in content_lower or ("you are" + task) in content_lower:
	elif ("you're " + task) in content_lower or ("you are " + task) in content_lower:

Conversation

ultmaster commented Oct 30, 2025

Uh oh!

Check warning

Copilot Autofix

Check warning

Copilot Autofix

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants