GitHub - NousEU/Coding-Tutor: Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors

Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents:
The Curious Case of LLMs as Your Coding Tutors

This work explores the potential of LLMs as coding tutors. We propose Trace-and-Verify (Traver), an effective agent workflow that incorporates knowledge tracing and turn-by-turn verification, to tackle key challenges in coding tutoring. While this work focuses on coding tutoring as an example, the proposed method extends beyond coding to other task-tutoring scenarios, where the tutor must adapt content to users' varying levels of background knowledge. We further introduce Dialogue for Coding Tutoring (DICT), a novel evaluation protocol combining student simulation and coding tests to assess tutor performance. Such automated evaluation is critical for developing task-tutoring agents as it supports a systematic development and evaluation cycle.

Coding Tutoring Evaluation

Analysis of Simulated Students

Under a controlled setup, simulated students at different levels demonstrate distinct abilities in completing target coding tasks. Our DICT protocol serves as a feasible proxy for human evaluation, offering its advantages of scalability and cost-effectiveness for evaluating tutor agents.

Inference-Time Scaling with Verifiers

Our proposed Traver agent workflow with the trained verifier shows inference-time scaling for coding tutoring:

Todo

Add detailed instructions for quick start
Add shell scripts for training and evaluation
Release checkpoints for the verifiers

Released Data and Results

Please refer to output for the released data and evaluation results.

Evaluation

Please refer to scripts/eval/ for the evaluation scripts.

Citation

If you find the resources in this repository useful for your work, please kindly cite our work as:

@article{wang2025training,
  title={Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors},
  author={Wang, Jian and Dai, Yinpei and Zhang, Yichi and Ma, Ziqiao and Li, Wenjie and Chai, Joyce},
  journal={arXiv preprint arXiv:2502.13311},
  url={https://arxiv.org/abs/2502.13311},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
assets		assets
benchmark/EvoCodeBench-2403		benchmark/EvoCodeBench-2403
build		build
config		config
human_eval		human_eval
output		output
prompt		prompt
scripts/eval		scripts/eval
traver		traver
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents:
The Curious Case of LLMs as Your Coding Tutors

Coding Tutoring Evaluation

Analysis of Simulated Students

Inference-Time Scaling with Verifiers

Todo

Released Data and Results

Evaluation

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors

Coding Tutoring Evaluation

Analysis of Simulated Students

Inference-Time Scaling with Verifiers

Todo

Released Data and Results

Evaluation

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents:
The Curious Case of LLMs as Your Coding Tutors

Packages