Actual models tested?

Practical question: for people running this with a cheaper model as the task agent (cost reasons), does the teacher need to match the family to get the diagnostic benefit, or does any strong frontier model work equally well as teacher? The MarkTechPost writeup hints at same-model pairing mattering but the two runs used different harnesses so it's hard to tell. Would love to know the actual teacher models used and whether you've tested cross-family pairings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actual models tested? #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Actual models tested? #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions