Skip to content

Commit b0672ba

Browse files
authored
fix: fix a bug in model tuning feedback (microsoft#316)
* fix a bug * fix two bugs
1 parent 6716704 commit b0672ba

File tree

2 files changed

+3
-4
lines changed

2 files changed

+3
-4
lines changed

rdagent/scenarios/kaggle/developer/feedback.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ def generate_feedback(self, exp: Experiment, hypothesis: Hypothesis, trace: Trac
114114
# Prepare render dictionary
115115
render_dict = {
116116
"context": self.scen.get_scenario_all_desc(),
117-
"last_hypothesis": trace.hist[-1][0].hypothesis if trace.hist else None,
117+
"last_hypothesis": trace.hist[-1][0] if trace.hist else None,
118118
"last_task_and_code": last_task_and_code,
119119
"last_result": trace.hist[-1][1].result if trace.hist else None,
120120
"hypothesis": hypothesis,

rdagent/scenarios/kaggle/prompts.yaml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ model_experiment_output_format: |-
110110
}
111111
Usually, a larger model works better than a smaller one. Hence, the parameters should be larger.
112112
113-
model_feedback_generation:
113+
model_tuning_feedback_generation:
114114
system: |-
115115
You are a professional result analysis assistant. You will receive a result and a hypothesis.
116116
Your task is to provide feedback on how well the result supports or refutes the hypothesis by judging from the observation of performance increase or decrease.
@@ -149,8 +149,7 @@ model_feedback_generation:
149149
{% if last_hypothesis %}
150150
Last Round Information:
151151
Hypothesis: {{last_hypothesis.hypothesis}}
152-
Task: {{last_task}}
153-
Code Implemented: {{last_code}}
152+
Last Task and Code: {{last_task_and_code}}
154153
Result: {{last_result}}
155154
{% else %}
156155
This is the first round. No previous information available. As long as the performance is not too negative (eg.ICIR is greater than 0), treat it as successful. Do not set the threshold too high.

0 commit comments

Comments
 (0)