-
Notifications
You must be signed in to change notification settings - Fork 300
Remove target_for_fewshot_sorting
and duplicate target calls
#241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove target_for_fewshot_sorting
and duplicate target calls
#241
Conversation
Hi ! Thanks for your PR, it appears the tests are not passing, I already took a look and it looks good but I will be reviewing more thourouly once the tests pass :) |
df4d57b
to
dc81343
Compare
choices=[" " + i for i in LETTER_INDICES[: len(line["endings"])]] + ([""] if line["__few_shot"] else []), | ||
gold_index=gold_ix, # -1 for test, | ||
instruction="The following are multiple choice questions (with answers) about common sense.\n\n", | ||
target_for_fewshot_sorting=line["endings"][gold_ix] if gold_ix > -1 else "", | ||
specific={ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this conversion is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to a safer alternative.
gold_index=gold_ix, | ||
instruction=f"The following are multiple choice questions (with answers) about {subject.replace('_', ' ')}.\n\n", | ||
target_for_fewshot_sorting=line["choices"][gold_ix], # specific to HELM evals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion is incorrect here.
Helm evals are actually doing a super weird thing where the actual choice is used for the few shot, and the key for the evaluation. (I think we should change it anyway since it makes no sense, so you'll just need to remove the comment about helm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it correct now? choice contents for fewshot and labels(A, B, C, D) for eval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I set target_for_fewshot_sorting
to the label (A,B,C,D). This way fewshot sampling becomes balanced on labels. May I keep this or revert it to be the choice content?
return Doc( | ||
task_name=task_name, | ||
query=query, | ||
choices=["A", "B"], | ||
gold_index=gold_ix, | ||
instruction="The following are multiple choice questions (with answers) about common sense.\n", | ||
target_for_fewshot_sorting=[line["sol1"], line["sol2"]][gold_ix], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the same here, you changed the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this accordingly.
Hi Sadra, |
target_for_fewshot_sorting
for another purposetarget_for_fewshot_sorting
and duplicate target calls
Hello @clefourrier , but this helps in fewshot sampling to become balanced. When For example, we have a function calling task. Our golds are json strings e.g. |
I see what you need it for! Hm let me think about it - I still think we should remove it for the HELM and keep only one version of a number of the tasks we have, but I understand your use case! |
@sadra-barikbin I added the way I would code it in #393 - (I can't push to your PR branch unless I'm editing from the web interface). Can you bring the modifications (prompt_manager and requests) over to your PR? |
Edit: will close this one and finish the rest in the other PR - can you check you got all you're needing over there? |
Removedtask
arg fromFewShotSampler.init_fewshot_sampling_balanced()
because it no longer needs it.fewshot
arg fromtask.doc_to_target()
anddoc.get_golds()
as it's no longer used.get_target_for_fewshot_sorting()
toDoc
which givestarget_for_fewshot_sorting
if it's set, otherwise gives the gold.Closes #240