Skip to content

Remove target_for_fewshot_sorting and duplicate target calls #241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

sadra-barikbin
Copy link
Contributor

@sadra-barikbin sadra-barikbin commented Jul 28, 2024

  • Removed task arg from FewShotSampler.init_fewshot_sampling_balanced() because it no longer needs it.
  • Removed fewshot arg from task.doc_to_target() and doc.get_golds() as it's no longer used.
  • Added get_target_for_fewshot_sorting() to Doc which gives target_for_fewshot_sorting if it's set, otherwise gives the gold.
  • Adopted some of the tasks in the code to the change to check its viability.

Closes #240

@NathanHB
Copy link
Member

Hi ! Thanks for your PR, it appears the tests are not passing, I already took a look and it looks good but I will be reviewing more thourouly once the tests pass :)

@sadra-barikbin sadra-barikbin force-pushed the deciding-target-for-fewshot-sorting branch from df4d57b to dc81343 Compare October 9, 2024 19:45
Comment on lines 793 to 796
choices=[" " + i for i in LETTER_INDICES[: len(line["endings"])]] + ([""] if line["__few_shot"] else []),
gold_index=gold_ix, # -1 for test,
instruction="The following are multiple choice questions (with answers) about common sense.\n\n",
target_for_fewshot_sorting=line["endings"][gold_ix] if gold_ix > -1 else "",
specific={
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this conversion is correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to a safer alternative.

gold_index=gold_ix,
instruction=f"The following are multiple choice questions (with answers) about {subject.replace('_', ' ')}.\n\n",
target_for_fewshot_sorting=line["choices"][gold_ix], # specific to HELM evals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversion is incorrect here.
Helm evals are actually doing a super weird thing where the actual choice is used for the few shot, and the key for the evaluation. (I think we should change it anyway since it makes no sense, so you'll just need to remove the comment about helm)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct now? choice contents for fewshot and labels(A, B, C, D) for eval.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I set target_for_fewshot_sorting to the label (A,B,C,D). This way fewshot sampling becomes balanced on labels. May I keep this or revert it to be the choice content?

return Doc(
task_name=task_name,
query=query,
choices=["A", "B"],
gold_index=gold_ix,
instruction="The following are multiple choice questions (with answers) about common sense.\n",
target_for_fewshot_sorting=[line["sol1"], line["sol2"]][gold_ix],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same here, you changed the logic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this accordingly.

@clefourrier
Copy link
Member

Hi Sadra,
I'm going to edit heavily your PR to actually remove target for few shot sorting, as it makes little sense imo to keep this param

@clefourrier clefourrier changed the title Keep target_for_fewshot_sorting for another purpose Remove target_for_fewshot_sorting and duplicate target calls Nov 18, 2024
@sadra-barikbin
Copy link
Contributor Author

sadra-barikbin commented Nov 18, 2024

Hello @clefourrier , but this helps in fewshot sampling to become balanced. When choices are the actual choices and not the labels(A, B,C, D) or when the gold is a custom string.

For example, we have a function calling task. Our golds are json strings e.g. {"function":"get_weather", "params":["city":"Paris"]} and we want to have a balanced sample of fewshot docs in the prompt, some of them be get_weather , some be get_time and so on. We can make sure that all function types are present in the prompt fewshot part. Without this argument, we don't have such guarantee.

@clefourrier
Copy link
Member

I see what you need it for! Hm let me think about it - I still think we should remove it for the HELM and keep only one version of a number of the tasks we have, but I understand your use case!

@clefourrier
Copy link
Member

clefourrier commented Nov 18, 2024

@sadra-barikbin I added the way I would code it in #393 - (I can't push to your PR branch unless I'm editing from the web interface). Can you bring the modifications (prompt_manager and requests) over to your PR?
Or I can wrap up what you need in the other PR and merge this one, and ofc add you as co-author

@clefourrier
Copy link
Member

Edit: will close this one and finish the rest in the other PR - can you check you got all you're needing over there?

@clefourrier clefourrier mentioned this pull request Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FT] To keep target_for_fewshot_sorting for another purpose
3 participants