Remove `target_for_fewshot_sorting` and duplicate target calls #241

sadra-barikbin · 2024-07-28T17:07:21Z

~~Removed task arg from FewShotSampler.init_fewshot_sampling_balanced() because it no longer needs it.~~
Removed fewshot arg from task.doc_to_target() and doc.get_golds() as it's no longer used.
Added get_target_for_fewshot_sorting() to Doc which gives target_for_fewshot_sorting if it's set, otherwise gives the gold.
Adopted some of the tasks in the code to the change to check its viability.

Closes #240

NathanHB · 2024-08-14T11:59:13Z

Hi ! Thanks for your PR, it appears the tests are not passing, I already took a look and it looks good but I will be reviewing more thourouly once the tests pass :)

clefourrier · 2024-11-14T14:14:50Z

src/lighteval/tasks/default_prompts.py

+        choices=[" " + i for i in LETTER_INDICES[: len(line["endings"])]] + ([""] if line["__few_shot"] else []),
        gold_index=gold_ix,  # -1 for test,
        instruction="The following are multiple choice questions (with answers) about common sense.\n\n",
-        target_for_fewshot_sorting=line["endings"][gold_ix] if gold_ix > -1 else "",
        specific={


Are you sure this conversion is correct?

I changed it to a safer alternative.

src/lighteval/tasks/default_prompts.py

clefourrier · 2024-11-14T14:17:46Z

src/lighteval/tasks/default_prompts.py

        gold_index=gold_ix,
        instruction=f"The following are multiple choice questions (with answers) about {subject.replace('_', ' ')}.\n\n",
-        target_for_fewshot_sorting=line["choices"][gold_ix],  # specific to HELM evals


The conversion is incorrect here.
Helm evals are actually doing a super weird thing where the actual choice is used for the few shot, and the key for the evaluation. (I think we should change it anyway since it makes no sense, so you'll just need to remove the comment about helm)

Is it correct now? choice contents for fewshot and labels(A, B, C, D) for eval.

Now I set target_for_fewshot_sorting to the label (A,B,C,D). This way fewshot sampling becomes balanced on labels. May I keep this or revert it to be the choice content?

clefourrier · 2024-11-14T14:18:20Z

src/lighteval/tasks/default_prompts.py

    return Doc(
        task_name=task_name,
        query=query,
        choices=["A", "B"],
        gold_index=gold_ix,
        instruction="The following are multiple choice questions (with answers) about common sense.\n",
-        target_for_fewshot_sorting=[line["sol1"], line["sol2"]][gold_ix],


It's the same here, you changed the logic

I changed this accordingly.

clefourrier · 2024-11-18T09:57:18Z

Hi Sadra,
I'm going to edit heavily your PR to actually remove target for few shot sorting, as it makes little sense imo to keep this param

sadra-barikbin · 2024-11-18T10:53:22Z

Hello @clefourrier , but this helps in fewshot sampling to become balanced. When choices are the actual choices and not the labels(A, B,C, D) or when the gold is a custom string.

For example, we have a function calling task. Our golds are json strings e.g. {"function":"get_weather", "params":["city":"Paris"]} and we want to have a balanced sample of fewshot docs in the prompt, some of them be get_weather , some be get_time and so on. We can make sure that all function types are present in the prompt fewshot part. Without this argument, we don't have such guarantee.

clefourrier · 2024-11-18T11:03:32Z

I see what you need it for! Hm let me think about it - I still think we should remove it for the HELM and keep only one version of a number of the tasks we have, but I understand your use case!

clefourrier · 2024-11-18T11:28:35Z

@sadra-barikbin I added the way I would code it in #393 - (I can't push to your PR branch unless I'm editing from the web interface). Can you bring the modifications (prompt_manager and requests) over to your PR?
Or I can wrap up what you need in the other PR and merge this one, and ofc add you as co-author

clefourrier · 2024-11-18T12:25:28Z

Edit: will close this one and finish the rest in the other PR - can you check you got all you're needing over there?

sadra-barikbin added 4 commits October 9, 2024 23:15

Working on fewshot

61e11ef

Adapt prompts to removing target_for_fewshot_sorting

f7ed5ff

Fix a bug related to target_for_fewshot_sorting

6146844

Fix a tiny bug and apply ruff

dc81343

sadra-barikbin force-pushed the deciding-target-for-fewshot-sorting branch from df4d57b to dc81343 Compare October 9, 2024 19:45

clefourrier reviewed Nov 14, 2024

View reviewed changes

src/lighteval/tasks/default_prompts.py Show resolved Hide resolved

clefourrier reviewed Nov 14, 2024

View reviewed changes

sadra-barikbin added 2 commits November 16, 2024 20:08

Apply review comments

aa36d2d

Update piqa_helm

656a03f

clefourrier added 4 commits November 18, 2024 10:58

Update requests.py

c5c9936

Update prompt_manager.py

c545863

removing doc_to_target from task as it's now in the prompt manager

3d96734

Merge branch 'main' into deciding-target-for-fewshot-sorting

8b51d61

clefourrier changed the title ~~Keep target_for_fewshot_sorting for another purpose~~ Remove target_for_fewshot_sorting and duplicate target calls Nov 18, 2024

clefourrier closed this Nov 18, 2024

clefourrier mentioned this pull request Nov 18, 2024

Pr sadra #393

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove `target_for_fewshot_sorting` and duplicate target calls #241

Remove `target_for_fewshot_sorting` and duplicate target calls #241

Uh oh!

sadra-barikbin commented Jul 28, 2024 •

edited

Loading

Uh oh!

NathanHB commented Aug 14, 2024

Uh oh!

clefourrier Nov 14, 2024

Uh oh!

sadra-barikbin Nov 16, 2024

Uh oh!

Uh oh!

clefourrier Nov 14, 2024

Uh oh!

sadra-barikbin Nov 16, 2024

Uh oh!

sadra-barikbin Nov 16, 2024

Uh oh!

clefourrier Nov 14, 2024

Uh oh!

sadra-barikbin Nov 16, 2024

Uh oh!

clefourrier commented Nov 18, 2024

Uh oh!

sadra-barikbin commented Nov 18, 2024 •

edited

Loading

Uh oh!

clefourrier commented Nov 18, 2024

Uh oh!

clefourrier commented Nov 18, 2024 •

edited

Loading

Uh oh!

clefourrier commented Nov 18, 2024

Uh oh!

Uh oh!

Remove target_for_fewshot_sorting and duplicate target calls #241

Remove target_for_fewshot_sorting and duplicate target calls #241

Uh oh!

Conversation

sadra-barikbin commented Jul 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NathanHB commented Aug 14, 2024

Uh oh!

clefourrier Nov 14, 2024

Choose a reason for hiding this comment

Uh oh!

sadra-barikbin Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

clefourrier Nov 14, 2024

Choose a reason for hiding this comment

Uh oh!

sadra-barikbin Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

sadra-barikbin Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

clefourrier Nov 14, 2024

Choose a reason for hiding this comment

Uh oh!

sadra-barikbin Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

clefourrier commented Nov 18, 2024

Uh oh!

sadra-barikbin commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clefourrier commented Nov 18, 2024

Uh oh!

clefourrier commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clefourrier commented Nov 18, 2024

Uh oh!

Uh oh!

Remove `target_for_fewshot_sorting` and duplicate target calls #241

Remove `target_for_fewshot_sorting` and duplicate target calls #241

sadra-barikbin commented Jul 28, 2024 •

edited

Loading

sadra-barikbin commented Nov 18, 2024 •

edited

Loading

clefourrier commented Nov 18, 2024 •

edited

Loading