You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when converting a preference dataset to an unpaired preference dataset with unpair_preference_dataset(), we are converting from a relative ranking to an absolute ranking. In a preference dataset, despite having a "chosen" and a "rejected" example, both can be good or both bad, just one slightly better/worse. See the example below.
So one should not convert a Preference dataset to an Unpaired Preference Dataset without keeping an eye on absolute ratings from e.g. a reward model.
Suggestion: At least add a warning to the documentation and conversion code or even remove it
when converting from Unpaired preference or Stepwise supervision to anything un-labeled like Language modeling or Prompt-completion, only the good (label=True) examples should be used. Like when converting from a Preference dataset it only uses the chosen completions.
Suggestion: Can easily fix that in the example conversion code
Information
The official example scripts
My own modified scripts
Tasks
An officially supported task in the examples folder
My own task or dataset (give details below)
Reproduction
fromdatasetsimportDatasetdataset_dict= {
"prompt": ["The sky is", "The sun is"]
"chosen": [" blue.", " in our solar system"],
"rejected": [" above.", " in the sky."]
}
dataset=Dataset.from_dict(dataset_dict)
dataset=unpair_preference_dataset(dataset)
dataset[1]
outputs:
e.g.
{'prompt': 'The sky is', 'completion': ' above.', 'label': False}
Expected behavior
{'prompt': 'The sky is', 'completion': ' blue.', 'label': True}
{'prompt': 'The sky is', 'completion': ' above.', 'label': True}
Checklist
I have checked that my issue isn't already filed (see open issues)
I have included my system information
Any code provided is minimal, complete, and reproducible (more on MREs)
Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
Any traceback provided is complete
The text was updated successfully, but these errors were encountered:
System Info
Some things that are not really correct in the dataset type conversions in https://huggingface.co/docs/trl/main/en/dataset_formats#utilities-for-converting-dataset-types:
So one should not convert a Preference dataset to an Unpaired Preference Dataset without keeping an eye on absolute ratings from e.g. a reward model.
Suggestion: At least add a warning to the documentation and conversion code or even remove it
Suggestion: Can easily fix that in the example conversion code
Information
Tasks
examples
folderReproduction
outputs:
e.g.
Expected behavior
Checklist
The text was updated successfully, but these errors were encountered: