-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Insights: huggingface/trl
Overview
Could not load contribution data
Please try again later
2 Pull requests merged by 2 people
-
🚜 Use field in dataclasses
#2494 merged
Jan 6, 2025 -
Remove graph breaks for torch.compile() in padding free branch in DataCollatorForCompletionOnlyLM
#2158 merged
Jan 6, 2025
4 Pull requests opened by 3 people
-
add "_prepare_fsdp" for DPOTrainer
#2539 opened
Jan 3, 2025 -
custom reward function support for ppo trainer
#2540 opened
Jan 3, 2025 -
Issues Auto-Labeller
#2542 opened
Jan 4, 2025 -
MPO
#2544 opened
Jan 6, 2025
1 Issue closed by 1 person
-
SFTTrainer not loading dataset correctly, expected format?
#2541 closed
Jan 4, 2025
7 Issues opened by 7 people
-
Finetuning on the last turn of multi-turn conversations
#2545 opened
Jan 6, 2025 -
Dataset type conversion utilities
#2543 opened
Jan 6, 2025 -
Is `truncation_mode` used in `DPOTrainer`?
#2538 opened
Jan 2, 2025 -
Different finetune speed in DPO task of peft and ms-swift (600/S iter vs 30/s iter)
#2536 opened
Jan 2, 2025 -
(Willing to PR) Will it be welcomed if speeding up algorithms like PPO and code refactor/cleanup?
#2535 opened
Dec 31, 2024 -
Using "beam search" strategy while generating the responses
#2534 opened
Dec 31, 2024
13 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
add xpu support for DPO
#2533 commented on
Jan 3, 2025 • 2 new comments -
PPO Example Script Accelerator error: initialize your accelerator via `accelerator = Accelerator()`
#2377 commented on
Dec 31, 2024 • 0 new comments -
UserWarning for train dpo with lora: None of the inputs have requires_grad=True. Gradients will be None
#2486 commented on
Jan 2, 2025 • 0 new comments -
AttributeError: 'DistributedDataParallel' object has no attribute 'policy' when saving model using PPOTrainer
#2375 commented on
Jan 3, 2025 • 0 new comments -
`PPOv2Trainer` `reward_model` throws `AttributeError: '<My Custom Class>' object has no attribute 'base_model_prefix'`
#1977 commented on
Jan 4, 2025 • 0 new comments -
[question] best way to have my own reward model which is backed by rules
#2518 commented on
Jan 4, 2025 • 0 new comments -
[GRPO] initial GRPO trainer
#1954 commented on
Jan 5, 2025 • 0 new comments -
Asynchronous RLHF: Faster and More Efficient Online DPO
#2278 commented on
Dec 31, 2024 • 0 new comments -
Padding free dpo
#2437 commented on
Jan 2, 2025 • 0 new comments -
[Liger] add native liger-kernel orpo loss
#2482 commented on
Jan 3, 2025 • 0 new comments -
[Liger] Integrate Liger CPO & SimPO
#2506 commented on
Jan 3, 2025 • 0 new comments -
🕊️ DPO padding free
#2520 commented on
Jan 6, 2025 • 0 new comments -
[ORPO] revert orpo changes
#2527 commented on
Jan 6, 2025 • 0 new comments