PPOv2Trainer
reward_model
throws AttributeError: '<My Custom Class>' object has no attribute 'base_model_prefix'
#1977
Labels
📚 documentation
Improvements or additions to documentation
🧒 good second issue
Good for contributors with basic project familiarity
🙋 help from community wanted
Open invitation for community members to contribute
🏋 PPO
Related to PPO
System Info
transformers
version: 4.44.0- distributed_type: FSDP
- mixed_precision: bf16
- use_cpu: False
- debug: True
- num_processes: 2
- machine_rank: 0
- num_machines: 1
- rdzv_backend: static
- same_network: True
- main_training_function: main
- enable_cpu_affinity: False
- fsdp_config: {'fsdp_activation_checkpointing': True, 'fsdp_auto_wrap_policy': 'TRANSFORMER_BASED_WRAP', 'fsdp_backward_prefetch': 'BACKWARD_PRE', 'fsdp_cpu_ram_efficient_loading': True, 'fsdp_forward_prefetch': True, 'fsdp_offload_params': True, 'fsdp_sharding_strategy': 'FULL_SHARD', 'fsdp_state_dict_type': 'SHARDED_STATE_DICT', 'fsdp_sync_module_states': True, 'fsdp_use_orig_params': True}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
- dynamo_config: {'dynamo_backend': 'EAGER'}
Information
Tasks
examples
folderReproduction
Note that In
PPOv2Trainer
, the type annotation forreward_model
isnn.Module
: https://github.com/huggingface/trl/blob/main/trl/trainer/ppov2_trainer.py#L77However, when I pass in
nn.Module
object (a classStrInputRewardModelEnsemble
I created myself which inherits fromnn.Module
), I receive the error:AttributeError: 'StrInputRewardModelEnsemble' object has no attribute 'base_model_prefix'
The error occurs here: https://github.com/huggingface/trl/blob/main/trl/trainer/ppov2_trainer.py#L58
Expected behavior
I think
PPOv2Trainer
either needs:to specify what exactly is expected for the
reward_model
The text was updated successfully, but these errors were encountered: