`PPOv2Trainer` `reward_model` throws `AttributeError: '<My Custom Class>' object has no attribute 'base_model_prefix'` #1977

RylanSchaeffer · 2024-08-26T15:48:49Z

System Info

transformers version: 4.44.0
Platform: Linux-5.4.0-162-generic-x86_64-with-glibc2.31
Python version: 3.11.9
Huggingface_hub version: 0.23.4
Safetensors version: 0.4.3
Accelerate version: 0.32.1
Accelerate config: - compute_environment: LOCAL_MACHINE
- distributed_type: FSDP
- mixed_precision: bf16
- use_cpu: False
- debug: True
- num_processes: 2
- machine_rank: 0
- num_machines: 1
- rdzv_backend: static
- same_network: True
- main_training_function: main
- enable_cpu_affinity: False
- fsdp_config: {'fsdp_activation_checkpointing': True, 'fsdp_auto_wrap_policy': 'TRANSFORMER_BASED_WRAP', 'fsdp_backward_prefetch': 'BACKWARD_PRE', 'fsdp_cpu_ram_efficient_loading': True, 'fsdp_forward_prefetch': True, 'fsdp_offload_params': True, 'fsdp_sharding_strategy': 'FULL_SHARD', 'fsdp_state_dict_type': 'SHARDED_STATE_DICT', 'fsdp_sync_module_states': True, 'fsdp_use_orig_params': True}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
- dynamo_config: {'dynamo_backend': 'EAGER'}
PyTorch version (GPU?): 2.4.0+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: No
Using GPU in script?: Yes
GPU type: NVIDIA A100-SXM4-80GB

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

Note that In PPOv2Trainer, the type annotation for reward_model is nn.Module: https://github.com/huggingface/trl/blob/main/trl/trainer/ppov2_trainer.py#L77

However, when I pass in nn.Module object (a class StrInputRewardModelEnsemble I created myself which inherits from nn.Module), I receive the error:

AttributeError: 'StrInputRewardModelEnsemble' object has no attribute 'base_model_prefix'

The error occurs here: https://github.com/huggingface/trl/blob/main/trl/trainer/ppov2_trainer.py#L58

Expected behavior

I think PPOv2Trainer either needs:

Better documentation and/or
better type annotations

to specify what exactly is expected for the reward_model

The text was updated successfully, but these errors were encountered:

haimianxing · 2025-01-04T08:52:17Z

I also encountered this error when I try to use RLootTrainer by using AutoModelForCausalLMWithValueHead.

[rank3]: Traceback (most recent call last):
[rank3]: File "/mnt/data2/zcz/infer/utils/./accelerate_torch.py", line 262, in
[rank3]: trainer.train()
[rank3]: File "/mnt/data2/zcz/.miniconda3_14/envs/_torch_env/lib/python3.9/site-packages/trl/trainer/rloo_trainer.py", line 352, in train
[rank3]: _, score, _ = get_reward(
[rank3]: File "/mnt/data2/zcz/.miniconda3_14/envs/_torch_env/lib/python3.9/site-packages/trl/trainer/utils.py", line 1128, in get_reward
[rank3]: lm_backbone = getattr(model, model.base_model_prefix)
[rank3]: File "/mnt/data2/zcz/.miniconda3_14/envs/_torch_env/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 517, in getattr
[rank3]: raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")
[rank3]: AttributeError: 'DeepSpeedEngine' object has no attribute 'base_model_prefix'

RylanSchaeffer added the 🐛 bug Something isn't working label Aug 26, 2024

huggingface deleted a comment from MinecraftEarthVillage Aug 26, 2024

qgallouedec added 📚 documentation Improvements or additions to documentation 🧒 good second issue Good for contributors with basic project familiarity 🏋 PPO Related to PPO and removed 🐛 bug Something isn't working labels Oct 20, 2024

qgallouedec added the 🙋 help from community wanted Open invitation for community members to contribute label Dec 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`PPOv2Trainer` `reward_model` throws `AttributeError: '<My Custom Class>' object has no attribute 'base_model_prefix'` #1977

`PPOv2Trainer` `reward_model` throws `AttributeError: '<My Custom Class>' object has no attribute 'base_model_prefix'` #1977

RylanSchaeffer commented Aug 26, 2024 •

edited

Loading

haimianxing commented Jan 4, 2025

PPOv2Trainer reward_model throws AttributeError: '<My Custom Class>' object has no attribute 'base_model_prefix' #1977

PPOv2Trainer reward_model throws AttributeError: '<My Custom Class>' object has no attribute 'base_model_prefix' #1977

Comments

RylanSchaeffer commented Aug 26, 2024 • edited Loading

System Info

Information

Tasks

Reproduction

Expected behavior

haimianxing commented Jan 4, 2025

`PPOv2Trainer` `reward_model` throws `AttributeError: '<My Custom Class>' object has no attribute 'base_model_prefix'` #1977

`PPOv2Trainer` `reward_model` throws `AttributeError: '<My Custom Class>' object has no attribute 'base_model_prefix'` #1977

RylanSchaeffer commented Aug 26, 2024 •

edited

Loading