Skip to content

[BUG] target_value_network_params initialization bug in convert_to_functional() #2523

Open
@matinmoezzi

Description

Bug Description

  1. During the initialization of LossModule and execution of convert_to_functional(), the first layer parameters are set as UninitializedParameter(shape=torch.Size(-1)). Consequently, when target_value_network_params is cloned from this, it becomes Parameter(torch.Size(0)), leading to the error: RuntimeError: mat2 must be a matrix, got 1-D tensor.

  2. In the DiscreteCQLLoss class, when calling value_estimate, the params argument should reference target_params instead, as indicated in this line of code.

To Reproduce

Steps to reproduce the behavior.

import torch
from tensordict import TensorDict
from torchrl.data import OneHotDiscreteTensorSpec
from torchrl.modules import DistributionalQValueActor, MLP
from torchrl.objectives import DistributionalDQNLoss

nbins = 3
batch_size = 5
action_dim = 2
module = MLP(out_features=(nbins, action_dim), depth=2)
action_spec = OneHotDiscreteTensorSpec(action_dim)
qvalue_actor = DistributionalQValueActor(
    module=module,
    spec=action_spec,
    support=torch.arange(nbins),
)

loss_module = DistributionalDQNLoss(
    qvalue_actor,
    gamma=0.99,
    delay_value=True,
)
td = TensorDict(
    {
        "observation": torch.randn(batch_size, 4),
        "action": torch.nn.functional.one_hot(
            torch.randint(0, action_dim, (batch_size,)), action_dim
        ).float(),
        "next": {
            "observation": torch.randn(batch_size, 4),
            "reward": torch.randn(batch_size, 1),
            "done": torch.zeros(batch_size, 1, dtype=torch.bool),
        },
    },
    batch_size=[batch_size],
)
loss = loss_module(td)
print("Computed loss:", loss)
File ../../python3.10/site-packages/torch/nn/modules/linear.py", line 117, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat2 must be a matrix, got 1-D tensor

System info

python=3.10.15
torchrl=0.5.0
torch=2.4.1

import torchrl, numpy, sys
print(torchrl.__version__, numpy.__version__, sys.version, sys.platform)

0.5.0 2.1.2 3.10.15 | packaged by conda-forge | (main, Oct 16 2024, 01:24:20) [Clang 17.0.6 ] darwin

Possible fixes

Specify input features in MLP module.
module = MLP(in_features=4, out_features=(nbins, action_dim), depth=2)
This prevents the module parameters from being UninitializedParameter

Checklist

  • I have checked that there is no similar issue in the repo (required)
  • I have read the documentation (required)
  • I have provided a minimal working example to reproduce the bug (required)

Thanks to @BY571 and @vmoens

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions