[BUG] target_value_network_params
initialization bug in convert_to_functional()
#2523
Open
Description
Bug Description
-
During the initialization of
LossModule
and execution ofconvert_to_functional()
, the first layer parameters are set asUninitializedParameter(shape=torch.Size(-1))
. Consequently, whentarget_value_network_params
is cloned from this, it becomesParameter(torch.Size(0))
, leading to the error:RuntimeError: mat2 must be a matrix, got 1-D tensor
. -
In the
DiscreteCQLLoss
class, when callingvalue_estimate
, theparams
argument should referencetarget_params
instead, as indicated in this line of code.
To Reproduce
Steps to reproduce the behavior.
import torch
from tensordict import TensorDict
from torchrl.data import OneHotDiscreteTensorSpec
from torchrl.modules import DistributionalQValueActor, MLP
from torchrl.objectives import DistributionalDQNLoss
nbins = 3
batch_size = 5
action_dim = 2
module = MLP(out_features=(nbins, action_dim), depth=2)
action_spec = OneHotDiscreteTensorSpec(action_dim)
qvalue_actor = DistributionalQValueActor(
module=module,
spec=action_spec,
support=torch.arange(nbins),
)
loss_module = DistributionalDQNLoss(
qvalue_actor,
gamma=0.99,
delay_value=True,
)
td = TensorDict(
{
"observation": torch.randn(batch_size, 4),
"action": torch.nn.functional.one_hot(
torch.randint(0, action_dim, (batch_size,)), action_dim
).float(),
"next": {
"observation": torch.randn(batch_size, 4),
"reward": torch.randn(batch_size, 1),
"done": torch.zeros(batch_size, 1, dtype=torch.bool),
},
},
batch_size=[batch_size],
)
loss = loss_module(td)
print("Computed loss:", loss)
File ../../python3.10/site-packages/torch/nn/modules/linear.py", line 117, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat2 must be a matrix, got 1-D tensor
System info
python=3.10.15
torchrl=0.5.0
torch=2.4.1
import torchrl, numpy, sys
print(torchrl.__version__, numpy.__version__, sys.version, sys.platform)
0.5.0 2.1.2 3.10.15 | packaged by conda-forge | (main, Oct 16 2024, 01:24:20) [Clang 17.0.6 ] darwin
Possible fixes
Specify input features in MLP
module.
module = MLP(in_features=4, out_features=(nbins, action_dim), depth=2)
This prevents the module parameters from being UninitializedParameter
Checklist
- I have checked that there is no similar issue in the repo (required)
- I have read the documentation (required)
- I have provided a minimal working example to reproduce the bug (required)