Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Minor] Some final adjustments for scheduling models #195

Merged
merged 4 commits into from
Jun 14, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[Config] updated configs to match latest experiments
  • Loading branch information
LTluttmann committed Jun 13, 2024
commit f9843c1cd235490b9b43270bfccee39b6320cd06
1 change: 1 addition & 0 deletions configs/experiment/scheduling/am-pomo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ model:
_target_: rl4co.models.L2DAttnPolicy
env_name: ${env.name}
scaling_factor: ${scaling_factor}
normalization: "batch"
batch_size: 64
num_starts: 10
num_augment: 0
Expand Down
8 changes: 1 addition & 7 deletions configs/experiment/scheduling/am-ppo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,14 +43,8 @@ model:
batch_size: 128
val_batch_size: 512
test_batch_size: 64
# Song et al use 1000 iterations over batches of 20 = 20_000
# We train 10 epochs on a set of 2000 instance = 20_000
train_data_size: 2000
mini_batch_size: 512
reward_scale: scale
optimizer_kwargs:
lr: 1e-4

env:
stepwise_reward: True
_torchrl_mode: True
stepwise_reward: True
8 changes: 5 additions & 3 deletions configs/experiment/scheduling/base.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,17 +22,19 @@ trainer:

seed: 12345678

scaling_factor: 20
scaling_factor: ${env.generator_params.max_processing_time}

model:
_target_: ???
batch_size: ???
train_data_size: 2_000
val_data_size: 1_000
test_data_size: 1_000
test_data_size: 100
optimizer_kwargs:
lr: 1e-4
lr: 2e-4
weight_decay: 1e-6
lr_scheduler: "ExponentialLR"
lr_scheduler_kwargs:
gamma: 0.95
reward_scale: scale
max_grad_norm: 1
14 changes: 6 additions & 8 deletions configs/experiment/scheduling/gnn-ppo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,24 +12,22 @@ logger:
model:
_target_: rl4co.models.L2DPPOModel
policy_kwargs:
embed_dim: 128
embed_dim: 256
num_encoder_layers: 3
scaling_factor: ${scaling_factor}
max_grad_norm: 1
ppo_epochs: 3
ppo_epochs: 2
het_emb: False
normalization: instance
test_decode_type: greedy
batch_size: 128
val_batch_size: 512
test_batch_size: 64
mini_batch_size: 512
reward_scale: scale
optimizer_kwargs:
lr: 1e-4


trainer:
max_epochs: 10


env:
stepwise_reward: True
_torchrl_mode: True
stepwise_reward: True
1 change: 1 addition & 0 deletions configs/experiment/scheduling/hgnn-pomo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ model:
stepwise_encoding: False
scaling_factor: ${scaling_factor}
het_emb: True
normalization: instance
num_starts: 10
batch_size: 64
num_augment: 0
Expand Down
16 changes: 4 additions & 12 deletions configs/experiment/scheduling/hgnn-ppo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,24 +12,16 @@ logger:
model:
_target_: rl4co.models.L2DPPOModel
policy_kwargs:
embed_dim: 128
embed_dim: 256
num_encoder_layers: 3
scaling_factor: ${scaling_factor}
max_grad_norm: 1
ppo_epochs: 3
ppo_epochs: 2
het_emb: True
normalization: instance
batch_size: 128
val_batch_size: 512
test_batch_size: 64
mini_batch_size: 512
reward_scale: scale
optimizer_kwargs:
lr: 1e-4

trainer:
max_epochs: 10


env:
stepwise_reward: True
_torchrl_mode: True
stepwise_reward: True
8 changes: 1 addition & 7 deletions configs/experiment/scheduling/matnet-ppo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,7 @@ model:
batch_size: 128
val_batch_size: 512
test_batch_size: 64
# Song et al use 1000 iterations over batches of 20 = 20_000
# We train 10 epochs on a set of 2000 instance = 20_000
mini_batch_size: 512
reward_scale: scale
optimizer_kwargs:
lr: 1e-4

env:
stepwise_reward: True
_torchrl_mode: True
stepwise_reward: True