Skip to content

[Connect4] Default settings result in NaN loss, "ValueError 'a' cannot be empty unless no samples are taken" #60

Closed
@Elijas

Description

Training Connect4 with default settings results in many errors.

Reproduction

  1. python muzero.py
  2. Select ConnectX (2)
  3. Select Train (0)
  4. Wait several minutes

Branch master, commit "c046c03 Fix backpropagate"
Python 3.7, Ubuntu 20.04 LTS, RTX6000 GPU (24GB), 8 CPU, 32GB RAM

Documentation

Recording of terminal session

https://asciinema.org/a/QR0bM7MH4TR2ulgGddZ4luJ1n

Terminal screenshot

Show

image

Tensorboard

Show

image

Terminal copied errors

Show
Loss: nan
Warning : Extreme values (nan) in game priorities. Could be underfitting or overfitting.
2020-07-14 20:24:39,425.ERROR worker.py:987 -- Possible unhandled error from worker: ray::SelfPlay.continuous_self_play() (pid=10473, ip=45.79.123.77)
  File "python/ray/_raylet.pyx", line 446, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 400, in ray._raylet.execute_task.function_executor
  File "/root/muzero-general/self_play.py", line 48, in continuous_self_play
    0,
  File "/root/muzero-general/self_play.py", line 142, in play_game
    False if temperature == 0 else True,
  File "/root/muzero-general/self_play.py", line 312, in run
    action, node = self.select_child(node, min_max_stats)
  File "/root/muzero-general/self_play.py", line 359, in select_child
    for action, child in node.children.items()
  File "mtrand.pyx", line 907, in numpy.random.mtrand.RandomState.choice
ValueError: 'a' cannot be empty unless no samples are taken

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions