Skip to content

CUDA backend does not work with rust nightly #1867

Closed as not planned
Closed as not planned
@jggc

Description

Describe the bug
This is not actually clear whether this is a bug or a feature/documentation request but here it goes:

Running rust nightly 2024-05-30, no matter how I set up libtorch I will end up with

2024-06-08T16:45:38.148370Z ERROR burn_train::learner::application_logger: PANIC => panicked at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tch-0.15.0/src/wrappers/tensor_generated.rs:7988:40:                  
called `Result::unwrap()` on an `Err` value: Torch("Could not run 'aten::empty.memory_format' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective....

The reason I am reporting is that this is at least the third time that I encounter this same issue for different reasons such as :

  • Running rust stable (a while ago nightly was required by burn)
  • Running rust nightly
  • Running wrong cuda version
  • Running wrong libtorch version
  • Wrong environment variables setup

What is my point
I think this error is totally unhelpful and there is a loot of room for improvement regarding the setup tch-gpu.

What are you thinking ?

Should we :

  1. Implement pre-flight checks
  2. Improve and consolidate documentation
  3. Improve the error message, reading "operator does not exist" does not hint that well at where the issue is IMHO.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions