Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH Artifacts (Horizontal lines) when running on L4 GPU #124

Open
leanderloew opened this issue Sep 13, 2024 · 1 comment
Open

GH Artifacts (Horizontal lines) when running on L4 GPU #124

leanderloew opened this issue Sep 13, 2024 · 1 comment

Comments

@leanderloew
Copy link

L4 GPU:
Screenshot 2024-09-13 at 16 25 27
T4 GPU:
Screenshot 2024-09-13 at 16 26 10

Other variables look fine. I ran this for the neural_gcm_dynamic_forcing_stochastic_1_4_deg mode.

@leanderloew leanderloew changed the title GH Artifacts (Vertical lines) when running on L4 GPU GH Artifacts (Horizontal lines) when running on L4 GPU Sep 13, 2024
@shoyer
Copy link
Collaborator

shoyer commented Sep 19, 2024

Thanks for the report!

Since this artifact only shows up in the latitudinal direction, my guess is that it is somehow related to errors in the spherical harmonic transform. Potentially L4 vs T4 GPU have different TensorCores, with slightly different numerical precision?

Can you try setting the precision for all spherical harmonic transforms to full float32?
#56 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants