Skip to content

ww3_tp2.6 regression test hanging #1350

Open
@sbanihash

Description

Describe the bug
ww3_tp2.6 regression test stalls and reaches time limit. Log file shows that the model hangs at this stage:

        output dates out of run dates : Track point output deactivated
        output dates out of run dates : Nesting data deactivated
        output dates out of run dates : Partitioned wave field data deactivated
        output dates out of run dates : Restart files second request deactivated
   Wave model ...

slurmstepd: error: *** JOB 5132044 ON h11c53 CANCELLED AT 2025-01-13T23:27:44 DUE TO TIME LIMIT ***
slurmstepd: error: *** STEP 5132044.36 ON h11c53 CANCELLED AT 2025-01-13T23:27:44 DUE TO TIME LIMIT ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
#############################

The error seems to have entered the code after PR#1333

To Reproduce
Run ww3_tp2.6: (I had run matrix05 and encountered the issue)
run_cmake_test -b slurm -o all -S -T -s MPI -s PDLIB -w work_pdlib -g pdlib -f -p srun -n 24 ../model ww3_tp2.6

Expected behavior
The run will stall and reach time limit.

Log file:
matrix05_out.txt

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions