Skip to content

Enable and retest RT cases on Derecho #2038

Closed
@zach1221

Description

@zach1221

Description

Cases that were disabled for Derecho in rt.conf, to match Cheyenne setting, and should be retested/debugged to see if they can be enabled to run on the HPC.

Enable and retest regional_atmaq_debug_intel, cpld_control_p8_faster_intel, cpld_bmark_p8, cpld_restart_bmark_p8 and conus13km_debug_qr on Derecho.

To Reproduce:

  1. clone ufs-weather model develop repo
  2. enable these tests in rt.conf
  3. re-run

Additional context

Output

Activity

self-assigned this
on Dec 14, 2023
DeniseWorthen

DeniseWorthen commented on Dec 14, 2023

@DeniseWorthen
Collaborator

Cheyenne-disabled tests also include cpld_bmark_p8, cpld_restart_bmark_p8 and conus13km_debug_qr

zach1221

zach1221 commented on Dec 14, 2023

@zach1221
CollaboratorAuthor

Cheyenne-disabled tests also include cpld_bmark_p8, cpld_restart_bmark_p8 and conus13km_debug_qr

Yes, true. Ok, noted in description as well.

natalie-perlin

natalie-perlin commented on Feb 20, 2024

@natalie-perlin
Collaborator

Here are the modules that are needed to be loaded on Derecho to enable use of ecflow and rocoto:

module use /glade/work/epicufsrt/contrib/spack-stack/derecho/modulefiles
module load ecflow/5.8.4
module use /glade/work/epicufsrt/contrib/derecho/rocoto/modulefiles
module load rocoto
zach1221

zach1221 commented on Feb 20, 2024

@zach1221
CollaboratorAuthor

@natalie-perlin ok, should variable ECFLOW_START=/glade/p/ral/jntp/tools/miniconda3/4.8.3/envs/ufs-weather-model/bin/ecflow_start.sh be changed?

BrianCurtis-NOAA

BrianCurtis-NOAA commented on Feb 20, 2024

@BrianCurtis-NOAA
Collaborator

Derecho has it's own ecflow install through module load ecflow, if it's easier to use that.

Once you module load ecflow it creates the paths to use their ecflow_start.sh. I am not sure though, how much that complicates ecflow package on python.

natalie-perlin

natalie-perlin commented on Feb 20, 2024

@natalie-perlin
Collaborator

There is an ecflow_start.sh script already:
ECFLOW_START=/glade/work/epicufsrt/contrib/spack-stack/derecho/ecflow-5.8.4/bin/ecflow_start.sh

natalie-perlin

natalie-perlin commented on Feb 20, 2024

@natalie-perlin
Collaborator

Derecho has it's own ecflow install through module load ecflow, if it's easier to use that.

Once you module load ecflow it creates the paths to use their ecflow_start.sh. I am not sure though, how much that complicates ecflow package on python.

@BrianCurtis-NOAA - it might be more handy to use the same ecflow version as used during the spack-stack build

zach1221

zach1221 commented on Feb 20, 2024

@zach1221
CollaboratorAuthor

I have this setup for Derecho in rt.sh.
elif [[ $MACHINE_ID = derecho ]]; then

export PATH=/glade/work/epicufsrt/contrib/derecho/rocoto/bin:$PATH
module use /glade/work/epicufsrt/contrib/spack-stack/derecho/modulefiles
module load ecflow/5.8.4
ECF_PORT=$(( $(id -u) + 1500 ))
ECFLOW_START=/glade/work/epicufsrt/contrib/spack-stack/derecho/ecflow-5.8.4/bin/ecflow_start.sh

Getting: ImportError: /glade/u/home/zshrader/miniconda3/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /glade/work/epicufsrt/contrib/spack-stack/derecho/ecflow-5.8.4/lib/python3.10/site-packages/ecflow/ecflow.so)

natalie-perlin

natalie-perlin commented on Feb 21, 2024

@natalie-perlin
Collaborator

These changes to rt.sh allowed a job to enter the queue (still Queued ):

elif [[ $MACHINE_ID = derecho ]]; then

 module use /glade/work/epicufsrt/contrib/derecho/rocoto/modulefiles
 module load rocoto
 module use /glade/work/epicufsrt/contrib/spack-stack/derecho/modulefiles
 module load ecflow/5.8.4
 module unload ncarcompilers
 module use /glade/work/epicufsrt/contrib/spack-stack/derecho/spack-stack-1.5.1/envs/unified-env/install/modulefiles/Core
 module load stack-intel/2021.10.0
 module load stack-python/3.10.8
 ECFLOW_START=/glade/work/epicufsrt/contrib/spack-stack/derecho/ecflow-5.8.4/bin/ecflow_start.sh

I could also suggest to modify the following in the ./modulefiles/ufs_derecho.intel.lua:

Change the line:
prepend_path("MODULEPATH", "/lustre/desc1/scratch/epicufsrt/contrib/modulefiles")
to
prepend_path("MODULEPATH", "/glade/work/epicufsrt/contrib/spack-stack/derecho/modulefiles")

natalie-perlin

natalie-perlin commented on Feb 21, 2024

@natalie-perlin
Collaborator

@zach1221 - with these changes, a test cpld_control_p8_mixedmode_intel passed successfully (only this test is set in rt.conf).
Regression test log: /glade/derecho/scratch/nperlin/UFS-WM/ufs-weather-model/tests/logs/RegressionTests_derecho.log

zach1221

zach1221 commented on Feb 21, 2024

@zach1221
CollaboratorAuthor

/glade/derecho/scratch/nperlin/UFS-WM/ufs-weather-model/tests

Seems to be working @natalie-perlin . Thank you,

4 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

  • Status

    Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Enable and retest RT cases on Derecho Β· Issue #2038 Β· ufs-community/ufs-weather-model