-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test MAPL v2.46.3 in UFS weather model #2346
Comments
@lipan-NOAA Can you confirm this version of MAPL works in GOCART for GEFSv13? Thanks |
@Hang-Lei-NOAA is MAPL 2.46.2 installed on Acorn/WCOSS2? |
I will add these today.
…On Mon, Jul 1, 2024 at 8:06 AM Brian Curtis ***@***.***> wrote:
@Hang-Lei-NOAA <https://github.com/Hang-Lei-NOAA> is MAPL 2.46.2
installed on Acorn/WCOSS2?
—
Reply to this email directly, view it on GitHub
<#2346 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKWSMFE5DZSFI5WKMRDN65LZKFA25AVCNFSM6AAAAABKB6TLROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJZHE3DSNJQGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@Hang-Lei-NOAA Can you tell me where you installed it? |
@Li Pan - NOAA Affiliate ***@***.***> Please check here
module use
/lfs/h1/emc/nceplibs/noscrub/hpc-stack/libs/hpc-stack/modulefiles/mpi/intel/19.1.3.304/cray-mpich/8.1.9
module load esmf/8.6.1
module load mapl/4.6.2-esmf-8.6.1
…On Tue, Jul 2, 2024 at 10:47 AM lipan-NOAA ***@***.***> wrote:
@Hang-Lei-NOAA <https://github.com/Hang-Lei-NOAA> Can you tell me where
you installed it?
—
Reply to this email directly, view it on GitHub
<#2346 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKWSMFBJID7JN5QQCTIGLFDZKK4RPAVCNFSM6AAAAABKB6TLROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBTGQYTMMBYGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
do you want this installed in spack-stack/1.6.0? and on which machine? |
@DusanJovic-NOAA @junwang-noaa new maple and esmf version are available on hercules and orion for the test and debug activities. @RatkoVasic-NOAA thanks for the installation!
|
As an FYI, the MAPL 2.46.3 fix was when using externally initialized MPI. (Which is something you all do but we don't do internally). We also had this notice to users:
My guess is you do not need to call |
@weiyuan-jiang may I ask what compiler/mpi versions you want to test? Thanks |
@junwang-noaa Well for that version of MAPL, we would have been testing with:
We never had Intel ifort 2021.9 on any machines we have. Are any of these possible? |
@AlexanderRichert-NOAA do we have these intel and GCC version on Hera or Hercules orGaea? |
I can't get onto Hera at the moment but for the others:
|
As we have access to Hercules/Orion, I guess ifort 2021.12 is our best bet at the moment (though I'd have to imagine there must be a recent-ish gcc on there). I'm fairly certain we can run with 2021.12. There were some bugfixes needed in places of GEOS (I don't think MAPL, but I can find out), so by the time we got them all in, 2021.13 was out, but I think 2021.12 works. Plus, we have 2021.12 on discover, so we can do some matching if need be. |
@weiyuan-jiang @mathomp4 Which MAPL version would you like us to test in UFS? I assume the ESMF version is 8.6.1 |
@junwang-noaa It doesn't matter. I can always replace the MAPL as long as it can be built and run under new compiler on Hercules |
I am not sure how it is upgraded. It is still 2021.9 to me. git pull module load ufs_hercules.intel ifort --version |
@weiyuan-jiang I repeated 'cpld_control_p8 intel' test and confirmed that version 2021.12.0 is used (see: /work2/noaa/stmp/djovic/stmp/djovic/FV3_RT/rt_1593997/cpld_control_p8_intel/out). ifort command is still showing 2012.9.0 but in the module file we set I_MPI_F90: so when mpiifort wrapper is used, the version of the compiler used is:
Alex was not able to use both C and Fortran compiler from 2021.12.0 to build libraries, that's why this hack was necessary, see above comments. Also, take a look at the err file from my last run: /work2/noaa/stmp/djovic/stmp/djovic/FV3_RT/rt_1593997/cpld_control_p8_intel/err |
@DusanJovic-NOAA Thanks. Is the ESMF library also build by the same compiler ? I think ESMF should be built by 2021.12 because it is ESMF's call that crashes |
I think it is. @AlexanderRichert-NOAA can you confirm (check?) that esmf is compiled with fortran version 2012.12.0 Thanks. |
It seems to me the ipc and icpc is still 2021.9. ESMF are written in C and C++ mostly |
According to this comment #2346 (comment), 2021.12.0 has only icx/icpx and ifort, and there were some issues using icx, if I understand correctly. |
If it is mixed, can it go back to the older ipc and icps like 2021.6 ? We don't have 2021.9 in our system so we cannot reproduce the issue. |
FWIW the only NOAA system I see with 2021.6 is Gaea (C5 and C6) via the intel/2022.1.0 module. |
I see this esmf error in PET* log files when I run
Could this be the reason for the error we see in mapl where the grid is not recognized as cubed-sphere grid? |
This looks to be the same message I saw in #1888 |
The message "json exception" sounds a bit familiar... (#2371 (comment)) What helped to resolve this issue on MacOS (clang for C, CXX, and gfortran for Fortran) was using CXX linker instead of Fortran. The issue could arise when mixing C/CXX compilers and Fortran compilers from different vendors.
This may not be the solution for this particular case, but it may worth testing as an option. |
I tried setting LINKER_LANGUAGE to CXX but compilation fails:
|
Thank you very much for testing this option, good to know that this potential quick fix had other implications... I will look more into this issue with the "json.exception error". It has something to do with nlohmann/json library in C++, which may need help to be located or linked. |
When the error said "undefined reference to `main'", it usually means the link language should be Fortran. |
I think I found the reason why model reports the error message "" and ultimately fails.
The above error message is first printed in "MAPL_GetGlobalHorzIJIndex" subroutine called from "MAPL_GetHorzIJIndex" which is called from "Run1" routine in SU2G_GridCompMod. The reason the error is printed is that
while in the "Run1" phase the grid is retrieved from the mapl object:
The reason grids in "initialize" and "run1" are different is because the grid from the mapl object in 'run1' is actually a subgrid computed from the original 'cubed-sphere' grid passed from fv3atm in case gocart/mapl are running using omp threading. The subgrids created for each of the omp threads are not 'cubed-sphere' grids (with tileCount == 6), instead they are just regular single tile rectangular grids. This is done in 'make_subgrids_from_bounds' in generic/OpenMP_Support.F90 here. I think the code that creates the subgrids should be updated to correctly create multi-tiled grids if the original (primary_grid) is multi-tiled. Or maybe the logic in MAPL_GridGet should be updated to not rely on the value of tileCount to compute globalCellCountPerDim. As a quick fix, if I run the test without omp threads in gocart by setting "use_threads: .FALSE." it does not print those "It only works for cubed-sphere grid" FAIL messages. Unfortunately it still fails with floating point exception later in the forecast. |
@DusanJovic-NOAA Thanks for finding this. At this point I can confirm that multi threading in MAPL ( except the older one like 2.40...) would not work with GOCART. |
Thanks for confirming. |
The model is failing with the floating point exception because of grid mask inconsistencies between fv3atm/gocart and what MAPL expects. MAPL is computing remapping route handles in "create_route_handle" in base/MAPL_EsmfRegridder.F90 and it sets 'dstMaskValues' to MAPL_MASK_OUT (which is 0):
where "has_mask" is true if grid has ESMF_GRIDITEM_MASK item. The fv3grid has ESMF_GRIDITEM_MASK set, and the mask values are either 0=ocean or 1=land. Which means the grid points with mask = 0 are valid destination points and should not be masked out. Currently, by using hard-coded value MAPL_MASK_OUT constant it is assumed that all mask points with 0 are masked out, which is currently not the case in fv3atm and gocart when used with fv3atm. Maybe value can be passed via grid attributes or something like that. Or maybe in fv3atm we can change the mask values to something other than 0, for example, 1=land, 2=ocean, but I do not know if and how that will impact other coupled components and CMEPS. Anyway, as a test I simply commented the above line that sets the value of dstMaskValues, and the model finally finished the test without any error. |
@DusanJovic-NOAA I do not think we want to change the current (0,1) mask values for ATM because that will have downstream impacts in CMEPS. I can't even find the MAPL code you reference, but if you don't want to mask the destination anywhere, then why not provide a special value? CMEPS uses |
MAPL code I reference is here: I do not understand where that special value should be provided and how is MAPL going to use that value? |
I didn't mean that MAPL should somehow reach into CMEPS for the value. I was just pointing out that by setting a special value (one that is never encountered), you can map to all destination points. |
Sorry, I still do not understand what should be set to a special value, a grid mask? The problem is that some of the fv3atm grid points have grid mask set to 0 and MAPL considers them as destination grid points that should be masked out. The fv3atm grid mask is defined in this routine https://github.com/NOAA-EMC/fv3atm/blob/a7d46eee01a78f0915373ebc58c9b20ba14a6c36/atmos_model.F90#L3654 |
The I'm probably misunderstanding, since I've never looked at MAPL. But the issue is that when the destination is ATM, you want all destination points mapped, right? Or is the destination in this case the aerosol "grid/mesh" ? |
Yes, it can be. But it is currently set to 0 (MAPL_MASK_OUT is 0). Here. That means any fv3atm destination point with grid mask 0 (all ocean points) will be masked out.
Yes. All fv3atm destination points should be mapped, ie. it should not have grid mask set to 0. If we can not redefine fv3atm grid mask to not use 0 for any grid point, which it seems we can not do, then either we pass the information to MAPL to not use 0 to mask out destination points. Or maybe somewhere in fv3atm or maybe in gocart we redefine grid mask that is going to be passed to MAPL. |
But why does the |
@weiyuan-jiang @tclune UFS is using a land sea mask on FV3 cubed sphere grid (0 for ocean and 1 for land) for coupling purposes. May I ask if the following line (MAPL_MASK_OUT) can be updated so that mask value 0 in UFS won't be considered undefined?
|
I think the constant value of MAPL_MASK_OUT should be updated |
With this change in GOCART I was able to make a successful (no floating point exception) run. diff --git a/ESMF/UFS/Aerosol_Cap.F90 b/ESMF/UFS/Aerosol_Cap.F90
index b753afa..e21ed11 100644
--- a/ESMF/UFS/Aerosol_Cap.F90
+++ b/ESMF/UFS/Aerosol_Cap.F90
@@ -243,6 +243,7 @@ contains
type(MAPL_CapOptions) :: maplCapOptions
type(Aerosol_InternalState_T) :: is
type(Aerosol_Tracer_T), pointer :: trp
+ integer(kind=ESMF_KIND_I4), pointer :: maskPtr(:,:)
! begin
rc = ESMF_SUCCESS
@@ -338,6 +339,10 @@ contains
file=__FILE__)) &
return ! bail out
+ call ESMF_GridGetItem(grid, itemflag=ESMF_GRIDITEM_MASK, &
+ staggerloc=ESMF_STAGGERLOC_CENTER, farrayPtr=maskPtr, _RC)
+ maskPtr = 1
+
! provide model grid to MAPL
call cap % cap_gc % set_grid(grid, lm=nlev, _RC) This change explicitly sets all grid mask points to 1, before passing the grid to MAPL. |
Description
MAPL 2.46.2 has fixes for issue #2162. UFS weather model needs to be tested and updated with this version.
20240830:
MAPL 2.46.2 has a bug. MAPL 2.46.3 should be installed and tested in UFS weather model. The issue title is updated.
Solution
Alternatives
Related to
The text was updated successfully, but these errors were encountered: