-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix/cam rev7 #405
Bugfix/cam rev7 #405
Conversation
I have cloned your fork and checked out branch |
Beginning w/ item 34 in Wei's document ... I ran the global_ens plotting scripts for VDATE=20231230:
The only WARNINGs that persist are for ECME data not being available. COMOUT= |
The tar files for these three plot jobs were successfully produced, and the expected plot png quantities for each tar file were confirmed. |
Item 36 in Wei's document ... I ran The WARNING for OMP_NUM_THREADS is gone.
I confirmed there are many of these warning statements in the output for the emc.vpppg parallel runs. The PR corrects item 36. ✔️ |
Item 40 in Wei's document ... I ran The WARNING persists, but is reduced, as per the proposed improvement described by Marcel. Now the warning messages are for 3-hr intervals rather than an hourly cycle frequency. The WARNING messages are only cycles 09, 15.
The logfile for emc.vpppg (same VDATE and vhr: The PR addresses item 40 appropriately. ✔️ |
Item 41 in Wei's document ... I ran
I confirmed they exist in Wei's parallel for the same VDATE and vhr.
This PR corrects item 41. ✔️ |
Items 42 and 43 in Wei's document ... I am running
For item 42 we are looking for no more WARNING messages related to Empty dataframes, as well as SNOD graphics only generated for hrrr and namnest. So far there are no occurrences of the WARNING message. The same run in Wei's parallel produced the warning message 75,420 times! 🤞
For item 43 we are looking for reduced runtime of UPDATE: the snowfall plots job completed. I still see quite a few WARNING messages.
COMOUT= |
It looks like COMOUTplots aged off already. I will re-run the cam plots processing once the dev queues are re-established on dogwood later today. |
Thanks very much Shelley and Steve for reviewing this PR so far.
I missed this second concern about the missing ECME data, looks to be a separate issue. I'll check that and update. That leaves 34, 42, and 43. An update for 34 and 42 will be added tomorrow, and new items 39, 49, and 50 will be added shortly after. None of the new items should require retesting the above already-tested jobs. |
I forgot to add: I'm also adding item 55 today. For simplicity, this may be tested using one of the jobs we'll test for the other new items. |
Thanks Marcel! Re: item 43, I checked the log file ( But I also looked at the log file outputs from the NCO parallel, which appears to have nproc=1280 in the ecf script. |
Yes, the only change is increased processors, and 4:43 is an excellent result. I wonder though why the large difference compared to the parallel jobs. Is it possible to copy those logs to Hera today? I can also check them tomorrow. |
Marcel, I put the cactus log files in /dfs. Once dogwood returns, you can find them in |
I re-ran the global_ens grid2obs plots processes for 31 days following the commit #79566c4.
The WARNING messages for ECME and CMCE are gone for the non-separate plots. |
Thanks @ShelleyMelchior-NOAA and @StevenSimon-NOAA! I'll update soon, once I have a fix for the lingering issue in the "separate" job A snowfall fix is committed here, but I am re-testing because I expect a few more WARNINGS will exist. I should also have an update about that soon. |
@MarcelCaron-NOAA Thank you for working on this. The "separate" scripts from Binbin separate different initialization times so that we have seperate 00Z and 12Z plots for near surface variables like 2-m temperature. Here is an example webpage: It looks like these plots broke several days ago (other plots have lines that extend out past this). I'm unsure if the timing of the break is associated with a PR that was merged or what, but it would be good to get this working again. CCing @StevenSimon-NOAA and @ShelleyMelchior-NOAA |
… "group" or "set" in model name
It looks like the "missing ECME data will not be plotted" WARNINGs have been successfully addressed for the g2o separate init plots. Please see the following logfile to double check if necessary: /lfs/h2/emc/ptmp/shelley.melchior/EVS_out/jevs_global_ens_atmos_gefs_grid2obs_past31days_separate_plots.o112718556 |
UpdatesRe: global-ens (34) - Thanks @StevenSimon-NOAA, I agree. Removing ECME from CAPEsfc plots I think fixed the ECME-only warnings. The CMCE warnings are also for CAPEsfc and trigger when the CAPE metrics are NaN. Part of those warning messages: "This may occur if no forecast or observed events were counted at any threshold for any model, so that all performance statistics are undefined." The lowest threshold is 250 J/kg. I think such warnings are expected during the cool season. Re: snowfall (42) - Removed models from plots when they have no data (06Z, 18Z inits for hiresws; or longer leads for hrrr and some hiresws) and muted some unhelpful warnings. I think I got everything but I'm running one more test to check. |
I concur that global_ens g2o plotting, Wei's item 34, is addressed. ✔️ |
I tested the changes to jevs_cam_snowfall_plots and few warnings remain, all seemingly for graphics in CONUS_South, when the lowest snowfall thresholds are too high to find any events (similar to the CAPEsfc warnings described above). I think this job is ready to be reviewed here again. |
Marcel -- I am getting started on running tests. Can you do a sync fork on your branch. 2 PRs were merged yesterday. Thanks! |
Wei's item 43 -- I am re-running
logfile: |
Wei's item 55 -- Changes to |
Wei's item 57 -- Changes to |
The re-run for |
@ShelleyMelchior-NOAA Yes, these warning messages should reduce or disappear when we start to see larger snowfall events in the CONUS_South domain. I think the remaining messages are appropriate. ✔️ Otherwise the output file looks good, and I found no other concerning messages in the log file. |
Items 39 and 49 are addressed and the fixes are ready for review. Fixes for items 31 and 50 will be deferred to another PR.
Both jobs have relatively short runtimes (<30m). Changes made and instructions for running the above jobs have been added to the PR description. |
Hi Marcel! I just submitted Both processes are running at the moment.
Logfiles: I'll leave these tests here for our review when we both return to work Friday. |
@ShelleyMelchior-NOAA @MarcelCaron-NOAA Thanks for your continued work on this PR. Wei would like EVS v1.0.2 by COB today, so I hope that this PR is nearly there and that it can be included before next week's extended dev outage! Please keep up the good work. Fingers crossed. |
✔️ jevs_cam_headline_plots outputted no warnings and output file looks good I need to fix the remaining warning message mentioned in item 49 and will update soon. If the fix tests successfully I think this PR will be ready for final review and merge. |
Thanks, @MarcelCaron-NOAA! We'll look for your fixes! |
Update re. jevs_cam_href_spcoutlook_stats:Deeper dive into the warning message (part of item 49):
This warning occurs consistently in every METplus process verifying MLCAPE, and impacts cam, mesoscale, and global-det jobs. The relevant met code producing this warning is
The boolean statements at lines 816 and 817 check if the lowest fcst level is lower than (rather than
Therefore, I think the warning triggers because the obs levels range is equal to the forecast levels range, not contained within it. I don't know if the levels configuration can be changed to address this, or if it's something to resolve on the METplus side. However, to progress this PR, since this warning affects several EVS components, and since changing the MLCAPE configuration to resolve a warning message may not be ideal, I suggest leaving the issue in this PR and addressing it separately if necessary. |
@MarcelCaron-NOAA Thank you for digging into this. We will leave this out of the current PR and either tell Wei we can't remove it because it's METplus or try to resolve it in a future PR. With so many dev outages, it's likely option 1. @ShelleyMelchior-NOAA will work to merge this in it's current form. Thanks! |
Thanks for this thorough digging, Marcel. This makes sense. We can delay addressing what you found and put it into a future PR. I will conclude the code review and merge this PR after one final sync fork. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests yield expected results, except for completing item 49. The remaining WARNINGs for item 49 will be addressed in a future PR.
Pull Request Testing
This PR addresses issues 34, 36, 39-43, 49, 55, and 57 noted in the Google Doc titled "EVS v1.0 issues". Information about the cause and fix of each:
This PR also addresses the following issues:
For the most part, the affected jobs were not tested.
(1) Set up jobs
• symlink the EVS_fix directory locally as "fix"
• In all driver scripts, point all exports of "HOMEevs" to your EVS test directory
• In all stats driver scripts, change COMIN to the parallel directory (
/lfs/h2/emc/vpppg/noscrub/emc.vpppg/$NET/$evs_ver_2d
)• Optional: In the stats driver scripts, change "MAILTO" to point to your email address
(2) Test Jobs
Use the following to submit ...
qsub –v vhr=03
qsub –v vhr=00
From your EVS testing directory, the following scripts should be run using the setup in step (1):
[Total: 1 prep jobs, 2 stats jobs, 4 plots jobs]
(3) Review
Pull Request Checklist
Review the source issue metadata (required labels, projects, and milestone).
Complete the PR description above.
Ensure the PR title matches the feature branch name.
Check the following:
Instructions provided on how to run
Developer's name is replaced by ${user} where necessary throughout the code
Check that the ecf file has all the proper definitions of variables
Check that the jobs file has all the proper settings of COMIN and COMOUT and other input variables
Check to see that the output directory structure is followed
Be sure that you are not using MET utilities outside the METplus wrapper structure
After submitting the PR, select Development issue with the original issue number.
After the PR is approved, merge your changes. If permissions do not allow this, request that the reviewer do the merge.
Close the linked issue.