-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test configuration for the "merge" of NESP2pt9 and CABLE-POP_TRENDY #526
Comments
First step, keep the inputs for TRENDY and change the science configs to mirror the BIOS config. |
Some more detail/thoughts around this: The three (four) tests cases should be simply differentiated by the specification of a landmask:
So the problematic cases - 2 elements. 1) Can the partitioning part of the process work on top of an externally specified landmask file? i.e. the landmasks created/used by the serial jobs need to be the overlay of the external landmask and the act of splitting it up. 2) Can the recombination step provide output on an externally specified landmask (or does it default to the input meteorology)? Another aspect to think through - the TRENDY partitioning process used some kind of randomisation of grid cells to ensure a reasonable mix of fast and slow (to compute) grid cells in each job. This allows for efficient kSU usage and through flow during the multi-stage process. Do we need to do the same randomisation of cells for BIOS or can we use a geographically defined (lat-lon box) as the means to split things up? How would this impact the recombination script? |
I have got the ACT9 test case running with the TRENDY pseudo-parallel configuration. Some minor adjustments of the original BIOS landmask were required:
The TRENDY partitioning process uses a "tiling" approach- effectively walks through the the points in the order they appear in memory, and assigns them cyclically to each process. Say we wanted to run the ACT9 test case with 4 parallel jobs, it would assign:
I've also managed to at least begin the Australia-wide 0.05 degree case without them all crashing on start-up, but I figured I would do some more testing with the cheaper cases before burning compute resources on this. |
Good news - I think the same process should work for the 0.25 and 1000pts cases as well. What's perhaps less clear is whether the recombination script will work without modification.
An aside - we haven't run a 0.05 degree case in living memory. We certainly don't have a test case that we could compare against. Indeed I'm not sure that the MPI code could actually run this given the need to bring everything back on to one processor (for output). Last I heard on this - we were running a reduced science configuration and still running out of memory & walltime (on raijin). |
We don't currently have netCDF meteorology for the 0.25 degree case, only at 0.05 degree resolution, which is the only barrier there I think. Is the intention to eventually run that 0.05 degree case? I'm surprised that it's so expensive. Getting to real performance and memory improvements for CABLE is a fair way down the track I think. |
We shouldn't need the 0.25 meteorology in netCDF - we define a 0.25 degree landmask, then partition that across the TRENDY poor man's processors, that should be able to pick up the met information using the 0.05 degree netCDF files. The question that's left is the recombination - I suspect that TRENDY would try to recombine back to 0.05 degrees and we really only want the recombined version at 0.25 degree. |
Unfortunately not, the TRENDY configuration doesn't do any interpolation. It practically requires that the landmask to be at the same resolution as the meteorology, since it takes the IDs of the land points to extract meteorology rather than the actual (lon, lat) coordinates. |
True - but similarly the 0.25 degree run is a subsample of the 0.05 degree. So if we create a land mask (at 0.05 degrees) that selects the 0.25 degree land points we should get the same answer. We don't have that land mask at hand but it should be relatively easy to create. The key tasks would be
|
Ah yep, you could select points at 0.25 degree resolution from the 0.05 landmask. |
Test simulations done with the CABLE-POP_TRENDY branch using a file mask to select the points:
Note:
To do: Need to convert full meteorology to netcdf. Evaluation of the simulations:
Spin up: Next steps:
|
At the end of the "BIOS merge", we will need to test the resulting code with a BIOS configuration. We will need to setup the configuration based on the pseudo-parallel TRENDY config.
BIOS config:
Success criteria: good if bitwise comparable but very likely not achievable.
The text was updated successfully, but these errors were encountered: