Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Parameter Testing] KPP vs ePBL vertical mixing #243

Open
minghangli-uni opened this issue Nov 21, 2024 · 19 comments
Open

[Parameter Testing] KPP vs ePBL vertical mixing #243

minghangli-uni opened this issue Nov 21, 2024 · 19 comments
Labels

Comments

@minghangli-uni
Copy link
Contributor

minghangli-uni commented Nov 21, 2024

Parameter Tests Description

The test is as follows,

  • uses 0.25deg ryf configuration,
  • adopts the latest bathymetry by @ezhilsabareesh8, where input files can be found in the YAML input file.
  • baroclinic timestep = coupling timestep = cice thermodynamic timestep = 1080s, ocean thermodynamic timestep = 10800s,
  • The diagnostic table for both the control and experiment runs is available here. The agreed scheme can be accessed here.

Expts_manager Version

dd38b80

YAML input file

# =====================================================================================
# YAML Configuration for Expts_manager.py
# =====================================================================================
# This configuration file defining the parameters and settings required for cloning,
# setting up, and running control and perturbation experiments using `Expts_manager.py`.
# Detailed explanations are provided to ensure the configuration is straightforward.
# =====================================================================================


# ============ Model Selection ========================================================

model: access-om3  # Specify the model to be used. Options: 'access-om2', 'access-om3'

# ============ Utility Tool Configuration (only for access-om3) =======================
# The following configuration provides the necessary tools to:
# 1. Parse parameters and comments from `MOM_input` in MOM6.
# 2. Read and write the `nuopc.runconfig` file.

force_overwrite_tools: False
utils_url: git@github.com:minghangli-uni/om3-utils.git  # Git URL for the utility tool repository
utils_branch_name: main  # The branch for which the utility tool will be checked out
utils_dir_name: om3-utils  # Directory name for the utility tool (user-defined)

# ============ Diagnostic Table (optional) ============================================
# Configuration for customising the diagnostic table.

diag_url: git@github.com:minghangli-uni/make_diag_table.git  # Git URL for the `make_diag_table`
diag_branch_name: general_scheme3  # Branch for the `make_diag_table`
diag_dir_name: make_diag_table  # Directory name for the `make_diag_table` (user-defined)
diag_ctrl: True  # Set to 'True' to modify the diagnostic table for the control experiment
diag_pert: True  # Set to 'True' to modify the diagnostic table for perturbation experiments

# ============ Control Experiment Setup ===============================================

base_url: git@github.com:ACCESS-NRI/access-om3-configs.git  # Git URL for the control experiment repository
base_commit: "f37396e"  # Commit hash to use; Please ensure it is a string!
base_dir_name: Ctrl-1deg_jra55do_ryf  # Directory name for cloning (user-defined)
base_branch_name: ctrl  # Branch name for the experiment (user-defined)
test_path: product1_0.25deg_new_topo # Relative path for all test (control and perturbation) runs (user-defined)

# ============ Control Experiment Variables ===========================================
# Allows modification of various control experiment settings.
# 1. config.yaml  (access-om2 or access-om3)
# 2. all namelists such as with endswith "_in" or ".nml" etc.  (access-om2 or access-om3)
# 3. cpl_dt (coupling timestep)  (access-om3)
# 4. nuopc.runconfig  (access-om3)
# 5. MOM_input  (access-om3)
# Below are some examples for the illustration purpose, please modify for your own settings.

config.yaml:
    ncpus: 1440
    mem: 5760GB
    walltime: 10:00:00

    metadata:
        enable: True
    runlog: True
    restart_freq: 1

    input:
    - /g/data/tm70/ml0072/COMMON/git_repos/COSIMA_om3-scripts/expts_manager/New_grid_input_files_025deg_75zlevels/access-om2-025deg-ESMFmesh.nc
    - /g/data/tm70/ml0072/COMMON/git_repos/COSIMA_om3-scripts/expts_manager/New_grid_input_files_025deg_75zlevels/access-om2-025deg-nomask-ESMFmesh.nc
    - /g/data/tm70/ml0072/COMMON/git_repos/COSIMA_om3-scripts/expts_manager/New_grid_input_files_025deg_75zlevels/topog.nc
    - /g/data/tm70/ml0072/COMMON/git_repos/COSIMA_om3-scripts/expts_manager/New_grid_input_files_025deg_75zlevels/ocean_hgrid.nc
    - /g/data/tm70/ml0072/COMMON/git_repos/COSIMA_om3-scripts/expts_manager/New_grid_input_files_025deg_75zlevels/ocean_vgrid.nc
    - /g/data/tm70/ml0072/COMMON/git_repos/COSIMA_om3-scripts/expts_manager/New_grid_input_files_025deg_75zlevels/ocean_temp_salt.res.nc
    - /g/data/tm70/ml0072/COMMON/git_repos/COSIMA_om3-scripts/expts_manager/New_grid_input_files_025deg_75zlevels/salt_sfc_restore.nc
    - /g/data/tm70/ml0072/COMMON/git_repos/COSIMA_om3-scripts/expts_manager/New_grid_input_files_025deg_75zlevels/grid.nc
    - /g/data/tm70/ml0072/COMMON/git_repos/COSIMA_om3-scripts/expts_manager/New_grid_input_files_025deg_75zlevels/kmt.nc
    - /g/data/vk83/configurations/inputs/access-om3/share/meshes/share/2024.09.16/JRA55do-datm-ESMFmesh.nc
    - /g/data/vk83/configurations/inputs/access-om3/share/meshes/share/2024.09.16/JRA55do-drof-ESMFmesh.nc
    - /g/data/vk83/experiments/inputs/JRA-55/RYF/v1-4/data

nuopc.runseq: 1080.0  # Coupling timestep in the `nuopc_runseq`
nuopc.runconfig:
    CLOCK_attributes:
        stop_option: nyears
        stop_n: 1
        restart_option: nyears
        restart_n: 1

MOM_input:
    NJGLOBAL: 1142
    NK: 75
    DT_THERM: 10800.0
    DT: 1080.0
    DIABATIC_FIRST: False
    THERMO_SPANS_COUPLING: True
    DTBT_RESET_PERIOD: 10800.0
    MAXTRUNC: 500

input.nml:
    diag_manager_nml:
        max_axes: 400
        max_files: 200
        max_num_axis_sets: 200

# ============ Namelist Tunning ================================
# Tune parameters across different model components.

namelists:
    cross_block1:
    # epbl_1: only removes KPP and CVMix related parameters
        cross_block1_dirs: [epbl_1]

        MOM_input:
            MOM_list1_combo:
                FATAL_UNUSED_PARAMS: [False] # TODO: remove parameters not used
                USE_KPP: [False]
                USE_CVMix_CONVECTION: [False]
                # shear-driven turbulence associated with epbl
                USE_JACKSON_PARAM: [True]
                VERTEX_SHEAR: [True]
                MAX_RINO_IT: [25]
                USE_RESTRICTIVE_TOLERANCE_CHECK: [True]
                USE_LMD94: [False]
                USE_CVMIX_DDIFF: [False]
                # epbl
                ENERGETICS_SFC_PBL: [True]
                EPBL_IS_ADDITIVE: [False]
                # module MOM_energetic_PBL
                ML_OMEGA_FRAC: [0.001]
                TKE_DECAY: [0.01]
                EPBL_MSTAR_SCHEME: ["OM4"]
                MSTAR_CAP: [1.25]
                MSTAR2_COEF1: [0.29]
                MSTAR2_COEF2: [0.152]
                NSTAR: [0.06]
                MSTAR_CONV_ADJ: [0.667]
                EPBL_TRANSITION_SCALE: [0.01]
                MIX_LEN_EXPONENT: [1.0]
                # USE_LA_LI2016: [True] # Langmuir related
                #!EPBL_USTAR_MIN: [1.45842E-18] # TODO: enable variables such as this, with "!" before the variable name.
                
# ============ Perturbation Experiment Setup (Optional - access-om3) =======================
# Configure settings for perturbation experiments. Currently, only `nuopc.runconfig` is supported.
# If conducting parameter tuning for `nuopc.runconfig`, any pre-existing settings in this section 
# will be purged by the above namelist tunning.



# ============ Control experiment and perturbation Runs ===================================
# This section configures the settings for running control experiments and their corresponding perturbation tests.
    
ctrl_nruns: 5
# Number of control experiment runs.            
# Default: 0.            
# Adjust this value to set the number of iterations for the control experiment, which serves as the baseline for perturbations.

run_namelists: True  
# Determines whether to run using the specified namelists.            
# Default: False.            
# Set to 'True' to apply configurations from the namelist section; otherwise, 'False' to skip this step.

check_duplicate_jobs: True  
# Checks if there are duplicate PBS jobs within the same parent directory (`test_path`) based on their names.            
# This check is useful when you have existing running jobs and want to add additional tests, which helps avoid conflicts by ensuring new jobs don't duplicate existing ones in the same `test_path`.            
# The check will not be triggered if the jobs are located in different `test_path`. It only applies to jobs within the same `test_path` directory.            
# Default: True.            
# If duplicates are found, a message will be printed indicating the presence of duplicate runs and those runs will be skipped.

check_skipping: False  
# Checks if certain runs should be skipped based on pre-existing conditions or results. Currently only valid for nml type.            
# Default: False.            
# Set to 'True' if you want the system to skip runs under specific criteria; otherwise, keep it 'False'. Currently only valid for nml type.

force_restart: False  
# Controls whether to force a restart of the experiments regardless of existing initial conditions.            
# Default: False.            
# Set to 'True' to enforce a restart of the control and perturbation runs.

startfrom: 'rest'  
# Defines the starting point for perturbation tests.            
# Options: a specific restart number of the control experiment, or 'rest' to start from the initial state.            
# This parameter determines the initial condition for each perturbation test.

nruns: 5
# Total number of output directories to generate for each Expts_manager member.            
# Default: 0.            
# Specifies how many runs of each perturbation experiment will be started; this number correlates with the different parameter combinations defined in the configuration.

@minghangli-uni
Copy link
Contributor Author

minghangli-uni commented Nov 22, 2024

Velocity truncations, likely caused by initial conditions, happen only on the second day of a 5 model year run. The affected locations can be visualised in this notebook.


Update on 9 Dec 2024:
The initial conclusion regarding truncations was incorrect due to an issue described in #246. For the updated truncation issues, please refer to the following #243 (comment).

@ezhilsabareesh8
Copy link
Contributor

Minghang pointed out that the truncation errors are occurring only with the ePBL mixing for the 0.25° configuration using the new grids and topography. These errors are happening consecutively for 5 years.

Screenshot 2024-12-02 at 1 44 03 pm Screenshot 2024-12-02 at 1 42 03 pm

Anton mentioned that the ice shelf may have retreated compared to the previous GEBCO data, which might explain the presence of a new channel. @aekiss Would a topography edit be required in this case, given that the issue arises only with the ePBL mixing?

@aekiss
Copy link
Contributor

aekiss commented Dec 5, 2024

I'm not sure what you mean by a new channel.

Here's the OM2 topog /g/data/ik11/inputs/access-om2/input_20230515_025deg_topog/mom_025deg/topog.nc (top)
vs /g/data/ik11/inputsncview /g/data/tm70/ml0072/COMMON/git_repos/COSIMA_om3-scripts/expts_manager/New_grid_input_files_025deg_75zlevels/topog.nc (bottom) as viewed in ncview.

The coast looks quite similar, apart from stretching due to the finer grid dy in the new topog due to Mercator extending further south.

Maybe we're hitting CFL instability due to finer dy in the new grid?

Screenshot 2024-12-05 at 4 14 05 pm

@minghangli-uni
Copy link
Contributor Author

minghangli-uni commented Dec 9, 2024

The truncations occur when dt=1080s for the ePBL mixing.
epbl-dt_1080_consecutive_5years_truncations

The baroclinic timestep has been reduced from 1080s to 900s to check if any further truncation occurs.

It is worth to note that there are no issues for KPP mixing.

@minghangli-uni
Copy link
Contributor Author

Truncations are still present when dt and dt_cpl are reduced to 900s while keeping dt_therm=10800s.
epbl-dt_900_consecutive_5years_truncations

A comparison between the OM3 and OM2 topos is provided below:
topo_om3_om2

As noted by @aekiss, the two topos are quite similar to each other.

@minghangli-uni
Copy link
Contributor Author

Using a 900s timestep results in a ~27% reduction in runtime performance compared to the one using 1080s. Given this big decrease, it seems unlikely that we will further reduce the timestep.

The current ePBL parameters are configured according to the OM5 setup, excluding the Langmuir parameterisation.

@minghangli-uni
Copy link
Contributor Author

To troubleshoot if this truncation occurred due to the frequency of ice-ocean coupling, dt was still 900s, but dt_cpl was set to 1080s and no truncation was observed.

@minghangli-uni minghangli-uni added the mom6 Related to MOM6 label Dec 11, 2024
@access-hive-bot
Copy link

This issue has been mentioned on ACCESS Hive Community Forum. There might be relevant details there:

https://forum.access-hive.org.au/t/cosima-twg-meeting-minutes-2024/1734/22

@minghangli-uni
Copy link
Contributor Author

minghangli-uni commented Dec 17, 2024

This comment is related to salinity restoring (related #167).

Option1: Black solid lines - KPP (Control)
Option2: Red dashed-dotted lines - ePBL (following OM5 without Langmuir parameterisation)
Option3: Green dotted lines - ePBL with rivermix parameter
Option4: Cyan solid lines - ePBL+rivermix+max_delta_srestore_0.5 (Failed with extreme values)

  • WARNING from PE 1170: Extreme surface sfc_state detected: i= 175 j=1019 lon=-242.396 lat= 73.609 x=-236.375 y= 77.135 D= 5.0620E+00 SSH= 9.3846E+00 SST=-3.0029E+00 SSS= 5.5455E+01 U-= 0.0000E+00 U+= 2.5480E-03 V-= 0.0000E+00 V+=-1.5151E-03

Option5: Gray dashed lines - OM2-0.25deg ryf

Comments:

  1. Option4 failed due to extreme values in the second year (1901-02-27), similar to Excessively low T and high S in 0.25 degree #139. Despite the failure, the fluxes for the first year across all marginal seas for Option4 appears to perform better than those for all other options. I think we need to figure out why it failed. (The topography is not finalised yet) The parameter max_delta_srestore then should be adjusted to a sensible value (e.g., 0.5) instead of the default 999.Unset MAX_DELTA_SRESTORE #167
  2. Option2 and Option3 overlap for both salinity flux and salinity profiles, implying the rivermix has no impact on these results.
  3. Option2/3 aligns more closely with OM2 (Option5) than Option1 for Baltic Sea, but overall, its behaviour is similar to other options.

marginal_seas_salt_fluxes2
marginal_seas_salt

Plots are available https://github.com/ACCESS-NRI/access-eval-recipes/blob/main/ocean/Salinity_restoring.ipynb

@ezhilsabareesh8
Copy link
Contributor

ezhilsabareesh8 commented Dec 17, 2024

Option4: Cyan solid lines - ePBL+rivermix+max_delta_srestore_0.5 (Failed with extreme values)

  • WARNING from PE 1170: Extreme surface sfc_state detected: i= 175 j=1019 lon=-242.396 lat= 73.609 x=-236.375 y= 77.135 D= 5.0620E+00 SSH= 9.3846E+00 SST=-3.0029E+00 SSS= 5.5455E+01 U-= 0.0000E+00 U+= 2.5480E-03 V-= 0.0000E+00 V+=-1.5151E-03

The salinity value at the crash location is high, max_delta_srestore = 0.5 needs to be set to the default value, similar to other cases, to prevent the high salinity issues as discussed here

@minghangli-uni
Copy link
Contributor Author

Following this comment, #167 (comment)

Yes, I agree. It would be good to set MAX_DELTA_SRESTORE to a default value until we switch to a new topography.

The updated topography is noted, but are we still using the default value for MAX_DELTA_SRESTORE? If so, salinity restoring will likely remain an issue.

@ezhilsabareesh8
Copy link
Contributor

Yes, we haven't changed max_delta_srestore and still we are using the default value. The new topography is not finalised yet, I think we have to monitor the salinity restoring for a long run and adjust the topography based on the test runs before we set max_delta_srestore to a lower value, similar to OM2. Also, all the cases tested here should have a consistent max_delta_srestore value to compare with one another.

@anton-seaice
Copy link
Contributor

It looks like we forgot to set the RIVERMIX_DEPTH in the above test:

DO_RIVERMIX = True              !   [Boolean] default = False
                                ! If true, apply additional mixing wherever there is runoff, so that it is mixed
                                ! down to RIVERMIX_DEPTH if the ocean is that deep.
RIVERMIX_DEPTH = 0.0            !   [m] default = 0.0
                                ! The depth to which rivers are mixed if DO_RIVERMIX is defined.

@aekiss noted OM2 uses river_insertion_thickness=40m, OFAM3 used 15m. I'm not sure how either value was chosen.

And we think trying 40m for a first test could be appropriate.

@aekiss
Copy link
Contributor

aekiss commented Jan 28, 2025

FYI Reichl et al 2024 provide some improvements to ePBL, which will be adopted in GFDL's OM5 . They find this corrects "a significant bias in the diurnal cycle of mixing in OM4" but has "little impact on the time-mean biases".

@aekiss
Copy link
Contributor

aekiss commented Feb 6, 2025

At today's COSIMA meeting there seemed to be a strong view we should to move to ePBL (or at least the latest improvements to KPP) if we can, e.g.:

@rmholmes, @PaulSpence, @dhruvbhagtani, @adele-morrison please weigh in with your thoughts/suggestions.

@dhruvbhagtani
Copy link
Member

Having gone through Reichl et al. 2024 now, I think the improvements are great, and in line with Sane et al. 2023 already out that use neural networks to improve vertical diffusivity in ePBL. Some things to note about Sane et al. 2023 that might be relevant to this discussion:

  1. They find the largest biases near the equator (especially in summer), which Reichl et al. 2024 dive into more detail.
  2. Their neural network reformulates the shape function, so their method can also be applied to improve the KPP (in case we want to stick with KPP but want to improve mixing using data-driven methods).
  3. Since neural networks are difficult to interpret physically, they are also performing an equation discovery that might be more physically-inclined.
  4. This neural network is probably not the best at high resolution because "the longer
    implicit time step used in the numerics of ePBL (see Reichl & Hallberg, 2018) can lead to a smoothing effect
    which can complicate resolving small-scale structure".

From what I understand, it'll be nice to use the improvement by Reichl et al 2024 as Ryan also suggested, and when the equation discovery preprint comes out, we can check whether it's physically motivated or not, applicable only to realistic simulations or perturbation simulations or not, etc..

@navidcy
Copy link

navidcy commented Feb 9, 2025

Hi, I also would vote for ePBL. Like switching out a component from MOM6 that seems to have been tested with and adding a more archaic version seems a bit backwards? I don't wanna impede the OM3 development tho -- feel free to ignore me! But KPP has been shown to not do the best job and I was always feeling happier and excited with the ePBL addition of OM3.

I wouldn't advocate towards the equation-discovery and neural net avenues that @dhruvbhagtani mentions though. Those seem more experimental.

I'm puzzled (and I admit I may have not read all the details here). But since GFDL has included it as their default OM4p25 configuration (and elsewhere?) then is the problems we are facing related to something else? I know Brandon Reichl very well and I can connect you with them. I could tag them here if you like -- just didn't wanna enforce this.

@rmholmes
Copy link

Hi,

I have not followed any of the work above identifying the numerical stability issues and so I cannot comment on that directly. However, I would echo @navidcy and @dhruvbhagtani with the comments on sticking with ePBL if possible, given the age of KPP and where GFDL is going.

I would also add:

  • It's important to distinguish between the surface boundary layer and the interior. ePBL is a boundary layer scheme (and most people mean "the KPP boundary layer scheme" when they mention KPP). In the interior there are other schemes. At the equator, the treatment of interior shear-driven mixing is critical, and this is mainly what is addressed by Reichl et al. 2024. Their modification of m* in ePBL effectively just lets ePBL be replaced by the shear-driven scheme (JHL) in the right circumstances. The other suggestions they have (lower vertical viscosity and a modification to JHL) are also focused largely on interior mixing. I feel that you want to take advantage of these changes if you can, especially given that the interior-shear scheme in the original KPP has some problems (e.g. it doesn't enforce the observed marginal stability). I would guess that these modifications are already in the latest configuration from GFDL?
  • Vertical resolution is another important consideration, also looked at by Reichl et al. 2024.
  • It would be worth thinking forward to when you want to ultimately couple this with the atmosphere (e.g. as part of ACCESS-CM3). The choice of mixing scheme will impact the air-sea coupling, and could have a big impact on things like ENSO etc.
  • I also agree with Navid that the NN stuff is probably a bit experimental in the short-term. But it is exciting, and I do think it's important to start thinking about these kinds of approaches sooner rather than later.

@aekiss
Copy link
Contributor

aekiss commented Feb 18, 2025

Last I heard, CESM3 will use KPP for CMIP7 #83 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants