-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support monthly average and variable name convention for EAMxx #712
Conversation
Commit #0c74dd9 to enable 12 months means as default in core sets. I considered to add logic in season selector to support monthly means other than the original 5 (ANN, and 4 seasons), but it appears to adding these 12 month to each cfg files is less intrusive. With this PR the default option for |
@kaizhangpnl Just for your information, this PR to support EAMxx regridded data in E3SM Diags is ready for review. I think you or someone from you team can be great early testers. I also included the result viewer in this thread for review. We can work together to refine this feature. On the other hand, @golaz has another PR in zppy to support processing native EAMxx output. We are planning to merge shortly. I will tag you again when the PR is ready to merge after some testing... |
@tomvothecoder Tagging you for review. Changes from this PR might need to be carried to cdat migration branch at a later time.. |
Great. I'd be happy to test it. Thanks for the useful information! |
"SOLIN": OrderedDict( | ||
[ | ||
(("rsdt",), rename), | ||
(("SW_flux_dn_at_model_top",), rename), # EAMxx | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"SOLIN": OrderedDict( | |
[ | |
(("rsdt",), rename), | |
(("SW_flux_dn_at_model_top",), rename), # EAMxx | |
] | |
"SOLIN": { | |
("rsdt",): rename, | |
("SW_flux_dn_at_model_top",): rename, # EAMxx | |
} |
I don't think you need need to use an OrderedDict
in the nested derived variables dictionary. I just use a regular dictionary in my refactored version (derivations.py
). The old Dataset class will probably be able to parse this the same way as the new Dataset class in the CDAT migration branch because order of the keys shouldn't matter (as far as I'm aware).
e3sm_diags/e3sm_diags/derivations/derivations.py
Lines 592 to 604 in 6b296b1
# ISCCP | |
"CLDTOT_TAU1.3_ISCCP": { | |
("FISCCP1_COSP",): cosp_bin_sum, | |
("CLISCCP",): cosp_bin_sum, | |
}, | |
"CLDTOT_TAU1.3_9.4_ISCCP": { | |
("FISCCP1_COSP",): cosp_bin_sum, | |
("CLISCCP",): cosp_bin_sum, | |
}, | |
"CLDTOT_TAU9.4_ISCCP": { | |
("FISCCP1_COSP",): cosp_bin_sum, | |
("CLISCCP",): cosp_bin_sum, | |
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recognize you're updating existing elements that are already OrderedDict. Up to you if you want to use a regular dict for new or existing elements that you've modified. Just a suggestion since it is easier to read and cleaner, but I'm aware this file is going to be replaced anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tomvothecoder Thank you for reviewing. It's been a while since the ordered dictionary was designed. I think we made it ordered dictionary to support edge cases when some datasets include multiple variables in the key, and we wanted to loop over keys in order. I'm not sure if this is still needed given that lots of datasets are being updated over years.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe e3sm_diags development started around Python 3.5/3.6? I learned that the regular dict will maintain order with Python >=3.6, so we probably don't need to use OrderedDict
anymore.
Starting with CPython 3.6, dictionaries return items in the order you inserted them.
Source - https://stackoverflow.com/a/47849121
Since Python 3.7, all dictionaries are guaranteed to be ordered. The Python contributors determined that switching to making dict ordered would not have a negative performance impact. I don't know how the performance of OrderedDict compares to dict in Python >= 3.7, but I imagine they would be comparable since they are both ordered.
Note that there are still differences between the behaviour of OrderedDict and dict. See also: Will OrderedDict become redundant in Python 3.7?
Source - https://stackoverflow.com/a/53535866
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i think the development started around 3.5 and 3.6. In this case we should go with regular dictionary. Good call for the cdat migration!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem!
"FLUT": OrderedDict( | ||
[ | ||
(("rlut",), rename), | ||
(("LW_flux_up_at_model_top",), rename), # EAMxx | ||
] | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about not needing to use an OrderedDict
.
"FLUTC": OrderedDict( | ||
[ | ||
(("rlutcs",), rename), | ||
(("LW_clrsky_flux_up_at_model_top",), rename), # EAMxx | ||
] | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about not needing to use an OrderedDict
.
"SOLIN": OrderedDict( | ||
[ | ||
(("rsdt",), rename), | ||
(("SW_flux_dn_at_model_top",), rename), # EAMxx | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recognize you're updating existing elements that are already OrderedDict. Up to you if you want to use a regular dict for new or existing elements that you've modified. Just a suggestion since it is easier to read and cleaner, but I'm aware this file is going to be replaced anyways.
"SHFLX": OrderedDict( | ||
[ | ||
(("hfss",), rename), | ||
(("surf_sens_flux",), rename), # EAMxx | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about not needing to use an OrderedDict
.
"TS": OrderedDict( | ||
[ | ||
(("ts",), rename), | ||
(("surf_radiative_T",), rename), # EAMxx | ||
] | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about not needing to use an OrderedDict
.
"TAUX": OrderedDict( | ||
[ | ||
(("tauu",), lambda tauu: -tauu), | ||
(("surf_mom_flux_U",), lambda tauu: -tauu), # EAMxx | ||
] | ||
), | ||
"TAUY": OrderedDict( | ||
[ | ||
(("tauv",), lambda tauv: -tauv), | ||
(("surf_mom_flux_V",), lambda tauv: -tauv), # EAMxx | ||
] | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about not needing to use an OrderedDict
.
@@ -705,6 +725,10 @@ def cosp_histogram_standardize(cld: "FileVariable"): | |||
lambda fsntoa, fsntoac: swcf(fsntoa, fsntoac), | |||
), | |||
(("rsut", "rsutcs"), lambda rsutcs, rsut: swcf(rsut, rsutcs)), | |||
( | |||
("SW_flux_up_at_model_top", "SW_clrsky_flux_up_at_model_top"), | |||
lambda rsutcs, rsut: swcf(rsut, rsutcs), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a beginner in using Python. Does the order of variable names and fields matter here (should "SW_flux_up_at_model_top" match rsut)?
Is it equivalent to the following?
("SW_flux_up_at_model_top", "SW_clrsky_flux_up_at_model_top"),
lambda rsut, rsutcs: swcf(rsut, rsutcs),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The final "SWCF" plot looks correct, so I guess it's not an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kaizhangpnl thanks for looking into it! I was also supersized that the final plot looks reasonable, still investigating...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going through the code for derived variable and documentation for lamda expression. It does seem like the order of the parameters after lambda
does not matter.
In the derived variable dictionary, it seems like we can further simplify without using lamda expression, we can just to use the regular functions for most if not all cases. Tagging @tomvothecoder for confirmation..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going through the code for derived variable and documentation for lamda expression. It does seem like the order of the parameters after
lambda
does not matter.In the derived variable dictionary, it seems like we can further simplify without using lamda expression, we can just to use the regular functions for most if not all cases. Tagging @tomvothecoder for confirmation..
That's correct, the lambda isn't needed because the required args will be passed directly to the function when it is called. The lambda is doing the same thing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @tomvothecoder !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested it on Perlmutter and reproduced the plots.
Awesome! Thank you, Kai! |
* support monthly average in viewer * add 12 monthly mean to core sets as options * Initial modifications for EAMxx variable names * add monthly mean option for aerosol budget tables * update SCREAM variable convention * more update for supporting monthly data --------- Co-authored-by: Chris Golaz <golaz1@llnl.gov>
Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class Regression testing for lat_lon variables `NET_FLUX_SRF` and `RESTOM` (#754) Update regression test notebook to show validation of all vars Add `subset_and_align_datasets()` to regrid.py (#776) Add template run scripts CDAT Migration Phase: Refactor `cosp_histogram` set (#748) - Refactor `cosp_histogram_driver.py` and `cosp_histogram_plot.py` - `formulas_cosp.py` (new file) - Includes refactored, Xarray-based `cosp_histogram_standard()` and `cosp_bin_sum()` functions - I wrote a lot of new code in `formulas_cosp.py` to clean up `derivations.py` and the old equivalent functions in `utils.py` - `derivations.py` - Cleaned up portions of `DERIVED_VARIABLES` dictionary - Removed unnecessary `OrderedDict` usage for `cosp_histogram` related variables (we should do this for the rest of the variables in in #716) - Remove unnecessary `convert_units()` function calls - Move cloud levels passed to derived variable formulas to `formulas_cosp.CLOUD_BIN_SUM_MAP` - `utils.py` - Delete deprecated, CDAT-based `cosp_histogram` functions - `dataset_xr.py` - Add `dataset_xr.Dataset._open_climo_dataset()` method with a catch for dataset quality issues where "time" is a scalar variable that does not match the "time" dimension array length, drops this variable and replaces it with the correct coordinate - Update `_get_dataset_with_derivation_func()` to handle derivation functions that require the `xr.Dataset` and `target_var_key` args (e.g., `cosp_histogram_standardize()` and `cosp_bin_sum()`) - `io.py` - Update `_write_vars_to_netcdf()` to write test, ref, and diff variables to individual netCDF (required for easy comparison to CDAT-based code that does the same thing) - Add `cdat_migration_regression_test_netcdf.ipynb` validation notebook template for comparing `.nc` files CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) Refactor 654 zonal mean xy (#752) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration - Update run script output directory to NERSC public webserver (#793) [PR]: CDAT Migration: Refactor `aerosol_aeronet` set (#788) CDAT Migration: Test `lat_lon` set with run script and debug any issues (#794) CDAT Migration: Refactor `polar` set (#749) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> Align order of calls to `_set_param_output_attrs` CDAT Migration: Refactor `meridional_mean_2d` set (#795) CDAT Migration: Refactor `aerosol_budget` (#800) Add `acme.py` changes from PR #712 (#814) * Add `acme.py` changes from PR #712 * Replace unnecessary lambda call Refactor area_mean_time_series and add ccb slice flag feature (#750) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> [Refactor]: Validate fix in PR #750 for #759 (#815) CDAT Migration Phase 2: Refactor `diurnal_cycle` set (#819) CDAT Migration: Refactor annual_cycle_zonal_mean set (#798) * Refactor `annual_cycle_zonal_mean` set * Address PR review comments * Add lat lon regression testing * Add debugging scripts * Update `_open_climo_dataset()` to decode times as workaround to misaligned time coords - Update `annual_cycle_zonal_mean_plot.py` to convert time coordinates to month integers * Fix unit tests * Remove old plotter * Add script to debug decode_times=True and ncclimo file * Update plotter time values to month integers * Fix slow `.load()` and multiprocessing issue - Due to incorrectly updating `keep_bnds` logic - Add `_encode_time_coords()` to workaround cftime issue `ValueError: "months since" units only allowed for "360_day" calendar` * Update `_encode_time_coords()` docstring * Add AODVIS debug script * update AODVIS obs datasets; regression test results --------- Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `qbo` set (#826) CDAT Migration Phase 2: Refactor tc_analysis set (#829) * start tc_analysis_refactor * update driver * update plotting * Clean up plotter - Remove unused variables - Make `plot_info` a constant called `PLOT_INFO`, which is now a dict of dicts - Reorder functions for top-down readability * Remove unused notebook --------- Co-authored-by: tomvothecoder <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `enso_diags` set (#832) CDAT Migration Phase 2: Refactor `streamflow` set (#837) [Bug]: CDAT Migration Phase 2: enso_diags plot fixes (#841) [Refactor]: CDAT Migration Phase 3: testing and documentation update (#846) CDAT Migration Phase 3 - Port QBO Wavelet feature to Xarray/xCDAT codebase (#860) CDAT Migration Phase 2: Refactor arm_diags set (#842) Add performance benchmark material (#864) Add function to add CF axis attr to Z axis if missing for downstream xCDAT operations (#865) CDAT Migration Phase 3: Add Convective Precipitation Fraction in lat-lon (#875) CDAT Migration Phase 3: Fix LHFLX name and add catch for non-existent or empty TE stitch file (#876) Add support for time series datasets via glob and fix `enso_diags` set (#866) Add fix for checking `is_time_series()` property based on `data_type` attr (#881) CDAT migration: Fix African easterly wave density plots in TC analysis and convert H20LNZ units to ppm/volume (#882) CDAT Migration: Update `mp_partition_driver.py` to use Dataset from `dataset_xr.py` (#883) CDAT Migration - Port JJB tropical subseasonal diags to Xarray/xCDAT (#887) CDAT Migration: Prepare branch for merge to `main` (#885) [Refactor]: CDAT Migration - Update dependencies and remove Dataset._add_cf_attrs_to_z_axes() (#891) CDAT Migration Phase 2: Refactor core utilities and `lat_lon` set (#677) Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) CDAT Migration Phase 2: Refactor `qbo` set (#826)
Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class Regression testing for lat_lon variables `NET_FLUX_SRF` and `RESTOM` (#754) Update regression test notebook to show validation of all vars Add `subset_and_align_datasets()` to regrid.py (#776) Add template run scripts CDAT Migration Phase: Refactor `cosp_histogram` set (#748) - Refactor `cosp_histogram_driver.py` and `cosp_histogram_plot.py` - `formulas_cosp.py` (new file) - Includes refactored, Xarray-based `cosp_histogram_standard()` and `cosp_bin_sum()` functions - I wrote a lot of new code in `formulas_cosp.py` to clean up `derivations.py` and the old equivalent functions in `utils.py` - `derivations.py` - Cleaned up portions of `DERIVED_VARIABLES` dictionary - Removed unnecessary `OrderedDict` usage for `cosp_histogram` related variables (we should do this for the rest of the variables in in #716) - Remove unnecessary `convert_units()` function calls - Move cloud levels passed to derived variable formulas to `formulas_cosp.CLOUD_BIN_SUM_MAP` - `utils.py` - Delete deprecated, CDAT-based `cosp_histogram` functions - `dataset_xr.py` - Add `dataset_xr.Dataset._open_climo_dataset()` method with a catch for dataset quality issues where "time" is a scalar variable that does not match the "time" dimension array length, drops this variable and replaces it with the correct coordinate - Update `_get_dataset_with_derivation_func()` to handle derivation functions that require the `xr.Dataset` and `target_var_key` args (e.g., `cosp_histogram_standardize()` and `cosp_bin_sum()`) - `io.py` - Update `_write_vars_to_netcdf()` to write test, ref, and diff variables to individual netCDF (required for easy comparison to CDAT-based code that does the same thing) - Add `cdat_migration_regression_test_netcdf.ipynb` validation notebook template for comparing `.nc` files CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) Refactor 654 zonal mean xy (#752) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration - Update run script output directory to NERSC public webserver (#793) [PR]: CDAT Migration: Refactor `aerosol_aeronet` set (#788) CDAT Migration: Test `lat_lon` set with run script and debug any issues (#794) CDAT Migration: Refactor `polar` set (#749) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> Align order of calls to `_set_param_output_attrs` CDAT Migration: Refactor `meridional_mean_2d` set (#795) CDAT Migration: Refactor `aerosol_budget` (#800) Add `acme.py` changes from PR #712 (#814) * Add `acme.py` changes from PR #712 * Replace unnecessary lambda call Refactor area_mean_time_series and add ccb slice flag feature (#750) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> [Refactor]: Validate fix in PR #750 for #759 (#815) CDAT Migration Phase 2: Refactor `diurnal_cycle` set (#819) CDAT Migration: Refactor annual_cycle_zonal_mean set (#798) * Refactor `annual_cycle_zonal_mean` set * Address PR review comments * Add lat lon regression testing * Add debugging scripts * Update `_open_climo_dataset()` to decode times as workaround to misaligned time coords - Update `annual_cycle_zonal_mean_plot.py` to convert time coordinates to month integers * Fix unit tests * Remove old plotter * Add script to debug decode_times=True and ncclimo file * Update plotter time values to month integers * Fix slow `.load()` and multiprocessing issue - Due to incorrectly updating `keep_bnds` logic - Add `_encode_time_coords()` to workaround cftime issue `ValueError: "months since" units only allowed for "360_day" calendar` * Update `_encode_time_coords()` docstring * Add AODVIS debug script * update AODVIS obs datasets; regression test results --------- Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `qbo` set (#826) CDAT Migration Phase 2: Refactor tc_analysis set (#829) * start tc_analysis_refactor * update driver * update plotting * Clean up plotter - Remove unused variables - Make `plot_info` a constant called `PLOT_INFO`, which is now a dict of dicts - Reorder functions for top-down readability * Remove unused notebook --------- Co-authored-by: tomvothecoder <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `enso_diags` set (#832) CDAT Migration Phase 2: Refactor `streamflow` set (#837) [Bug]: CDAT Migration Phase 2: enso_diags plot fixes (#841) [Refactor]: CDAT Migration Phase 3: testing and documentation update (#846) CDAT Migration Phase 3 - Port QBO Wavelet feature to Xarray/xCDAT codebase (#860) CDAT Migration Phase 2: Refactor arm_diags set (#842) Add performance benchmark material (#864) Add function to add CF axis attr to Z axis if missing for downstream xCDAT operations (#865) CDAT Migration Phase 3: Add Convective Precipitation Fraction in lat-lon (#875) CDAT Migration Phase 3: Fix LHFLX name and add catch for non-existent or empty TE stitch file (#876) Add support for time series datasets via glob and fix `enso_diags` set (#866) Add fix for checking `is_time_series()` property based on `data_type` attr (#881) CDAT migration: Fix African easterly wave density plots in TC analysis and convert H20LNZ units to ppm/volume (#882) CDAT Migration: Update `mp_partition_driver.py` to use Dataset from `dataset_xr.py` (#883) CDAT Migration - Port JJB tropical subseasonal diags to Xarray/xCDAT (#887) CDAT Migration: Prepare branch for merge to `main` (#885) [Refactor]: CDAT Migration - Update dependencies and remove Dataset._add_cf_attrs_to_z_axes() (#891) CDAT Migration Phase 2: Refactor core utilities and `lat_lon` set (#677) Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) CDAT Migration Phase 2: Refactor `qbo` set (#826)
Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class Regression testing for lat_lon variables `NET_FLUX_SRF` and `RESTOM` (#754) Update regression test notebook to show validation of all vars Add `subset_and_align_datasets()` to regrid.py (#776) Add template run scripts CDAT Migration Phase: Refactor `cosp_histogram` set (#748) - Refactor `cosp_histogram_driver.py` and `cosp_histogram_plot.py` - `formulas_cosp.py` (new file) - Includes refactored, Xarray-based `cosp_histogram_standard()` and `cosp_bin_sum()` functions - I wrote a lot of new code in `formulas_cosp.py` to clean up `derivations.py` and the old equivalent functions in `utils.py` - `derivations.py` - Cleaned up portions of `DERIVED_VARIABLES` dictionary - Removed unnecessary `OrderedDict` usage for `cosp_histogram` related variables (we should do this for the rest of the variables in in #716) - Remove unnecessary `convert_units()` function calls - Move cloud levels passed to derived variable formulas to `formulas_cosp.CLOUD_BIN_SUM_MAP` - `utils.py` - Delete deprecated, CDAT-based `cosp_histogram` functions - `dataset_xr.py` - Add `dataset_xr.Dataset._open_climo_dataset()` method with a catch for dataset quality issues where "time" is a scalar variable that does not match the "time" dimension array length, drops this variable and replaces it with the correct coordinate - Update `_get_dataset_with_derivation_func()` to handle derivation functions that require the `xr.Dataset` and `target_var_key` args (e.g., `cosp_histogram_standardize()` and `cosp_bin_sum()`) - `io.py` - Update `_write_vars_to_netcdf()` to write test, ref, and diff variables to individual netCDF (required for easy comparison to CDAT-based code that does the same thing) - Add `cdat_migration_regression_test_netcdf.ipynb` validation notebook template for comparing `.nc` files CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) Refactor 654 zonal mean xy (#752) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration - Update run script output directory to NERSC public webserver (#793) [PR]: CDAT Migration: Refactor `aerosol_aeronet` set (#788) CDAT Migration: Test `lat_lon` set with run script and debug any issues (#794) CDAT Migration: Refactor `polar` set (#749) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> Align order of calls to `_set_param_output_attrs` CDAT Migration: Refactor `meridional_mean_2d` set (#795) CDAT Migration: Refactor `aerosol_budget` (#800) Add `acme.py` changes from PR #712 (#814) * Add `acme.py` changes from PR #712 * Replace unnecessary lambda call Refactor area_mean_time_series and add ccb slice flag feature (#750) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> [Refactor]: Validate fix in PR #750 for #759 (#815) CDAT Migration Phase 2: Refactor `diurnal_cycle` set (#819) CDAT Migration: Refactor annual_cycle_zonal_mean set (#798) * Refactor `annual_cycle_zonal_mean` set * Address PR review comments * Add lat lon regression testing * Add debugging scripts * Update `_open_climo_dataset()` to decode times as workaround to misaligned time coords - Update `annual_cycle_zonal_mean_plot.py` to convert time coordinates to month integers * Fix unit tests * Remove old plotter * Add script to debug decode_times=True and ncclimo file * Update plotter time values to month integers * Fix slow `.load()` and multiprocessing issue - Due to incorrectly updating `keep_bnds` logic - Add `_encode_time_coords()` to workaround cftime issue `ValueError: "months since" units only allowed for "360_day" calendar` * Update `_encode_time_coords()` docstring * Add AODVIS debug script * update AODVIS obs datasets; regression test results --------- Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `qbo` set (#826) CDAT Migration Phase 2: Refactor tc_analysis set (#829) * start tc_analysis_refactor * update driver * update plotting * Clean up plotter - Remove unused variables - Make `plot_info` a constant called `PLOT_INFO`, which is now a dict of dicts - Reorder functions for top-down readability * Remove unused notebook --------- Co-authored-by: tomvothecoder <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `enso_diags` set (#832) CDAT Migration Phase 2: Refactor `streamflow` set (#837) [Bug]: CDAT Migration Phase 2: enso_diags plot fixes (#841) [Refactor]: CDAT Migration Phase 3: testing and documentation update (#846) CDAT Migration Phase 3 - Port QBO Wavelet feature to Xarray/xCDAT codebase (#860) CDAT Migration Phase 2: Refactor arm_diags set (#842) Add performance benchmark material (#864) Add function to add CF axis attr to Z axis if missing for downstream xCDAT operations (#865) CDAT Migration Phase 3: Add Convective Precipitation Fraction in lat-lon (#875) CDAT Migration Phase 3: Fix LHFLX name and add catch for non-existent or empty TE stitch file (#876) Add support for time series datasets via glob and fix `enso_diags` set (#866) Add fix for checking `is_time_series()` property based on `data_type` attr (#881) CDAT migration: Fix African easterly wave density plots in TC analysis and convert H20LNZ units to ppm/volume (#882) CDAT Migration: Update `mp_partition_driver.py` to use Dataset from `dataset_xr.py` (#883) CDAT Migration - Port JJB tropical subseasonal diags to Xarray/xCDAT (#887) CDAT Migration: Prepare branch for merge to `main` (#885) [Refactor]: CDAT Migration - Update dependencies and remove Dataset._add_cf_attrs_to_z_axes() (#891) CDAT Migration Phase 2: Refactor core utilities and `lat_lon` set (#677) Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) CDAT Migration Phase 2: Refactor `qbo` set (#826)
Two use cases:
For longer than 1 year simulation:
Support both monthly average ['01', '02', ..., '12'] in viewer in addition to seasonal and annual mean.
For shorter than 1 year simulation:
Support visualization the averaged snapshot.
This PR is to support the first use case, it closes #348 and partially #652.
This PR also add support for variable names from EAMxx to support its output.
An example run with regridded EAMxx output is shown here.
The run script for this run is linked here.