diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml index fd53b9a32..671db476e 100644 --- a/.github/workflows/build.yml +++ b/.github/workflows/build.yml @@ -12,13 +12,13 @@ on: jobs: call-version-info-workflow: - uses: ASFHyP3/actions/.github/workflows/reusable-version-info.yml@v0.8.1 + uses: ASFHyP3/actions/.github/workflows/reusable-version-info.yml@v0.8.2 with: python_version: '3.10' call-docker-ghcr-workflow: needs: call-version-info-workflow - uses: ASFHyP3/actions/.github/workflows/reusable-docker-ghcr.yml@v0.8.1 + uses: ASFHyP3/actions/.github/workflows/reusable-docker-ghcr.yml@v0.8.2 with: version_tag: ${{ needs.call-version-info-workflow.outputs.version_tag }} release_branch: main diff --git a/.github/workflows/changelog.yml b/.github/workflows/changelog.yml index 472cf2bf1..0a50ce126 100644 --- a/.github/workflows/changelog.yml +++ b/.github/workflows/changelog.yml @@ -13,6 +13,6 @@ on: jobs: call-changelog-check-workflow: - uses: ASFHyP3/actions/.github/workflows/reusable-changelog-check.yml@v0.8.1 + uses: ASFHyP3/actions/.github/workflows/reusable-changelog-check.yml@v0.8.2 secrets: USER_TOKEN: ${{ secrets.GITHUB_TOKEN }} diff --git a/.github/workflows/deploy-docs.yml b/.github/workflows/deploy-docs.yml index ad21d76cb..a47b0e5db 100644 --- a/.github/workflows/deploy-docs.yml +++ b/.github/workflows/deploy-docs.yml @@ -10,7 +10,7 @@ jobs: name: Build site and deploy runs-on: "ubuntu-latest" steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 with: fetch-depth: 0 diff --git a/.github/workflows/labeled-pr.yml b/.github/workflows/labeled-pr.yml index c00df6335..ee8a5f5ea 100644 --- a/.github/workflows/labeled-pr.yml +++ b/.github/workflows/labeled-pr.yml @@ -12,4 +12,4 @@ on: jobs: call-labeled-pr-check-workflow: - uses: ASFHyP3/actions/.github/workflows/reusable-labeled-pr-check.yml@v0.8.1 + uses: ASFHyP3/actions/.github/workflows/reusable-labeled-pr-check.yml@v0.8.2 diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 2053d9028..bb7e27ce1 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -7,7 +7,7 @@ on: jobs: call-release-workflow: - uses: ASFHyP3/actions/.github/workflows/reusable-release.yml@v0.8.1 + uses: ASFHyP3/actions/.github/workflows/reusable-release.yml@v0.8.2 with: release_prefix: RAiDER develop_branch: dev diff --git a/.github/workflows/tag.yml b/.github/workflows/tag.yml index a256fe4c2..ad0761eb0 100644 --- a/.github/workflows/tag.yml +++ b/.github/workflows/tag.yml @@ -7,7 +7,7 @@ on: jobs: call-bump-version-workflow: - uses: ASFHyP3/actions/.github/workflows/reusable-bump-version.yml@v0.8.1 + uses: ASFHyP3/actions/.github/workflows/reusable-bump-version.yml@v0.8.2 with: user: dbekaert email: bekaertdavid@gmail.com diff --git a/CHANGELOG.md b/CHANGELOG.md index a11a97484..e70ad1834 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,39 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [PEP 440](https://www.python.org/dev/peps/pep-0440/) and uses [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.4.5] + +## Fixes +* [#583](https://github.com/dbekaert/RAiDER/issues/583): it appears that since the issues with geo2rdr cropped up during our processing campaign, there has been a new release of ISCE3 that resolves these failures with `geo2rdr` and the time interpolation that uses this ISCE3 routine. +* [#584](https://github.com/dbekaert/RAiDER/issues/584): failed Raider step function in hyp3 job submission when HRRR model times are not available (even within the valid model range) - to resolve, we check availability of files when delay workflow called with a) azimuth_grid_interpolation and b) input to workflow is GUNW. If weather model files are unavailable and the GUNW is on s3, do nothing to GUNW (i.e. do not add tropo delay) and exit successfully. If weather model files are unavailable and the GUNW is on local disk, raise `ValueError` +* [#587](https://github.com/dbekaert/RAiDER/issues/587): similar to 584 except added here to the mix is control flow in RAiDER.py passes over numerous exceptions in workflow. This is fixed identically as above. +* [#596](https://github.com/dbekaert/RAiDER/issues/596): the "prefix" for aws does not include the final netcdf file name, just the sub-directories in the bucket and therefore extra logic must be added to determine the GUNW netcdf file name (and the assocaited reference/secondary dates). We proceed by downloading the data which is needed regardless. Tests are updated. + +## Removed +* Removes `update` option (either `True` or `False`) from calcGUNW workflow which asks whether the GUNW should be updated or not. In existing code, it was not being used/applied, i.e. previous workflow always updated GUNW. Removed input arguments related from respective functions so that it can be updated later. + +## Added +* Allow for Hyp3 GUNW workflow with HRRR (i.e. specifying a gunw path in s3) to successfully exit if any of the HRRR model times required for `azimuth-time-grid` interpolation (which is default interpolatin method) are not available when using bucket inputs (i.e. only on the cloud) +* For GUNW workflow, when model is HRRR, azimuth_time_grid interpolation used, and using a local GUNW, if requisite weather model files are not available for raise a ValueError (before processing) +* Raise a value error if non-unique dates are given in the function `get_inverse_weights_for_dates` in `s1_azimuth_timing.py` +* Added metadata provenance for each delay layer that is included in GUNW and the cube workflow generally in `calcDelays` including: + * `model_times_used` - the weather models used and interpolated for the delay calculation + * `interpolation_method` - whether `none`, `center_time`, or `azimuth_time_grid` methods were used - see description in `calcDelayGUNW` + * `scene_center_time` - the center time in which the associated SAR image was acquired +* Stages GMAO data for GUNW testing of correct dataset update i.e. in the test `test_GUNW_dataset_update`. +* Stages HRRR data for `test_HRRR_ztd` test. +* Ensures ISCE3 is `>=0.15.0` +* Uses correct HyP3 S3 prefix conventions and filename suffix within test patches to improve readability of what tests are mocking (see comments in #597). + +## Changed +* Get only 2 or 3 model times required for azimuth-time-interpolation (previously obtained all 3 as it was easier to implement) - this ensures slightly less failures associated with HRRR availability. Importantly, if a acquisition time occurs during a model time, then we order by distance to the reference time and how early it occurs (so earlier times come first if two times are equidistant to the aquisition time). +* Made test names in `test_GUNW.py` more descriptive +* Numpy docstrings and general linting to modified function including removing variables that were not being accessed +* hrrr_download to ensure that `hybrid` coordinate is obtained regardless how herbie returns datacubes and ensures test_HRRR_ztd passes consistently + * Remove the command line call in `test_HRRR_ztd.py` and call using the python mock up of CLI for better error handling and data mocking. +* Return xarray.Dataset types for RAiDER.calcGUNW.tropo_gunw_slc and RAiDER.raider.calcDelayGUNW for easier inspection and testing +* Fixes tests for checking availability of HRRR due Issue #596 (above). + ## [0.4.4] ## Fixes diff --git a/environment.yml b/environment.yml index dcbaf453c..464a9ee91 100644 --- a/environment.yml +++ b/environment.yml @@ -25,7 +25,7 @@ dependencies: - herbie-data - h5netcdf - hyp3lib - - isce3>=0.9.0 + - isce3>=0.15.0 - jsonschema==3.2.0 # this is for ASF DAAC ingest schema validation - lxml - matplotlib diff --git a/test/conftest.py b/test/conftest.py index 67fb7f227..004a2e67f 100644 --- a/test/conftest.py +++ b/test/conftest.py @@ -81,10 +81,8 @@ def weather_model_dict_for_azimuth_time_test(): test_data = TEST_DIR / 'gunw_azimuth_test_data' / 'weather_files' return {'HRRR': [test_data / 'HRRR_2021_07_23_T02_00_00_33N_36N_120W_115W.nc', test_data / 'HRRR_2021_07_23_T01_00_00_33N_36N_120W_115W.nc', - test_data / 'HRRR_2021_07_23_T03_00_00_33N_36N_120W_115W.nc', test_data / 'HRRR_2021_07_11_T02_00_00_33N_36N_120W_115W.nc', test_data / 'HRRR_2021_07_11_T01_00_00_33N_36N_120W_115W.nc', - test_data / 'HRRR_2021_07_11_T03_00_00_33N_36N_120W_115W.nc' ]} @@ -107,4 +105,48 @@ def orbit_paths_for_duplicate_orbit_xml_test(): 'S1A_OPER_AUX_POEORB_OPOD_20230413T080643_V20230323T225942_20230325T005942.EOF', 'S1A_OPER_AUX_POEORB_OPOD_20230413T080643_V20230323T225942_20230325T005942.EOF', 'S1A_OPER_AUX_POEORB_OPOD_20230412T080821_V20230322T225942_20230324T005942.EOF'] - return [test_data / fn for fn in orbit_file_names] \ No newline at end of file + return [test_data / fn for fn in orbit_file_names] + + +@pytest.fixture(scope='session') +def weather_model_dict_for_gunw_integration_test(): + """Order is important here; will be in chronological order with respect to closest date times. + + Generate via: + ``` + from RAiDER.processWM import prepareWeatherModel + from RAiDER.models import GMAO + import datetime + + model = GMAO() + datetimes = [datetime.datetime(2020, 1, 30, 12, 0), + datetime.datetime(2020, 1, 30, 15, 0), + datetime.datetime(2020, 1, 24, 12, 0), + datetime.datetime(2020, 1, 24, 15, 0)] + bounds = [32.5, 35.5, -119.8, -115.7] + wmfiles = [prepareWeatherModel(model, dt, bounds) for dt in datetimes] + ``` + """ + test_data = TEST_DIR / 'gunw_test_data' / 'weather_files' + return {'GMAO': [test_data / 'GMAO_2020_01_30_T12_00_00_32N_36N_121W_114W.nc', + test_data / 'GMAO_2020_01_30_T15_00_00_32N_36N_121W_114W.nc', + test_data / 'GMAO_2020_01_24_T12_00_00_32N_36N_121W_114W.nc', + test_data / 'GMAO_2020_01_24_T15_00_00_32N_36N_121W_114W.nc'] + } + +@pytest.fixture(scope='session') +def data_for_hrrr_ztd(): + '''Obtained via: + ``` + from RAiDER.processWM import prepareWeatherModel + from RAiDER.models import HRRR + import datetime + + model = HRRR() + datetimes = [datetime.datetime(2020, 1, 1, 12)] + bounds = [36, 37, -92, -91] + wmfiles = [prepareWeatherModel(model, dt, bounds) for dt in datetimes] + ``` + ''' + test_data_dir = TEST_DIR / 'scenario_1' / 'HRRR_ztd_test' + return test_data_dir / 'HRRR_2020_01_01_T12_00_00_35N_38N_93W_90W.nc' \ No newline at end of file diff --git a/test/gunw_azimuth_test_data/weather_files/HRRR_2021_07_11_T03_00_00_33N_36N_120W_115W.nc b/test/gunw_azimuth_test_data/weather_files/HRRR_2021_07_11_T03_00_00_33N_36N_120W_115W.nc deleted file mode 100644 index 366fb1cd6..000000000 Binary files a/test/gunw_azimuth_test_data/weather_files/HRRR_2021_07_11_T03_00_00_33N_36N_120W_115W.nc and /dev/null differ diff --git a/test/gunw_azimuth_test_data/weather_files/HRRR_2021_07_23_T03_00_00_33N_36N_120W_115W.nc b/test/gunw_azimuth_test_data/weather_files/HRRR_2021_07_23_T03_00_00_33N_36N_120W_115W.nc deleted file mode 100644 index 65c249900..000000000 Binary files a/test/gunw_azimuth_test_data/weather_files/HRRR_2021_07_23_T03_00_00_33N_36N_120W_115W.nc and /dev/null differ diff --git a/test/gunw_test_data/weather_files/GMAO_2020_01_24T13_52_44_timeInterp_32N_36N_121W_114W.nc b/test/gunw_test_data/weather_files/GMAO_2020_01_24T13_52_44_timeInterp_32N_36N_121W_114W.nc new file mode 100644 index 000000000..8d03c4c0b Binary files /dev/null and b/test/gunw_test_data/weather_files/GMAO_2020_01_24T13_52_44_timeInterp_32N_36N_121W_114W.nc differ diff --git a/test/gunw_test_data/weather_files/GMAO_2020_01_24_T12_00_00_32N_36N_121W_114W.nc b/test/gunw_test_data/weather_files/GMAO_2020_01_24_T12_00_00_32N_36N_121W_114W.nc new file mode 100644 index 000000000..e43bdfdd9 Binary files /dev/null and b/test/gunw_test_data/weather_files/GMAO_2020_01_24_T12_00_00_32N_36N_121W_114W.nc differ diff --git a/test/gunw_test_data/weather_files/GMAO_2020_01_24_T15_00_00_32N_36N_121W_114W.nc b/test/gunw_test_data/weather_files/GMAO_2020_01_24_T15_00_00_32N_36N_121W_114W.nc new file mode 100644 index 000000000..848d37a52 Binary files /dev/null and b/test/gunw_test_data/weather_files/GMAO_2020_01_24_T15_00_00_32N_36N_121W_114W.nc differ diff --git a/test/gunw_test_data/weather_files/GMAO_2020_01_30T13_52_44_timeInterp_32N_36N_121W_114W.nc b/test/gunw_test_data/weather_files/GMAO_2020_01_30T13_52_44_timeInterp_32N_36N_121W_114W.nc new file mode 100644 index 000000000..49d3b1e83 Binary files /dev/null and b/test/gunw_test_data/weather_files/GMAO_2020_01_30T13_52_44_timeInterp_32N_36N_121W_114W.nc differ diff --git a/test/gunw_test_data/weather_files/GMAO_2020_01_30_T12_00_00_32N_36N_121W_114W.nc b/test/gunw_test_data/weather_files/GMAO_2020_01_30_T12_00_00_32N_36N_121W_114W.nc new file mode 100644 index 000000000..775729de2 Binary files /dev/null and b/test/gunw_test_data/weather_files/GMAO_2020_01_30_T12_00_00_32N_36N_121W_114W.nc differ diff --git a/test/gunw_test_data/weather_files/GMAO_2020_01_30_T15_00_00_32N_36N_121W_114W.nc b/test/gunw_test_data/weather_files/GMAO_2020_01_30_T15_00_00_32N_36N_121W_114W.nc new file mode 100644 index 000000000..4c54e93ef Binary files /dev/null and b/test/gunw_test_data/weather_files/GMAO_2020_01_30_T15_00_00_32N_36N_121W_114W.nc differ diff --git a/test/scenario_1/HRRR_ztd_test/HRRR_2020_01_01_T12_00_00_35N_38N_93W_90W.nc b/test/scenario_1/HRRR_ztd_test/HRRR_2020_01_01_T12_00_00_35N_38N_93W_90W.nc new file mode 100644 index 000000000..37e1fa320 Binary files /dev/null and b/test/scenario_1/HRRR_ztd_test/HRRR_2020_01_01_T12_00_00_35N_38N_93W_90W.nc differ diff --git a/test/test_GUNW.py b/test/test_GUNW.py index 0836d988f..71c18db90 100644 --- a/test/test_GUNW.py +++ b/test/test_GUNW.py @@ -18,7 +18,10 @@ import RAiDER import RAiDER.cli.raider as raider from RAiDER import aws -from RAiDER.aria.prepFromGUNW import check_weather_model_availability +from RAiDER.aria.prepFromGUNW import ( + check_hrrr_dataset_availablity_for_s1_azimuth_time_interpolation, + check_weather_model_availability +) from RAiDER.cli.raider import calcDelaysGUNW @@ -35,16 +38,31 @@ def compute_transform(lats, lons): @pytest.mark.isce3 @pytest.mark.parametrize('weather_model_name', ['GMAO']) -def test_GUNW_update(test_dir_path, test_gunw_path_factory, weather_model_name): +def test_GUNW_dataset_update(test_dir_path, test_gunw_path_factory, weather_model_name, + weather_model_dict_for_gunw_integration_test, mocker): + """The GUNW from test gunw factor is: + + S1-GUNW-D-R-071-tops-20200130_20200124-135156-34956N_32979N-PP-913f-v2_0_4.nc + + Therefore relevant GMAO datetimes are + 12 pm and 3 pm (in that order) + """ scenario_dir = test_dir_path / 'GUNW' scenario_dir.mkdir(exist_ok=True, parents=True) orig_GUNW = test_gunw_path_factory() updated_GUNW = scenario_dir / orig_GUNW.name shutil.copy(orig_GUNW, updated_GUNW) - cmd = f'raider.py ++process calcDelaysGUNW -f {updated_GUNW} -m {weather_model_name} -o {scenario_dir} -interp center_time' - proc = subprocess.run(cmd.split(), stdout=subprocess.PIPE, universal_newlines=True) - assert proc.returncode == 0 + iargs = ['--weather-model', weather_model_name, + '--file', str(updated_GUNW), + '-interp', 'center_time'] + + side_effect = weather_model_dict_for_gunw_integration_test[weather_model_name] + # RAiDER needs strings for paths + side_effect = list(map(str, side_effect)) + mocker.patch('RAiDER.processWM.prepareWeatherModel', + side_effect=side_effect) + calcDelaysGUNW(iargs) # check the CRS and affine are written correctly epsg = 4326 @@ -73,7 +91,7 @@ def test_GUNW_update(test_dir_path, test_gunw_path_factory, weather_model_name): [os.remove(f) for f in glob.glob(f'{weather_model_name}*')] -def test_GUNW_metadata_update(test_gunw_json_path, test_gunw_json_schema_path, tmp_path, mocker): +def test_GUNW_hyp3_metadata_update(test_gunw_json_path, test_gunw_json_schema_path, tmp_path, mocker): """This test performs the GUNW entrypoint with bucket/prefix provided and only updates the json. Monkey patches the upload/download to/from s3 and the actual computation. """ @@ -85,10 +103,11 @@ def test_GUNW_metadata_update(test_gunw_json_path, test_gunw_json_schema_path, t mocker.patch("RAiDER.aws.get_s3_file", side_effect=['foo.nc', temp_json_path]) mocker.patch("RAiDER.aws.upload_file_to_s3") mocker.patch("RAiDER.aria.prepFromGUNW.main", return_value=['my_path_cfg', 'my_wavelength']) + mocker.patch('RAiDER.aria.prepFromGUNW.check_hrrr_dataset_availablity_for_s1_azimuth_time_interpolation', + side_effect=[True]) mocker.patch("RAiDER.aria.prepFromGUNW.check_weather_model_availability", return_value=True) mocker.patch("RAiDER.cli.raider.calcDelays", return_value=['file1', 'file2']) mocker.patch("RAiDER.aria.calcGUNW.tropo_gunw_slc") - mocker.patch("os.getcwd", return_value='myDir') iargs = ['--weather-model', 'HRES', '--bucket', 'myBucket', @@ -114,8 +133,6 @@ def test_GUNW_metadata_update(test_gunw_json_path, test_gunw_json_schema_path, t ['file1', 'file2'], 'foo.nc', 'my_wavelength', - 'myDir', - True, ) assert aws.upload_file_to_s3.mock_calls == [ @@ -125,13 +142,13 @@ def test_GUNW_metadata_update(test_gunw_json_path, test_gunw_json_schema_path, t @pytest.mark.parametrize('weather_model_name', ['HRRR']) -def test_azimuth_timing_against_interpolation(weather_model_name: str, - tmp_path: Path, - gunw_azimuth_test: Path, - orbit_dict_for_azimuth_time_test: dict[str], - weather_model_dict_for_azimuth_time_test, - weather_model_dict_for_center_time_test, - mocker): +def test_azimuth_timing_interp_against_center_time_interp(weather_model_name: str, + tmp_path: Path, + gunw_azimuth_test: Path, + orbit_dict_for_azimuth_time_test: dict[str], + weather_model_dict_for_azimuth_time_test, + weather_model_dict_for_center_time_test, + mocker): """This test shows that the azimuth timing interpolation does not deviate from the center time by more than 1 mm for the HRRR model. This is expected since the model times are 6 hours apart and a the azimuth time is changing the interpolation weights for a given pixel at the order @@ -162,10 +179,8 @@ def test_azimuth_timing_against_interpolation(weather_model_name: str, ``` wfile_0 = prepareWeatherModel(model, datetime.datetime(2021, 7, 23, 2, 0), [33.45, 35.45, -119.15, -115.95]) wfile_1 = prepareWeatherModel(model, datetime.datetime(2021, 7, 23, 1, 0), [33.45, 35.45, -119.15, -115.95]) - wfile_2 = prepareWeatherModel(model, datetime.datetime(2021, 7, 23, 3, 0), [33.45, 35.45, -119.15, -115.95]) wfile_3 = prepareWeatherModel(model, datetime.datetime(2021, 7, 11, 2, 0), [33.45, 35.45, -119.15, -115.95]) wfile_4 = prepareWeatherModel(model, datetime.datetime(2021, 7, 11, 1, 0), [33.45, 35.45, -119.15, -115.95]) - wfile_5 = prepareWeatherModel(model, datetime.datetime(2021, 7, 11, 3, 0), [33.45, 35.45, -119.15, -115.95]) ``` Note for azimuth time they are acquired in order of proximity to acq as opposed to chronological. @@ -242,8 +257,8 @@ def test_azimuth_timing_against_interpolation(weather_model_name: str, ] calcDelaysGUNW(iargs_1) - # Calls 6 times for azimuth time and 4 times for center time - assert RAiDER.processWM.prepareWeatherModel.call_count == 10 + # Calls 4 times for azimuth time and 4 times for center time + assert RAiDER.processWM.prepareWeatherModel.call_count == 8 # Only calls for azimuth timing for reference and secondary; # There is 1 slc for ref and 2 for secondary, totalling 3 calls assert hyp3lib.get_orb.downloadSentinelOrbitFile.call_count == 3 @@ -273,8 +288,8 @@ def test_check_weather_model_availability(test_gunw_path_factory, weather_model_ assert check_weather_model_availability(test_gunw_path, weather_model_name) # Let's mock an earlier date for some models - mocker.patch("RAiDER.aria.prepFromGUNW.get_acq_from_slc_id", side_effect=[pd.Timestamp('2015-01-01'), - pd.Timestamp('2014-01-01')]) + mocker.patch("RAiDER.aria.prepFromGUNW.get_acq_time_from_slc_id", side_effect=[pd.Timestamp('2015-01-01'), + pd.Timestamp('2014-01-01')]) cond = check_weather_model_availability(test_gunw_path, weather_model_name) if weather_model_name in ['HRRR', 'GMAO']: cond = not cond @@ -289,8 +304,8 @@ def test_check_weather_model_availability_over_alaska(test_gunw_path_factory, we assert check_weather_model_availability(test_gunw_path, weather_model_name) # Let's mock an earlier date - mocker.patch("RAiDER.aria.prepFromGUNW.get_acq_from_slc_id", side_effect=[pd.Timestamp('2017-01-01'), - pd.Timestamp('2016-01-01')]) + mocker.patch("RAiDER.aria.prepFromGUNW.get_acq_time_from_slc_id", side_effect=[pd.Timestamp('2017-01-01'), + pd.Timestamp('2016-01-01')]) cond = check_weather_model_availability(test_gunw_path, weather_model_name) if weather_model_name == 'HRRR': cond = not cond @@ -299,7 +314,11 @@ def test_check_weather_model_availability_over_alaska(test_gunw_path_factory, we @pytest.mark.parametrize('weather_model_name', ['HRRR', 'GMAO']) @pytest.mark.parametrize('location', ['california-t71', 'alaska']) -def test_weather_model_availability_integration(location, test_gunw_path_factory, tmp_path, weather_model_name, mocker): +def test_weather_model_availability_integration_using_valid_range(location, + test_gunw_path_factory, + tmp_path, + weather_model_name, + mocker): temp_json_path = tmp_path / 'temp.json' test_gunw_path = test_gunw_path_factory(location=location) shutil.copy(test_gunw_path, temp_json_path) @@ -307,9 +326,14 @@ def test_weather_model_availability_integration(location, test_gunw_path_factory # We will pass the test GUNW to the workflow mocker.patch("RAiDER.aws.get_s3_file", side_effect=[test_gunw_path, 'foo.json']) mocker.patch("RAiDER.aws.upload_file_to_s3") + + # Have another test for checking the actual files - we are only checking for valid + if weather_model_name == 'HRRR': + mocker.patch('RAiDER.aria.prepFromGUNW.check_hrrr_dataset_availablity_for_s1_azimuth_time_interpolation', + side_effect=[True]) # These are outside temporal availability of GMAO and HRRR ref_date, sec_date = pd.Timestamp('2015-01-01'), pd.Timestamp('2014-01-01') - mocker.patch("RAiDER.aria.prepFromGUNW.get_acq_from_slc_id", side_effect=[ref_date, sec_date]) + mocker.patch("RAiDER.aria.prepFromGUNW.get_acq_time_from_slc_id", side_effect=[ref_date, sec_date]) # Don't specify side-effects or return values, because never called mocker.patch("RAiDER.aria.prepFromGUNW.main") mocker.patch("RAiDER.cli.raider.calcDelays") @@ -326,3 +350,192 @@ def test_weather_model_availability_integration(location, test_gunw_path_factory RAiDER.cli.raider.calcDelays.assert_not_called() RAiDER.aria.prepFromGUNW.main.assert_not_called() RAiDER.aria.calcGUNW.tropo_gunw_slc.assert_not_called() + + +@pytest.mark.parametrize('weather_model_name', ['HRRR']) +@pytest.mark.parametrize('interp_method', ['center_time', 'azimuth_time_grid']) +def test_provenance_metadata_for_tropo_group(weather_model_name: str, + tmp_path: Path, + gunw_azimuth_test: Path, + orbit_dict_for_azimuth_time_test: dict[str], + weather_model_dict_for_azimuth_time_test, + weather_model_dict_for_center_time_test, + interp_method, + mocker): + """ + Same mocks as `test_azimuth_timing_interp_against_center_time_interp` above. + """ + + out = gunw_azimuth_test.name.replace('.nc', '__ct_interp.nc') + + out_path = shutil.copy(gunw_azimuth_test, tmp_path / out) + + if interp_method == 'azimuth_time_grid': + mocker.patch('hyp3lib.get_orb.downloadSentinelOrbitFile', + # Hyp3 Lib returns 2 values + side_effect=[ + # For azimuth time + (orbit_dict_for_azimuth_time_test['reference'], ''), + (orbit_dict_for_azimuth_time_test['secondary'], ''), + (orbit_dict_for_azimuth_time_test['secondary'], ''), + ] + ) + + # For prepGUNW + side_effect = [ + # center-time + orbit_dict_for_azimuth_time_test['reference'], + # azimuth-time + orbit_dict_for_azimuth_time_test['reference'], + ] + side_effect = list(map(str, side_effect)) + mocker.patch('eof.download.download_eofs', + side_effect=side_effect) + + # These outputs are not needed since the orbits are specified above + mocker.patch('RAiDER.s1_azimuth_timing.get_slc_id_from_point_and_time', + side_effect=[ + # Azimuth time + ['reference_slc_id'], + # using two "dummy" ids to mimic GUNW sec granules + # See docstring + ['secondary_slc_id', 'secondary_slc_id'], + ]) + + weather_model_path_dict = (weather_model_dict_for_center_time_test + if interp_method == 'center_time' + else weather_model_dict_for_azimuth_time_test) + side_effect = weather_model_path_dict[weather_model_name] + # RAiDER needs strings for paths + side_effect = list(map(str, side_effect)) + mocker.patch('RAiDER.processWM.prepareWeatherModel', + side_effect=side_effect) + iargs = [ + '--file', str(out_path), + '--weather-model', weather_model_name, + '-interp', interp_method + ] + calcDelaysGUNW(iargs) + + # Check metadata + model_times_dict = {'reference': ["20210723T01:00:00", "20210723T02:00:00"], + 'secondary': ["20210711T01:00:00", "20210711T02:00:00"]} + time_dict = {'reference': "20210723T01:50:24", + 'secondary': "20210711T01:50:24"} + for insar_date in ['reference', 'secondary']: + group = f'science/grids/corrections/external/troposphere/HRRR/{insar_date}' + with xr.open_dataset(out_path, group=group) as ds: + center_time = time_dict[insar_date] + model_times_used = model_times_dict[insar_date] + for var in ['troposphereWet', 'troposphereHydrostatic']: + assert ds[var].attrs['time_interpolation_method'] == interp_method + assert ds[var].attrs['scene_center_time'] == center_time + assert ds[var].attrs['model_times_used'] == model_times_used + + +def test_hrrr_availability_check_using_gunw_ids(mocker): + """Hits the HRRR servers and makes sure that for certain dates they are indeed flagged as false + """ + + # All dates in 2023 are available + gunw_id = 'S1-GUNW-A-R-106-tops-20230108_20230101-225947-00078W_00041N-PP-4be8-v3_0_0' + assert check_hrrr_dataset_availablity_for_s1_azimuth_time_interpolation(gunw_id) + + # 2016-08-09 16:00:00 is a missing date + gunw_id = 'S1-GUNW-A-R-106-tops-20160809_20140101-160001-00078W_00041N-PP-4be8-v3_0_0' + assert not check_hrrr_dataset_availablity_for_s1_azimuth_time_interpolation(gunw_id) + + +def test_hyp3_exits_succesfully_when_hrrr_not_available(mocker): + """This test performs the GUNW entrypoint with bucket/prefix provided and only updates the json. + Monkey patches the upload/download to/from s3 and the actual computation. + """ + # 2016-08-09 16:00:00 is a missing date + mocker.patch('RAiDER.aria.prepFromGUNW.check_hrrr_dataset_availablity_for_s1_azimuth_time_interpolation', + side_effect=[False]) + # The gunw id should not have a hyp3 file associated with it + # This call will still hit the HRRR s3 API as done in the previous test + mocker.patch("RAiDER.aws.get_s3_file", side_effect=['hyp3-job-uuid-3ad24/S1-GUNW-A-R-106-tops-20160809_20140101-160001-00078W_00041N-PP-4be8-v3_0_0.nc']) + mocker.patch('RAiDER.aria.prepFromGUNW.check_weather_model_availability') + iargs = [ + '--bucket', 's3://foo', + '--bucket-prefix', 'hyp3-job-uuid-3ad24', + '--weather-model', 'HRRR', + '-interp', 'azimuth_time_grid' + ] + out = calcDelaysGUNW(iargs) + assert out is None + # Ensure calcDelaysGUNW in raider.py ended after it saw HRRR was not available + RAiDER.aria.prepFromGUNW.check_weather_model_availability.assert_not_called() + + +def test_GUNW_workflow_fails_if_a_download_fails(gunw_azimuth_test, orbit_dict_for_azimuth_time_test, mocker): + """Makes sure for azimuth-time-grid interpolation that an error is raised if one of the files fails to + download and does not do additional processing""" + # The first part is the same mock up as done in test_azimuth_timing_interp_against_center_time_interp + # Maybe better mocks could be done - but this is sufficient or simply a factory for this test given + # This is reused so many times. + mocker.patch('hyp3lib.get_orb.downloadSentinelOrbitFile', + # Hyp3 Lib returns 2 values + side_effect=[ + # For azimuth time + (orbit_dict_for_azimuth_time_test['reference'], ''), + (orbit_dict_for_azimuth_time_test['secondary'], ''), + (orbit_dict_for_azimuth_time_test['secondary'], ''), + ] + ) + + # For prepGUNW + side_effect = [ + # center-time + orbit_dict_for_azimuth_time_test['reference'], + # azimuth-time + orbit_dict_for_azimuth_time_test['reference'], + ] + side_effect = list(map(str, side_effect)) + mocker.patch('eof.download.download_eofs', + side_effect=side_effect) + + # These outputs are not needed since the orbits are specified above + mocker.patch('RAiDER.s1_azimuth_timing.get_slc_id_from_point_and_time', + side_effect=[ + # Azimuth time + ['reference_slc_id'], + # using two "dummy" ids to mimic GUNW sec granules + # See docstring + ['secondary_slc_id', 'secondary_slc_id'], + ]) + + # These are the important parts of this test + # Makes sure that a value error is raised if a download fails via a Runtime Error + # There are two weather model files required for this particular mock up. First, one fails. + mocker.patch('RAiDER.processWM.prepareWeatherModel', side_effect=[RuntimeError, 'weather_model.nc']) + mocker.patch('RAiDER.s1_azimuth_timing.get_s1_azimuth_time_grid') + iargs_1 = [ + '--file', str(gunw_azimuth_test), + '--weather-model', 'HRRR', + '-interp', 'azimuth_time_grid' + ] + + with pytest.raises(ValueError): + calcDelaysGUNW(iargs_1) + RAiDER.s1_azimuth_timing.get_s1_azimuth_time_grid.assert_not_called() + + +def test_value_error_for_file_inputs_when_no_data_available(mocker): + """See test_hyp3_exits_succesfully_when_hrrr_not_available above + + In this case if a bucket is specified rather than a file; the program exits successfully! + """ + mocker.patch('RAiDER.aria.prepFromGUNW.check_hrrr_dataset_availablity_for_s1_azimuth_time_interpolation', + side_effect=[False]) + mocker.patch('RAiDER.aria.prepFromGUNW.main') + iargs = [ + '--file', 'foo.nc', + '--weather-model', 'HRRR', + '-interp', 'azimuth_time_grid' + ] + + with pytest.raises(ValueError): + calcDelaysGUNW(iargs) + RAiDER.aria.prepFromGUNW.main.assert_not_called() diff --git a/test/test_HRRR_ztd.py b/test/test_HRRR_ztd.py index 36c040f29..babf21727 100644 --- a/test/test_HRRR_ztd.py +++ b/test/test_HRRR_ztd.py @@ -7,13 +7,14 @@ import numpy as np import xarray as xr +from RAiDER.cli.raider import calcDelays - -def test_scenario_1(): +def test_scenario_1(data_for_hrrr_ztd, mocker): SCENARIO_DIR = os.path.join(TEST_DIR, "scenario_1") test_path = os.path.join(SCENARIO_DIR, 'raider_example_1.yaml') - process = subprocess.run(['raider.py', test_path],stdout=subprocess.PIPE, universal_newlines=True,) - assert process.returncode == 0 + mocker.patch('RAiDER.processWM.prepareWeatherModel', + side_effect=[str(data_for_hrrr_ztd)]) + calcDelays([test_path]) new_data = xr.load_dataset(os.path.join(SCENARIO_DIR, 'HRRR_tropo_20200101T120000_ztd.nc')) new_data1 = new_data.sel(x=-91.84, y=36.84, z=0, method='nearest') @@ -21,8 +22,3 @@ def test_scenario_1(): np.testing.assert_almost_equal(golden_data[0], new_data1['hydro'].data) np.testing.assert_almost_equal(golden_data[1], new_data1['wet'].data) - - # Clean up files - for f in glob.glob(os.path.join(SCENARIO_DIR, 'HRRR*')): - os.remove(f) - shutil.rmtree(os.path.join(SCENARIO_DIR, 'weather_files')) diff --git a/test/test_s1_time_grid.py b/test/test_s1_time_grid.py index 1247166be..0b4102b75 100644 --- a/test/test_s1_time_grid.py +++ b/test/test_s1_time_grid.py @@ -10,7 +10,8 @@ import RAiDER.s1_azimuth_timing from RAiDER.s1_azimuth_timing import ( get_inverse_weights_for_dates, get_n_closest_datetimes, - get_s1_azimuth_time_grid, get_slc_id_from_point_and_time + get_s1_azimuth_time_grid, get_slc_id_from_point_and_time, + get_times_for_azimuth_interpolation ) @@ -180,6 +181,19 @@ def test_n_closest_dts(): datetime.datetime(2023, 1, 2, 0, 0, 0)] assert out == expected + # Make sure if ref_time occurs at model time then we get correct times + n_target_datetimes = 3 + time_step = 1 + dt = datetime.datetime(2023, 1, 2, 0, 0, 0) + out = get_n_closest_datetimes(dt, n_target_datetimes, time_step) + expected = [datetime.datetime(2023, 1, 2, 0, 0, 0), + # Note we order equal distance from ref time by how early it is + datetime.datetime(2023, 1, 1, 23, 0, 0), + datetime.datetime(2023, 1, 2, 1, 0, 0), + ] + + assert out == expected + n_target_datetimes = 2 # Does not divide 24 hours so has period > 1 day. time_step = 5 @@ -335,3 +349,40 @@ def test_duplicate_orbits(mocker, orbit_paths_for_duplicate_orbit_xml_test): assert RAiDER.s1_azimuth_timing.get_slc_id_from_point_and_time.call_count == 1 assert hyp3lib.get_orb.downloadSentinelOrbitFile.call_count == 4 + + +def test_get_times_for_az(): + + # Within 5 minutes of time-step (aka model time) so returns 3 times + dt = datetime.datetime(2023, 1, 1, 11, 1, 0) + out = get_times_for_azimuth_interpolation(dt, 1) + + out_expected = [datetime.datetime(2023, 1, 1, 11, 0, 0), + datetime.datetime(2023, 1, 1, 12, 0, 0), + datetime.datetime(2023, 1, 1, 10, 0, 0)] + + assert out == out_expected + + # Since model time is now 3 hours, we are beyond buffer, so we get 2 times + out = get_times_for_azimuth_interpolation(dt, 3) + out_expected = [datetime.datetime(2023, 1, 1, 12, 0, 0), + datetime.datetime(2023, 1, 1, 9, 0, 0)] + assert out == out_expected + + # Similarly return 2 times if we nudge reference time away from buffer + # When model time is 1 hour + # Note that if we chose 11:30 we would get 2 dates which would both be admissible + dt = datetime.datetime(2023, 1, 1, 11, 29, 0) + out = get_times_for_azimuth_interpolation(dt, 1) + + out_expected = [datetime.datetime(2023, 1, 1, 11, 0, 0), + datetime.datetime(2023, 1, 1, 12, 0, 0)] + + assert out == out_expected + + +def test_error_for_weighting_when_dates_not_unique(): + dates = [datetime.datetime(2023, 1, 1)] * 2 + with pytest.raises(ValueError): + get_inverse_weights_for_dates(np.zeros((3, 3)), + dates) diff --git a/tools/RAiDER/aria/calcGUNW.py b/tools/RAiDER/aria/calcGUNW.py index 54cac6400..12c18cf91 100644 --- a/tools/RAiDER/aria/calcGUNW.py +++ b/tools/RAiDER/aria/calcGUNW.py @@ -20,8 +20,22 @@ DIM_NAMES = ['heightsMeta', 'latitudeMeta', 'longitudeMeta'] -def compute_delays_slc(cube_filenames:list, wavelength): - """ Get delays and convert to radians. Return xr dataset. """ +def compute_delays_slc(cube_filenames: list, wavelength: float) -> xr.Dataset: + """Get delays from standard RAiDER output formatting ouput including radian + conversion and metadata. + + Parameters + ---------- + cube_filenames : list + List of xarray datasets/cubes (for each date) output from raider.py CalcDelays + wavelength : float + Depends on sensor, e.g. for Sentinel-1 it is ~.05 + + Returns + ------- + xr.Dataset + Formatted dataset for GUNW + """ # parse date from filename dct_delays = {} for f in cube_filenames: @@ -30,26 +44,25 @@ def compute_delays_slc(cube_filenames:list, wavelength): sec, ref = sorted(dct_delays.keys()) - wet_delays = [] - hyd_delays = [] + wet_delays = [] + hyd_delays = [] + attrs_lst = [] phase2range = (-4 * np.pi) / float(wavelength) for dt in [ref, sec]: path = dct_delays[dt] with xr.open_dataset(path) as ds: - da_wet = ds['wet'] * phase2range + da_wet = ds['wet'] * phase2range da_hydro = ds['hydro'] * phase2range wet_delays.append(da_wet) hyd_delays.append(da_hydro) - - crs = da_wet.rio.crs - gt = da_wet.rio.transform() + attrs_lst.append(ds.attrs) chunk_sizes = da_wet.shape[0], da_wet.shape[1]/3, da_wet.shape[2]/3 # open one to copy and store new data - ds_slc = xr.open_dataset(path).copy() - encoding = ds_slc['wet'].encoding # chunksizes and fill value + ds_slc = xr.open_dataset(path).copy() + encoding = ds_slc['wet'].encoding # chunksizes and fill value encoding['contiguous'] = False encoding['_FillValue'] = 0. encoding['chunksizes'] = tuple([np.floor(cs) for cs in chunk_sizes]) @@ -68,19 +81,25 @@ def compute_delays_slc(cube_filenames:list, wavelength): ## no data (fill value?) chunk size? for name in TROPO_NAMES: - for key in 'reference secondary'.split(): - descrip = f"Delay due to {name.lstrip('troposphere')} component of troposphere" - da_attrs = {**attrs, 'description':descrip, - 'long_name':name, 'standard_name':name, + for k, key in enumerate(['reference', 'secondary']): + descrip = f"Delay due to {name.lstrip('troposphere')} component of troposphere" + da_attrs = {**attrs, + 'description': descrip, + 'long_name': name, + 'standard_name': name, 'RAiDER version': RAiDER.__version__, + 'model_times_used': attrs_lst[k]['model_times_used'], + 'scene_center_time': attrs_lst[k]['reference_time'], + 'time_interpolation_method': attrs_lst[k]['interpolation_method'] } ds_slc[f'{key}_{name}'] = ds_slc[f'{key}_{name}'].assign_attrs(da_attrs) ds_slc[f'{key}_{name}'].encoding = encoding + ds_slc = ds_slc.assign_attrs(model=model, + method='ray tracing' + ) - ds_slc = ds_slc.assign_attrs(model=model, method='ray tracing') - - ## force these to float32 to prevent stitching errors - coords = {coord:ds_slc[coord].astype(np.float32) for coord in ds_slc.coords} + # force these to float32 to prevent stitching errors + coords = {coord: ds_slc[coord].astype(np.float32) for coord in ds_slc.coords} ds_slc = ds_slc.assign_coords(coords) return ds_slc.rename(z=DIM_NAMES[0], y=DIM_NAMES[1], x=DIM_NAMES[2]) @@ -163,35 +182,31 @@ def update_gunw_version(path_gunw): return -### ------------------------------------------------------------- main function -def tropo_gunw_slc(cube_filenames:list, path_gunw:str, wavelength, out_dir:str, update_gunw:bool): - """ Calculate ref/sec phase delay +def tropo_gunw_slc(cube_filenames: list, + path_gunw: str, + wavelength: float) -> xr.Dataset: + """ + Computes and formats the troposphere phase delay for GUNW from RAiDER outputs. - Requires: + Parameters + ---------- + cube_filenames : list list with filename of delay cube for ref and sec date (netcdf) - path to the gunw file - wavelength (units: m) - output directory (where to store the delays) + path_gunw : str + GUNW netcdf path + wavelength : float + Wavelength of SAR + + Returns + ------- + xr.Dataset + Output cube that will be included in GUNW """ - os.makedirs(out_dir, exist_ok=True) - ds_slc = compute_delays_slc(cube_filenames, wavelength) - da = ds_slc[f'reference_{TROPO_NAMES[0]}'] # for metadata - model = ds_slc.model # write the interferometric delay to disk - ref, sec = os.path.basename(path_gunw).split('-')[6].split('_') - mid_time = os.path.basename(path_gunw).split('-')[7] - dst = os.path.join(out_dir, f'{model}_interferometric_{ref}-{sec}_{mid_time}.nc') - ds_slc.to_netcdf(dst) - logger.info ('Wrote slc delays to: %s', dst) - - ## optionally update netcdf with the slc delay update_gunw_slc(path_gunw, ds_slc) - - ## temp update_gunw_version(path_gunw) + logger.info('Wrote slc delays to: %s', path_gunw) - -if __name__ == '__main__': - main() + return ds_slc diff --git a/tools/RAiDER/aria/prepFromGUNW.py b/tools/RAiDER/aria/prepFromGUNW.py index 26cefca46..c2b4c984e 100644 --- a/tools/RAiDER/aria/prepFromGUNW.py +++ b/tools/RAiDER/aria/prepFromGUNW.py @@ -24,12 +24,55 @@ from RAiDER.utilFcns import rio_open, writeArrayToRaster from RAiDER.logger import logger from RAiDER.models import credentials -from RAiDER.models.hrrr import HRRR_CONUS_COVERAGE_POLYGON, AK_GEO +from RAiDER.models.hrrr import HRRR_CONUS_COVERAGE_POLYGON, AK_GEO, check_hrrr_dataset_availability +from RAiDER.s1_azimuth_timing import get_times_for_azimuth_interpolation ## cube spacing in degrees for each model DCT_POSTING = {'HRRR': 0.05, 'HRES': 0.10, 'GMAO': 0.10, 'ERA5': 0.10, 'ERA5T': 0.10} +def _get_acq_time_from_gunw_id(gunw_id: str, reference_or_secondary: str) -> datetime: + # Ex: S1-GUNW-A-R-106-tops-20220115_20211222-225947-00078W_00041N-PP-4be8-v3_0_0 + if reference_or_secondary not in ['reference', 'secondary']: + raise ValueError('Reference_or_secondary must "reference" or "secondary"') + tokens = gunw_id.split('-') + date_tokens = tokens[6].split('_') + date_token = date_tokens[0] if reference_or_secondary == 'reference' else date_tokens[1] + center_time_token = tokens[7] + cen_acq_time = datetime(int(date_token[:4]), + int(date_token[4:6]), + int(date_token[6:]), + int(center_time_token[:2]), + int(center_time_token[2:4]), + int(center_time_token[4:])) + return cen_acq_time + + + +def check_hrrr_dataset_availablity_for_s1_azimuth_time_interpolation(gunw_id: str) -> bool: + """Determines if all the times for azimuth interpolation are available using Herbie; note that not all 1 hour times + are available within the said date range of HRRR. + + Parameters + ---------- + gunw_id : str + + Returns + ------- + bool + """ + ref_acq_time = _get_acq_time_from_gunw_id(gunw_id, 'reference') + sec_acq_time = _get_acq_time_from_gunw_id(gunw_id, 'secondary') + + model_step_hours = 1 + ref_times_for_interp = get_times_for_azimuth_interpolation(ref_acq_time, model_step_hours) + sec_times_for_interp = get_times_for_azimuth_interpolation(sec_acq_time, model_step_hours) + ref_dataset_availability = list(map(check_hrrr_dataset_availability, ref_times_for_interp)) + sec_dataset_availability = list(map(check_hrrr_dataset_availability, sec_times_for_interp)) + + return all(ref_dataset_availability) and all(sec_dataset_availability) + + def get_slc_ids_from_gunw(gunw_path: str, reference_or_secondary: str = 'reference') -> list[str]: if reference_or_secondary not in ['reference', 'secondary']: @@ -40,7 +83,7 @@ def get_slc_ids_from_gunw(gunw_path: str, return slc_ids -def get_acq_from_slc_id(slc_id: str) -> pd.Timestamp: +def get_acq_time_from_slc_id(slc_id: str) -> pd.Timestamp: ts_str = slc_id.split('_')[5] return pd.Timestamp(ts_str) @@ -70,8 +113,8 @@ def check_weather_model_availability(gunw_path: str, ref_slc_ids = get_slc_ids_from_gunw(gunw_path, reference_or_secondary='reference') sec_slc_ids = get_slc_ids_from_gunw(gunw_path, reference_or_secondary='secondary') - ref_ts = get_acq_from_slc_id(ref_slc_ids[0]) - sec_ts = get_acq_from_slc_id(sec_slc_ids[0]) + ref_ts = get_acq_time_from_slc_id(ref_slc_ids[0]) + sec_ts = get_acq_time_from_slc_id(sec_slc_ids[0]) if weather_model_name == 'HRRR': group = '/science/grids/data/' diff --git a/tools/RAiDER/cli/raider.py b/tools/RAiDER/cli/raider.py index c454c2f45..ea90c3363 100644 --- a/tools/RAiDER/cli/raider.py +++ b/tools/RAiDER/cli/raider.py @@ -21,7 +21,8 @@ from RAiDER.cli.validators import DateListAction, date_type from RAiDER.models.allowed import ALLOWED_MODELS from RAiDER.utilFcns import get_dt -from RAiDER.s1_azimuth_timing import get_s1_azimuth_time_grid, get_inverse_weights_for_dates, get_n_closest_datetimes +from RAiDER.s1_azimuth_timing import get_s1_azimuth_time_grid, get_inverse_weights_for_dates, get_times_for_azimuth_interpolation + import traceback @@ -76,7 +77,7 @@ def read_template_file(fname): params[key] = {} # Parse the user-provided arguments - template = DEFAULT_DICT + template = DEFAULT_DICT.copy() for key, value in params.items(): if key == 'runtime_group': for k, v in value.items(): @@ -259,13 +260,10 @@ def calcDelays(iargs=None): times = get_nearest_wmtimes(t, [model.dtime() if \ model.dtime() is not None else 6][0]) if interp_method == 'center_time' else [t] elif interp_method == 'azimuth_time_grid': - n_target_dates = 3 step = model.dtime() time_step_hours = step if step is not None else 6 - # Will yield 3 dates - times = get_n_closest_datetimes(t, - n_target_dates, - time_step_hours) + # Will yield 2 or 3 dates depending if t is within 5 minutes of time step + times = get_times_for_azimuth_interpolation(t, time_step_hours) else: raise NotImplementedError('Only none, center_time, and azimuth_time_grid are accepted values for ' 'interp_method.') @@ -290,12 +288,13 @@ def calcDelays(iargs=None): msg = f'Downloading and/or preparation of {model._Name} failed.' logger.error(e) logger.error(msg) - + if interp_method == 'azimuth_time_grid': + break # dont process the delays for download only if dl_only: continue - if len(wfiles) == 0: + if (len(wfiles) == 0) and (interp_method != 'azimuth_time_grid'): logger.error('No weather model data was successfully processed.') if len(params['date_list']) == 1: raise RuntimeError @@ -310,7 +309,7 @@ def calcDelays(iargs=None): # only one time in temporal interpolation worked # TODO: this seems problematic - unexpected behavior possibly for 'center_time' - elif len(wfiles)==1 and len(times)==2: + elif (len(wfiles) == 1) and (len(times) == 2) and (interp_method != 'azimuth_time_grid'): logger.warning('Time interpolation did not succeed, defaulting to nearest available date') weather_model_file = wfiles[0] @@ -341,7 +340,12 @@ def calcDelays(iargs=None): os.path.basename(wfiles[0]).split('_')[0] + '_' + t.strftime('%Y_%m_%dT%H_%M_%S') + '_timeInterp_' + '_'.join(wfiles[0].split('_')[-4:]), ) ds.to_netcdf(weather_model_file) - elif (interp_method == 'azimuth_time_grid') and (len(wfiles) == 3): + elif (interp_method == 'azimuth_time_grid'): + n_files = len(wfiles) + n_times = len(times) + if n_files != n_times: + raise ValueError('The model files for the datetimes for requisite azimuth interpolation were not ' + 'succesfully downloaded or processed') datasets = [xr.open_dataset(f) for f in wfiles] # Each model will require some inspection here @@ -361,15 +365,17 @@ def calcDelays(iargs=None): else: raise NotImplementedError('Azimuth Time is currently only implemented for HRRR') - time_grid = get_s1_azimuth_time_grid(lon, - lat, - hgt, - # This is the acq time from loop - t) + time_grid = RAiDER.s1_azimuth_timing.get_s1_azimuth_time_grid(lon, + lat, + hgt, + # This is the acq time from loop + t) if np.any(np.isnan(time_grid)): - raise ValueError('The Time Grid return nans meaning no orbit was downloaded.') - wgts = get_inverse_weights_for_dates(time_grid, times) + raise ValueError('The azimuth time grid return nans meaning no orbit was downloaded.') + wgts = get_inverse_weights_for_dates(time_grid, + times, + temporal_window_hours=model._time_res) # combine datasets ds_out = datasets[0] for var in ['wet', 'hydro', 'wet_total', 'hydro_total']: @@ -386,7 +392,7 @@ def calcDelays(iargs=None): else: n = len(wfiles) raise NotImplementedError(f'The {interp_method} with {n} retrieved weather model files was not well posed ' - 'for the dela current workflow.') + 'for the the delay workflow.') # Now process the delays try: @@ -410,12 +416,18 @@ def calcDelays(iargs=None): else: out_filename = w + # A dataset was returned by the above + # Dataset returned: Cube e.g. GUNW workflow if hydro_delay is None: - # means that a dataset was returned (cubes and station files) ds = wet_delay ext = os.path.splitext(out_filename)[1] out_filename = out_filename.replace('wet', 'tropo') + # data provenance: include metadata for model and times used + times_str = [t.strftime("%Y%m%dT%H:%M:%S") for t in sorted(times)] + ds = ds.assign_attrs(model_name=model._Name, + model_times_used=times_str, + interpolation_method=interp_method) if ext not in ['.nc', '.h5']: out_filename = f'{os.path.splitext(out_filename)[0]}.nc' @@ -425,7 +437,7 @@ def calcDelays(iargs=None): ds.to_netcdf(out_filename, engine="h5netcdf", invalid_netcdf=True) logger.info('\nSuccessfully wrote delay cube to: %s\n', out_filename) - + # Dataset returned: station files, radar_raster, geocoded_file else: if aoi.type() == 'station_file': out_filename = f'{os.path.splitext(out_filename)[0]}.csv' @@ -526,7 +538,7 @@ def downloadGNSS(): ## ------------------------------------------------------------ prepFromGUNW.py -def calcDelaysGUNW(iargs: list[str] = None): +def calcDelaysGUNW(iargs: list[str] = None) -> xr.Dataset: p = argparse.ArgumentParser( description='Calculate a cube of interferometic delays for GUNW files', @@ -577,17 +589,12 @@ def calcDelaysGUNW(iargs: list[str] = None): help='Directory to store results.' ) - p.add_argument( - '-u', '--update-GUNW', default=True, - help='Optionally update the GUNW by writing the delays into the troposphere group.' - ) - - args = p.parse_args(iargs) + iargs = p.parse_args(iargs) - if args.interpolate_time not in ['none', 'center_time', 'azimuth_time_grid']: + if iargs.interpolate_time not in ['none', 'center_time', 'azimuth_time_grid']: raise ValueError('interpolate_time arg must be in [\'none\', \'center_time\', \'azimuth_time_grid\']') - if args.weather_model == 'None': + if iargs.weather_model == 'None': # NOTE: HyP3's current step function implementation does not have a good way of conditionally # running processing steps. This allows HyP3 to always run this step but exit immediately # and do nothing if tropospheric correction via RAiDER is not selected. This should not cause @@ -595,25 +602,40 @@ def calcDelaysGUNW(iargs: list[str] = None): print('Nothing to do!') return - # args.files = glob.glob(args.files) # eventually support multiple files - if not args.file and args.bucket: - args.file = aws.get_s3_file(args.bucket, args.bucket_prefix, '.nc') - if not RAiDER.aria.prepFromGUNW.check_weather_model_availability(args.file, args.weather_model): + if iargs.file and (iargs.weather_model == 'HRRR') and (iargs.interpolate_time == 'azimuth_time_grid'): + file_name = iargs.file.split('/')[-1] + gunw_id = file_name.replace('.nc', '') + if not RAiDER.aria.prepFromGUNW.check_hrrr_dataset_availablity_for_s1_azimuth_time_interpolation(gunw_id): + raise ValueError('The required HRRR data for time-grid interpolation is not available') + + if not iargs.file and iargs.bucket: + # only use GUNW ID for checking if HRRR available + iargs.file = aws.get_s3_file(iargs.bucket, iargs.bucket_prefix, '.nc') + if iargs.weather_model == 'HRRR' and (iargs.interpolate_time == 'azimuth_time_grid'): + file_name_str = str(iargs.file) + gunw_nc_name = file_name_str.split('/')[-1] + gunw_id = gunw_nc_name.replace('.nc', '') + if not RAiDER.aria.prepFromGUNW.check_hrrr_dataset_availablity_for_s1_azimuth_time_interpolation(gunw_id): + print('The required HRRR data for time-grid interpolation is not available; returning None and not modifying GUNW dataset') + return + + # Download file to obtain metadata + if not RAiDER.aria.prepFromGUNW.check_weather_model_availability(iargs.file, iargs.weather_model): # NOTE: We want to submit jobs that are outside of acceptable weather model range # and still deliver these products to the DAAC without this layer. Therefore # we include this within this portion of the control flow. print('Nothing to do because outside of weather model range') return - json_file_path = aws.get_s3_file(args.bucket, args.bucket_prefix, '.json') + json_file_path = aws.get_s3_file(iargs.bucket, iargs.bucket_prefix, '.json') json_data = json.load(open(json_file_path)) - json_data['metadata'].setdefault('weather_model', []).append(args.weather_model) + json_data['metadata'].setdefault('weather_model', []).append(iargs.weather_model) json.dump(json_data, open(json_file_path, 'w')) - elif not args.file: + elif not iargs.file: raise ValueError('Either argument --file or --bucket must be provided') # prep the config needed for delay calcs - path_cfg, wavelength = RAiDER.aria.prepFromGUNW.main(args) + path_cfg, wavelength = RAiDER.aria.prepFromGUNW.main(iargs) # write delay cube (nc) to disk using config # return a list with the path to cube for each date @@ -622,16 +644,16 @@ def calcDelaysGUNW(iargs: list[str] = None): assert len(cube_filenames) == 2, 'Incorrect number of delay files written.' # calculate the interferometric phase and write it out - RAiDER.aria.calcGUNW.tropo_gunw_slc(cube_filenames, - args.file, - wavelength, - args.output_directory, - args.update_GUNW) + ds = RAiDER.aria.calcGUNW.tropo_gunw_slc(cube_filenames, + iargs.file, + wavelength, + ) # upload to s3 - if args.bucket: - aws.upload_file_to_s3(args.file, args.bucket, args.bucket_prefix) - aws.upload_file_to_s3(json_file_path, args.bucket, args.bucket_prefix) + if iargs.bucket: + aws.upload_file_to_s3(iargs.file, iargs.bucket, iargs.bucket_prefix) + aws.upload_file_to_s3(json_file_path, iargs.bucket, iargs.bucket_prefix) + return ds ## ------------------------------------------------------------ processDelays.py diff --git a/tools/RAiDER/delay.py b/tools/RAiDER/delay.py index f3c4e3166..f02bd4592 100755 --- a/tools/RAiDER/delay.py +++ b/tools/RAiDER/delay.py @@ -356,7 +356,7 @@ def writeResultsToXarray(dt, xpts, ypts, zpts, crs, wetDelay, hydroDelay, weathe source=os.path.basename(weather_model_file), history=str(datetime.utcnow()) + " RAiDER", description=f"RAiDER geo cube - {out_type}", - reference_time=str(dt), + reference_time=dt.strftime("%Y%m%dT%H:%M:%S"), ), ) diff --git a/tools/RAiDER/models/hrrr.py b/tools/RAiDER/models/hrrr.py index 3f9c9c6d6..e6f8fe685 100644 --- a/tools/RAiDER/models/hrrr.py +++ b/tools/RAiDER/models/hrrr.py @@ -24,6 +24,15 @@ AK_GEO = gpd.read_file(Path(__file__).parent / 'data' / 'alaska.geojson.zip').geometry.unary_union +def check_hrrr_dataset_availability(dt: datetime) -> bool: + """Note a file could still be missing within the models valid range""" + H = Herbie(dt, + model='hrrr', + product='nat', + fxx=0) + avail = (H.grib_source is not None) + return avail + def download_hrrr_file(ll_bounds, DATE, out, model='hrrr', product='nat', fxx=0, verbose=False): ''' Download a HRRR weather model using Herbie @@ -59,16 +68,21 @@ def download_hrrr_file(ll_bounds, DATE, out, model='hrrr', product='nat', fxx=0, raise ValueError ds_out = None - - for ds in ds_list: - if 'isobaricInhPa' in ds._coord_names: - ds_out = ds - coord = 'isobaricInhPa' - break - elif 'hybrid' in ds._coord_names: - ds_out = ds - coord = 'hybrid' - break + # Note order coord names are request for `test_HRRR_ztd` matters + # when both coord names are retreived by Herbie is ds_list possibly in + # Different orders on different machines; `hybrid` is what is expected for the test. + ds_list_filt_0 = [ds for ds in ds_list if 'hybrid' in ds._coord_names] + ds_list_filt_1 = [ds for ds in ds_list if 'isobaricInhPa' in ds._coord_names] + if ds_list_filt_0: + ds_out = ds_list_filt_0[0] + coord = 'hybrid' + # I do not think that this coord name will result in successful processing nominally as variables are + # gh,gribfile_projection for test_HRRR_ztd + elif ds_list_filt_1: + ds_out = ds_list_filt_1[0] + coord = 'isobaricInhPa' + else: + raise RuntimeError('Herbie did not obtain an HRRR dataset with the expected layers and coordinates') # subset the full file by AOI x_min, x_max, y_min, y_max = get_bounds_indices( @@ -79,11 +93,10 @@ def download_hrrr_file(ll_bounds, DATE, out, model='hrrr', product='nat', fxx=0, # bookkeepping ds_out = ds_out.rename({'gh': 'z', coord: 'levels'}) - ny, nx = ds_out['longitude'].shape # projection information ds_out["proj"] = int() - for k, v in CRS.from_user_input(ds.herbie.crs).to_cf().items(): + for k, v in CRS.from_user_input(ds_out.herbie.crs).to_cf().items(): ds_out.proj.attrs[k] = v for var in ds_out.data_vars: ds_out[var].attrs['grid_mapping'] = 'proj' @@ -102,7 +115,6 @@ def download_hrrr_file(ll_bounds, DATE, out, model='hrrr', product='nat', fxx=0, ds_out['x'] = xs ds_out['y'] = ys - ds_sub = ds_out.isel(x=slice(x_min, x_max), y=slice(y_min, y_max)) ds_sub.to_netcdf(out, engine='netcdf4') diff --git a/tools/RAiDER/s1_azimuth_timing.py b/tools/RAiDER/s1_azimuth_timing.py index 2dce19b11..70a44ee24 100644 --- a/tools/RAiDER/s1_azimuth_timing.py +++ b/tools/RAiDER/s1_azimuth_timing.py @@ -214,7 +214,8 @@ def get_n_closest_datetimes(ref_time: datetime.datetime, Returns ------- list[datetime.datetime] - List of closest dates ordered by absolute proximity + List of closest dates ordered by absolute proximity. If two dates have same distance to ref_time, + choose earlier one (more likely to be available) """ iterations = int(np.ceil(n_target_times / 2)) closest_times = [] @@ -226,19 +227,74 @@ def get_n_closest_datetimes(ref_time: datetime.datetime, ts = pd.Timestamp(ref_time) for k in range(iterations): - ts_0 = pd.Timestamp(ref_time) - pd.Timedelta(hours=(time_step_hours * k)) - ts_1 = pd.Timestamp(ref_time) + pd.Timedelta(hours=(time_step_hours * k)) + ts_0 = ts - pd.Timedelta(hours=(time_step_hours * k)) + ts_1 = ts + pd.Timedelta(hours=(time_step_hours * k)) t_ceil = ts_0.floor(f'{time_step_hours}H') t_floor = ts_1.ceil(f'{time_step_hours}H') - - closest_times.extend([t_ceil, t_floor]) - closest_times = sorted(closest_times, key=lambda ts_rounded: abs(ts - ts_rounded)) + # In the event that t_floor == t_ceil for k = 0 + out_times = list(set([t_ceil, t_floor])) + closest_times.extend(out_times) + # if 2 times have same distance to ref_time, order times by occurence (earlier comes first) + closest_times = sorted(closest_times, key=lambda ts_rounded: (abs(ts - ts_rounded), ts_rounded)) closest_times = [t.to_pydatetime() for t in closest_times] closest_times = closest_times[:n_target_times] return closest_times +def get_times_for_azimuth_interpolation(ref_time: datetime.datetime, + time_step_hours: int, + buffer_in_seconds: int = 300) -> list[datetime.datetime]: + """Obtains times needed for azimuth interpolation. Filters 3 closests dates from ref_time + so that all returned dates are within `time_step_hours` + `buffer_in_seconds`. + + This ensures we request dates that are really needed. + ``` + dt = datetime.datetime(2023, 1, 1, 11, 1, 0) + get_times_for_azimuth_interpolation(dt, 1) + ``` + yields + ``` + [datetime.datetime(2023, 1, 1, 11, 0, 0), + datetime.datetime(2023, 1, 1, 12, 0, 0), + datetime.datetime(2023, 1, 1, 10, 0, 0)] + ``` + whereas + ``` + dt = datetime.datetime(2023, 1, 1, 11, 30, 0) + get_times_for_azimuth_interpolation(dt, 1) + ``` + yields + ``` + [datetime.datetime(2023, 1, 1, 11, 0, 0), + datetime.datetime(2023, 1, 1, 12, 0, 0)] + ``` + + Parameters + ---------- + ref_time : datetime.datetime + A time of acquisition + time_step_hours : int + Weather model time step, should evenly divide 24 hours + buffer_in_seconds : int, optional + Buffer for filtering absolute times, by default 300 (or 5 minutes) + + Returns + ------- + list[datetime.datetime] + 2 or 3 closest times within 1 time step (plust the buffer) and the reference time + """ + # Get 3 closest times + closest_times = get_n_closest_datetimes(ref_time, 3, time_step_hours) + + def filter_time(time: datetime.datetime): + absolute_time_difference_sec = abs((ref_time - time).total_seconds()) + upper_bound_seconds = time_step_hours * 60 * 60 + buffer_in_seconds + return absolute_time_difference_sec < upper_bound_seconds + out_times = list(filter(filter_time, closest_times)) + return out_times + + def get_inverse_weights_for_dates(azimuth_time_array: np.ndarray, dates: list[datetime.datetime], inverse_regularizer: float = 1e-9, @@ -247,7 +303,8 @@ def get_inverse_weights_for_dates(azimuth_time_array: np.ndarray, azimuth timing array and dates. The output will be a list with length equal to that of dates and whose entries are arrays each whose shape matches the azimuth_timing_array. - Note: we do not do any checking of the dates provided so the inferred `temporal_window_hours` may be incorrect. + Note: we do not do any checking of the provided dates outside that they are unique so the inferred + `temporal_window_hours` may be incorrect. Parameters ---------- @@ -269,6 +326,11 @@ def get_inverse_weights_for_dates(azimuth_time_array: np.ndarray, list[np.ndarray] Weighting per pixel with respect to each date """ + n_unique_dates = len(set(dates)) + n_dates = len(dates) + if n_unique_dates != n_dates: + raise ValueError('Dates provided must be unique') + if not all([isinstance(date, datetime.datetime) for date in dates]): raise TypeError('dates must be all datetimes') if temporal_window_hours is None: