Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check input file workflow and refactor replace #28

Open
atteggiani opened this issue Aug 5, 2024 · 0 comments
Open

Check input file workflow and refactor replace #28

atteggiani opened this issue Aug 5, 2024 · 0 comments

Comments

@atteggiani
Copy link
Collaborator

atteggiani commented Aug 5, 2024

Overview

Currently (version 1.0.1), the hres_eccb.py and hres_ic.py scripts expect input files ($ICFILE and $ECCBFILE, respectively) having the suffix .tmp at the end.

However, this suffix is trimmed in the internal scripts replace_landsurface_with_ERA5land_IC.py and replace_landsurface_with_BARRA2R_IC.py.

Specifically the file used as an input (ff_in) is trimmed, while the output file (ff_out) stays the same.
Example from the replace_landsurface_with_ERA5land_IC.py script:

ff_in = ic_file_fullpath.as_posix().replace('.tmp', '')
# Path to output file
ff_out = ic_file_fullpath.as_posix()
)

This means the actual filepath used for the input file is the one without .tmp suffix.

Additionally, the script doesn't preserve the "original" file. It instead overwrites it in 2 steps:

  1. First it writes a file with the same path as the input but with .tmp appended.
  2. Then, it moves the file by trimming the appended .tmp (effectively replacing the original input file).
    These 2 steps happen in 2 consecutive "processing lines" (the lines belong to two different functions, but one is at the end of the function, the other is at the start of a function called immediately next. So, regarding the whole execution of the program, they happen one right after the other).

Solution

I suggest setting the input file for the hres_eccb.py and hres_ic.py scripts as the original initial/external condition filepaths (without .tmp).
If the original file needs to be preserved (thing that does not happen at the moment), I suggest renaming the original file before the new output is written.

This simplifies the code overall.

Other note

On top of that, I suggest refactoring the statement filepath.replace('.tmp', '') (or similar) with a different logic.
This because the str.replace(oldvalue, newvalue) method in Python replaces all occurrences of oldvalue with newvalue.
This would create problems if we have an input with the string .tmp in the middle of the name (for example a file called input.tmp_astart).
What we want, instead, is to only trim the suffix we create (in this case .tmp) from the end of the path.

I suggest the new logic could be something like:

new_filepath = filepath[:-4] if filepath.endswith(`.tmp`)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo ⏳
Development

No branches or pull requests

1 participant