-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set integration tests #53
base: main
Are you sure you want to change the base?
Conversation
Added integration tests for hres_ic Added development environment
8f2ecfb
to
fb5ffa8
Compare
CI / Verify Conda Build (pull_request) Failing after 3m I am not familiar with this CI test. Can you please explain what it is trying to do, why it fails, and the significance of the failure? |
I already opened a related issue (#54). More information on the (likely) reason there. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not suggesting you change what you've done for this PR, but I think the testing in the shell script could have been accomplished with a decorated pytest
function. This would be compact, easier to maintain and pytest
could then take care of all the accounting for what tests failed.
A downside is that parallel tests are not currently supported by pytest
, so perhaps it is not a viable solution. Parallelisation could be done at the CI workflow level, but that would run counter to some of the simiplicity argument above.
# Basic binary compatibility test script for replace_landsurface. | ||
# See INTEGRATION_README.md for details on test usage, data and options. | ||
|
||
TEST_DATA_DIR=/g/data/vk83/testing/data/replace_landsurface/integration_tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple /
in a path is treated like a single /
, i.e. //
is the same as /
.
So it is always a good idea to put a trailing slash in directory paths, then you are guaranteed of the correct behaviour when they concatenated with subdirectories or files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
I am going to add it.
INPUT_DIR=$TEST_DATA_DIR/input_data | ||
OUTPUT_DIR=$TEST_DATA_DIR/expected_outputs | ||
DRIVING_DATA_DIR=$TEST_DATA_DIR/driving_data | ||
CLEAN_OUTPUT=true | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INPUT_DIR=$TEST_DATA_DIR/input_data | |
OUTPUT_DIR=$TEST_DATA_DIR/expected_outputs | |
DRIVING_DATA_DIR=$TEST_DATA_DIR/driving_data | |
CLEAN_OUTPUT=true | |
INPUT_DIR=${TEST_DATA_DIR}/input_data | |
OUTPUT_DIR=${TEST_DATA_DIR}/expected_outputs | |
DRIVING_DATA_DIR=${TEST_DATA_DIR}/driving_data | |
CLEAN_OUTPUT=true |
It is good practice to enclose shell variables in curly braces, {}
, especially when concatenating strings. It won't have a material in effect here, but it's good practice to avoid strange errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually don't use curly braces when it's not necessary (like in this case).
But it's true, some nasty unexpected errors could always happen and putting them won't hurt, so I'm gonna add them.
|
||
# Create a temporary work directory | ||
WORK_DIR=$(mktemp -d) | ||
echo -e "Work directory: $WORK_DIR\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above. I'd advise using curly braces, but it's a style thing and I'm not going to insist, just suggest.
rm -rf "$WORK_DIR" | ||
echo "Work directory cleaned up." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rm -rf
gives me the heebie-jeebies. If ${WORK_DIR}
points to somewhere important ... like ${HOME}
ugh!
Will there be folders and files under ${WORK_DIR}
? If just files a 2-step rm -f ${WORK_DIR}/* && rmdir ${WORK_DIR}
would be safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I know rm -rf
is scary.
Unfortunately, ${WORK_DIR}
contains files and folders, so your option would not work.
${WORK_DIR}
is hardcoded in this case, which makes rm -rf ${WORK_DIR}
safe enough in my opinion.
But I am happy to find alternative solutions to avoid having to use rm -rf
:D
rm -rf "$WORK_DIR" | ||
echo "Script was terminated preventively." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
The tests are designed to be run on Gadi within a Python environment where `replace_landsurface` is installed as a development package (for instructions refer to the Installation paragraph in the README). | ||
|
||
Usage: | ||
regression_tests.sh [-h] [--keep] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regression_tests.sh [-h] [--keep] | |
integration_tests.sh [-h] [--keep] |
?
|
||
# create name of file to be replaced | ||
ic_file = ic_file_fullpath.parts[-1].replace('.tmp', '') | ||
|
||
# create date/time useful information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a code change here that doesn't appear to be covered by the intent of this PR. Is it worth mentioning why this is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops I missed this change.
Actually, this is not needed for the tests to succeed.
That line is just redundant because the ic_file
variable is never used later in the file.
However, it shouldn't have been changed in this PR.
I will revert the changes.
I agree, using |
This PR sets up the integration tests.
Fixes #46 (after all tests are added)
Overview
The
integration_tests.sh
script executes various tests in parallel, for all the entry points of the package, using different options to test multiple cases.All multiple tests are executed (in parallel) to completion, even if any of them fails.
When all the tests are completed, the script fails if any of the tested returned a non-zero exit code, and a message is printed with information about the failed tests.
The tests are designed to be run on
Gadi
within aPython
environment wherereplace_landsurface
is installed as a development package. Instruction for the installation of the development environment have been added in the README.Data
All test data is located in `/g/data/vk83/testing/data/replace_landsurface/integration_tests`. At the start of the script, this data gets copied to a temporary working directory where the tests are run, to avoid modifying the original input data.Important
This is especially important because the
replace_lansurface
functions replace the input data with the new modified version, therefore copying the input data before running the script is essential to avoid changing the original testing data.This could only be avoided by adding an
--output
option to the functions, that could default to the input file name (replacing the file and therefore keeping the behaviour as it is currently).Tests performed
hres_ic
with--type era5land
hres_ic
with--type barra
hres_ic
with--type astart
hres_eccb
with--type era5land
(CURRENTLY NOT AVAILABLE)hres_eccb
with--type barra
(CURRENTLY NOT AVAILABLE)Important
Tests 4 and 5 are still not present in the testing script, as data for these tests needs to be provided (more information in #46).
I suggest waiting for these tests to be added before merging this PR.
Profiling
The test currently takes around 2:50m to complete, of which ~1m is spent to copy the files (there is currently a portion in the test script that prints roughly prints out the time spent for copying the data. This was only added for testin/profiling purposes and could be removed before merging).
Adding more tests would not increase much the time spent to run the actual tests (because they are run in parallel), but would increase the time spent to copy the data, especially because the input data is pretty big (~ 45GB at the moment, with the potential to be ~ 75GB when the 2 remaining tests are added).
I highly suggest adding the
--output
option mentioned hereFurther information and usage instructions can be found in the INTEGRATION_README.