Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pgc_ortho fails with "ERROR in stats calculation" when using non-standard filenames #9

Open
7yl4r opened this issue May 30, 2018 · 1 comment

Comments

@7yl4r
Copy link
Contributor

7yl4r commented May 30, 2018

I was quite surprised to find that this error I was seeing is due to a filenaming issue.

utils.get_sensor returns vendor=None if the input filename does not match the regexes there. This causes a function to return 1 later when checking for vendor.

Ideally vendor and sensor could be passed in a CLI args, but I can see that would be a bigger refactor than I want to put in right now. At a minimum I would expect get_sensor throws an exception or a more informative error message is shown when the filename doesn't match.

@7yl4r
Copy link
Contributor Author

7yl4r commented Dec 6, 2022

This is still an issue and it can be easily resolved by modifying the hardcoded regex signatures:

imagery_utils/lib/utils.py

Lines 115 to 128 in 4081ea6

### Regex signatures to identify file vendor, mode, kind, and create the name_dict
RAW_DG = "(?P<ts>\d\d[a-z]{3}\d{8})-(?P<prod>\w{4})?(?P<tile>\w+)?-(?P<oid>\d{12}_\d\d)_(?P<pnum>p\d{3})"
RENAMED_DG = "(?P<snsr>\w\w\d\d)_(?P<ts>\d\d[a-z]{3}\d{9})-(?P<prod>\w{4})?(?P<tile>\w+)?-(?P<catid>[a-z0-9]+)"
RENAMED_DG2 = "(?P<snsr>\w\w\d\d)_(?P<ts>\d{14})_(?P<catid>[a-z0-9]{16})"
RAW_GE = "(?P<snsr>\d[a-z])(?P<ts>\d{6})(?P<band>[a-z])(?P<said>\d{9})(?P<prod>\d[a-z])(?P<pid>\d{3})(?P<siid>\d{8})(?P<ver>\d)(?P<mono>[a-z0-9])_(?P<pnum>\d{8,9})"
RENAMED_GE = "(?P<snsr>\w\w\d\d)_(?P<ts>\d{6})(?P<band>\w)(?P<said>\d{9})(?P<prod>\d\w)(?P<pid>\d{3})(?P<siid>\d{8})(?P<ver>\d)(?P<mono>\w)_(?P<pnum>\d{8,9})"
RAW_IK = "po_(?P<po>\d{5,7})_(?P<band>[a-z]+)_(?P<cmp>\d+)"
RENAMED_IK = "(?P<snsr>[a-z]{2}\d\d)_(?P<ts>\d{12})(?P<siid>\d+)_(?P<band>[a-z]+)_(?P<lat>\d{4}[ns])"

In my testing I found:

  • the RAW_DG string will always fail because snsr is not in that regex and thus cannot be captured. Then sat will remain None, and a TypeError will raise on sat.upper()
  • the RENAMED_DG string contains a very strange time string with length of 9 digits. DDHHMMSS is only 8 digits so I don't know what the last digit is supposed to be. We only got things working by adding a 0.

The extraction of metadata from filenames should be reworked here. As is it is non-standard, undocumented, and non-configurable.

Additionally: the sat information should probably just be read from the .xml file so we don't even need to worry about the filename.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant