Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make some dependencies optional #252

Closed
sandorkertesz opened this issue Nov 7, 2023 · 3 comments · Fixed by #269
Closed

Make some dependencies optional #252

sandorkertesz opened this issue Nov 7, 2023 · 3 comments · Fixed by #269
Assignees

Comments

@sandorkertesz
Copy link
Collaborator

What maintenance does this project need?

Requirements file needs to be revised for earthkit-data as it installs too many things. We could solve that through optional dependencies in the toml config, for instance having an option to install all dependencies, or just a subset, or just the basic ones.

Organisation

ECMWF

@sandorkertesz sandorkertesz self-assigned this Nov 7, 2023
@sandorkertesz
Copy link
Collaborator Author

sandorkertesz commented Nov 16, 2023

At the moment, there are these sources requiring additional packages/libraries:

  • "cds", "ads": cdsapi
  • "mars": ecmwf-api-client
  • "ecmwf-opendata": ecmwf-opendata
  • "polytope": polytope-client
  • "wekeo", "wekeo-cds": hda
  • "fdb": pyfdb

We have two options to make any of them optional:

  1. turn the source into a plugin. E.g. the "polytope" source can be implemented as the"earthkit-data-polytope" plugin. So it will only be available when the user installs it (and this will install all its dependencies)
  2. make the dependency optional in the config and print a proper error message when someone wants to use a source where the dependency is not installed.

@sandorkertesz
Copy link
Collaborator Author

sandorkertesz commented Nov 16, 2023

There is also the question of the supported data file formats. The following list contains the current formats and their (Python) dependencies:

"grib": eccodes, (cfgrib, xarray for to_array)
"bufr": pdbufr
"odb": pyodc, (pandas for to_pandas)
"netcdf": netcdf4, xarray
"csv": (pandas for to_pandas)
"geojson: (geopandas for to_pandas)

Please note that when xarray is used currently dask is required too.

@tlmquintino
Copy link
Member

If we start by just making the dependency optional, I think it will be simpler/quicker.
Then later we can consider plugins, and separate repos etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants