-
-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decouple pandera and pandas when using it with polars #1847
Comments
One thing that might be interesting along these lines is the narwhals package. I have not taken too much of a deep dive into it, but from what I've seen using their API could help build a unified back-end that doesn't require any one framework. |
I think @baldwinj30 is on to something. Narwhals is well funded, the existing api is great, and they are working on adding support for DuckDB as a backend. In theory, Ibis does this, too, but Ibis also adds Pandas as a dependency. Narwhals is designed to be lean and mean. |
Oh, there's already an existing issue tracking this: |
folks watching this issue might be happy to know this PR will address this issue. It does introduce a breaking change in the way some folks install pandera: if you relied on pandera to install pandas, then you'll need to update your requirements to explicitly install pandas. I'm counting on this being a rare case, but the good news is it's an easy fix on the user side. |
Is your feature request related to a problem? Please describe.
I am using pandera for a project with polars. I noticed that, installing pandera, pandas is installed too, since it is a requirement for pandera. The impact is an increased build size of 80+MB (size of pandas, numpy and pyarrow packages) that polars do not need, as well as pandera when checking polars DataFrame. I tried to uninstall pandas manually with pip and, despite all the imports in my package are either
pandera.polars
orpandera.typing.polars
, the runtime fails because pandas is not installed. This because, in the pandera__init__
file, it importspandera.backends
, that eventually importspandas
.Describe the solution you'd like
I would like to use pandera with polars without installing pandas.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: