Skip to content

Failure to find dependency that was installed from extra index url #599

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cornelius-braun opened this issue Apr 25, 2023 · 11 comments
Open
Assignees
Labels
bug Something isn't working component:dep-sources Dependency sources

Comments

@cornelius-braun
Copy link

Bug description

I created a requirements file for my project using pip-compile. To get the correct version, I added an extra url for the torch installation, resulting in the following command:

pip-compile --extra-index-url  https://download.pytorch.org/whl/cpu

This gives me the following requirements.txt

filelock==3.12.0
jinja2==3.1.2
markupsafe==2.1.2
mpmath==1.3.0
networkx==3.1
sympy==1.11.1
torch==2.0.0
typing-extensions==4.5.0

When I run pip-audit on this, I get the issue that torch is skipped from the auditing:

No known vulnerabilities found
Name  Skip Reason
----- ------------------------------------------------------------------------
torch Dependency not found on PyPI and could not be audited: torch (2.0.0+cpu)

Is this a bug or am I misusing pip-audit?

Reproduction steps

I generated my requirements using

pip-compile --extra-index-url  https://download.pytorch.org/whl/cpu
pip-sync

Then I ran

pip-audit -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cpu

Expected behavior

Auditing of all packages including torch.

Screenshots and logs

The logs are as follows:

DEBUG:pip_audit._cli:Auditing filelock (3.12.0)
DEBUG:pip_audit._cli:Auditing Jinja2 (3.1.2)
DEBUG:pip_audit._cli:Auditing MarkupSafe (2.1.2)
DEBUG:pip_audit._cli:Auditing mpmath (1.3.0)
DEBUG:pip_audit._cli:Auditing networkx (3.1)
DEBUG:pip_audit._cli:Auditing sympy (1.11.1)
DEBUG:pip_audit._service.pypi:Dependency not found on PyPI and could not be audited: torch (2.0.0+cpu)
DEBUG:pip_audit._cli:Auditing typing_extensions (4.5.0)
No known vulnerabilities found
Name  Skip Reason
----- ------------------------------------------------------------------------
torch Dependency not found on PyPI and could not be audited: torch (2.0.0+cpu)

Platform information

  • OS name and version: Ubuntu 20.04.4 LTS (Focal Fossa)
  • pip-audit version (pip-audit -V): 2.5.4
  • Python version (python -V or python3 -V): 3.11
  • pip version (pip -V or pip3 -V): 23.1
@cornelius-braun cornelius-braun added the bug-candidate Might be a bug. label Apr 25, 2023
@woodruffw
Copy link
Member

Thanks for the report @cornelius-braun, and for filling our each section! We greatly appreciate it.

Is this a bug or am I misusing pip-audit?

This is a somewhat tricky case:

  1. pip-audit fundamentally relies on PyPI for vulnerability information, which means that it can only supply vulnerability reports for packages that appear on PyPI. 2.0.0+cpu is a distinct version from 2.0.0 and the former is only on your extra index, so pip-audit is arguably correct in reporting that it couldn't find a auditable dependency with that name and version on PyPI.
  2. At the same time, 2.0.0+cpu is really just 2.0.0 with a PEP 440 "local version identifier" of cpu. These are applied most often by Linux and similar distributions, e.g. foopkg-1.0.0+ubuntu.0. Local versions are sometimes considered equivalent to their non-local counterparts, but not always: they sometimes carry extra patches, or imply different build processes, dependencies, etc. I don't know a ton about Torch, but I suspect that the cpu local tag implies that something is different about the build here; given that, I'm not sure if it would be sound of us to return non-local audit results for it.

To summarize: this boils down to a question of whether pip-audit should consider "bare" and "local tagged" versions with the same basic version the same for auditing purposes, i.e. whether we should normalize 2.0.0+cpu to 2.0.0.

Argument for: Even when different, vulnerabilities reported in X.Y.Z may be of interest to people running X.Y.Z+foo. We should err on the side of caution and report vulnerabilities for the same "base" version, since it's a stronger signal than not.

Argument against: When a package reports its version as X.Y.Z+foo, they're telling us something different and important than X.Y.Z. We arguably shouldn't override that intent.

CC @tetsuo-cpp and @di for thoughts. I'm personally inclined to say that we should support "normalizing" local versions into their "base" version, although perhaps behind an option or flag that isn't enabled by default.

@di
Copy link
Member

di commented Apr 25, 2023

Agreed. PyPI can't distribute vulnerability data for releases that aren't on PyPI (no matter how similar the version numbers look).

@cornelius-braun, I'm curious, when you saw the "Dependency not found on PyPI and could not be audited", was this clear enough? Is there more we could do here to say "you installed something we've never seen before, we have no way to tell you if there are known vulnerabilities for it?"

(As an aside, if we standardized the vulnerability API, the pytorch index could offer vulnerability details here, but that is a much bigger effort)

@cornelius-braun
Copy link
Author

Thank you both for your elaborate replies!

@cornelius-braun, I'm curious, when you saw the "Dependency not found on PyPI and could not be audited", was this clear enough? Is there more we could do here to say "you installed something we've never seen before, we have no way to tell you if there are known vulnerabilities for it?"

To me, it was clear that you could not find information about the torch installation because it was not found on PyPi.

Since an --extra-index-url flag is supported, I was not sure, however, if you were checking for other vulnerability sources as well, as this permits to install packages from outside of PyPi.

Based on your explanations, your procedure now makes complete sense to me.

@di
Copy link
Member

di commented Apr 25, 2023

I think we do want to support this eventually, but we could make it more clear that it's not currently supported.

@woodruffw
Copy link
Member

Agreed! I think we can improve the user experience here with the following:

If the user passes --index-url or --extra-index-url, we should emit a warning telling them that the PyPI vulnerability source won't necessarily report vulnerabilities for dependencies resolved from their sources.

@woodruffw woodruffw added bug Something isn't working component:dep-sources Dependency sources and removed bug-candidate Might be a bug. labels Apr 25, 2023
@woodruffw
Copy link
Member

Assigned to both myself and @tnytown, we'll triage it based on availability during the sprints.

@tufanalbayrak
Copy link

Hi. I also have the same problem. pip-audit fails to find cpu versions of torch and torchvision on PyPI. Is there any progress here? Thanks.

@woodruffw
Copy link
Member

Hi. I also have the same problem. pip-audit fails to find cpu versions of torch and torchvision on PyPI. Is there any progress here? Thanks.

That sounds like a different issue, since this issue is about third party index URL handling. Could you please file a separate issue and include an example for us to reproduce your problem with?

@a-recknagel
Copy link

Similar issue here, but in my case it's a private package that doesn't appear on the global pypi at all. And if it did, I'd have to treat it as a potential MIDM attack. Same as the other poster, I assumed that setting --index-url to the index where I downloaded the package from would solve the issue, which it didn't. Standardizing the vulnerability API would solve my problem.

@woodruffw
Copy link
Member

Standardizing the vulnerability API would solve my problem.

FWIW, this is not something pip-audit can do unilaterally: the PEP 503 and PEP 691 index APIs currently have no way to report vulnerability information, so someone would need to write a new PEP that adds vulnerability metadata to one (or both) of them.

At the moment, we rely on a PyPI-specific API (documented here) to retrieve vulnerability information.

Another route forwards here would be to allow people to provide their own OSV-compatible API; #805 and #810 would enable that. This would be easier from a standards perspective since it isn't tied to the index APIs. I'm curious if that would be suitable for your use case?

@a-recknagel
Copy link

Thanks for the links. Yeah, I was reading the discuss thread trying to understand the amount of work, looks like it'd be a good while off, if ever. Our index is hosted by gitlab, and they're usually not the fastest in implementing features, so a solution where the vulnerability info is hosted somewhere that is not necessarily the index itself would be the most useful. In order to know which kind of API would serve best, I'd need to do some reading first, but I imagine that either works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component:dep-sources Dependency sources
Projects
None yet
Development

No branches or pull requests

6 participants