Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix duplicate packages with multiple conflicting extras declared #11513

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

BurntSushi
Copy link
Member

This implements a somewhat of a cop-out fix for #11479, where the lock
file produced was missing some conflict markers. This in turn could lead
to multiple versions of the same package being installed into the same
environment.

(What follows is one of the commit messages that gets into the weeds
about the specific problem here.)

The particular example I honed in on here was the e3nn -> sympy 1.13.1
and e3nn -> sympy 1.13.3 dependency edges. In particular, while the
former correctly has a conflict marker, the latter's conflict marker was
getting simplified to true. This makes the edges trivially
overlapping, and results in both of them getting installed
simultaneously. (A similar problem happens for the e3nn -> torch
dependency edges.)

Why does this happen? Well, conflict marker simplification works by
detecting which extras are known to be enabled (and disabled) for each
node in the graph. This ends up being expressed as a set of sets, where
each inner set contains items corresponding to "extras is included" or
"extra is excluded."

The logic then is if all of these sets are satisfied by the conflict
marker on the dependency edge, then this conflict marker can be
simplified by assuming all of the inclusions/exclusions to be true.

In this particular case, we run into an issue where the set of
assumptions discovered for e3nn is:

{test[sevennet]}, {}, {~test[m3gnet], ~test[alignn], test[all]}

And the corresponding conflict marker for e3nn -> sympy 1.13.1 is:

extra == 'extra-4-test-all'
or extra == 'extra-4-test-chgnet'
or (extra != 'extra-4-test-alignn' and extra != 'extra-4-test-m3gnet')

And the conflict marker for e3nn -> sympy 1.13.3 is:

extra == 'extra-4-test-alignn' or extra == 'extra-4-test-m3gnet'

Evaluating each of the sets above for sympy 1.13.1's conflict
marker results in them all being true. Simplifying in turn results in
the marker being true. For sympy 1.13.3, not all of the sets are
satisfied, so this marker is not simplified.

I think the fundamental problem here is that our inferences aren't quite
rich enough to make these logical leaps. In particular, the conflict
marker for e3nn -> sympy 1.13.3 is not satisfied by any of our sets.
One might therefore conclude that this dependency edge is impossible.
But! The test[sevennet] set doesn't actually rule out test[m3gnet]
from being included, for example, because there is no conflict. So it is
actually possible for this marker to evaluate to true.

And I think this reveals the problem: for the e3nn -> sympy 1.13.1
conflict marker, the inferences don't capture the fact that
test[sevennet] might have test[m3gnet] enabled, and that would in
turn result in the conflict marker evaluating to false. This directly
implies that our simplification here is inappropriate.

It would be nice to revisit how we build our inferences here so that
they are richer and enable us to make correct logical leaps. For now, we
fix this particular bug with a bit of a cop-out: we skip conflict marker
simplification when there are ambiguous dependency edges.

Fixes #11479

@BurntSushi BurntSushi added bug Something isn't working lock Related to universal resolution and locking labels Feb 14, 2025
The place to look in this snapshot is the `name = "e3nn"` dependency.
Its dependencies on `sympy` and `torch` consist of multiple versions
with overlapping conflict markers. They are getting incorrectly
simplified to `true`.
The particular example I honed in on here was the `e3nn -> sympy 1.13.1`
and `e3nn -> sympy 1.13.3` dependency edges. In particular, while the
former correctly has a conflict marker, the latter's conflict marker was
getting simplified to `true`. This makes the edges trivially
overlapping, and results in both of them getting installed
simultaneously. (A similar problem happens for the `e3nn -> torch`
dependency edges.)

Why does this happen? Well, conflict marker simplification works by
detecting which extras are known to be enabled (and disabled) for each
node in the graph. This ends up being expressed as a set of sets, where
each inner set contains items corresponding to "extras is included" or
"extra is excluded."

The logic then is if _all_ of these sets are satisfied by the conflict
marker on the dependency edge, then this conflict marker can be
simplified by assuming all of the inclusions/exclusions to be true.

In this particular case, we run into an issue where the set of
assumptions discovered for `e3nn` is:

    {test[sevennet]}, {}, {~test[m3gnet], ~test[alignn], test[all]}

And the corresponding conflict marker for `e3nn -> sympy 1.13.1` is:

    extra == 'extra-4-test-all'
    or extra == 'extra-4-test-chgnet'
    or (extra != 'extra-4-test-alignn' and extra != 'extra-4-test-m3gnet')

And the conflict marker for `e3nn -> sympy 1.13.3` is:

    extra == 'extra-4-test-alignn' or extra == 'extra-4-test-m3gnet'

Evaluating each of the sets above for `sympy 1.13.1`'s conflict
marker results in them all being true. Simplifying in turn results in
the marker being true. For `sympy 1.13.3`, not all of the sets are
satisfied, so this marker is not simplified.

I think the fundamental problem here is that our inferences aren't quite
rich enough to make these logical leaps. In particular, the conflict
marker for `e3nn -> sympy 1.13.3` is not satisfied by _any_ of our sets.
One might therefore conclude that this dependency edge is impossible.
But! The `test[sevennet]` set doesn't actually rule out `test[m3gnet]`
from being included, for example, because there is no conflict. So it is
actually possible for this marker to evaluate to true.

And I think this reveals the problem: for the `e3nn -> sympy 1.13.1`
conflict marker, the inferences don't capture the fact that
`test[sevennet]` _might_ have `test[m3gnet]` enabled, and that would in
turn result in the conflict marker evaluating to `false`. This directly
implies that our simplification here is inappropriate.

It would be nice to revisit how we build our inferences here so that
they are richer and enable us to make correct logical leaps. For now, we
fix this particular bug with a bit of a cop-out: we skip conflict marker
simplification when there are ambiguous dependency edges.

Fixes #11479
…marker simplification

This is fallout from skipping simplification when two or more edges with
the same package name exist.
@@ -93,6 +93,14 @@ impl ResolutionGraphNode {
}
}
}

#[allow(dead_code)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Necessary?

@charliermarsh
Copy link
Member

I'm sort of just confirming my understanding, but if we simplified the conflict marker for each set independently, and then checked if each simplification resulted in the same outcome, could we use that?

Or, what if for each set A and B, we did:

A and (extra == 'extra-4-test-all'
or extra == 'extra-4-test-chgnet'
or (extra != 'extra-4-test-alignn' and extra != 'extra-4-test-m3gnet'))
or B and (extra == 'extra-4-test-all'
or extra == 'extra-4-test-chgnet'
or (extra != 'extra-4-test-alignn' and extra != 'extra-4-test-m3gnet'))

Would that be sound?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working lock Related to universal resolution and locking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicate packages with multiple conflicts
2 participants