Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aeon 1.0 #94

Merged
merged 30 commits into from
Jan 31, 2024
Merged

Aeon 1.0 #94

merged 30 commits into from
Jan 31, 2024

Conversation

daemontus
Copy link
Collaborator

@daemontus daemontus commented Jan 31, 2024

This PR migrates balm to the version 1.0 of biodivine_aeon. As part of the migration, several major changes are introduced:

  • The dependency on PyEDA is completely removed, because we can now use symbolic representations in AEON. This mainly affects the Petri net translation and motif avoidant detection, where PyEDA BDDs were used extensively. To some extent, this also affects percolation, but not as dramatically.

    • In a few places, this change is a bit annoying because it means that, aside from the BDD, you also have to provide a proper "context" object through which the BDD is interpreted.
    • To help with this, each SuccessionDiagram now also maintains an AsynchronousGraph of the underlying BooleanNetwork. To avoid any unnecessary overhead, the SuccessionDiagram first removes any static constraints imposed by the influence graph, so this should really be only building the symbolic context and the function BDDs and nothing else.
    • With that said... For the Petri net translation, Aeon computes a ~10-20% smaller Petri net for the real world models in roughly the same time... or the same Petri net but 10x faster for very large N-K models (see petri_net_translation table below).
    • The size of the Petri net very positively impacts some of the real-world models, but this is relatively rare (see e.g. 146 and 159 in the bfs_full_expansion table). For other models, the impact is negligible, mostly because we are not bottlenecked by the solver very much (or the PN did not improve measurably).
    • For motif avoidant detection, this also helped quite a bit in cases where attractor detection was a bottleneck. For BBM models, we are up to 10x faster on many models that were previously "slow".
  • For percolation, I observed that it's not particularly critical for performance on the real-world models, but it can be a bottleneck on the large random models. Here, the new implementation should be up to 2x faster than the old one, but we seem to be hitting other bottlenecks for those models as well, so the performance impact is not as drastic.

    • I separated the basic and strict percolations into separate methods to make them easier to optimize. For now, I am focusing on the basic percolation. Strict percolation does not seem to be used in performance critical code at the moment, so it should be fine.
    • After accounting for the new translation and percolation changes, we are 3% (BBM) and 15% (random) faster than the old implementation. But this weights every model equally, the difference tends to be much larger on the non-trivial models (see bfs_full_expansion table).
  • I also did some non-trivial updates to the SCC expansion, mainly using new features in BN manipulation to avoid the bnet translations. In the end, the changes should not impact performance much, but the expansion is still faster compared to the previous result, probably for the same reasons full expansion is faster (See attractors_scc_expansion table)

Other miscellaneous changes:

  • The type stubs are no longer necessary, since PyEDA is not used, and AEON now distributes its own type annotations.
  • The fact that we no longer use PyEDA means we don't need to increase the recursion limit anymore.
  • Added a Python script that can sample networks with random input valuations. Some of the AEON bindings
    to do this in Python were missing previously.
  • Added a simple benchmark script to test Petri net translation in isolation to rule out any bottlenecks in that area and to be able to compare the PN sizes across implementations.
  • Now that everything is generally faster, we can remove some of the limits we imposed on attractor detection in the original tests.
  • BooleanNetwork now inherits from a RegulatoryGraph, hence type signatures that accepted both can be simplified to just RegulatoryGraph (for the most part).
  • The caching of expressions/symbolic values is removed because most of it is now handled by the SymbolicContext and AsynchronousGraph instances that we pass around.

petri_net_translation.xlsx
bfs_full_expansion.xlsx
attractors_scc_expansion.xlsx

Copy link

github-actions bot commented Jan 31, 2024

Coverage

Coverage Report
FileStmtsMissCoverMissing
balm
   SuccessionDiagram.py2114280%6, 126–133, 139–146, 159, 166, 173, 181, 184, 190–221, 308, 448–450, 456, 582
   control.py1231489%48, 57, 61, 67, 81, 90–106, 329, 332, 345
   interaction_graph_utils.py54591%6–8, 47, 165–166
   motif_avoidant.py152299%24, 120
   petri_net_translation.py1171091%17–19, 49, 85, 133–134, 158–159, 168, 272
   space_utils.py129497%24–26, 251, 277
   symbolic_utils.py26388%10–12, 44
   trappist_core.py1872686%10–12, 40, 42, 82, 128, 193, 195, 197, 232–234, 254–260, 318, 320, 350, 390, 392, 423, 452
balm/_sd_algorithms
   compute_attractor_seeds.py30197%8
   expand_attractor_seeds.py51590%6, 42, 95–100
   expand_bfs.py28196%6
   expand_dfs.py30197%6
   expand_minimal_spaces.py37295%6, 31
   expand_source_SCCs.py163597%20, 90, 100, 142, 285
   expand_to_target.py31390%6, 38, 43
TOTAL139812491% 

Tests Skipped Failures Errors Time
363 0 💤 0 ❌ 0 🔥 48.723s ⏱️

@jcrozum
Copy link
Owner

jcrozum commented Jan 31, 2024

This is great! While you're overhauling the SCC expansion, watch out for the import cycle. Maybe with the new approach, there will be a nicer way to resolve it. Basically, the SuccessionDiagram.py needs to import the methods in expand_source_SCC.py file to get the expansion method for the SuccessionDiagram class, but those methods construct a new SuccessionDiagram. This is causing problems for the automatic documentation. My solution on that branch was to import the SuccessionDiagram class from within the expansion methods at runtime, but hopefully there is a better way. On slightly a related note, we should rename the files to conform to PEP8 standars (all files should be written_like_this.py).

@daemontus daemontus requested a review from jcrozum January 31, 2024 15:10
@jcrozum
Copy link
Owner

jcrozum commented Jan 31, 2024

I just pushed some formatting changes to this PR. We should probably add black to our CI pipeline. I also added a # type: ignore to a type error that I got locally but which our CI seemed to miss for some reason.

@@ -279,7 +285,7 @@ def find_node(self, node_space: BooleanSpace) -> int | None:
return i
else:
return None
except RuntimeError:
except IndexError:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we actually expect this to be an AssertionError?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... what we are actually intending to catch here is an error that is raised by space_unique_key if you feed it something that cannot be converted to a valid key. The assertion in this function is actually a "true" assertion: if it fails, something is wrong and we should terminate. But I agree that this is a bit confusing :/

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that makes sense. Maybe just add a comment here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's there, underneath :) It's just not visible in the diff, because it was already there before. I changed the error to something that seemed more appropriate (after all, RuntimeError could be literally anything).

But I guess it is kind of vague, I'll make it more explicit.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I saw that comment; I'll paste it here so we can see it:

# If the user gives us a space that uses variables not used by this
# network, we should get an error that we can catch and report that
# no such space exists here.

I meant we should add a comment to key = space_unique_key(node_space, self.network) if that's where the exception can be thrown. An even better solution would be to have space_unique_key handle bad inputs more gracefully, e.g., returning None in those cases. Then we can just write

if not key:
    return None

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, I think we can merge this and worry about it later since it's correct.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a bit more thought and I think the assertion is actually unsatisfiable (at least at this point, maybe it made sense in the past). For every key in node_indices, we should have that if key = space_unique_key(space), then node_space(node_indices(k)) == space, because the keys are unique hashes: every space has its own key, so there can be no collision. I guess my initial confusion was that if the key were to be computed for a different network with compatible variable names, this sort of collision could happen. But that is not what is happening in this method.

function_expression = network.get_update_function(varID)
function_bdd = expr2bdd(aeon_to_pyeda(function_expression))
update_functions[var_name] = function_bdd

# A random generator initialized with a fixed seed. Ensures simulation
# is randomized but deterministic.
generator = random.Random(1234567890)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be passed in (with a default seed of 0 probably), not hard-coded

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm referring to generator = random.Random(1234567890), in case that wasn't clear)

if bdd.is_false():
return
if bdd.is_true():
yield BddPartialValuation(bdd.__ctx__(), {}) # type: ignore
Copy link
Owner

@jcrozum jcrozum Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was the type error that I ignored

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... maybe something mypy added recently and you need to upgrade? It works locally on my computer as well.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, that's the reason. This works without the ignore in the latest mypy. Feel free to remove the tag.

result: BooleanSpace = {}
for var, value in percolated.items():
var_name = stg.get_network_variable_name(var)
result[var_name] = cast(Literal[0, 1], int(value))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to have a type alias like type Bool = Literal[0,1] in the types file. Not sure how often this comes up, but the overlap in terminology between a Python Literal and a Boolean literal has tripped me up more than once.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe BooleanValue would be a better name, since it would match with BooleanState.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! But I guess we can add that as a separate fix :)

Copy link
Owner

@jcrozum jcrozum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with merging this as-is. The comments I left are very minor. They should be probably addressed eventually, but let's not hold everything up for them. The only one I'm actually kind of worried about is catching the IndexError vs the AssertionError. We might want to add something to the unit tests for this.

@daemontus
Copy link
Collaborator Author

Ok, I don't know what happened exactly, maybe I screwed up something during testing (I was experimenting with the percolation when I did the preliminary tests for SCC expansion), but the latest commit is actually faster on SCC expansion than the original version, roughly in line with the improvements we got for the "full" expansion. So I guess that issue is closed.

Furthermore, I tested control: Anything that was able to finish under 10 minutes on the new version was slower on the older version. So I think we should be fine in terms of performance.

I'll be merging this, and we can solve the rest as we go.

@daemontus daemontus merged commit 7b6d1ee into main Jan 31, 2024
6 checks passed
@daemontus daemontus deleted the aeon-1.0 branch January 31, 2024 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants