-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding vector feature cutoff to ET #260
Conversation
produce a continuous energy at the cutoff.
cc @Honza-R |
After careful consideration we decided not to deprecate ET for the time being. |
@guillemsimeon please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Now both scalar and vector features will be weighted with the cutoff.
Thanks for fixing the issue. We have very nice results obtained with ET, and discovered this at the very last moment. We're going to try TensorNet on the same problem once this works. You mentioned that this invalidates the previous checkpoints - does this mean that they can't be used at all, or would it only change the results? I'd like to restart the training from what we already have, it took us some work to get the best models. |
You should have no problems with old checkpoints. I left the old behavior as the default as to not invalidate checkpoints. |
The user in #254 discovered a discontinuity in the energy at the cutoff when using ET.
We found that this is caused by a quirk in the current architecture that cannot be avoided without invalidating previous checkpoints.
The message passing step in the current architecture, which is consistent with eq. 9 in [1], weights only scalar features with the cutoff function, not vector ones. Given that both distance projections (eq. 5) have a bias, this allows the arch to learn a non-vanishing offset on the energy.
We can solve this by modifying the message passing so that the cutoff also affects vector features. This amounts to taking away the$\phi(d_{ij})$ from eq. 8 and multiplying it instead at eq.9a, so that $s^1_{ij}, s^2_{ij}, s^3_{ij} = \text{split}(V_j\odot D^V_{ij}) \phi(d_{ij})$ .
Since this invalidates checkpoints I am gating this change behind a new option,
vector_cutoff
, that defaults to false and if turned on implements this new behavior.Enabling this option seems to slightly improve benchmarks, for instance, this is ET-MD17.yaml with and without (reference) vector_cutoff. Note that I did not do any kind of hparam optimization.
data:image/s3,"s3://crabby-images/b3ef6/b3ef6246984c3671c3f3dd2a182b0086cd0aee21" alt="image"
Moreover, since we are moving away from ET in general in favor of TensorNet, I am using this PR to mark ET as deprecated.
[1] https://arxiv.org/pdf/2202.02541.pdf
Closes #254