Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat 1-point datasets equally in sequential and parallel fits #2276

Merged
merged 11 commits into from
Feb 20, 2025

Conversation

scarlehoff
Copy link
Member

@scarlehoff scarlehoff commented Feb 12, 2025

Due to the shape-changing nature of boolean masks we decided to just put single-point datasets in training when running in GPU. This PR removes that limitation by just accepting the point in both training and validation and setting to 0 the row and column in the inverse covmat (so the masking happens at the level of the loss).

If this works ok, #2138 comes for free.

@RoyStegeman @achiefa I want to run a few tests first before considering this good:

  • That Etr and Evl are ok for multireplica fits (they are ok for single replicas given that the regression test pass)
  • Running a fit with only single-point datasets (this is actually broken in master)
  • Experiments with one single dataset which is one-single-point.
  • Check whether the written down replicas are really the same in sequential and parallel
    • Add a test for exactly this situation
    • Separate the .csv files per replica

(any help running the checks would be appreciated ofc, the more eyes the better)

This is needed for tree-saving reasons.

@scarlehoff scarlehoff added the n3fit Issues and PRs related to n3fit label Feb 12, 2025
@scarlehoff scarlehoff marked this pull request as draft February 12, 2025 12:05
@scarlehoff scarlehoff added the run-fit-bot Starts fit bot from a PR. label Feb 12, 2025
Copy link

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

@scarlehoff scarlehoff added run-fit-bot Starts fit bot from a PR. and removed run-fit-bot Starts fit bot from a PR. labels Feb 19, 2025
@scarlehoff scarlehoff marked this pull request as ready for review February 19, 2025 08:18
@scarlehoff
Copy link
Member Author

This is ready for review.

@achiefa please have a look if you have time to see whether this works for your use case. In principle it should store the replica pseudodata in each folder in GPU just like it does when you run sequentially.

@comane to achieve this I've created a decorator which modifies reportengine's decorator. Not sure whether what I've done is legal or whether there's an easier more official way to achieve this?

Copy link

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

@scarlehoff scarlehoff force-pushed the mask_single_point_datasets_parallel branch from fec4235 to 2b92347 Compare February 19, 2025 16:59
Copy link
Contributor

@achiefa achiefa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with that. Some things were not clear to me. But I think that is because I don't have as strong a knowledge of the codebase as you have.

@scarlehoff
Copy link
Member Author

I'm okay with that. Some things were not clear to me. But I think that is because I don't have as strong a knowledge of the codebase as you have.

This is why it is important to have less experienced people doing the review, so they can become experienced! Thanks. I'll add the requested comments / docs.

@scarlehoff scarlehoff added run-fit-bot Starts fit bot from a PR. and removed run-fit-bot Starts fit bot from a PR. labels Feb 20, 2025
Copy link

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

@scarlehoff scarlehoff merged commit 17f23eb into master Feb 20, 2025
9 checks passed
@scarlehoff scarlehoff deleted the mask_single_point_datasets_parallel branch February 20, 2025 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
n3fit Issues and PRs related to n3fit run-fit-bot Starts fit bot from a PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants