You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I anticipate that a lot of user error will be due to giving CAS normalized counts data. Unfortunately, we cannot rely on dtype for that: I have seen too many AnnData files with float32 that contain integer counts. Heck, I do that myself all the time :D
A quick a dirty validation is to sample x percent (x ~ 5 to 10) of non-zero counts counts (easy if sparse, a bit more expensive if dense), and ensure that their decimal is < 1e-3. Otherwise, raise an exception with an informative error message. We can also have a flag to disable input data integralness validation (set to False by default) for those who know what they're doing.
The text was updated successfully, but these errors were encountered:
I anticipate that a lot of user error will be due to giving CAS normalized counts data. Unfortunately, we cannot rely on dtype for that: I have seen too many AnnData files with float32 that contain integer counts. Heck, I do that myself all the time :D
A quick a dirty validation is to sample x percent (x ~ 5 to 10) of non-zero counts counts (easy if sparse, a bit more expensive if dense), and ensure that their decimal is < 1e-3. Otherwise, raise an exception with an informative error message. We can also have a flag to disable input data integralness validation (set to False by default) for those who know what they're doing.
The text was updated successfully, but these errors were encountered: