-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dockopt treats protomers/tautormers as separate unique molecules when calculating enrichment #30
Comments
@jir322 Please comment on this if you have any remarks. |
it would be best to unify all molecules with the same ZINC ID and count
them as a single molecule, retaining only the best score for one
representative of each ZINC ID.
…On Tue, Apr 4, 2023 at 6:44 PM Ian Scott Knight ***@***.***> wrote:
@jir322 <https://github.com/jir322> Please comment on this if you have
any remarks.
—
Reply to this email directly, view it on GitHub
<#30 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABIR2H7U2BAAWOEGPIT5LSLW7TFANANCNFSM6AAAAAAWSFRKOQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@jir322 From DockOpt's perspective, there is no such thing as "ZINC ID". There is only the (Note that The real problem here is that there is no general ID for molecules in the DB2 file format. One possible solution is to just use the Another solution is to update the .db2 file format to account for this ambiguity by adding a Yet another solution is to adopt a naming convention for |
let's discuss this. There is no practical way to retrospectively update all
of ZINC-22 files. We can fix it in ZINC-25, but that's not for a while
longer.
John Irwin
UCSF Pharmaceutical Chemistry
http://irwinlab.compbio.ucsf.edu
…On Wed, Apr 5, 2023 at 7:42 PM Ian Scott Knight ***@***.***> wrote:
@jir322 <https://github.com/jir322> From DockOpt's perspective, there is
no such thing as "ZINC ID". There is only the id_num column in the
OUTDOCK file, which corresponds to the zincname field encoded in the .db2
file <https://wiki.docking.org/index.php?title=DB2_File_Format> of the
molecule.
(Note that zincname and id_num are both misnomers for their data types,
and are partly responsible for the confusion here. E.g., it is possible for
built molecules to come from somewhere other than ZINC, such as the actives
in the DUDE-Z dataset, which come from RCSB PDB.)
The real problem here is that there is no general ID for molecules in the
DB2 file format. One possible solution is to just use the zincname field
in the .db2 file as an actual molecule ID, since DockOpt currently treats
the id_num column of OUTDOCK as a molecule ID, but doing so would almost
certainly only create confusion in the long run.
Another solution is to update the .db2 file format to account for this
ambiguity by adding a molecule_id field (and rectifying the zincname
misnomer).
Yet another solution is to adopt a naming convention for zincname field
entries of the same molecule which would allow DockOpt to figure out what
to treat as the same molecule. I would suggest a regex. E.g., "^.*.\d$"
would match any strings which are ZINC codes followed by a period and a
number, where the number would identify the protomer / tautomer.
—
Reply to this email directly, view it on GitHub
<#30 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABIR2HYGFZPBC3XVHSJ4UJ3W7YUSJANCNFSM6AAAAAAWSFRKOQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
The different protomers and tautomers for the same molecule get built as separate db2 files with the same ZINC id, but different numbers after the decimal points (ie ZINC00000000aBcD.0.0 vs ZINC00000000aBcD.1.0 ). Each of these get docked on their own and scored, however, only the best scoring protomer/tautormer should be considered when calculating enrichment. Dockopt currently treats every separate db2 file as a unique 'active' molecule which alters the calculated enrichment and allows for situations where a poor scoring promoter will bring down the enrichment score despite the alternative protomer scoring well compared to decoy compounds.
The text was updated successfully, but these errors were encountered: