Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid repetitive syncmers #475

Closed
wants to merge 2 commits into from
Closed

Avoid repetitive syncmers #475

wants to merge 2 commits into from

Conversation

marcelm
Copy link
Collaborator

@marcelm marcelm commented Dec 20, 2024

This is only to document how results would change when avoiding repetitive syncmers. These two commits were originally part of #464, but I removed them because the effect is so small.

Changes are relative to #464.

Comparing accuracy (paf)

92c6105 Produce a syncmer if any minimum is at the required position
48b8257 Sample fewer syncmers in repetitive regions

library 92c6105 48b8257 difference
sim5-drosophila-50-se 76.9434 76.9428 -0.0006
sim5-drosophila-75-se 83.4574 83.4662 +0.0088
sim5-drosophila-100-se 84.9698 85.0006 +0.0308
sim5-drosophila-150-se 87.5199 87.5247 +0.0048
sim5-drosophila-200-se 89.2588 89.2764 +0.0176
sim5-drosophila-300-se 91.2808 91.2698 -0.0110
sim5-maize-50-se 39.1029 39.0660 -0.0369
sim5-maize-75-se 51.1972 51.1828 -0.0144
sim5-maize-100-se 58.7602 58.7517 -0.0085
sim5-maize-150-se 70.7971 70.7977 +0.0006
sim5-maize-200-se 77.6729 77.6700 -0.0029
sim5-maize-300-se 84.6080 84.6204 +0.0124
sim5-CHM13-50-se 74.2958 74.3139 +0.0181
sim5-CHM13-75-se 82.7271 82.7352 +0.0081
sim5-CHM13-100-se 85.3325 85.3486 +0.0161
sim5-CHM13-150-se 88.6368 88.6345 -0.0023
sim5-CHM13-200-se 90.4160 90.4165 +0.0005
sim5-CHM13-300-se 92.1565 92.1592 +0.0027
sim5-rye-50-se 36.5431 36.5321 -0.0110
sim5-rye-75-se 49.1549 49.1423 -0.0126
sim5-rye-100-se 57.3552 57.3367 -0.0185
sim5-rye-150-se 69.9844 70.0066 +0.0222
sim5-rye-200-se 76.9929 77.0022 +0.0093
sim5-rye-300-se 83.8594 83.8577 -0.0017
sim5-ecoli50-50-se 10.7365 10.7364 -0.0001
sim5-ecoli50-75-se 13.6556 13.6550 -0.0006
sim5-ecoli50-100-se 15.8174 15.8174 +0.0000
sim5-ecoli50-150-se 20.0893 20.0893 +0.0000
sim5-ecoli50-200-se 23.3933 23.3933 +0.0000
sim5-ecoli50-300-se 28.3073 28.3060 -0.0013

Average difference se: +0.0010

Comparing accuracy (paf)

92c6105 Produce a syncmer if any minimum is at the required position
bb384f8 Ignore repetitive k-mers

library 92c6105 bb384f8 difference
sim5-drosophila-50-se 76.9434 76.9176 -0.0258
sim5-drosophila-75-se 83.4574 83.4443 -0.0131
sim5-drosophila-100-se 84.9698 84.9998 +0.0300
sim5-drosophila-150-se 87.5199 87.5099 -0.0100
sim5-drosophila-200-se 89.2588 89.2666 +0.0078
sim5-drosophila-300-se 91.2808 91.2886 +0.0078
sim5-maize-50-se 39.1029 39.0993 -0.0036
sim5-maize-75-se 51.1972 51.1967 -0.0005
sim5-maize-100-se 58.7602 58.7662 +0.0060
sim5-maize-150-se 70.7971 70.8051 +0.0080
sim5-maize-200-se 77.6729 77.6560 -0.0169
sim5-maize-300-se 84.6080 84.5989 -0.0091
sim5-CHM13-50-se 74.2958 74.2933 -0.0025
sim5-CHM13-75-se 82.7271 82.7352 +0.0081
sim5-CHM13-100-se 85.3325 85.3485 +0.0160
sim5-CHM13-150-se 88.6368 88.6354 -0.0014
sim5-CHM13-200-se 90.4160 90.4067 -0.0093
sim5-CHM13-300-se 92.1565 92.1479 -0.0086
sim5-rye-50-se 36.5431 36.5736 +0.0305
sim5-rye-75-se 49.1549 49.1517 -0.0032
sim5-rye-100-se 57.3552 57.3378 -0.0174
sim5-rye-150-se 69.9844 69.9829 -0.0015
sim5-rye-200-se 76.9929 76.9919 -0.0010
sim5-rye-300-se 83.8594 83.8475 -0.0119
sim5-ecoli50-50-se 10.7365 10.7351 -0.0014
sim5-ecoli50-75-se 13.6556 13.6575 +0.0019
sim5-ecoli50-100-se 15.8174 15.8174 +0.0000
sim5-ecoli50-150-se 20.0893 20.0910 +0.0017
sim5-ecoli50-200-se 23.3933 23.3906 -0.0027
sim5-ecoli50-300-se 28.3073 28.3059 -0.0014

Average difference se: -0.0008

@marcelm
Copy link
Collaborator Author

marcelm commented Dec 20, 2024

This was only for documentation, closing.

@marcelm marcelm closed this Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant