Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

../input_prep/reprocess_rnac.pl id_mapping.tsv.gz rfam_annotations.tsv.gz # ~8 minutes #92

Open
rse-lbl opened this issue Mar 10, 2024 · 3 comments

Comments

@rse-lbl
Copy link

rse-lbl commented Mar 10, 2024

I'm trying to install on my laptop to get some experience before installing and running on supercomputer GPU nodes.

i7-3520M with 16 GB RAM and >100 GB swap space on an HDD (per #43). With the install directory on a separate external HDD (downloaded files that the script is running on are on this same HDD).

So far I'm at 12 hours running this script. Does anyone know how much longer I should expect to wait? Or will this never resolve?

Additionally, what files should I expect to see created, and what is their sizes?

Can I unpack the tar.gz files and run the script on the re-gz'd components one at a time to save memory use and speed things up? Go further and split the files then concatenate the split output files?

@rse-lbl
Copy link
Author

rse-lbl commented Mar 10, 2024

It's now been a little over 14 hours. I'm tracking the file size change between the updating rfam_annotations.tsv.gz and rfam_annotations.tsv.gz.bak. Based on the average speed in the file size delta it's looking like another 6 hours for the sizes to become the same.

So about 20 hours total.

free -h is showing about 20 Gi of swap used in addition to 14 Gi of RAM (a little of which is running other processes).

I don't know Perl, else I'd rewrite the script to split, batch and concatenate the outputs if free mem is too low to do it all in RAM.

@rse-lbl
Copy link
Author

rse-lbl commented Mar 11, 2024

After a total of about 19 and a half hours the rfam_annotations.tsv.gz has stopped updating at a size of 384,827,392 bytes compared to the 347,475,915 bytes of the backed up original file. The memory still hasn't cleared out and my display manager (and the Xterm within it that I launched the original script from) are dragging. Top in tty3 shows that reprocess_rnac.pl is still running and using 89.8% of memory (VIRT says 26.3g, RES 13.9g).

35 minutes later still no further update, and the drives are silent, so I'm killing the script and continuing with installation.

@SuhasSrinivasan
Copy link
Contributor

I had encountered an Out of memory issue similar to #43 and found a workaround.
Documented here: #96
Hope this is helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants