-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PC-AIR issue when over 50k samples. #86
Comments
On this run, we got a segfault memory error. Working with 52877 samples of selected variants: 87,468
CPU capabilities: Double-Precision SSE2 *** caught segfault *** Traceback: |
Did you check the memory usage? Ran out of memory? |
We are running with 196GB of memory with 28 cores. What is a recommended amount of memory per core be? |
We were able to run the PCs on 45k unrelated and then project it on the remaining 8k samples successfully. |
I encountered similar issues as jjfarrell mentioned but haven’t found a solution yet. I was running 76K samples for 33,765 SNPs, using PLINK bed/bim/fam files. I was able to get the PCs for 45K samples before by using R 3.6.3/GENESIS_2.16.1. When the sample size goes up to 76K, I got “segfault” error. After reading this post, I upgraded both R and GENESIS to the latest versions. I did two runs:
session info:R version 4.1.0 (2021-05-18) Matrix products: default locale: attached base packages: other attached packages: loaded via a namespace (and not attached):
|
I guess you should use the fastPCA option in |
The fastPCA option is already available in GENESIS, as any additional arguments to the |
Xiuwen and Stephanie, thank you so much for your suggestion! I have added algorithm="randomized" to the PCair function and my 76K sample dataset was able to run through pcair without problem. |
Hello Xiuwen and Stephanie, when I took the PCs estimated from PCair and tried to run my 76K sample set through pcrelate, it failed with the following error message:
It appears to hit the maximum row size (2^31-1) for a R data.table. Does this mean we cannot run pcrelate for more than ~65K samples? Any advice or insights? Thank you! |
@GraceSheng the pcrelate issue belongs on the GENESIS page. (Unlike pcair, pcrelate does not use SNPRelate functions.) |
Thank you Stephanie! I opened a new issue on the GENESIS page |
We are getting an error when our sample size goes above 50,000. The PC-AIR step results in just NAs for all the PCS. At 45k samples, it runs fine. Originally reported this issue to GENESIS (UW-GAC/GENESIS#64) Any suggestions on this issue?
Here is the code...
Here is the session info:
Here is the log, this time the script run through but the PCs results are NaNs:
/restricted/projectnb/adgc/zhucc/ADGC_data/pheno/PCs/script_53k_twoStepsPCS/adgc.pc-air.pcs.txt
The text was updated successfully, but these errors were encountered: