From 6c207ed1732fa9a6fd2e9c7f9381db0f5c5688a1 Mon Sep 17 00:00:00 2001 From: adeschen Date: Tue, 22 Oct 2024 03:36:52 +0000 Subject: [PATCH] Built site for RAIDS@1.3.2: 8f2ea94 --- articles/Create_Reference_GDS_File.html | 6 +- articles/RAIDS.html | 25 ++- articles/Wrappers.html | 2 +- pkgdown.yml | 2 +- reference/add1KG2SampleGDS.html | 2 +- reference/addBlockFromDetFile.html | 2 +- reference/addGDS1KGLDBlock.html | 2 +- reference/addGeneBlockGDSRefAnnot.html | 2 +- reference/addGeneBlockRefAnnot.html | 2 +- reference/addStudy1Kg.html | 2 +- reference/createStudy2GDS1KG.html | 2 +- .../generateGDS1KGgenotypeFromSNPPileup.html | 2 +- reference/generateGDSSNPinfo.html | 12 +- reference/inferAncestry.html | 13 +- reference/inferAncestryGeneAware.html | 208 +++++++++++++++++- reference/profileAncestry.html | 5 - reference/pruningSample.html | 2 +- reference/wrapperAncestry.html | 5 - 18 files changed, 240 insertions(+), 56 deletions(-) diff --git a/articles/Create_Reference_GDS_File.html b/articles/Create_Reference_GDS_File.html index 356cf261..c8d8add0 100644 --- a/articles/Create_Reference_GDS_File.html +++ b/articles/Create_Reference_GDS_File.html @@ -95,7 +95,7 @@

Pascal Belleau,


Package: RAIDS
Authors: Pascal Belleau [cre, aut] (https://orcid.org/0000-0002-0802-1071), Astrid DeschĂȘnes [aut] (https://orcid.org/0000-0001-7846-6749), David A. Tuveson [aut] (https://orcid.org/0000-0002-8017-2712), Alexander -Krasnitz [aut]
Version: 1.3.2
Compiled date: 2024-10-21
License: Apache License (>= 2)

+Krasnitz [aut]
Version: 1.3.2
Compiled date: 2024-10-22
License: Apache License (>= 2)



This vignette explains, in further details, the format of the population reference files that are required to run the ancestry @@ -393,7 +393,7 @@

Population Reference GDS File## Show the file format print(gdsRefNew) -## File: /tmp/RtmpZEQp7l/reference_DEMO.gds (1.6K) +## File: /tmp/Rtmp4S5RLw/reference_DEMO.gds (1.6K) ## + [ ] ## |--+ sample.id { Str8 10, 80B } ## |--+ sample.annot [ data.frame ] * @@ -578,7 +578,7 @@

Population Reference Annotatio ## Show the file format print(gdsRefAnnotNew) -## File: /tmp/RtmpZEQp7l/reference_SNV_Annotation_DEMO.gds (427B) +## File: /tmp/Rtmp4S5RLw/reference_SNV_Annotation_DEMO.gds (427B) ## + [ ] ## |--+ phase { Bit2 10x10, 25B } ## |--+ block.annot [ data.frame ] * diff --git a/articles/RAIDS.html b/articles/RAIDS.html index 540dd776..844567b2 100644 --- a/articles/RAIDS.html +++ b/articles/RAIDS.html @@ -96,7 +96,7 @@

Pascal Belleau,


Package: RAIDS
Authors: Pascal Belleau [cre, aut] (https://orcid.org/0000-0002-0802-1071), Astrid DeschĂȘnes [aut] (https://orcid.org/0000-0001-7846-6749), David A. Tuveson [aut] (https://orcid.org/0000-0002-8017-2712), Alexander -Krasnitz [aut]
Version: 1.3.2
Compiled date: 2024-10-21
License: Apache License (>= 2)

+Krasnitz [aut]
Version: 1.3.2
Compiled date: 2024-10-22
License: Apache License (>= 2)

Licensing

@@ -172,9 +172,8 @@

Main StepsStep 2 Ancestry inference with RAI

2.1 Set-up required directories

+

All required directories need to be created. In addition, the path to +the reference files are kept in variables that will be used later.

 
 #############################################################################
@@ -325,22 +326,20 @@ 

2.2 Sample refe ############################################################################# dataRef <- select1KGPopForSynthetic(fileReferenceGDS=refGenotype, nbProfiles=2L)

-

The output object is going to be used later at the ancestry inference -step.

+

The output object is going to be used later.


2.3 Perform the ancestry inference

-

Within a single function call, data synthesis is performed, the -synthetic data are used to optimize the inference parameters and, with -these, the ancestry of the input profile donor is inferred.

+

Ancestry inference can be done in one function call. Within a single +function call, data synthesis is performed, the synthetic data are used +to optimize the inference parameters and, with these, the ancestry of +the input profile donor is inferred.

According to the type of input data (RNA or DNA), a specific function -is available. The inferAncestry() function is used for DNA +should be called. The inferAncestry() function is used for DNA profiles while the inferAncestryGeneAware() function is RNA specific.

-

In this example, the profile is from DNA source and requires the use -of the inferAncestry() function.

The inferAncestry() function requires a specific profile input format. The format is set by the genoSource parameter.

@@ -372,6 +371,8 @@

2.3 Perform the ancestry inferenceBeware that the starting position in the population reference GDS file is zero (like BED files). The generic SNP file should also start at position zero.

+

In this example, the profile is from DNA source and requires the use +of the inferAncestry() function.

 
 ###########################################################################
diff --git a/articles/Wrappers.html b/articles/Wrappers.html
index 7b549125..f0b45700 100644
--- a/articles/Wrappers.html
+++ b/articles/Wrappers.html
@@ -95,7 +95,7 @@ 

Pascal Belleau,


Package: RAIDS
Authors: Pascal Belleau [cre, aut] (
https://orcid.org/0000-0002-0802-1071), Astrid DeschĂȘnes [aut] (https://orcid.org/0000-0001-7846-6749), David A. Tuveson [aut] (https://orcid.org/0000-0002-8017-2712), Alexander -Krasnitz [aut]
Version: 1.3.2
Compiled date: 2024-10-21
License: Apache License (>= 2)

+Krasnitz [aut]
Version: 1.3.2
Compiled date: 2024-10-22
License: Apache License (>= 2)



This vignette explains, in further details, the used of the wrapper functions that were developed for a previous release of RAIDS.

diff --git a/pkgdown.yml b/pkgdown.yml index da542f31..83942197 100644 --- a/pkgdown.yml +++ b/pkgdown.yml @@ -5,4 +5,4 @@ articles: Create_Reference_GDS_File: Create_Reference_GDS_File.html RAIDS: RAIDS.html Wrappers: Wrappers.html -last_built: 2024-10-21T23:19Z +last_built: 2024-10-22T03:33Z diff --git a/reference/add1KG2SampleGDS.html b/reference/add1KG2SampleGDS.html index 8cb6d737..e74075d2 100644 --- a/reference/add1KG2SampleGDS.html +++ b/reference/add1KG2SampleGDS.html @@ -154,7 +154,7 @@

Examples

## The 'pruned.study' entry should be present content <- openfn.gds(fileProfile) content -#> File: /tmp/RtmpZ7wD2p/ex2.gds (10.4K) +#> File: /tmp/RtmpQxo2h7/ex2.gds (10.4K) #> + [ ] #> |--+ Ref.count { SparseInt16 11000x1, 568B } #> |--+ Alt.count { SparseInt16 11000x1, 74B } diff --git a/reference/addBlockFromDetFile.html b/reference/addBlockFromDetFile.html index 7a113e56..ba82f7fa 100644 --- a/reference/addBlockFromDetFile.html +++ b/reference/addBlockFromDetFile.html @@ -179,7 +179,7 @@

Examples

gdsAnnot1KG <- openfn.gds(fileAnnotGDS) print(gdsAnnot1KG) -#> File: /tmp/RtmpZ7wD2p/ex1_good_small_1KG_Ann_GDS.gds (349.4K) +#> File: /tmp/RtmpQxo2h7/ex1_good_small_1KG_Ann_GDS.gds (349.4K) #> + [ ] #> |--+ phase { Bit2 11000x156 LZ4_ra(21.5%), 90.2K } #> |--+ block.annot [ data.frame ] * diff --git a/reference/addGDS1KGLDBlock.html b/reference/addGDS1KGLDBlock.html index d4ab7587..8dac5a01 100644 --- a/reference/addGDS1KGLDBlock.html +++ b/reference/addGDS1KGLDBlock.html @@ -143,7 +143,7 @@

Examples

gdsAnnot1KG <- openfn.gds(fileAnnotGDS) print(gdsAnnot1KG) -#> File: /tmp/RtmpZ7wD2p/ex1_good_small_1KG_Ann_GDS.gds (349.4K) +#> File: /tmp/RtmpQxo2h7/ex1_good_small_1KG_Ann_GDS.gds (349.4K) #> + [ ] #> |--+ phase { Bit2 11000x156 LZ4_ra(21.5%), 90.2K } #> |--+ block.annot [ data.frame ] * diff --git a/reference/addGeneBlockGDSRefAnnot.html b/reference/addGeneBlockGDSRefAnnot.html index fcdbda41..146bac0f 100644 --- a/reference/addGeneBlockGDSRefAnnot.html +++ b/reference/addGeneBlockGDSRefAnnot.html @@ -168,7 +168,7 @@

Examples

unlink(fileAnnotGDS, force=TRUE) } -#> File: /tmp/RtmpZ7wD2p/ex1_good_small_1KG_Ann_GDS.gds (392.5K) +#> File: /tmp/RtmpQxo2h7/ex1_good_small_1KG_Ann_GDS.gds (392.5K) #> + [ ] #> |--+ phase { Bit2 11000x156 LZ4_ra(21.5%), 90.2K } #> |--+ block.annot [ data.frame ] * diff --git a/reference/addGeneBlockRefAnnot.html b/reference/addGeneBlockRefAnnot.html index 179f97f8..5ebbcebb 100644 --- a/reference/addGeneBlockRefAnnot.html +++ b/reference/addGeneBlockRefAnnot.html @@ -162,7 +162,7 @@

Examples

unlink(fileAnnotGDS, force=TRUE) } -#> File: /tmp/RtmpZ7wD2p/ex1_good_small_1KG_Ann_GDS.gds (392.5K) +#> File: /tmp/RtmpQxo2h7/ex1_good_small_1KG_Ann_GDS.gds (392.5K) #> + [ ] #> |--+ phase { Bit2 11000x156 LZ4_ra(21.5%), 90.2K } #> |--+ block.annot [ data.frame ] * diff --git a/reference/addStudy1Kg.html b/reference/addStudy1Kg.html index 77014493..69a0cfb1 100644 --- a/reference/addStudy1Kg.html +++ b/reference/addStudy1Kg.html @@ -156,7 +156,7 @@

Examples

## The Profile GDS file will contain 'study.list' and 'study.annot' entries addStudy1Kg(gdsReference=gds1KG, fileProfileGDS=fileNameProfile, verbose=TRUE) -#> study.annot DONE 2024-10-21 23:20:36.957532 +#> study.annot DONE 2024-10-22 03:34:18.031674 #> [1] 0 closefn.gds(gds1KG) diff --git a/reference/createStudy2GDS1KG.html b/reference/createStudy2GDS1KG.html index 4c1be5df..c258f773 100644 --- a/reference/createStudy2GDS1KG.html +++ b/reference/createStudy2GDS1KG.html @@ -223,7 +223,7 @@

Examples

## The Profile GDS file 'ex1.gds' has been created in the ## specified directory list.files(tempdir()) -#> [1] "downlit" "ex1.gds" "filebc74c76775f" +#> [1] "downlit" "ex1.gds" "filebb47a9c44cf" ## Remove Profile GDS file (created for demo purpose) unlink(file.path(tempdir(), "ex1.gds"), force=TRUE) diff --git a/reference/generateGDS1KGgenotypeFromSNPPileup.html b/reference/generateGDS1KGgenotypeFromSNPPileup.html index 6bf1a1d7..84798508 100644 --- a/reference/generateGDS1KGgenotypeFromSNPPileup.html +++ b/reference/generateGDS1KGgenotypeFromSNPPileup.html @@ -239,7 +239,7 @@

Examples

## The Profile GDS file 'ex1.gds' has been created in the ## specified directory list.files(dataDir) -#> [1] "downlit" "ex1.gds" "ex1.txt.gz" "filebc74c76775f" +#> [1] "downlit" "ex1.gds" "ex1.txt.gz" "filebb47a9c44cf" ## Unlink Profile GDS file (created for demo purpose) unlink(file.path(dataDir, "ex1.gds")) diff --git a/reference/generateGDSSNPinfo.html b/reference/generateGDSSNPinfo.html index 78b37bea..ac97087e 100644 --- a/reference/generateGDSSNPinfo.html +++ b/reference/generateGDSSNPinfo.html @@ -116,12 +116,12 @@

Examples

## Add SNV information to Reference GDS RAIDS:::generateGDSSNPinfo(gdsReference=filenewGDS, fileFreq=fileFilerterSNVs, verbose=TRUE) -#> Read mapSNVSel DONE 2024-10-21 23:21:00.83511 -#> SNP part snp.id DONE 2024-10-21 23:21:00.835595 -#> SNP part snp.chromosome DONE 2024-10-21 23:21:00.83602 -#> SNP part snp.position DONE 2024-10-21 23:21:00.836411 -#> SNP part 1 DONE 2024-10-21 23:21:00.836827 -#> SNP part AF DONE 2024-10-21 23:21:00.837214 +#> Read mapSNVSel DONE 2024-10-22 03:34:42.706733 +#> SNP part snp.id DONE 2024-10-22 03:34:42.707238 +#> SNP part snp.chromosome DONE 2024-10-22 03:34:42.707656 +#> SNP part snp.position DONE 2024-10-22 03:34:42.708075 +#> SNP part 1 DONE 2024-10-22 03:34:42.70846 +#> SNP part AF DONE 2024-10-22 03:34:42.708846 #> [1] 0 ## Close GDS file (important) diff --git a/reference/inferAncestry.html b/reference/inferAncestry.html index 96e23097..ab4a4807 100644 --- a/reference/inferAncestry.html +++ b/reference/inferAncestry.html @@ -137,8 +137,8 @@

Arguments

genoSource
-

a character string with two possible values: -'snp-pileup', 'generic' or 'VCF'. It specifies if the genotype files +

a character string with four possible values: +'snp-pileup', 'generic', 'VCF' or 'bam'. It specifies if the genotype files are generated by snp-pileup (Facets) or are a generic format CSV file with at least those columns: 'Chromosome', 'Position', 'Ref', 'Alt', 'Count', 'File1R' and 'File1A'. @@ -161,7 +161,7 @@

Arguments

Value

-

TODO a list containing 4 entries:

pcaSample
+

a list containing 4 entries:

pcaSample

a list containing the information related to the eigenvectors. The list contains those 3 entries:

sample.id

a character string representing the unique @@ -362,11 +362,6 @@

Value

-
-

Details

-

The profileAncestry() generates list list -TODO update the description

-

References

Galinsky KJ, Bhatia G, Loh PR, Georgiev S, Mukherjee S, Patterson NJ, @@ -388,7 +383,6 @@

Examples

## Path to the demo 1KG GDS file is located in this package dataDir <- system.file("extdata", package="RAIDS") - ################################################################# ## The 1KG GDS file and the 1KG SNV Annotation GDS file ## need to be located in the same directory @@ -412,7 +406,6 @@

Examples

################################################################# pathProfileGDS <- file.path(tempdir(), "out.tmp") - #################################################################### ## Fix seed to ensure reproducible results #################################################################### diff --git a/reference/inferAncestryGeneAware.html b/reference/inferAncestryGeneAware.html index 8be780b2..8ea13ded 100644 --- a/reference/inferAncestryGeneAware.html +++ b/reference/inferAncestryGeneAware.html @@ -138,8 +138,8 @@

Arguments

genoSource
-

a character string with two possible values: -'snp-pileup', 'generic' or 'VCF'. It specifies if the genotype files +

a character string with four possible values: +'snp-pileup', 'generic', 'VCF' or 'bam'. It specifies if the genotype files are generated by snp-pileup (Facets) or are a generic format CSV file with at least those columns: 'Chromosome', 'Position', 'Ref', 'Alt', 'Count', 'File1R' and 'File1A'. @@ -168,8 +168,208 @@

Arguments

Value

-

The integer list TODO see inferAncestry.

-
+

a list containing 4 entries:

pcaSample
+

a list containing the information related +to the eigenvectors. The list contains those 3 entries:

sample.id
+

a character string representing the unique +identifier of the current profile.

+ +
eigenvector.ref
+

a matrix of numeric containing +the eigenvectors for the reference profiles.

+ +
eigenvector
+

a matrix of numeric containing the +eigenvectors for the current profile projected on the PCA from the +reference profiles.

+ + +
+ +
paraSample
+

a list containing the results with +different D and K values that lead to optimal parameter +selection. The list contains those entries:

dfPCA
+

a data.frame containing statistical results +on all combined synthetic results done with a fixed value of D (the +number of dimensions). The data.frame contains those columns:

D
+

a numeric representing the value of D (the +number of dimensions).

+ +
median
+

a numeric representing the median of the +minimum AUROC obtained (within super populations) for all combination of +the fixed D value and all tested K values.

+ +
mad
+

a numeric representing the MAD of the minimum +AUROC obtained (within super populations) for all combination of the fixed +D value and all tested K values.

+ +
upQuartile
+

a numeric representing the upper quartile +of the minimum AUROC obtained (within super populations) for all +combination of the fixed D value and all tested K values.

+ +
k
+

a numeric representing the optimal K value +(the number of neighbors) for a fixed D value.

+ + +
+ +
dfPop
+

a data.frame containing statistical results on +all combined synthetic results done with different values of D (the +number of dimensions) and K (the number of neighbors). +The data.frame contains those columns:

D
+

a numeric representing the value of D (the +number of dimensions).

+ +
K
+

a numeric representing the value of K (the +number of neighbors).

+ +
AUROC.min
+

a numeric representing the minimum accuracy +obtained by grouping all the synthetic results by super-populations, for +the specified values of D and K.

+ +
AUROC
+

a numeric representing the accuracy obtained +by grouping all the synthetic results for the specified values of D +and K.

+ +
Accu.CM
+

a numeric representing the value of accuracy +of the confusion matrix obtained by grouping all the synthetic results for +the specified values of D and K.

+ + +
+ +
dfAUROC
+

a data.frame the summary of the results by +super-population. The data.frame contains +those columns:

D
+

a numeric representing the value of D (the +number of dimensions).

+ +
K
+

a numeric representing the value of K (the +number of neighbors).

+ +
Call
+

a character string representing the +super-population.

+ +
L
+

a numeric representing the lower value of the 95% +confidence interval for the AUROC obtained for the fixed values of +super-population, D and K.

+ +
AUROC
+

a numeric representing the AUROC obtained for +the fixed values of super-population, D and K.

+ +
H
+

a numeric representing the higher value of the 95% +confidence interval for the AUROC obtained for the fixed values of +super-population, D and K.

+ + +
+ +
D
+

a numeric representing the optimal D value +(the number of dimensions) for the specific profile.

+ +
K
+

a numeric representing the optimal K value +(the number of neighbors) for the specific profile.

+ +
listD
+

a numeric representing the optimal D +values (the number of dimensions) for the specific profile. More than one +D is possible.

+ + +
+ +
KNNSample
+

a data.frame containing the inferred +ancestry for different values of K and D. The +data.frame contains those columns:

sample.id
+

a character string representing the unique +identifier of the current profile.

+ +
D
+

a numeric representing the value of D (the +number of dimensions) used to infer the ancestry.

+ +
K
+

a numeric representing the value of K (the +number of neighbors) used to infer the ancestry.

+ +
SuperPop
+

a character string representing the inferred +ancestry for the specified D and K values.

+ + +
+ +
KNNSynthetic
+

a data.frame containing the inferred +ancestry for each synthetic data for different values of K and +D. +The data.frame +contains those columns:

sample.id
+

a character string representing the unique +identifier of the current synthetic data.

+ +
D
+

a numeric representing the value of D (the +number of dimensions) used to infer the ancestry.

+ +
K
+

a numeric representing the value of K (the +number of neighbors) used to infer the ancestry.

+ +
infer.superPop
+

a character string representing the +inferred ancestry for the specified D and K values.

+ +
ref.superPop
+

a character string representing the known +ancestry from the reference

+ + +
+ +
Ancestry
+

a data.frame containing the inferred +ancestry for the current profile. The data.frame contains those +columns:

sample.id
+

a character string representing the unique +identifier of the current profile.

+ +
D
+

a numeric representing the value of D (the +number of dimensions) used to infer the ancestry.

+ +
K
+

a numeric representing the value of K (the +number of neighbors) used to infer the ancestry.

+ +
SuperPop
+

a character string representing the inferred +ancestry.

+ + +
+ + +

Details

The runExomeAncestry() function generates 3 types of files diff --git a/reference/profileAncestry.html b/reference/profileAncestry.html index 3114775b..c59c6b72 100644 --- a/reference/profileAncestry.html +++ b/reference/profileAncestry.html @@ -385,11 +385,6 @@

Value

-
-

Details

-

The profileAncestry() generates list list -TODO update the description

-

References

Galinsky KJ, Bhatia G, Loh PR, Georgiev S, Mukherjee S, Patterson NJ, diff --git a/reference/pruningSample.html b/reference/pruningSample.html index 607a8234..9ce0fc07 100644 --- a/reference/pruningSample.html +++ b/reference/pruningSample.html @@ -260,7 +260,7 @@

Examples

## The 'pruned.study' entry should be present content <- openfn.gds(profileFile) content -#> File: /tmp/RtmpZ7wD2p/ex1.gds (4.3K) +#> File: /tmp/RtmpQxo2h7/ex1.gds (4.3K) #> + [ ] #> |--+ Ref.count { SparseInt16 11000x1, 568B } #> |--+ Alt.count { SparseInt16 11000x1, 74B } diff --git a/reference/wrapperAncestry.html b/reference/wrapperAncestry.html index ce655902..eaea62dd 100644 --- a/reference/wrapperAncestry.html +++ b/reference/wrapperAncestry.html @@ -398,11 +398,6 @@

Value

-
-

Details

-

The runWrapperAncestry() generates list list -TODO update the description

-

References

Galinsky KJ, Bhatia G, Loh PR, Georgiev S, Mukherjee S, Patterson NJ,