diff --git a/DESCRIPTION b/DESCRIPTION index 965e9c8..c6e678d 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Package: margheRita Title: Mass spectrometry-based metabolomics data analysis -Version: 0.2.6 -Date: 2024-06-19 +Version: 0.2.7 +Date: 2024-06-28 Authors@R: c(person(given = "Ettore", family = "Mosca", role = c("aut", "cre"), email = "ettore.mosca@itb.cnr.it", comment = c(ORCID = "0000-0002-3102-5150")), person(given = "Zahrasadat", family = "Aalavikakhki", role = "aut", email = "zahrasadat.alavikakhki@studenti.unimi.it"), diff --git a/NEWS b/NEWS index 73a6098..c5068f3 100644 --- a/NEWS +++ b/NEWS @@ -1,4 +1,10 @@ -03 May 2025, v0.2.5 +28 June 2024, v0.2.7 +-- addition of citation and link to zenodo dataset + +19 June 2024, v0.2.6 +-- various minor improvements in code, function documentation and vignettes + +03 May 2024, v0.2.5 -- various minor improvements in code, function documentation and vignettes 29 March 2024, v0.2.3 diff --git a/README.md b/README.md index 3dcdfd7..8d0cb05 100644 --- a/README.md +++ b/README.md @@ -18,7 +18,7 @@ Documentation: https://emosca-cnr.github.io/margheRita Source code: https://github.com/emosca-cnr/margheRita -Citation: ... +Citation: Ettore Mosca, Marynka Ulaszewska, Zahrasadat Alavikakhki, Edoardo Niccolò Bellini, Valeria Mannella, Gianfranco Frigerio, Denise Drago, Annapaola Andolfo. MargheRita: an R package for LC-MS/MS SWATH metabolomics data analysis and confident metabolite identification based on a spectral library of reference standards. bioRxiv 2024.06.20.599545; doi: https://doi.org/10.1101/2024.06.20.599545 Contacts: @@ -27,17 +27,5 @@ Contacts: # Installation -The package requires a series of other R packages, which are availble in [CRAN](https://cran.r-project.org/), [Bioconductor](https://www.bioconductor.org/) and [github](https://github.com/). In most of the cases, the following instructions guarantee that all such dependencies are installed: - -```{r, eval=FALSE} -install.packages("devtools") -devtools::install_github(c("pcastellanoescuder/POMA", "antonvsdata/notame")) - -if (!require("BiocManager", quietly = TRUE)){ - install.packages("BiocManager") -} -BiocManager::install(c("clusterProfiler", "pcaMethods")) - -devtools::install_github("emosca-cnr/margheRita", dependencies = T) -``` +See documentation at https://emosca-cnr.github.io/margheRita diff --git a/docs/404.html b/docs/404.html index 69fed3c..d415d0c 100644 --- a/docs/404.html +++ b/docs/404.html @@ -32,7 +32,7 @@
diff --git a/docs/articles/index.html b/docs/articles/index.html index 3003e26..5d71db3 100644 --- a/docs/articles/index.html +++ b/docs/articles/index.html @@ -17,7 +17,7 @@ diff --git a/docs/articles/margheRita.html b/docs/articles/margheRita.html index d28daa4..c78c4aa 100644 --- a/docs/articles/margheRita.html +++ b/docs/articles/margheRita.html @@ -33,7 +33,7 @@ @@ -105,6 +105,13 @@Citation: Ettore Mosca, Marynka Ulaszewska, Zahrasadat Alavikakhki, +Edoardo Niccolò Bellini, Valeria Mannella, Gianfranco Frigerio, Denise +Drago, Annapaola Andolfo. MargheRita: an R package for LC-MS/MS SWATH +metabolomics data analysis and confident metabolite identification based +on a spectral library of reference standards. bioRxiv 2024.06.20.599545; +doi: https://doi.org/10.1101/2024.06.20.599545
Contacts:
The package requires a series of other R packages, which are availble -in CRAN, Bioconductor or github, namely:
+The package requires a series of other R packages, which are +available in CRAN, Bioconductor or github, namely:
## graphics, grDevices, stats, utils, clusterProfiler, pcaMethods, ComplexHeatmap, LSD, plotrix, pals, Hmisc, notame, Biobase, openxlsx, devtools
In most of the cases, the following instructions guarantee that all such dependencies are installed:
@@ -251,6 +258,14 @@The full version of the “Urine” dataset, which was used for +margheRita assessment and to generate this documentation, is available +at https://doi.org/10.5281/zenodo.11243781, files +“Urine_RP_NEG_norm.txt” and “Urine_RP_POS_norm.txt”. The corresponding +sample information files can be accessed as follows:
+
+sample_file_NEG <- system.file("extdata", "Urine_RP_NEG_norm_metadata.txt", package = "margheRita")
+sample_file_POS <- system.file("extdata", "Urine_RP_POS_norm_metadata.txt", package = "margheRita")
+ms <- as.metaboset(mRList)
## MetaboSet object with 303 features and 253 samples. ## 10 QC samples included @@ -271,7 +286,7 @@
Inter-operability## The object has the following parts (splits): ## FALSE: features
or as “PomaSummarizedExperiment” object, used by package “POMA” (Castellano-Escuder et al. 2021):
-diff --git a/docs/reference/calc_ppm_err.html b/docs/reference/calc_ppm_err.html index aced8cc..7772897 100644 --- a/docs/reference/calc_ppm_err.html +++ b/docs/reference/calc_ppm_err.html @@ -17,7 +17,7 @@+se <- as.PomaSummarizedExperiment(mRList)
## class: SummarizedExperiment ## dim: 303 253 @@ -293,12 +308,12 @@
Filtering, imputation and normal
The function
-filtering()
runs filters to exclude features/sample with many missing values, features with wrong m/z values and, lastly, performs imputation of missing values:+<- filtering(mL) - mL # Samples with >= 100 metabolites 243 / 243 -# Features occurring in >= 3 samples 604 / 604 -# Features with appropriate m/z values: 548 -# Features without appropriate m/z values: 56 -: imputation not performed. No NAs
<- filtering(mL) + mL # Samples with >= 100 metabolites 243 / 243 +# Features occurring in >= 3 samples 604 / 604 +# Features with appropriate m/z values: 548 +# Features without appropriate m/z values: 56 +: imputation not performed. No NAs
These three steps can be called independently through the function
filter_NA()
,m_z_filtering()
andimputation()
, respectively. In particular, @@ -309,7 +324,7 @@Filtering, imputation and normal
The function
-heatscatter_chromatography()
creates a graphic overview of the mz and rt values in the dataset:diff --git a/docs/reference/calc_RI.html b/docs/reference/calc_RI.html index d2b0455..20c0a4e 100644 --- a/docs/reference/calc_RI.html +++ b/docs/reference/calc_RI.html @@ -17,7 +17,7 @@+margheRita provides three ways for normalizing metabolite @@ -324,22 +339,22 @@
Filtering, imputation and normal function
calc_reference()
sets up such column using average metabolite values and medians of QC samples. For example, here’s a call tonormalize_profiles()
using pqn: -+<- normalize_profiles(mL, method = "pqn") - mL_norm - PQN normalizationcalc_reference() function... - No reference profile found, using Using QC...
<- normalize_profiles(mL, method = "pqn") + mL_norm + PQN normalizationcalc_reference() function... + No reference profile found, using Using QC...
The comparison of the coefficient of variation of a metabolite in relation to QC samples provides a means to exclude low quality features. In particular, only features that have a CV ratio between no-QC samples and QC sample higher than a given threshold (by default 1) are kept:
-+<- CV_ratio(mRList = mL_norm) - mL_norm ratio (samples / QC): - Summary of CV - Min. 1st Qu. Median Mean 3rd Qu. Max. 0.3593 0.8025 1.1032 1.4645 1.6516 13.4896 - # Metabolites with appropriate CV 303 / 539
<- CV_ratio(mRList = mL_norm) + mL_norm ratio (samples / QC): + Summary of CV + Min. 1st Qu. Median Mean 3rd Qu. Max. 0.3593 0.8025 1.1032 1.4645 1.6516 13.4896 + # Metabolites with appropriate CV 303 / 539
The distributions of metabolite relative log-abundances can be calculated and visualized by means of:
-diff --git a/docs/reference/as.metaboset.html b/docs/reference/as.metaboset.html index 47dfc31..058c330 100644 --- a/docs/reference/as.metaboset.html +++ b/docs/reference/as.metaboset.html @@ -17,7 +17,7 @@+mL <- RLA(mRList = mL)
Typically, after normalization, the various samples should have similar distributions of relative log-abundances.
@@ -353,7 +368,7 @@Principal Component Analysis
+mL_norm <- mR_pca(mRList = mL_norm, nPcs=5, scaling="uv", include_QC=FALSE)
The results are added to the mRList in the element
pca
. It also provides some plots, like the visualization of distribution of @@ -363,7 +378,7 @@Principal Component AnalysisPlot2DPCA() function. The argument
col_by
enables the choice of themRList$sample_ann
column to be used to color samples: -@@ -376,7 +391,7 @@+Plot2DPCA(mRList = mL_norm, pcx=1, pcy=2, col_by="class", include_QC=TRUE)
Removing samplesmRList. Here, for example we remove all “Blank” samples: -
diff --git a/docs/reference/as.PomaSummarizedExperiment.html b/docs/reference/as.PomaSummarizedExperiment.html index 720ec69..629faf3 100644 --- a/docs/reference/as.PomaSummarizedExperiment.html +++ b/docs/reference/as.PomaSummarizedExperiment.html @@ -17,7 +17,7 @@+mL <- remove_samples(mRList = mL, ids = "Blank", column = "class")
In this case, the function removes all samples with value “Blank” in the column “class” of sample annotation.
@@ -387,7 +402,7 @@Collapsing techinical replicatesThe definition of mean metabolite abundance for every biological replicate is performed by means of
collapse_tech_rep()
function: -++mL_norm_bio <- collapse_tech_rep(mRList = mL_norm)
## AA_mealA_t00 AA_mealA_t01 AA_mealA_t02 AA_mealA_t03 AA_mealA_t04 ## F506 372.5212 314.9164 641.2731 328.3019 183.8177 @@ -413,16 +428,16 @@
Statistical analysis
mean_median_stdev_samples(mL_norm_bio) - - According to dataset size, this might take a few minutes. - Calculating means... - Calculating medians... Calculating standard deviations...
mean_median_stdev_samples(mL_norm_bio) + + According to dataset size, this might take a few minutes. + Calculating means... + Calculating medians... Calculating standard deviations...
The function
-univariate()
performs dataset-wide statistical tests (Student t-tests, Wilcoxon test, Anova and Kruskal-Wallis test) between levels of a particular factor defined in the sample annotation:@@ -485,7 +500,7 @@+mL_norm_bio <- univariate(mL_norm_bio, test_method="anova", exp.levels = c("AA", "DD", "MM"), exp.factor = "class")
## F p q DD-AA MM-AA ## F506 56.603887 6.500895e-16 9.379863e-15 0.000000e+00 0.000000e+00 @@ -442,7 +457,7 @@
Statistical analysis
+significant_features <- select_sign_features(mL_norm_bio, test_method="anova", test_value = "q", cutoff_value = 0.05)
## [1] "F3957" "F18426" "F19199" "F10248" "F9507" "F958"
Metabolite identificationIn this example, we load the margheRita library in positive modalitity with retention times of RPShort columns and we discard all peaks with relative intensity less than 10: -
diff --git a/docs/reference/annotate_univariate_results.html b/docs/reference/annotate_univariate_results.html index c7ea659..b644c4e 100644 --- a/docs/reference/annotate_univariate_results.html +++ b/docs/reference/annotate_univariate_results.html @@ -17,7 +17,7 @@+mR_library <- select_library(column = "RPShort", mode = "POS", accept_RI=10)
The resulting
@@ -537,7 +552,7 @@mR_library
is a list that contains information about precursorsMetabolite identificationfeatures specifies the features to be considered (all features if it is left
features=NULL
, as in the following example): -diff --git a/docs/reference/RLA.html b/docs/reference/RLA.html index 2e198e9..b649c11 100644 --- a/docs/reference/RLA.html +++ b/docs/reference/RLA.html @@ -17,7 +17,7 @@+mL_norm_bio <- metabolite_identification(mL_norm_bio, library_list = mR_library)
The function
metabolite_identification()
has a series of parameters that can be adjusted to optimize the identification process @@ -627,13 +642,13 @@Metabolite identification
The spectra from all the features that match a metabolite can be inspected creating the following plot through:
-diff --git a/docs/reference/Plot2DPCA.html b/docs/reference/Plot2DPCA.html index 87950a0..e8b3882 100644 --- a/docs/reference/Plot2DPCA.html +++ b/docs/reference/Plot2DPCA.html @@ -17,7 +17,7 @@+visualize_associated_spectra(mRList = mL_norm_bio, mR_library = mR_library, metabolite_id = "L1660")
The function
-h_map_MSMS_comparison()
draws heatmaps to visually compare ppm errors and RI differences between feature and metabolite spectra:@@ -646,7 +661,7 @@+h_map_MSMS_comparison(mL_norm_bio, metab_id = "L1660", feature_id = "F10165")
Retriving data
feature_stats
should be the name of any statistical test saved in themRList
or a custom data frame with Feature_ID as row names: -@@ -655,14 +670,14 @@+metab_stat <- annotate_univariate_results(mRList = mL_norm_bio, feature_stats = "anova")
The resulting data.frame is saved to file “data_stats_ann.csv”.
Metabolite abundance visualization
The function
-metab_boxplot()
draws boxplots of feature abundances grouped by the levels of a given factor:diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml index 294e61e..32e4165 100644 --- a/docs/pkgdown.yml +++ b/docs/pkgdown.yml @@ -3,5 +3,5 @@ pkgdown: 2.0.7 pkgdown_sha: ~ articles: margheRita: margheRita.html -last_built: 2024-06-19T14:05Z +last_built: 2024-06-28T16:15Z diff --git a/docs/reference/CV_ratio.html b/docs/reference/CV_ratio.html index 1232bf7..83b709e 100644 --- a/docs/reference/CV_ratio.html +++ b/docs/reference/CV_ratio.html @@ -21,7 +21,7 @@+metab_boxplot(mRList = mL_norm_bio, col_by="class", group="class", features = "F3957")
The function
-h_map()
provides heatmaps based on package ComplexHeatmap (Gu, Eils, and Schlesner 2016). Here we shoew the abundance of the most significant metabolites according to anova test:@@ -75,7 +75,7 @@+significant_features <- select_sign_features(mL_norm_bio, test_method="anova", test_value = "q", cutoff_value = 10e-10, feature_id = "Name") h_map(mL_norm_bio, scale_features=TRUE, features = significant_features, show_column_names=F, data.use = "data_ann")
Note that we extracted metabolite “Name” as feature_id and used @@ -682,13 +697,13 @@
Pathway analysis -
diff --git a/docs/index.html b/docs/index.html index 8ea8a41..c0467ee 100644 --- a/docs/index.html +++ b/docs/index.html @@ -33,7 +33,7 @@+significant_features <- select_sign_features(mRList = mL_norm_bio, test_method="anova", test_value = "q", cutoff_value = 10e-10, feature_id = "PubChemCID") all_PubChemCID <- unique(mL_norm_bio$metab_ann$PubChemCID[!is.na(mL_norm_bio$metab_ann$PubChemCID)]) pa_res <- pathway_analysis(in_list = significant_features, type = "ora", universe = all_PubChemCID)
In case of MSEA, a named ranked vector of scores for all PubChemCIDs in the dataset, in decreasing order of importance:
-@@ -73,13 +73,16 @@+diff --git a/docs/authors.html b/docs/authors.html index c0c1751..ea1d802 100644 --- a/docs/authors.html +++ b/docs/authors.html @@ -17,7 +17,7 @@ranked_vector <- select_sign_features(mRList = mL_norm_bio, test_method="anova", test_value = "q", cutoff_value = Inf, feature_id = "PubChemCID", values = TRUE) ranked_vector <- sort(-log10(ranked_vector), decreasing = T) msea_res <- pathway_analysis(in_list = ranked_vector, type = "msea")
Citation
Mosca E, Ulaszewska M, Bellini EN, Alavikakhki Z, Frigerio G, Drago D, Mannella V, Andolfo A (2024). “MargheRita: an R package for LC-MS/MS SWATH metabolomics data analysis and confident metabolite identification based on a spectral library of reference standards.” -R package version 0.2.3. +bioRxiv. +doi: 10.1101/2024.06.20.599545.
-@Misc{, +@Article{, title = {MargheRita: an R package for LC-MS/MS SWATH metabolomics data analysis and confident metabolite identification based on a spectral library of reference standards}, author = {Ettore Mosca and Marynka Ulaszewska and Edoardo Niccolò Bellini and Zahrasadat Alavikakhki and Gianfranco Frigerio and Denise Drago and Valeria Mannella and Annapaola Andolfo}, + journal = {bioRxiv}, year = {2024}, - note = {R package version 0.2.3}, + publisher = {Cold Spring Harbor Laboratory}, + doi = {10.1101/2024.06.20.599545}, }Documentation: https://emosca-cnr.github.io/margheRita
Source code: https://github.com/emosca-cnr/margheRita
-Citation: …
+Citation: Ettore Mosca, Marynka Ulaszewska, Zahrasadat Alavikakhki, Edoardo Niccolò Bellini, Valeria Mannella, Gianfranco Frigerio, Denise Drago, Annapaola Andolfo. MargheRita: an R package for LC-MS/MS SWATH metabolomics data analysis and confident metabolite identification based on a spectral library of reference standards. bioRxiv 2024.06.20.599545; doi: https://doi.org/10.1101/2024.06.20.599545
Contacts:
- @@ -87,10 +87,7 @@
Installation
-The package requires a series of other R packages, which are availble in CRAN, Bioconductor and github. In most of the cases, the following instructions guarantee that all such dependencies are installed:
-```{r, eval=FALSE} install.packages(“devtools”) devtools::install_github(c(“pcastellanoescuder/POMA”, “antonvsdata/notame”))
-if (!require(“BiocManager”, quietly = TRUE)){ install.packages(“BiocManager”) } BiocManager::install(c(“clusterProfiler”, “pcaMethods”))
-devtools::install_github(“emosca-cnr/margheRita”, dependencies = T) ```
+See documentation at https://emosca-cnr.github.io/margheRita