Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworking PEP to prevent cross-contamination #2394

Merged
merged 118 commits into from
Aug 28, 2024
Merged
Show file tree
Hide file tree
Changes from 112 commits
Commits
Show all changes
118 commits
Select commit Hold shift + click to select a range
3edb3c6
new style computation of pep q-value
trishorts Jun 18, 2024
236045a
fixed unit tests
trishorts Jun 18, 2024
ca02738
separate PsmFdrInfo and PeptideFdrInfo calculations in FdrAnalysisEngine
trishorts Jun 19, 2024
e370ffd
d
trishorts Jun 19, 2024
9d90658
fdh
trishorts Jun 19, 2024
2bbf23b
not working yet
trishorts Jun 25, 2024
4562eb1
maybe better maybe not
trishorts Jun 26, 2024
e3566fa
huh
trishorts Jun 27, 2024
20cba5c
tr
trishorts Jul 1, 2024
98affe0
j
trishorts Jul 3, 2024
daa4afa
53
trishorts Jul 3, 2024
977d5e3
Fixed filtering kinda
Alexander-Sol Jul 17, 2024
271bb12
commit before i start breaking things
Alexander-Sol Jul 17, 2024
8d72e2e
still sorta broken
Alexander-Sol Jul 17, 2024
60f1767
idk
Alexander-Sol Jul 17, 2024
650a613
Fixed most issues, moved filtering to MetaMorpheus Task
Alexander-Sol Jul 17, 2024
62a9da6
fix multiprotease unit test
trishorts Jul 18, 2024
956a456
fix MakeSureFdrDoesntSkip
trishorts Jul 18, 2024
f87b6fa
fix TestPeptideCount
trishorts Jul 18, 2024
7d774aa
new postsearchanalysistask results generator
trishorts Jul 19, 2024
ae5141b
fixed results output in postsearchanalysistask
trishorts Jul 19, 2024
0369aad
yert
trishorts Jul 19, 2024
c2ca1b7
fix pep q-value calc
trishorts Jul 22, 2024
792d154
fix peptideFdrTest
trishorts Jul 22, 2024
dce2647
fix spectral recovery
trishorts Jul 22, 2024
d78f405
ity
trishorts Jul 23, 2024
a5becb6
fix semi specific test
trishorts Jul 23, 2024
cc38ed8
fix metadraw test
trishorts Jul 23, 2024
d62eeea
lkah
trishorts Jul 23, 2024
e809b7a
poiu
trishorts Jul 23, 2024
7162eda
slice test fixed
trishorts Jul 23, 2024
d97aa94
merge upstream
trishorts Jul 23, 2024
d34d284
some tests
trishorts Jul 24, 2024
4cd0e87
some testst
trishorts Jul 24, 2024
c8507e4
fixed most of silac unit tests
trishorts Jul 24, 2024
bfc3078
hmm
trishorts Jul 24, 2024
59db49a
dsg
trishorts Jul 24, 2024
87c44fd
uio
trishorts Jul 24, 2024
e9512c5
kjg
trishorts Jul 24, 2024
7bc1734
ghk
trishorts Jul 24, 2024
adb8fdb
Fixed the few remaining tests that were breaking
Alexander-Sol Jul 24, 2024
36f80c5
Five tests breaking, mostly numbers
Alexander-Sol Jul 25, 2024
4e17df4
Fixed results.txt writer for PEP-Q-values.
Alexander-Sol Jul 25, 2024
6d78810
Fixed output bug
Alexander-Sol Jul 25, 2024
229bef3
idk
Alexander-Sol Jul 25, 2024
eb33c9f
broken
Alexander-Sol Jul 26, 2024
61762bc
Finally fixed!!!
Alexander-Sol Jul 26, 2024
037362c
No longer duplicate Peptides when creating training data
Alexander-Sol Jul 26, 2024
b240e2f
PEP Dictionaries are now constructed inside a function
Alexander-Sol Jul 26, 2024
4bd13a0
Peptide groups implemented succesfully
Alexander-Sol Jul 26, 2024
ce17fe1
update csproj
trishorts Jul 29, 2024
3ce6c53
nuget update
trishorts Jul 29, 2024
eb69623
Merge branch 'master' into ShortreedPep3
Alexander-Sol Jul 29, 2024
7964dc8
lets start here
trishorts Jul 29, 2024
7ef8443
idk
Alexander-Sol Jul 29, 2024
961164b
it ran bro
trishorts Jul 29, 2024
95b3135
Added QValueThresholdForPEP to common params
Alexander-Sol Jul 29, 2024
38f6779
fsad
trishorts Jul 29, 2024
b89012d
remove ostensibly unused dlls
trishorts Jul 29, 2024
82449d5
bouncy castle linked to itext for writing pdf
trishorts Jul 29, 2024
12177b7
Added filterType enum
Alexander-Sol Jul 29, 2024
e062793
fixed issues
Alexander-Sol Jul 29, 2024
5594d03
Reduplicated PWSMs
Alexander-Sol Jul 29, 2024
d6c47b3
xyz
Alexander-Sol Jul 29, 2024
0df49cd
Don't train on ambiguous
Alexander-Sol Jul 29, 2024
50dfb35
It's finally working
Alexander-Sol Jul 30, 2024
a97f853
Fixing tests through reflection
Alexander-Sol Jul 30, 2024
035821b
Fixed the last of the tests
Alexander-Sol Jul 30, 2024
992d0d0
PostSearchAnalysisTaskTest fix
Alexander-Sol Jul 30, 2024
be7295d
unused using
trishorts Jul 30, 2024
e2bee66
more unused usings
trishorts Jul 30, 2024
e1ec392
All tests passing
Alexander-Sol Jul 30, 2024
068d541
close spectrum library connection
trishorts Jul 30, 2024
f20ba24
Fixed tests, fixed bug where decoys weren't partitioned correctly
Alexander-Sol Jul 30, 2024
d61af12
idk
Alexander-Sol Jul 31, 2024
70e375b
Addressed Nic's comments
Alexander-Sol Jul 31, 2024
f8871e1
Fixed conflicts
Alexander-Sol Jul 31, 2024
719e557
no longer delete decoys identical to targets
Alexander-Sol Jul 31, 2024
1f0b1b4
Fixed tests that broke when addressing Nic's comments
Alexander-Sol Jul 31, 2024
a965b04
Made fields in FilteredPsms more explicit
Alexander-Sol Aug 1, 2024
2c04b01
Fixed merge conflicts
Alexander-Sol Aug 1, 2024
a929f71
Reverted change where decoys matching targets were no longer removed
Alexander-Sol Aug 1, 2024
9c4a31a
actually fixed merge conflicts
Alexander-Sol Aug 1, 2024
16855d6
Increased QValue cutoff for calibrating PSMs to 0.005
Alexander-Sol Aug 1, 2024
63fc94b
Bumped q-value requirement
Alexander-Sol Aug 1, 2024
e8dfe2f
Merge branch 'CalibrationTweak' into Pep2
Alexander-Sol Aug 1, 2024
d4e433f
commented out decoy removal
Alexander-Sol Aug 1, 2024
493f50b
added decoy sanitizing to MetaMorpheus task. Not sure why tests are b…
Alexander-Sol Aug 2, 2024
8a4492a
idk
Alexander-Sol Aug 2, 2024
4d33ab3
Merge branch 'DecoyHomology3' into Pep3
Alexander-Sol Aug 2, 2024
fe14927
idk
Alexander-Sol Aug 2, 2024
b59752f
minpr
Alexander-Sol Aug 3, 2024
6b4e2eb
Fixed merge conflict
Alexander-Sol Aug 3, 2024
40430e3
Fixed merge conflicts
Alexander-Sol Aug 6, 2024
62d5b4a
Fixed merge conflicts
Alexander-Sol Aug 6, 2024
2ab51ad
Updated nuget package, fixed one test
Alexander-Sol Aug 6, 2024
5427698
Fixed merge conflicts
Alexander-Sol Aug 6, 2024
9290e38
Squashed bugs
Alexander-Sol Aug 6, 2024
6e4f428
All tests are passing
Alexander-Sol Aug 6, 2024
0fce82f
Added tests for XL PEP, made XL PEP actually work
Alexander-Sol Aug 6, 2024
b79a01c
Merge branch 'master' into Pep3
Alexander-Sol Aug 6, 2024
309dc8b
Fixed XL PEP issues
Alexander-Sol Aug 7, 2024
9edd533
Merge branch 'Pep3' of https://github.com/Alexander-Sol/MetaMorpheus …
Alexander-Sol Aug 7, 2024
7a09c51
addressed PR comments
Alexander-Sol Aug 19, 2024
af01a2e
Fixed merge conflicts
Alexander-Sol Aug 19, 2024
1a4e3b3
Fixed broken tests
Alexander-Sol Aug 19, 2024
7cde914
Adjusted number for XL test
Alexander-Sol Aug 19, 2024
5232af6
Merge branch 'master' into Pep3
trishorts Aug 23, 2024
90c5283
Merge branch 'master' into Pep3
trishorts Aug 23, 2024
ed9f1c0
Deleted B
Alexander-Sol Aug 27, 2024
41f1011
Merge branch 'Pep3' of https://github.com/Alexander-Sol/MetaMorpheus …
Alexander-Sol Aug 27, 2024
6359c3d
Merge branch 'master' into Pep3
trishorts Aug 27, 2024
4ad908c
TODO comments
Alexander-Sol Aug 27, 2024
3bf40b7
Merge branch 'Pep3' of https://github.com/Alexander-Sol/MetaMorpheus …
Alexander-Sol Aug 27, 2024
dc110ba
Merge branch 'master' into Pep3
Alexander-Sol Aug 27, 2024
0abd4e6
fiddling with spectral lib test
Alexander-Sol Aug 27, 2024
10c9e78
Even more changes to SpectralWriterTest
Alexander-Sol Aug 27, 2024
78c469f
added one second delay to SpectralWriterTest
Alexander-Sol Aug 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion MetaMorpheus/EngineLayer/CommonParameters.cs
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ public int DeconvolutionMaxAssumedChargeState
/// This parameter determines which PSMs/Peptides will be used as postive training examples
/// when training the GBDT model for PEP.
/// </summary>
public double QValueCutoffForPepCalculation { get; private set; }
public double QValueCutoffForPepCalculation { get; set; }
public DigestionParams DigestionParams { get; private set; }
public bool ReportAllAmbiguity { get; private set; }
public int? NumberOfPeaksToKeepPerWindow { get; private set; }
Expand Down
26 changes: 17 additions & 9 deletions MetaMorpheus/EngineLayer/FdrAnalysis/FdrAnalysisEngine.cs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
using System;
using EngineLayer.CrosslinkSearch;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
Expand Down Expand Up @@ -275,18 +276,25 @@ public static void PepQValueInverted(List<SpectralMatch> psms, bool peptideLevel

public void Compute_PEPValue(FdrAnalysisResults myAnalysisResults, List<SpectralMatch> psms)
{
if (psms[0].DigestionParams.Protease.Name == "top-down")
string searchType;
// Currently, searches of mixed data (bottom-up + top-down) are not supported
Alexander-Sol marked this conversation as resolved.
Show resolved Hide resolved
// PEP will be calculated based on the search type of the first file/PSM in the list, which isn't ideal
// This will be addressed in a future release
switch(psms[0].DigestionParams.Protease.Name)
{
myAnalysisResults.BinarySearchTreeMetrics = PEP_Analysis_Cross_Validation.ComputePEPValuesForAllPSMsGeneric(psms, "top-down", this.FileSpecificParameters, this.OutputFolder);
case "top-down":
searchType = "top-down";
break;
default:
searchType = "standard";
break;
}
else if (psms[0].DigestionParams.Protease.Name == "crosslink")
if (psms[0] is CrosslinkSpectralMatch)
{
myAnalysisResults.BinarySearchTreeMetrics = PEP_Analysis_Cross_Validation.ComputePEPValuesForAllPSMsGeneric(psms, "crosslink", this.FileSpecificParameters, this.OutputFolder);
}
else
{
myAnalysisResults.BinarySearchTreeMetrics = PEP_Analysis_Cross_Validation.ComputePEPValuesForAllPSMsGeneric(psms, "standard", this.FileSpecificParameters, this.OutputFolder);
searchType = "crosslink";
}
myAnalysisResults.BinarySearchTreeMetrics = new PepAnalysisEngine(psms, searchType, FileSpecificParameters, OutputFolder).ComputePEPValuesForAllPSMs();

}

/// <summary>
Expand Down

Large diffs are not rendered by default.

68 changes: 68 additions & 0 deletions MetaMorpheus/EngineLayer/FdrAnalysis/PeptideMatchGroup.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
using Omics;
using Proteomics.ProteolyticDigestion;
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace EngineLayer
{
public class PeptideMatchGroup : IEnumerable<SpectralMatch>
{
public string PeptideFullSequence { get; }
public List<SpectralMatch> SpectralMatches { get; }

/// <summary>
/// This class groups all spectral matches associated with a given peptide together,
/// to facilitate the calculation of PEP values.
/// </summary>
/// <param name="fullPeptideSeq"> The full sequence to be used for grouping</param>
/// <param name="spectralMatches"> Every spectral match that matches the full sequence</param>
public PeptideMatchGroup(string fullPeptideSeq, List<SpectralMatch> spectralMatches)
{
PeptideFullSequence = fullPeptideSeq;
SpectralMatches = spectralMatches;
}

public static List<PeptideMatchGroup> GroupByBaseSequence(List<SpectralMatch> spectralMatches)
{
// This groups psms by base sequence, ensuring that PSMs with the same base sequence but different modifications are grouped together when training.
trishorts marked this conversation as resolved.
Show resolved Hide resolved
return spectralMatches.GroupBy(p => p.BaseSequence)
.Select(group => new PeptideMatchGroup(group.Key, group.ToList()))
.OrderByDescending(matchGroup => matchGroup.Count())
.ThenByDescending(matchGroup => matchGroup.BestMatch.Score)
.ToList();
}

public IEnumerable<SpectralMatch> GetBestMatchByMod()
{
return SpectralMatches.GroupBy(p => p.FullSequence).Select(g => g.MaxBy(p => p));
}

/// <summary>
/// This function is called if there aren't enough peptides to train at the peptide level
/// </summary>
/// <param name="spectralMatches"></param>
/// <returns></returns>
public static List<PeptideMatchGroup> GroupByIndividualPsm(List<SpectralMatch> spectralMatches)
{
return spectralMatches.Select(psm => new PeptideMatchGroup(psm.FullSequence, new List<SpectralMatch> { psm }))
.ToList();
}

public SpectralMatch BestMatch => SpectralMatches.MaxBy(match => match);

public IEnumerator<SpectralMatch> GetEnumerator()
{
return SpectralMatches.GetEnumerator();
}

IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}

}
}
2 changes: 1 addition & 1 deletion MetaMorpheus/EngineLayer/SpectralMatch.cs
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@ public void ResolveAllAmbiguities()
ModsChemicalFormula = PsmTsvWriter.Resolve(_BestMatchingBioPolymersWithSetMods.Select(b => b.Pwsm.AllModsOneIsNterminus.Select(c => (c.Value)))).ResolvedValue;
Notch = PsmTsvWriter.Resolve(_BestMatchingBioPolymersWithSetMods.Select(b => b.Notch)).ResolvedValue;

// if the PSM matches a target and a decoy and they are the SAME SEQUENCE, remove the decoy
//if the PSM matches a target and a decoy and they are the SAME SEQUENCE, remove the decoy
if (IsDecoy)
{
bool removedPeptides = false;
Expand Down
23 changes: 17 additions & 6 deletions MetaMorpheus/TaskLayer/FilteredPsms.cs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,12 @@

namespace TaskLayer
{
public enum FilterType
{
QValue,
PepQValue
}

/// <summary>
/// Contains a filtered list of PSMs.
/// All properties within this class are read-only, and should only be set on object construction
Expand All @@ -18,11 +24,11 @@ public class FilteredPsms : IEnumerable<SpectralMatch>
/// <summary>
/// Filter type can have only two values: "q-value" or "pep q-value"
/// </summary>
public string FilterType { get; init; }
public FilterType FilterType { get; init; }
public double FilterThreshold { get; init; }
public bool FilteringNotPerformed { get; init; }
public bool PeptideLevelFiltering { get; init; }
public FilteredPsms(List<SpectralMatch> filteredPsms, string filterType, double filterThreshold, bool filteringNotPerformed, bool peptideLevelFiltering)
public FilteredPsms(List<SpectralMatch> filteredPsms, FilterType filterType, double filterThreshold, bool filteringNotPerformed, bool peptideLevelFiltering)
{
FilteredPsmsList = filteredPsms;
FilterType = filterType;
Expand All @@ -37,13 +43,18 @@ private bool AboveThreshold(SpectralMatch psm)

switch (FilterType)
{
case "pep q-value":
case FilterType.PepQValue:
return psm.GetFdrInfo(PeptideLevelFiltering).PEP_QValue <= FilterThreshold;
default:
return psm.GetFdrInfo(PeptideLevelFiltering).QValue <= FilterThreshold && psm.GetFdrInfo(PeptideLevelFiltering).QValueNotch <= FilterThreshold;
}
}

public string GetFilterTypeString()
{
return FilterType == FilterType.PepQValue ? "pep q-value" : "q-value";
}

/// <summary>
/// This method should only be called when filtered PSMs are modified for the purpose of SILAC analysis
/// </summary>
Expand Down Expand Up @@ -87,7 +98,7 @@ public static FilteredPsms Filter(IEnumerable<SpectralMatch> psms,
List<SpectralMatch> filteredPsms = new List<SpectralMatch>();

// set the filter type
string filterType = "q-value";
FilterType filterType = FilterType.QValue;
if (pepQValueThreshold < qValueThreshold)
{
if (psms.Count() < 100)
Expand All @@ -97,13 +108,13 @@ public static FilteredPsms Filter(IEnumerable<SpectralMatch> psms,
}
else
{
filterType = "pep q-value";
filterType = FilterType.PepQValue;
}
}

if (!includeHighQValuePsms)
{
filteredPsms = filterType.Equals("q-value")
filteredPsms = filterType.Equals(FilterType.QValue)
? psms.Where(p => p.GetFdrInfo(filterAtPeptideLevel) != null
&& p.GetFdrInfo(filterAtPeptideLevel).QValue <= filterThreshold
&& p.GetFdrInfo(filterAtPeptideLevel).QValueNotch <= filterThreshold).ToList()
Expand Down
67 changes: 3 additions & 64 deletions MetaMorpheus/TaskLayer/MbrAnalysis/SpectralRecoveryRunner.cs
Original file line number Diff line number Diff line change
Expand Up @@ -119,9 +119,8 @@ public static SpectralRecoveryResults RunSpectralRecoveryAlgorithm(
List<SpectralMatch> allPsms = parameters.AllPsms.
OrderByDescending(p => p).ToList();

AssignEstimatedPsmQvalue(bestMbrMatches, allPsms);
FDRAnalysisOfMbrPsms(bestMbrMatches, allPsms, parameters, fileSpecificParameters);
AssignEstimatedPsmPepQValue(bestMbrMatches, allPsms);

foreach (SpectralRecoveryPSM match in bestMbrMatches.Values) match.FindOriginalPsm(allPsms);
}

Expand Down Expand Up @@ -208,70 +207,10 @@ private static void FDRAnalysisOfMbrPsms(ConcurrentDictionary<ChromatographicPea
Select(p => p.Value.spectralLibraryMatch).
Where(v => v != null).
ToList();
List<int>[] psmGroupIndices = PEP_Analysis_Cross_Validation.Get_PSM_Group_Indices(psms, 1);
MLContext mlContext = new MLContext();
IEnumerable<PsmData>[] PSMDataGroups = new IEnumerable<PsmData>[1];

string searchType = "standard";
if (psms[0].DigestionParams.Protease.Name == "top-down")
{
searchType = "top-down";
}

int chargeStateMode = PEP_Analysis_Cross_Validation.GetChargeStateMode(allPsms);

Dictionary<string, Dictionary<int, Tuple<double, double>>> fileSpecificTimeDependantHydrophobicityAverageAndDeviation_unmodified = PEP_Analysis_Cross_Validation.ComputeHydrophobicityValues(allPsms, fileSpecificParameters, false);
Dictionary<string, Dictionary<int, Tuple<double, double>>> fileSpecificTimeDependantHydrophobicityAverageAndDeviation_modified = PEP_Analysis_Cross_Validation.ComputeHydrophobicityValues(allPsms, fileSpecificParameters, true);
PEP_Analysis_Cross_Validation.ComputeMobilityValues(allPsms, fileSpecificParameters);

Dictionary<string, float> fileSpecificMedianFragmentMassErrors = PEP_Analysis_Cross_Validation.GetFileSpecificMedianFragmentMassError(allPsms);

PSMDataGroups[0] = PEP_Analysis_Cross_Validation.CreatePsmData(searchType, fileSpecificParameters, psms, psmGroupIndices[0], fileSpecificTimeDependantHydrophobicityAverageAndDeviation_unmodified, fileSpecificTimeDependantHydrophobicityAverageAndDeviation_modified, fileSpecificMedianFragmentMassErrors, chargeStateMode);

string[] trainingVariables = PsmData.trainingInfos[searchType];

TransformerChain<BinaryPredictionTransformer<Microsoft.ML.Calibrators.CalibratedModelParametersBase<Microsoft.ML.Trainers.FastTree.FastTreeBinaryModelParameters, Microsoft.ML.Calibrators.PlattCalibrator>>>[] trainedModels = new TransformerChain<BinaryPredictionTransformer<Microsoft.ML.Calibrators.CalibratedModelParametersBase<Microsoft.ML.Trainers.FastTree.FastTreeBinaryModelParameters, Microsoft.ML.Calibrators.PlattCalibrator>>>[1];

var trainer = mlContext.BinaryClassification.Trainers.FastTree(labelColumnName: "Label", featureColumnName: "Features", numberOfTrees: 400);
var pipeline = mlContext.Transforms.Concatenate("Features", trainingVariables).Append(trainer);

IDataView dataView = mlContext.Data.LoadFromEnumerable(PSMDataGroups[0]);

string outputFolder = parameters.OutputFolder;

trainedModels[0] = pipeline.Fit(dataView);

PEP_Analysis_Cross_Validation.Compute_PSM_PEP(psms, psmGroupIndices[0], mlContext, trainedModels[0], searchType, fileSpecificParameters, fileSpecificMedianFragmentMassErrors, chargeStateMode, outputFolder);
}
new FdrAnalysisEngine(psms, parameters.NumNotches, fileSpecificParameters.First().Item2, fileSpecificParameters,
new List<string> { parameters.SearchTaskId }, analysisType: "PSM", doPEP: true, outputFolder: parameters.OutputFolder).Run();

private static void AssignEstimatedPsmPepQValue(ConcurrentDictionary<ChromatographicPeak, SpectralRecoveryPSM> bestMbrMatches, List<SpectralMatch> allPsms)
{
List<double> pepValues = bestMbrMatches.
Select(p => p.Value.spectralLibraryMatch).
Where(p => p != null).
OrderBy(p => p.FdrInfo.PEP).
Select(p => p.FdrInfo.PEP).
ToList();

foreach (SpectralRecoveryPSM match in bestMbrMatches.Values)
{
if (match.spectralLibraryMatch == null) continue;

int myIndex = 0;
while (myIndex < (pepValues.Count - 1) && pepValues[myIndex] <= match.spectralLibraryMatch.FdrInfo.PEP)
{
myIndex++;
}
if (myIndex == pepValues.Count - 1)
{
match.spectralLibraryMatch.FdrInfo.PEP_QValue = pepValues.Last();
}
else
{
double estimatedQ = (pepValues[myIndex - 1] + pepValues[myIndex]) / 2;
match.spectralLibraryMatch.FdrInfo.PEP_QValue = estimatedQ;
}
}
}

private static void WriteSpectralRecoveryPsmResults(ConcurrentDictionary<ChromatographicPeak, SpectralRecoveryPSM> bestMbrMatches, PostSearchAnalysisParameters parameters)
Expand Down
38 changes: 38 additions & 0 deletions MetaMorpheus/TaskLayer/MetaMorpheusTask.cs
Original file line number Diff line number Diff line change
Expand Up @@ -622,6 +622,44 @@ protected List<Protein> LoadProteins(string taskId, List<DbForTask> dbFilenameLi
{
Warn("Warning: " + emptyProteinEntries + " empty protein entries ignored");
}



if (!proteinList.Any(p => p.IsDecoy))
{
Status("Done loading proteins", new List<string> { taskId });
return proteinList;
}

// Sanitize the decoys
trishorts marked this conversation as resolved.
Show resolved Hide resolved

HashSet<string> targetPeptideSequences = new();
foreach(var protein in proteinList.Where(p => !p.IsDecoy))
{
// When thinking about decoy collisions, we can ignore modifications
foreach(var peptide in protein.Digest(commonParameters.DigestionParams, new List<Modification>(), new List<Modification>()))
{
targetPeptideSequences.Add(peptide.BaseSequence);
}
}
// Now, we iterate through the decoys and scramble the sequences that correspond to target peptides
for(int i = 0; i < proteinList.Count; i++)
{
if(proteinList[i].IsDecoy)
{
var peptidesToReplace = proteinList[i]
.Digest(commonParameters.DigestionParams, new List<Modification>(), new List<Modification>())
.Select(p => p.BaseSequence)
.Where(targetPeptideSequences.Contains)
.ToList();
if(peptidesToReplace.Any())
{
proteinList[i] = Protein.ScrambleDecoyProteinSequence(proteinList[i], commonParameters.DigestionParams, forbiddenSequences: targetPeptideSequences, peptidesToReplace);
}
}
}

Status("Done loading proteins", new List<string> { taskId });
return proteinList;
}

Expand Down
9 changes: 4 additions & 5 deletions MetaMorpheus/TaskLayer/SearchTask/PostSearchAnalysisTask.cs
Original file line number Diff line number Diff line change
Expand Up @@ -620,7 +620,7 @@ private void WritePsmResults()
"PEP could not be calculated due to an insufficient number of PSMs. Results were filtered by q-value." +
Environment.NewLine);
}
string psmResultsText = "All target PSMs with " + psmsForPsmResults.FilterType + " <= " + Math.Round(psmsForPsmResults.FilterThreshold, 2) + ": " +
string psmResultsText = "All target PSMs with " + psmsForPsmResults.GetFilterTypeString() + " <= " + Math.Round(psmsForPsmResults.FilterThreshold, 2) + ": " +
psmsForPsmResults.TargetPsmsAboveThreshold;
ResultsDictionary[("All", "PSMs")] = psmResultsText;
}
Expand All @@ -647,7 +647,7 @@ private void WritePeptideResults()
Parameters.SearchTaskResults.AddPsmPeptideProteinSummaryText(
"PEP could not be calculated due to an insufficient number of PSMs. Results were filtered by q-value." + Environment.NewLine);
}
string peptideResultsText = $"All target {GlobalVariables.AnalyteType.ToLower()}s with " + peptidesForPeptideResults.FilterType + " <= " + Math.Round(peptidesForPeptideResults.FilterThreshold, 2) + ": " +
string peptideResultsText = $"All target {GlobalVariables.AnalyteType.ToLower()}s with " + peptidesForPeptideResults.GetFilterTypeString() + " <= " + Math.Round(peptidesForPeptideResults.FilterThreshold, 2) + ": " +
peptidesForPeptideResults.TargetPsmsAboveThreshold;
ResultsDictionary[("All", GlobalVariables.AnalyteType)] = peptideResultsText;
}
Expand Down Expand Up @@ -684,7 +684,7 @@ private void WriteIndividualPsmResults()
FinishedWritingFile(writtenFile, new List<string> { Parameters.SearchTaskId, "Individual Spectra Files", psmFileGroup.Key });

// write summary text
string psmResultsText = strippedFileName + " - Target PSMs with " + psmsToWrite.FilterType + " <= " + Math.Round(psmsToWrite.FilterThreshold, 2) + ": " +
string psmResultsText = strippedFileName + " - Target PSMs with " + psmsToWrite.GetFilterTypeString() + " <= " + Math.Round(psmsToWrite.FilterThreshold, 2) + ": " +
psmsToWrite.TargetPsmsAboveThreshold;
ResultsDictionary[(strippedFileName, "PSMs")] = psmResultsText;
}
Expand Down Expand Up @@ -720,7 +720,7 @@ private void WriteIndividualPeptideResults()
FinishedWritingFile(writtenFile, new List<string> { Parameters.SearchTaskId, "Individual Spectra Files", psmFileGroup.Key });

// write summary text
string peptideResultsText = strippedFileName + $" - Target {GlobalVariables.AnalyteType.ToLower()}s with " + peptidesToWrite.FilterType + " <= " + Math.Round(peptidesToWrite.FilterThreshold, 2) + ": " +
string peptideResultsText = strippedFileName + $" - Target {GlobalVariables.AnalyteType.ToLower()}s with " + peptidesToWrite.GetFilterTypeString() + " <= " + Math.Round(peptidesToWrite.FilterThreshold, 2) + ": " +
peptidesToWrite.TargetPsmsAboveThreshold;
ResultsDictionary[(strippedFileName, GlobalVariables.AnalyteType)] = peptideResultsText;
}
Expand All @@ -746,7 +746,6 @@ private void UpdateSpectralLibrary()
// Value is the highest scoring psm in the group
elementSelector: g => g.MaxBy(p => p.Score));


//load the original library
var originalLibrarySpectra = Parameters.SpectralLibrary.GetAllLibrarySpectra();
List<LibrarySpectrum> updatedLibrarySpectra = new();
Expand Down
Loading
Loading