-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config files handling #22
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
%% Simple example that runs SCINGE for two replicates of two hyperparameter settings | ||
clear all; | ||
close all; | ||
clc; | ||
if ~isdeployed | ||
addpath(genpath('.')); | ||
end | ||
|
||
%% Import list of parameter combinations | ||
fid = fopen('SINGE_params.cfg'); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should support specifying the config file names as a command line argument. This could be the default name or we could require both types of config files to be provided before running. That way we can compile this file one and let users run on different datasets without needing to recompile. |
||
temp = fgetl(fid); | ||
while any(temp~=-1)||isempty(temp) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How much error handling do we need here? I believe that |
||
if ~isempty(temp) | ||
temp = strsplit(temp); | ||
pid = str2num(temp{1}); | ||
pname = temp{2}; | ||
pval = temp{3}; | ||
if isnumeric(str2num(pval))&&~isempty(str2num(pval)) | ||
pval = str2num(pval); | ||
else | ||
ind = find(pval==''''); | ||
pval(ind) = []; | ||
end | ||
param_list{pid}.(pname) = pval; | ||
end | ||
temp = fgetl(fid); | ||
end | ||
%% Specify Path to Input data and path to Output folder, gene_list and number of subsampled replicates | ||
fid = fopen('SINGE_IO.cfg'); | ||
temp = fgetl(fid); | ||
while any(temp~=-1)||isempty(temp) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same question as above regarding error handling and required config file values. |
||
if ~isempty(temp) | ||
temp = strsplit(temp); | ||
pname = temp{1}; | ||
pval = temp{2}; | ||
if isnumeric(str2num(pval))&&~isempty(str2num(pval)) | ||
pval = str2num(pval); | ||
else | ||
ind = find(pval==''''); | ||
pval(ind) = []; | ||
end | ||
IO.(pname) = pval; | ||
end | ||
temp = fgetl(fid); | ||
end | ||
%% Run SINGE | ||
[ranked_edges, gene_influence] = SINGE(IO.gene_list,IO.Data,IO.outdir,IO.num_replicates,param_list); |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Data data1/X_SCODE_data.mat | ||
outdir Output | ||
num_replicates 2 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we call this file Before merging, we'll need to update the readme to describe this file and how to run SINGE with it. Is this space separated or whitespace-separated? Tab-separated may be safest so we can support Windows file paths. |
||
gene_list data1/tf.mat |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
1 ID 541 | ||
atuldeshpande marked this conversation as resolved.
Show resolved
Hide resolved
|
||
1 lambda 0.01 | ||
1 dT 10 | ||
1 num_lags 5 | ||
1 kernel_width 2 | ||
1 prob_zero_removal 0 | ||
1 prob_remove_samples 0.2 | ||
1 family 'gaussian' | ||
1 date '01/31/2019' | ||
|
||
2 ID 542 | ||
2 lambda 0.01 | ||
2 dT 5 | ||
2 num_lags 9 | ||
2 kernel_width 4 | ||
2 prob_zero_removal 0.2 | ||
2 prob_remove_samples 0.1 | ||
2 family 'gaussian' | ||
2 date '31-Jan-2019' |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
1 ID 541 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we make this file tab-separated as well? I see multiple spaces instead of tabs. We will also document this file in the readme. |
||
1 lambda 0.01 | ||
1 dT 10 | ||
1 num_lags 5 | ||
1 kernel_width 2 | ||
1 prob_zero_removal 0 | ||
1 prob_remove_samples 0.2 | ||
1 family 'gaussian' | ||
1 date '01/31/2019' | ||
|
||
2 ID 542 | ||
2 lambda 0.01 | ||
2 dT 5 | ||
2 num_lags 9 | ||
2 kernel_width 4 | ||
2 prob_zero_removal 0.2 | ||
2 prob_remove_samples 0.1 | ||
2 family 'gaussian' | ||
2 date '31-Jan-2019' |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
function [ranked_edges, gene_influence] = SCINGE(gene_list,Data,outdir,num_replicates,param_list) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's remove the old |
||
% [ranked_edges, gene_influence] = SCINGE(gene_list,Data,outdir,num_replicates,param_list) | ||
% Standalone SCINGE implementation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rename |
||
% Inputs: | ||
% gene_list = N x 1 cell array with list of relevant genes in the data set | ||
% Data = string representing the path of mat file containing the expression | ||
% data corresponding to above gene_list in the form of cell array X. | ||
% outdir = directory path to store individual GLG test results before Borda | ||
% aggregation | ||
% num_replicates = number of subsampled replicates (global SCINGE parameter) | ||
% param_list = list of hyperparameter combinations for individual GLG tests | ||
% Outputs: | ||
% ranked_edges = ranked list of gene interactions with corresponding SCINGE scores | ||
% gene_influence = ranked lists of regulators (genes) with corresponding SCINGE influence | ||
SINGE_version = '0.1.0'; | ||
display(SINGE_version); | ||
for rep = 1:num_replicates | ||
for ii = 1:length(param_list) | ||
GLG_Instance(Data,'lambda',param_list{ii}.lambda,'dT',param_list{ii}.dT,'num_lags',param_list{ii}.num_lags,'kernel_width',param_list{ii}.kernel_width,'prob_zero_removal',param_list{ii}.prob_zero_removal,'replicate',rep,'ID',param_list{ii}.ID,'outdir',outdir,'family',param_list{ii}.family,'prob_remove_samples',param_list{ii}.prob_remove_samples,'date',param_list{ii}.date); | ||
end | ||
end | ||
Str = Data; | ||
Str(Str=='.') = 'p'; | ||
|
||
lind = max(max(strfind(Str,filesep)),0); | ||
mind = length(Str); | ||
if isempty(mind)||(mind<lind) | ||
mind = length(Str); | ||
end | ||
Str = Str(lind+1:mind); | ||
Agg = Modified_Borda_Aggregation(Str,outdir); | ||
load(gene_list); | ||
ranked_edges = adjmatrix2edgelist(Agg,gene_list); | ||
[influence,ind] = sort(sum(Agg,2),'descend'); | ||
gene_influence = [cell2table(gene_list(ind)) array2table(influence)]; | ||
gene_influence.Properties.VariableNames{1} = 'Gene_Name'; | ||
ranked_edgesw = ranked_edges; | ||
ranked_edgesw.SCINGE_Score = floor(ranked_edgesw.SCINGE_Score*10^5)/10^5; | ||
gene_influencew = gene_influence; | ||
gene_influencew.influence = floor(gene_influencew.influence*10^5)/10^5; | ||
writetable(ranked_edgesw,fullfile(outdir,'SCINGE_Ranked_Edge_List.txt'),'WriteVariableNames',true,'WriteRowNames',false,'Delimiter','\t'); | ||
writetable(gene_influencew,fullfile(outdir,'SCINGE_Gene_Influence.txt'),'WriteVariableNames',true,'WriteRowNames',false,'Delimiter','\t'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much of this file changed? If you delete the old
SCINGE_Example.m
in the same commit where you add this file, git may recognize that you renamed the file. That would make comparing the new and old version easier.Please convert the name here as well