You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Previously we had a script that produced detailed metrics utilizing descriptive_statistics gem to provide stats that Bryan asked for here:
Here are the stats that we'd like.
Breakdown of the number of species represented in each Kingdom, Phylum, and Class.
For each of the above, Mean, Medium, Mode on the number of sequences per entry
Find species with the largest number of sequences in each of these categories and report that number.
Here, "entry" is a single .fa file. So imagine for Phylum “Foo” there were only 2 species, each with two genes (thus two fa files). Species A having 10,12 sequences, species B having 14,16 sequences. The results would be:
mean: 13
mode: 10
median: 13
species with largest number of sequences in Foo: B
This is a script that generated that (though below currently needs access to the un-tarred gene file data).
require 'csv'
Previously we had a script that produced detailed metrics utilizing descriptive_statistics gem to provide stats that Bryan asked for here:
Here are the stats that we'd like.
Breakdown of the number of species represented in each Kingdom, Phylum, and Class.
For each of the above, Mean, Medium, Mode on the number of sequences per entry
Find species with the largest number of sequences in each of these categories and report that number.
Here, "entry" is a single .fa file. So imagine for Phylum “Foo” there were only 2 species, each with two genes (thus two fa files). Species A having 10,12 sequences, species B having 14,16 sequences. The results would be:
mean: 13
mode: 10
median: 13
species with largest number of sequences in Foo: B
This is a script that generated that (though below currently needs access to the un-tarred gene file data).
require 'csv'
The text was updated successfully, but these errors were encountered: