-
Notifications
You must be signed in to change notification settings - Fork 15
Home
Welcome to the clusterDbAnalysis (ITEP) wiki!
This wiki is intended to show how to perform some useful comparative analyses using the ITEP tools. For detailed documentation for each Python script, please type [nameofscript] -h or refer to the docs/ folder of ITEP.
- Installing ITEP on your machine including dependencies
- Using the ITEP virtual machine
- Managing multiple ITEP instances on a machine
- Adding and removing genomes from existing ITEP databases
Follow all of these directions to successfully build a complete ITEP database. We have done these steps for you in the virtual machine's "tutorial" copy of ITEP, so you can follow along with the rest of the tutorial without performing these steps yourself. However, it is necessary to perform these steps to build a new ITEP database with your own set of organisms. Therefore, it is still important to read these directions and understand what we did to build the example database.
- How to import genomes and format them for use with ITEP
- Specifying lists of organisms to cluster
- Building your database 1 - BLASTP and BLASTN
- Building your database 2 - MCL Clustering
- Building your database 3 - Contig import
- Building your database 4 - RPSBLAST
Note: If you decide to try to follow the following with your own install of ITEP (and not using the VM) the exact cluster IDs could vary slightly due to E-value differences in BLAST between different versions, possible changes in ordering of outputs, etc. However the gene IDs and their corresponding information should always be exactly the same for the same input.
- Searching for genes by gene properties
- Searching for genes by homology with other genes
- Obtaining a list of bidirectional-best BLAST hits
- Obtaining information about genes
- Turning ITEP IDs into human-readable formats
- Building alignments and trees
- Analyzing gene neighborhoods
- Searching for gene families by presence and absence patterns
- Visualizing homology patterns
- Building a concatinated gene tree
- Generating draft metabolic reconstructions from a reference
- Extracting DNA and amino acid sequences from a region of a genome, gene or protein
- Searching for missing genes and identifying causes for absence with tBLASTn
- Identifying the upstream regions of homologous proteins
- Searching for functions using conserved domains