Skip to content
This repository has been archived by the owner on Feb 16, 2019. It is now read-only.
mattb112885 edited this page Jul 7, 2013 · 62 revisions

Welcome to the clusterDbAnalysis (ITEP) wiki!

This wiki is intended to show how to perform some useful comparative analyses using the ITEP tools. For detailed documentation for each Python script, please type [nameofscript] -h or refer to the docs/ folder of ITEP.

  1. Introduction to ITEP
  2. How to get help
  3. ITEP architecture
  4. ITEP data limitations
  5. Data format standards
  6. Known issues

Installation and administration

  1. Installing ITEP on your machine including dependencies
  2. Using the ITEP virtual machine
  3. Managing multiple ITEP instances on a machine
  4. Adding and removing genomes from existing ITEP databases

Building an ITEP database (with examples)

Follow all of these directions to successfully build a complete ITEP database. We have done these steps for you in the virtual machine's "tutorial" copy of ITEP, so you can follow along with the rest of the tutorial without performing these steps yourself. However, it is necessary to perform these steps to build a new ITEP database with your own set of organisms. Therefore, it is still important to read these directions and understand what we did to build the example database.

  1. How to import genomes and format them for use with ITEP
  2. Specifying lists of organisms to cluster
  3. Building your database 1 - BLASTP and BLASTN
  4. Building your database 2 - MCL Clustering
  5. Building your database 3 - Contig import
  6. Building your database 4 - RPSBLAST

Comparative genomics with ITEP

Note: If you decide to try to follow the following with your own install of ITEP (and not using the VM) the exact cluster IDs could vary slightly due to E-value differences in BLAST between different versions, possible changes in ordering of outputs, etc. However the gene IDs and their corresponding information should always be exactly the same for the same input.

  1. Searching for genes by gene properties
  2. Searching for genes by homology with other genes
  3. Obtaining a list of bidirectional-best BLAST hits
  4. Obtaining information about genes
  5. Turning ITEP IDs into human-readable formats
  6. Building alignments and trees
  7. Analyzing gene neighborhoods
  8. Searching for gene families by presence and absence patterns
  9. Visualizing homology patterns
  10. Building a concatinated gene tree
  11. Generating draft metabolic reconstructions from a reference
  12. Extracting DNA and amino acid sequences from a region of a genome, gene or protein
  13. Obtaining the complete sequences of contigs, genes or proteins
  14. Searching for missing genes and identifying causes for absence with tBLASTn
  15. Identifying the upstream regions of homologous proteins
  16. Searching for functions using conserved domains
  17. Adding user-defined gene data to ITEP

Unclassified tutorials

  1. Running and troubleshooting OrthoMCL
Clone this wiki locally