Bioinformatics scripts and programs implemented on oakley-server
Programs
| Usage | Description |
|---|---|
| Phylogenetic Trees and Model Evaluation | |
| paup | Dave Swofford's PAUP* |
| mb | MrBayes by Huelsenbeck and others. current version is 3.1.2 installed 12/05. |
| modeltest | Modeltest program by Posada and others. Can be called with modelfit.pl <filename> using a script written by J. Nylander. Where filename is the name of a nexus file. |
| mrmodeltest | Nylander took Posada's code and took out the models that are not implemented in MrBayes. (A case of an origin of "novelty" by strictly loss :). Can be called with mrmodelfit.pl <filename> . Where filename is the name of a nexus file. |
| alrt_phyml | phyml (a fast ml program) with support measures using an approximate likelihood ratio test for each node. See website: |
| phyml | A fast ML program. Here is a website for usage Home Page |
| Multiple Sequence Alignment | |
| clustalw | Clustalw program by Higgens and others. |
| tcoffee | tcoffee program by Higgens and others. Seems more accurate than clustalw, but has some bugs when using more than 50 sequences. |
| muscle | sequence alignment program. See website: |
| mafft | A fast alignment program. Home page is here |
| Phylogenetic Tree Visualization | |
| tgf | "Treegraph" by Muller and Muller. Home Page. Generates graphics files for phylogenetic trees from text files. Allows for the automation of graphics production using scripting, nice when doing may trees (e.g. trees of all gene families in a genome). Here is the documentation. |
| tv | TreeView for visualizing and printing phylogenetic Trees. |
| Divergence Time Estimation | |
| r8s | By Mike Sanderson for estimating divergence times using molecular clocks and relaxed molecular clocks. HomePage . Usage is r8s |
| PATHd8 | By Britton and others. Here is the Home Page . Usage is PATHd8. |
| Other | |
| seq-gen | For simulating sequence evolution. Useful for parametric bootstrapping, for example. |
Perl Scripts
| genbankstrip.pl | Mines all gene sequences from a GenBank output file according to annotations provided in each accession. As such, it is limited by the accuracy of the information given in the accession and uses a restricted library of gene synonyms. However, it can often mine more evolutionarily divergent sequences and better account for paralogs than can a BLAST-based search. Written by Olaf Binida-Emonds. Type genbankstrip.pl -h for help. |
| seqConverter.pl | Converts files to different formats. For example, fasta to nexus. Type seqConverter.pl -h for a list of arguments. This was written by Olaf Binida-Emonds |
| clustal2fasta.pl | convert clustal alignments to fasta files. Written by THO . Now obsolete, new clustalw writes fasta files. |
| subdiv.pl | PERL script to create a consensus from a file and compare a sequence file to the consensus with blast. Writes out a fasta file |
| align.pl | use clustalw to align a group of sequences in fasta format. Output is also in fasta. usage: align.pl infile outfile |
| seqCat.pl | To concatenate two groups of sequences (which may differ in content of species) and write out a nexus format file. Works best with fasta input files (we've had problems when starting with nexus files). Written by Olaf Binida-Emonds. |
| gridder.pl | Creates a binary grid, to be opened in Excel. Rows are species, columns are data partitions. Written by THO by altering seqCat.pl by Olaf Binida-Emonds . |
| taxgridder.pl | Creates a binary grid, to be opened in Excel. Rows are species, columns are data partitions. Adds taxonomy from GenBank to end of each row. Written by THO by altering seqCat.pl by Olaf Binida-Emonds |
| nucgridder.pl | Creates a grid, to be opened in Excel. Rows are species, columns are data partitions. Unlike gridder and taxgridder above, nucgridder.pl writes the number of nucleotides present in the partition. It then adds a column with the total percentage completeness for the given row. Written by THO by altering seqCat.pl by Olaf Binida-Emonds |
| extracttrace.pl | Extracts genome traces from the ncbi trace database. To use this, first blast the trace database, on from the ncbi website. Next, copy the test from the output results to a text file. This text file is used for the names of the sequences to extract. Usage is: extracttrace.pl infile outfile written by THO |
| makedellist.pl | Makes a list of accession numbers to be deleted from a dataset. written by THO |