Content-type: text/html Manpage of code2aln

code2aln

Section: Multiple Nucleic Acid Sequence Alignment Program (1.2)
Updated: 1.2
Index Return to Main Contents

NAME

code2aln

SYNOPSIS

code2aln [-c [CTN]] FN FN FN ... [-c [CTN]] FN FN FN ...

DESCRIPTION

Code2aln produces multiple nucleic acid alignments using information on coding and non-coding regions as part of the scoring function. This is done in order to prevent the problem of higher sequence divergency on the level of nucleic acids as compared to the underlying protein sequences in the case of coding at a certain region of the input nucleic acid sequences.

Code2aln reads the input nucleic acid files (FN) as arguments. The possible input sequence file formats are Pearson's format (FASTA) or GenBank file format or sequence data in one or more lines without any format. Code2aln automatically detects the types of the various input sequence files and handles them accordingly. All data may be read as separate files or merged into one file.

It is possible to define one or more than one codon tables (CTN) for each sequence or groups of sequences. Default is the universal genetic code. Entering 'code2aln' without any options or input files displays a short help and a list of the various available codon tables. The codon tables are important for searching for start and stop codons and for translation of the detected open reading frames.

Code2aln detects all theoretically possible open reading frames which have a minimal length of 300 and one fifteenths of the sequence length. Divided coding regions and exons are joined, translated, and the amino acid sequences, beside the nucleic acid sequences, are used for the scoring function.

All pairwise alignments are done using all scoring parameters. A guide tree is built which defines the order of the profile alignments. An output file is created that gives a textual representation of this guide tree (cluster.txt). Further output files are ORF.ps, a PostScript display which shows the read and found open reading frames and exons, and info.ral, a file containing the same information in text format.

The profile alignments are done respecting the guide tree and using all scoring parameters. And, finally, the resulting multiple nucleic acid sequence alignment is written to the output file aln.aln.

OPTIONS

-c [CTN] for usage of various and/or different codon tables.

The following codon tables (CTN) are available:

    univ: universal genetic code (default)
    acet: Acetabularia
    ccyl: Candida cylindrica
    tepa: Tetrahymena, Paramecium,
          Oxytrichia, Stylonychia, Glaucoma
    eupl: Euplotes
    mlut: Micrococcus luteus
    mysp: Mycoplasma, Spiroplasma
mitocan: canonical mitochondrial code
mitovrt: Vertebrates -  mitochondrial code
mitoart: Arthropods -  mitochondrial code
mitoech: Echinoderms -  mitochondrial code
mitomol: Molluscs -  mitochondrial code
mitoasc: Ascidians -  mitochondrial code
mitonem: Nematodes -  mitochondrial code
mitopla: Plathelminths -  mitochondrial code
mitoyea: Yeasts -  mitochondrial code
mitoeua: Euascomycetes - mitochondrial code
mitopro: Protozoans - mitochondrial code

VERSION

This man page documents version 1.2 of code2aln.

AUTHOR

Roman R. Stocsits

BUGS

Please send comments and bug reports to roman@bioinf.uni-leipzig.de.

This document was created by man2html, using the manual pages.
Time: 12:43:51 GMT, October 11, 2005