Belvu Homepage

Read instructions on how to install Belvu.


What is Belvu?

Belvu is an X-windows viewer for multiple sequence alignments. One of the main advantages of Belvu is that it has an extensive set of modes to color the residues. There are several ways to color them by conservation and by residue type (user-configurable). Other useful features are fetching of the Swissprot (or PIR) entries by double clicking and easy tracking of the position in the alignment.
Belvu further supports some simple editing commands for rows and columns, but is not intended as a full editor. For actually editing the alignment, use your favorite alignment editor or Emacs.

In addition, Belvu is a phylogenetic tool. It can be used to generate distance matrices between sequences under a selection of distance metrics. These can be saved and used subsequently in other applications. Belvu also implements certain distance-based tree reconstruction algorithms - including import of externally generated distance matrices - and bootstrap phylogenetic reconstruction. These functions are available both in the GUI (meaning Belvu may also be used as a tree viewer) or as command-line options, making the program a potential component in phylogenetic software pipelines.

How to run Belvu

Type 'belvu' to get brief help:
Belvu - View multiple alignments in pretty colours.

 Usage: belvu [options]  < multiple_alignment > | -  [X options]

 < multiple_alignment > | - = file or pipe in Pfam/selex/MSF format (see below).


 Options:

 -r           Alignment is in 'raw' format.

              Example of raw alignment file: 

                  seq1_name MFILKTP
                  seq1_name MYI.RTP

 -l < file >  Load color code file.

              Format: < symbol > < color >
              (Lines starting with # are ignored (comment lines))

              Example of color code file:

                  # Aroma
                  F YELLOW
                  Y YELLOW
                  W YELLOW

                  # Yuck
                  D RED 
                  N GREEN  
                  X BLUE

              Available colors:

                    WHITE 
                    BLACK 
                    LIGHTGRAY 
                    DARKGRAY
                    RED 
                    GREEN 
                    BLUE
                    YELLOW 
                    CYAN 
                    MAGENTA
                    LIGHTRED 
                    LIGHTGREEN 
                    LIGHTBLUE
                    DARKRED 
                    DARKGREEN 
                    DARKBLUE
                    PALERED 
                    PALEGREEN 
                    PALEBLUE
                    PALEYELLOW 
                    PALECYAN 
                    PALEMAGENTA
                    BROWN 
                    ORANGE 
                    PALEORANGE
                    PURPLE 
                    VIOLET 
                    PALEVIOLET
                    GRAY 
                    PALEGRAY
                    CERISE 
                    MIDBLUE


 -L < file>  Load markup and organism color code file.
             Colour the markup text by residue or colour organism in tree.

	     Example to set color of letters A and B:
	     A GREEN
	     B YELLOW

	     Example to set color of organism human:
	     #=OS BLUE human


 -R          Do not parse coordinates when reading alignment.

 -o < format> Write alignment or tree to stdout in this format and exit.
                Valid formats: MSF, Mul(Stockholm), Selex,
                               FastaAlign, Fasta, tree.

 -X <#>      Print UPGMA-based subfamilies at cutoff #.

 -n < cutoff> Make non-redundant to  %identity at startup.

 -Q < cutoff> Remove columns more gappy than .

 -q < cutoff> Remove sequences more gappy than .

 -G          Penalize gaps in pairwise comparisons.

 -i          Ignore gaps in conservation calculation.

 -P          Remove partial sequences at startup.

 -C          Don't write coordinates to saved file.

 -z < char>   Separator char between name and coordinates in saved file.

 -a          Show alignment annotations on screen (Stockholm format only).

 -p          Output random model probabilites for HMMER.
                (Based on all residues.)

 -S < order>  Sort sequences in this order.
                a -> alphabetically
                o -> by Swissprot organism, alphabetically
                s -> by score
                n -> by Neighbor-joining tree
                u -> by UPGMA tree
                S -> by similarity to first sequence
                i -> by identity to first sequence

 -T < method> Tree options:
                i -> Start up showing tree
                I -> Start up showing only tree
                d -> Show distances in tree
                n -> Neighbor-joining
                u -> UPGMA
                c -> Don't color tree by organism
                o -> Don't display sequence coordinates in tree
                b -> Use Scoredist distance correction (default)
                j -> Use Jukes-Cantor distance correction
                k -> Use Kimura distance correction
                s -> Use Storm & Sonnhammer distance correction
                r -> Use uncorrected distances
                p -> Print distance matrix and exit
                R -> Read distance matrix instead of alignment
                     (only in combination with Tree routines)

 -b <#>      Apply boostrap analysis with # bootstrap samples

 -B          Print out bootstrap trees and exit
                (Negative value -> display bootstrap trees on screen)

 -O < label>  Read organism info after this label (default OS)

 -u          Start up with uncoloured alignment (faster).

 -c           Print Conservation table.

 -s < file >  Read in file of scores.  A column with scores will
              automatically appear after the coordinates.

              Format: < score > < sequence_id >

              Example of score file:

                  2.78 seq_1/180-206
                  2.78 seq_2/180-206
                  3.79 seq_3/42-94
              
 -t < title > Set window title.

 -g           Draw grid line (for debugging).

 -m < file >  Read file with matching sequences segments. This is used to
              display a match of  a query sequence to a family.  The format
              of the match is :

              Line 1: Name/start-end score
              Line 2: Query sequence in matching segment, no pads!
              Line 3: Sequence of matching segments (qstart1 qend1 fstart1
              fend2 qstart2 qend2 fstart2 fend2  etc...).

              Example:

              ZK673.9/238-260 21.58
              CPENWVQFTGNGTQYGVCLRGFT
              1 2 1 2  4 7 8 11  

              NOTE: A sometimes easier way of doing this is to concatenate
              the match to the end of the alignment, after a line with
              exactly this string within the quotes: "# matchFooter"


 Some X options:
 -acefont < font>   Main font.
 -font    < font>   Menu font.

 Note: X options only work after "setenv POSIXLY_CORRECT"

 setenv BELVU_FETCH to desired sequence fetching program.

In other words, to get going, just type 'belvu my_alignment' and it should work. It detects if it is in MSF, selex or Pfam format (see below), but you have to tell it if it is in 'raw' format (Name sequence).
The file of scores should be in the format "score name" with one score per sequence name in the alignment on a separate line.

NOTE: Make sure that all sequence names are unique - very funny thing happen if they are not.

Multiple alignment file formats supported by Belvu

Belvu currenty supports MSF, selex and Pfam formats. Selex is the native format used by Sean Eddy's HMM package HMMER. The Pfam format is Pfam's native format, and has one domain per line like this: "Seqname/start-end SEQ.UENCE". The residues must be aligned and gaps should be represented by dots.
Note on the MSF format: The "..... Check: .." line has to come before the first line that does not start with a space. The only legal exception is the line "PileUp of:" from GCG programs.

Marking up the multiple alignment

Belvu supports the "Stockholm" format for marking up features in the alignment. These mark-up annotations are preceded by a 'magic' label, of which there are four types. At the moment, Belvu only parses the #=GC and #=GS markup labels.

For compatibility with the old SELEX format, #=RF is also accepted by Belvu as if it were "#=GC #=RF".

Phylogenetic analysis using Belvu

To go from a sequence alignment to a tree shown on screen, consider the following options.


 -o  Write alignment or tree to stdout in this format and exit.
                Valid formats: MSF, Mul(Stockholm), Selex,
                               FastaAlign, Fasta, tree.

 ...

 -T  Tree options:
                i -> Start up showing tree
                I -> Start up showing only tree
                d -> Show distances in tree
                n -> Neighbor-joining
                u -> UPGMA
                c -> Don't color tree by organism
                o -> Don't display sequence coordinates in tree
                b -> Use Scoredist distance correction (default)
                j -> Use Jukes-Cantor distance correction
                k -> Use Kimura distance correction
                s -> Use Storm & Sonnhammer distance correction
                r -> Use uncorrected distances
                p -> Print distance matrix and exit
                R -> Read distance matrix instead of alignment
                     (only in combination with Tree routines)
 -b <#>      Apply boostrap analysis with # bootstrap samples
 -B          Print out bootstrap trees and exit
                (Negative value -> display bootstrap trees on screen)
So, calling belvu -b 100 -I -u -s [sequence file] would show a bootstrapped UPGMA tree of Scoredist corrected distances taken from the input file, and nothing else. belvu -b 100 -o tree [sequence file] would print a Neighbor-joining bootstrap tree to standard output.

Phylogenetic distance functionality

Belvu may compute and save a distance matrix for a sequence alignment under any of its supported metrics. Similarly, it can read in a distance matrix file and reconstruct a tree from its content. The format for the distance matrix is as follows: first row contains tab-separated sequence identifiers. The remaining rows form the full distance matrix as tab-separated numbers. For example, use belvu -T b -T p [sequence file] to print the Scoredist matrix for a sequence alignment, or use belvu -T R [matrix file] to build the tree using an externally calculated distance matrix. If the distances are not evolutionary but only reflect some relationship then you should use the UPGMA algorithm with -T u . An example of a distance matrix is:

A  B  C
0 10 20
15 0 15
25 5  0

Reference

If you use this program for your work, please reference:

"Scoredist: A simple and robust protein sequence distance estimator"
Erik LL Sonnhammer and Volker Hollich
BMC Bioinformatics 6:108 (2005) [PDF]