5 Conclusion and Future work



next up previous
Next: 6 Glossary Up: Visualization of Biological Previous: 4 Case studies of

5 Conclusion and Future work

The results of sequence similarity algorithms are large, discrete, and multi-dimensional, which leads to difficult and ineffective analysis of textual reports. Even though the positional, compositional, and similarity information are contained in these reports, it is difficult to determine the biological significance of the alignments. For example, it is difficult to determine whether two alignments are in the same region, or whether there are regions where two sequences share similar conserved composition.

AV provides a novel data representation and visualization method for the output of BLAST, one of the most popular of these alignment algorithms. The graphical representation provides a visual index to the algorithm output. AV encodes alignment data by using features such as colors and layers for different frames, and allows interactive zooming, translation, and rotation. The fat line technique provides real-time feedback during interaction. The alignment matrix curves present an estimate of the evolutionary distance of an alignment.

AV visualizes high level information contained in the similarity reports. The advantages of this approach are that AV offers a concise view of the global features and eases the interpretation. The potential disadvantage of omitting the details in the textual reports is mitigated by hyperlinks from the visualization to the text report. AV is used as a visualizer for data contained in these hypertext reports, in the same way external viewers are used to view images on World-Wide Web (WWW) documents. 15,000 AV visualizations can be found in the similarity reports of plant genome sequences located on our WWW site (http://lenti.med.umn.edu). The WWW documents provide easier access and distribution of the data to the biology community. AV is in use on a regular basis by biologists in our research group.

We plan future enhancements. The capability to choose different colors interactively is required for different output media. Also, the information encoded by the Y-axis can be other biologically meaningful similarity measures besides similarity score. We also plan to develop techniques for viewing the output of multiple algorithms as well as techniques for viewing several reports simultaneously.

As depicted in the case studies, AV greatly reduces the amount of time required to review and analyze the BLAST output. By visually presenting all the information from an alignment report, AV frees molecular biologists from the drudgery of the text reports, and makes new types of analysis possible.



next up previous
Next: 6 Glossary Up: Visualization of Biological Previous: 4 Case studies of



Ed H. Chi (echi@cs.umn.edu)
Fri Apr 28 12:51:35 CDT 1995