Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published March 7, 2017 | Submitted
Report Open

Alignment Metric Accuracy

Abstract

We propose a metric for the space of multiple sequence alignments that can be used to compare two alignments to each other. In the case where one of the alignments is a reference alignment, the resulting accuracy measure improves upon previous approaches, and provides a balanced assessment of the fidelity of both matches and gaps. Furthermore, in the case where a reference alignment is not available, we provide empirical evidence that the distance from an alignment produced by one program to predicted alignments from other programs can be used as a control for multiple alignment experiments. In particular, we show that low accuracy alignments can be effectively identified and discarded. We also show that in the case of pairwise sequence alignment, it is possible to find an alignment that maximizes the expected value of our accuracy measure. Unlike previous approaches based on expected accuracy alignment that tend to maximize sensitivity at the expense of specificity, our method is able to identify unalignable sequence, thereby increasing overall accuracy. In addition, the algorithm allows for control of the sensitivity/specificity tradeoff via the adjustment of a single parameter. These results are confirmed with simulation studies that show that unalignable regions can be distinguished from homologous, conserved sequences. Finally, we propose an extension of the pairwise alignment method to multiple alignment. Our method, which we call AMAP, outperforms existing protein sequence multiple alignment programs on benchmark datasets. A webserver and software downloads are available at http://bio.math.berkeley.edu/amap/.

Additional Information

A.S was partially supported by NSF grant EF 03-31494. G.M. was supported by the Max-Planck / Alexander von Humboldt International Research Prize. L.P. was partially supported by a Sloan Research Fellowship.

Attached Files

Submitted - 0510052.pdf

Files

0510052.pdf
Files (227.1 kB)
Name Size Download all
md5:40c8b8f79ab359d2d3e4438108b9e669
227.1 kB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 24, 2023