Fast search of sequences with complex symbol correlations using profile context-sensitive HMMS and pre-screening filters
- Creators
- Yoon, Byung-Jun
-
Vaidyanathan, P. P.
Abstract
Recently, profile context-sensitive HMMs (profile-csHMMs) have been proposed which are very effective in modeling the common patterns and motifs in related symbol sequences. Profile-csHMMs are capable of representing long-range correlations between distant symbols, even when these correlations are entangled in a complicated manner. This makes profile-csHMMs an useful tool in computational biology, especially in modeling noncoding RNAs (ncRNAs) and finding new ncRNA genes. However, a profile-csHMM based search is quite slow, hence not practical for searching a large database. In this paper, we propose a practical scheme for making the search speed significantly faster without any degradation in the prediction accuracy. The proposed method utilizes a pre-screening filter based on a profile-HMM, which filters out most sequences that will not be predicted as a match by the original profile-csHMM. Experimental results show that the proposed approach can make the search speed eighty times faster.
Additional Information
© 2007 IEEE. Reprinted with Permission. Publication Date: 15-20 April 2007. Posted online: 2007-06-04. Work supported in parts by the NSF grant CCF-0636799 and the Microsoft Research Graduate Fellowship.Files
Name | Size | Download all |
---|---|---|
md5:acc34de47a398ae845c312afebea1ac2
|
5.2 MB | Preview Download |
Additional details
- Eprint ID
- 9710
- Resolver ID
- CaltechAUTHORS:YOOicassp07b
- Created
-
2008-03-10Created from EPrint's datestamp field
- Updated
-
2021-11-08Created from EPrint's last_modified field