Fast Structural Similarity Search of Noncoding RNAs Based on Matched Filtering of Stem Patterns
- Creators
- Yoon, Byung-Jun
-
Vaidyanathan, P. P.
- Other:
- Matthews, Michael B.
Abstract
Many noncoding RNAs (ncRNAs) have characteristic secondary structures that give rise to complicated base correlations in their primary sequences. Therefore, when performing an RNA similarity search to find new members of a ncRNA family, we need a statistical model - such as the profile- csHMM or the covariance model (CM) - that can effectively describe the correlations between distant bases. However, these models are computationally expensive, making the resulting RNA search very slow. To overcome this problem, various prescreening methods have been proposed that first use a simpler model to scan the database and filter out the dissimilar regions. Only the remaining regions that bear some similarity are passed to a more complex model for closer inspection. It has been shown that the prescreening approach can make the search speed significantly faster at no (or a slight) loss of prediction accuracy. In this paper, we propose a novel prescreening method based on matched filtering of stem patterns. Unlike many existing methods, the proposed method can prescreen the database solely based on structural similarity. The proposed method can handle RNAs with arbitrary secondary structures, and it can be easily incorporated into various search methods that use different statistical models. Furthermore, the proposed approach has a low computational cost, yet very effective for prescreening, as will be demonstrated in the paper.
Additional Information
© 2007 IEEE. Issue Date: 4-7 Nov. 2007; Date of Current Version: 11 April 2008. This work was supported in part by the NSF grant CCF-0636799.Attached Files
Files
Name | Size | Download all |
---|---|---|
md5:42ece43b3ff790a372f4ee28c4364af6
|
2.3 MB | Preview Download |
Additional details
- Eprint ID
- 19661
- Resolver ID
- CaltechAUTHORS:20100825-133532010
- NSF
- CCF-0636799
- Created
-
2010-08-26Created from EPrint's datestamp field
- Updated
-
2021-11-08Created from EPrint's last_modified field
- Other Numbering System Name
- INSPEC Accession Number
- Other Numbering System Identifier
- 9941469