Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published April 2004 | Published
Journal Article Open

Accurate Identification of Novel Human Genes Through Simultaneous Gene Prediction in Human, Mouse, and Rat

Abstract

We describe a new method for simultaneously identifying novel homologous genes with identical structure in the human, mouse, and rat genomes by combining pairwise predictions made with the SLAM gene-finding program. Using this method, we found 3698 gene triples in the human, mouse, and rat genomes which are predicted with exactly the same gene structure. We show, both computationally and experimentally, that the introns of these triples are predicted accurately as compared with the introns of other ab initio gene prediction sets. Computationally, we compared the introns of these gene triples, as well as those from other ab initio gene finders, with known intron annotations. We show that a unique property of SLAM, namely that it predicts gene structures simultaneously in two organisms, is key to producing sets of predictions that are highly accurate in intron structure when combined with other programs. Experimentally, we performed reverse transcription-polymerase chain reaction (RT-PCR) in both the human and rat to test the exon pairs flanking introns from a subset of the gene triples for which the human gene had not been previously identified. By performing RT-PCR on orthologous introns in both the human and rat genomes, we additionally explore the validity of using RT-PCR as a method for confirming gene predictions.

Additional Information

© 2004 Cold Spring Harbor Laboratory Press. The Authors acknowledge that six months after the full-issue publication date, the Article will be distributed under a Creative Commons CC-BY-NC License (Attribution-NonCommercial 4.0 International License, http://creativecommons.org/licenses/by-nc/4.0/). Accepted January 26, 2004. Received November 5, 2003. L.P. and C.D. were partially supported by NIH grant R01 HG2362-2. The whole-genome SLAM runs were performed on the Affymetrix computing cluster. R.G. and J.Q.W. were partially supported by grants from the NHGRI/NHLBI (1 U54 HG02345) and NCI/SAIC (20XS182A). The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.

Attached Files

Published - 661.full.pdf

Files

661.full.pdf
Files (201.7 kB)
Name Size Download all
md5:2314914bc1f58c3b993836fe71fdc2a7
201.7 kB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 24, 2023