Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published September 15, 2010 | Supplemental Material + Published
Journal Article Open

Context dependent substitution biases vary within the human genome

Abstract

Background: Models of sequence evolution typically assume that different nucleotide positions evolve independently. This assumption is widely appreciated to be an over-simplification. The best known violations involve biases due to adjacent nucleotides. There have also been suggestions that biases exist at larger scales, however this possibility has not been systematically explored. Results: To address this we have developed a method which identifies over- and under-represented substitution patterns and assesses their overall impact on the evolution of genome composition. Our method is designed to account for biases at smaller pattern sizes, removing their effects. We used this method to investigate context bias in the human lineage after the divergence from chimpanzee. We examined bias effects in substitution patterns between 2 and 5 bp long and found significant effects at all sizes. This included some individual three and four base pair patterns with relatively large biases. We also found that bias effects vary across the genome, differing between transposons and non-transposons, between different classes of transposons, and also near and far from genes. Conclusions: We found that nucleotides beyond the immediately adjacent one are responsible for substantial context effects, and that these biases vary across the genome.

Additional Information

© 2010 Nevarez et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Received: 2 April 2010; Accepted: 15 September 2010; Published: 15 September 2010. We would like to thank Ran Libeskind-Hadas, Daniel Fielder, Lynn Bush and Steve Adolph for helpful discussions. Support for this work was provided by the NSF (MCB-0918335) and by and an institutional grant to Harvey Mudd College from the Howard Hughes Medical Institute. Authors' contributions: PAN carried out the analysis and wrote the manuscript. CMD, BAF and MAQ carried out the analysis. ECB designed the project, carried out the analysis, and wrote the manuscript. All authors read and approved the final paper.

Attached Files

Published - Nevarez2010p11658BMC_Bioinformatics.pdf

Supplemental Material - 1471-2105-11-462-s1.pdf

Supplemental Material - 1471-2105-11-462-s2.pdf

Supplemental Material - 1471-2105-11-462-s3.csv

Supplemental Material - 1471-2105-11-462-s4.pdf

Supplemental Material - 1471-2105-11-462-s5.pdf

Supplemental Material - 1471-2105-11-462-s6.pdf

Files

1471-2105-11-462-s4.pdf
Files (2.5 MB)
Name Size Download all
md5:a053e8ed165f3719b614ebe9398842b0
13.9 kB Preview Download
md5:a39b498da2862e90c17fad46937dd905
1.3 MB Preview Download
md5:e282da69c7d00a329ccb33614db59856
146.0 kB Preview Download
md5:4b6e8ee8b431715ac22faa6d7da1cbd6
322.8 kB Preview Download
md5:77ddcc3fd6e5db2331bd3c5a4267964a
47.4 kB Preview Download
md5:cd7c313eae2d54e90cec2d97bcc8407a
639.1 kB Preview Download
md5:8c2133e2de1a80b4f0a4e655ba10ad1a
33.0 kB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 20, 2023