Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published March 7, 2006 | Published
Journal Article Open

An ancient evolutionary origin of the Rag1/2 gene locus

Abstract

The diversity of antigen receptors in the adaptive immune system of jawed vertebrates is generated by a unique process of somatic gene rearrangement known as V(D)J recombination. The Rag1 and Rag2 proteins are the key mediators of this process. They are encoded by a compact gene cluster that has exclusively been identified in animal species displaying V(D)J-mediated immunity, and no homologous gene pair has been identified in other organisms. This distinctly restricted phylogenetic distribution has led to the hypothesis that one or both of the Rag genes were coopted after horizontal gene transfer and assembled into a Rag1/2 gene cluster in a common jawed vertebrate ancestor. Here, we identify and characterize a closely linked pair of genes, SpRag1L and SpRag2L, from an invertebrate, the purple sea urchin (Strongylocentrotus purpuratus) with similarity in both sequence and genomic organization to the vertebrate Rag1 and Rag2 genes. They are coexpressed during development and in adult tissues, and recombinant versions of the proteins form a stable complex with each other as well as with Rag1 and Rag2 proteins from several vertebrate species. We thus conclude that SpRag1L and SpRag2L represent homologs of vertebrate Rag1 and Rag2. In combination with the apparent absence of V(D)J recombination in echinoderms, this finding strongly suggests that linked Rag1- and Rag2-like genes were already present and functioning in a different capacity in the common ancestor of living deuterostomes, and that their specific role in the adaptive immune system was acquired much later in an early jawed vertebrate.

Additional Information

© 2006 by The National Academy of Sciences of the USA Edited by Masatoshi Nei, Pennsylvania State University, University Park, PA, and approved January 10, 2006 (received for review November 8, 2005) This paper was submitted directly (Track II) to the PNAS office. Published online before print February 27, 2006, 10.1073/pnas.0509720103 We thank Gary W. Litman, Eric H. Davidson, Ellen V. Rothenberg, Michael J. Pazin, Michele K. Anderson, and F. Nina Papavasiliou for comments on the manuscript, and L. Courtney Smith and Susanna M. Lewis for helpful discussions. We are grateful to Darrell Norton, C. Titus Brown, and Gail Mueller for technical help. We thank Samuel Schluter (University of Arizona, Tucson) for providing the shark Rag1 and Rag2 cDNAs. In addition, we thank David G. Schatz and Eric H. Davidson for their support and stimulating discussions. This work was supported by a Canadian Foundation for Innovation grant and funds from the Sunnybrook and Women's Research Institute (to J.P.R.), and by the Intramural Research Program of the National Institute on Aging/National Institutes of Health. Author contributions: S.D.F. and J.P.R. designed research; S.D.F., C.M., L.A.N., and J.P.R. performed research; R.A.C. contributed new reagents/analytic tools; S.D.F., C.M., R.A.C., and J.P.R. analyzed data; and S.D.F., C.M., and J.P.R. wrote the paper. Conflict of interest statement: No conflicts declared. Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. DQ082723 and DQ082724). Fig. 6. A SpRag1L cDNA translation. Green highlighting indicates putative start of translation. Red highlighting indicates termination codon. Yellow background indicates untranslated regions. Polyadenylation signal sequence is boxed. Exons 1-4 are designated with alternating blue and black text. An upstream in-frame stop codon that is also found in the sequence of BAC 149P17 and 78F1 is underlined. This sequence has been submitted to GenBank under accession number DQ082723. Fig. 7. Surrounding genomic sequence, splice sites, and untranslated 5' and 3' regions for the SpRag1L (a) and SpRag2L (b) genes as determined from BAC clone sequencing, cDNA sequencing, and from the 11/23/04 genomic sequence assembly (www.hgsc.bcm.tmc.edu/projects/seaurchin/). Yellow highlighting indicates transcribed regions. Green highlighting indicates conserved splice dinucleotides. Red highlighting indicates termination codons. Potential translation start codons are indicated with green text. Triplets are underlined at each splice site to indicate reading frame. Polyadenylation signals identified from sequencing of 3' RACE products are highlighted in blue. An in-frame upstream stop codon found in the sequence of BAC clones 149P17 and 78F1 is boxed. Sequences are found in Supertigs 17631 and 89807 in the 11/23/04 assembly of the Strongylocentrotus purpuratus genome sequence. Fig. 8. Translation of SpRag2L gene. Green highlighting indicates potential translation start sites. The second ATG triplet is a preferred start of translation. Red highlighting indicates termination the codon. Yellow background indicates untranslated regions. The polyadenylation signal sequence is boxed. Exons 1-3 are designated with alternating blue and black text. This sequence has been submitted to GenBank under accession number DQ082724. Fig. 9. Sequence alignments of a portion the SpRag2L (Sp) putative kelch-repeat/b-propeller region to a partial Rag2 sequence from the osteoglossiform fish Mormyrus rume (Mr, AAF43356.1) taken from a BLASTP search using the non- PHD region of the SpRag2L gene. BLAST searches using this region as a query show low matches to a variety of kelch-repeat proteins as might be expected if the SpRag2L gene has diverged to a point where little primary sequence affinity remains apart from the residues that are constrained to remain unchanged. For this reason, we rely on gene structure predictions, genomic position, and functional characteristics to assign homology with Rag2, rather than primary sequence identities. Nonetheless, when vertebrate proteins are analyzed, the top scoring sequences are dominated by matches with Rag2 genes (like alignment shown above) suggesting that some signal remains. Fig. 10. SpRag1L exon 3 sequence. The third 2,578-bp exon of SpRag1L contains a 2,070-bp region with 57 repeated variants of a 24-bp motif (consensus: ACAGCCCCTTTAACCCCAACTGCC; alternating black and white background) and 4 variants of a 126-bp region (blue background). We determined the sequence of this region by direct sequencing from two BAC clones (149P17 and 78F1). Nucleotides that vary from the consensus (shown in red) were used to maintain continuity across the sequence. The sequence of a 308-bp region that was impossible to resolve by direct BAC sequencing in the context of these repeats was unambiguously bridged using a WGS trace sequence (214582154; www.ncbi.nlm.nih.gov/Traces/trace.cgi). Our sequence differs from the genome assembly but is consistent with restriction mapping (using Eae1 and Sml1), and PCR measurements made on both BAC clone DNAs and on genomic DNA from the animal used for the genome sequence (data not shown). The exon encodes an ORF of 859 aa. Four separate cDNA sequences from this region taken from different animals track the genomic sequence at each end of the repeats but are missing internal regions. Each of these cDNAs maintains the correct reading frame throughout this region and into flanking exons. These length differences may result from polymorphism or a more complex form of noncanonical spicing. Yellow highlighting indicates nonrepetitive coding sequence. Splice junction dinucleotides are highlighted in green. Fig. 11. Relative message prevalence of SpRag1L (blue) and SpRag2L (red) is correlated in adult coelomocytes as it is in embryos. Coelomocyte samples were taken from six individuals. Coelomocytes are primarily composed of four different cell types. Analyses of separated subpopulations were variable (and thus the respective data are preliminary), but indicate that the SpRagL genes are expressed in cells other than the phagocytic cells that make up the majority of the coelomocytes (40–60%; data not shown). SpRag1/2L expression was found in other adult tissues. Whether SpRagL expression is cell type-specific or is correlated with other cellular parameters such as state of differentiation or proliferation is unresolved. Note that mature differentiated coelomocytes may not be the primary cell types that express these genes. Expression levels are normalized to 18S rRNA measurements. Fig. 12. DNA binding of SpRag1L. (A) Gel-shift using a 12-recombination signal sequence (RSS) ssDNA oligonucleotide substrate and increasing concentrations (40 ng, 200 ng, and 1 mg) of recombinant MBP-Rag1cd protein from mouse (mm) or sea urchin (sp). Complexes were resolved on native 4% acrylamide gels. The free probe (p) and the shifted complex (c) are indicated by arrows, and a degradation product observed at high concentrations of proteins is marked by an asterisk. (B) Supershift using a polyclonal anti-MBP antiserum. Polyclonal anti-HMG2 antiserum was used as a negative control. Note that the supershift (s) is only partial due to the low Ig concentration in the available anti-MBP antiserum. (C) Heptamer specificity of MmRag1cd and SpRag1Lcd DNA binding. Competition gel-shift experiments were performed using 200 ng of recombinant proteins. The free probe (p) and the shifted complex (c) are indicated by arrows, and a degradation product observed at high concentrations of proteins is marked by an asterisk. Decreasing amounts (160 fmol, 40 fmol, and 10 fmol) of unlabelled 12-RSS oligos, either WT, heptamer mutated (HM) (5'- gatctggctcgtcttaGAGAAGCatatagaccttaacaaaaacct gcactcgagcggag -3'), or nonamer-mutated (NM) (5'- gatctggctcgtcttacacagtgatatagaccttaAGGCTCTG Atgcactcgagcggag -3') were added to the binding reactions as indicated above the gels. For both MmRAG1cd and SpRAG1cd, the oligo with the mutated heptamer (HM) competed less well compared with those with an intact heptamer (WT and NM).

Attached Files

Published - FUGpnas06.pdf

Files

FUGpnas06fig6.pdf
Files (3.8 MB)
Name Size Download all
md5:1eb53530388239abc19072f5bcf1a00c
145.7 kB Preview Download
md5:b1eb7f8142b15d9ca8968a422a2a4de4
146.1 kB Preview Download
md5:22043241ed25ebd2a8cf1dd8d631348d
3.3 kB Preview Download
md5:d6bb82d17cc5ae009b47ac8e831f2528
89.9 kB Preview Download
md5:6681a51e2e578eb93d9cdd0d3ced5e96
145.9 kB Preview Download
md5:2a84fc674e7a2ee3c705b4a38cf2a62d
150.5 kB Preview Download
md5:a53231962ccb7ac9890348d7b10b048f
192.4 kB Preview Download
md5:0c2c4d803da03a5fcc4a421304133c6d
309.1 kB Preview Download
md5:875da0bb4c01813913f16af8216ecfa4
128.5 kB Preview Download
md5:1fb9ae28bcb7c22b2a96cb74f1447fb7
133.8 kB Preview Download
md5:e63ffe0a1dc9570bf25b24e02772eeb8
120.4 kB Preview Download
md5:21f229b069e9365d4fd10e15c60349be
109.7 kB Preview Download
md5:bafcafe44f5812fb5a45ec39d42b82df
436.2 kB Preview Download
md5:0fb92df4893832be8e55a7b9c12d9902
1.5 MB Preview Download
md5:c1547d660209b8b000fdd1140559cf5d
15.5 kB Preview Download
md5:feff96ecd985728106d6a8cd5041e2cc
138.9 kB Preview Download

Additional details

Created:
August 22, 2023
Modified:
October 16, 2023