The chloroplast ribosomal intron of Chimydomonas reinharli codes for a polypeptide related to mitochondrial maturases

The sequences of the 888bp chloroplast ribosomal intron and of the flanking 23S rRNA gene regions of Chlamydomonasreinhardii have been established. The intron can be folded with a secondary structure which is typical of group I introns of fungal mitochondrial genes. It contains a 489bp open reading frame encoding a potential polypeptide that is related to mitochondrial maturases.


INTRODUCTION
It is well documented that several mitochondrial genes from lower eukaryotes and chloroplast genes from algae and higher plants contain introns. In yeast mitochondria split genes include those of the large rRNA (1), of apocytochrome b (2,3) and of subunit I of cytochrome oxidase (4). Chloroplast introns of higher plants occur mostly in tRNA genes (5) and more rarely in protein genes (6). In green algae chloroplast introns have been found in numerous protein genes (7,8,9,10) and in the 23S rRNA gene (11). Recently Michel and Dujon have performed an extensive sequence comparison of introns from fungal mitochondria and from rRtIA genes of lower eukaryotes (12,13). These introns, which do not follow the GT... AG rule of introns from higher eukaryotes, can be divided into two groups, I and II, based on their secondary structure and on conserved sequence elements. Among the chloroplast introns that have been sequenced at least two, the introns of the tRNAile and tRNAala genes of Zea mays (14), appear to fit into group II (13). In contrast, the introns of the tRNAleu genes from Vicia faba (15) and maize (16) belong to group I.
The occurrence in yeast mitochondria of splicing deficient mutations within introns that can be complemented in trans and Volume 13 Number 3 1985 0 1 R L Press Limited, Oxford, England. 975 Nucleic Acids Research Nucleic Acids Research the finding of large intron open reading frames that are in phase with the preceding exons has led to the concept of maturase, a chimeric exon-intron encoded polypeptide that participates in the splicing reactions (3). Comparison of the sequences of these intron open reading frames has revealed two conserved elements P1 and P2 each coding for 12 amino acids (12,31,34). While similar open reading frames have been found in mitochondrial introns from other fungi they have not yet been detected in other organisms.
Here we report the sequence of the intron of the chloroplast 23SrRNA gene of Chlamydomonas reinhardii. This 888bp intron, which can be folded with a secondary structure typical of group I introns, contains an open reading frame coding for a putative polypeptide of 163 amino acids. A striking feature of the amino acid sequence is the presence of a dodecapeptide that is highly related to the P1 element of group I introns (12). This is the first evidence for a protein coding sequence of this sort outside of fungal mitochondria.

MATERIALS AND METHODS DNA and RNA
The plasmid containing the chloroplast ribosomal BamHI-HindIII fragment which includes the ribosomal intron was described previously (21). Plasmid DNA was prepared as described by Katz et al. (22). DNA sequencing was performed by the chemical cleavage method of Maxam and Gilbert (23). The DNA sequence analysis was performed on a Hewlett Packard computer, model 9845.
The folding model of the intron ( fig. 3) was obtained as described by Michel et al. (12). C. reinhardii RNA was isolated as described (24). Mapping of the 3' end of the 23S rRNA gene The 166bp AluI fragment ( fig. 1) was 3'end labelled with DNA polymerase (Klenow fragment), denatured and annealed with C. reinhardii RNA. The hybrids were digested with S1 nuclease (25) and the S1 resistant DNA products were sized on a 5% sequencing  (12,31,34). The sequencing strategy is shown at the bottom. The 166bp AluI fragment used for mapping the 3'end of the 23S rRNA gene is indicated.
gel using the partial cleavage products of the sequencing reactions as size standards.

RESULTS AND DISCUSSION
Previous studies have revealed the presence of an intron in the chloroplast 23SrRNA gene of C. reinhardii near its 3' end (11) and they have allowed us to establish the sequence of the two junctions between the intron and the flanking rRNA coding sequences (21). Fig. 1 displays a restriction map of the intron and of the flanking 23SrRNA coding regions which are contained in a 1623 bp BamHI-HindIII fragment. The sequencing strategy is also indicated. The BamHI site maps within the 23S rRNA gene sequence and the HindIII site is located 50bp downstream of the 3'end of the 23S rRNA gene (figs. 1,2, cf. Materials and Methods). The intron is located in a region of the 23S rRNA gene that has been highly conserved in different organisms (26,27). Comparison with the corresponding E. coli sequence (26) reveals a sequence homology of 78% for the 389bp upstream region and 72.5% for the l90bp downstream region (relative to the intron). The homology between the last 100 bases of the 23S rRNA genes of C. reinhardii and E. coli is only 59%. It is noted that the intron is located  (28) are marked with 0. S1 refers to the 3'end of the 23S rRNA gene. The intron ends are indicated by large dark wedges (21). The open reading frame starts at position 571 and ends at position 1059. The conserved dodecapeptide is underlined (12).Regions corresponding to box 9 and box 2 are indicated (30). Parts of the intron flanking regions determined previously (21) have been corrected.
near the homologous region of the yeast mitochondrial 21S rRNA gene where mutations have been sequenced that confer resistance to chloramphenicol (28, fig. 2). A uniparental chloramphenicol resistant mutant has been isolated in C. reinhardii (29). It is therefore possible that this mutation is located in the region of the chloroplast 23S rRNA gene indicated in fig. 2. If true, this region would provide a new correlation site between the physical and genetic maps of the chloroplast genome of C. reinhardii. The ribosomal intron is significantly richer in AT (63.4%) than the surrounding rRNA gene region (49.6%). This 888bp intron contains an open reading frame which could encode a polypeptide of 163 amino acids ( fig. 2). An unusual property of this basic protein is its high lysine content (11.6%). Fig. 3 shows that the secondary structure of the intron resembles closely the structure of group I introns with several characteristic helical regions (a to e) and loops ( fig. 3). Typical features include the U residue preceding the 5' splice junction that can basepair with a G within the intron in helix a and the presence of a G at the 3'end of the intron. The position of the open reading frame is unusual : It starts in the loop of helix d and continues through helix d ( fig. 3). In most of the group I introns the open reading frame is in the loop of helix e (12). Another distinctive feature of group I introns are two elements related to the box 9 and box 2 sequences of the fourth intron of the yeast mitochondrial cytochrome b gene (30). Mutations in these elements are cis-dominant and are thought to destroy recognition sequences involved in splicing. The C. reinhardii box 2 homologue is highly related to the fungal consensus sequence (figs. 2,3). In contrast the C. reinhardii box 9 homologue (figs. 2,3) is a rare variant which has also been found in aI5a and aI5f, two introns in the yeast mitochondrial gene of subunit I of cytochrome oxidase (31).
There are only four complementary bases between the box 2 and box 9 regions (f and f' in fig. 3) whereas in most cases the two elements have 5 complementary bases (13). Two additional helices are present on the 3' side of f' as has also been observed in fungal ribosomal introns (32). Davies (12). The two complementary regions of the box 9 and box 2 elements are indicated by f and f', respectively. Nucleotides that are conserved in the ribosomal intron of Kluyveromyces thermotolerans (32) are shaded and those that are maintained in the aI5a intron of S.cerevisiae (31) are marked with dots. The last six codons of the open reading frame are indicated with the corresponding amino acids. Codons used rarely in chloroplast protein genes are marked with asterisks. X designates the stop codon. The arrows with perpendicular tails near the ends of the intron (marked by short arrows) indicate two possible complementary guide sequences (33). sequences located in the first loop of the intron and in the region downstream of the other intron-exon junction. Sequences of this type are also found in C. reinhardii (marked by arrows with perpendicular tails in fig. 3). However an alternative pairing which brings the two intron ends into close proximity is also possible ( fig. 4a,b).
An interesting feature of the open reading frame of the ribosomal intron of C. reinhardii is the presence near its N terminal end of a dodecapeptide YLAGFVDGDGSI which is highly related to one of the group I consensus sequences YLAGLVDGDGYF Nucleic Acids Research with mitochondrial intron open reading frames suggests that the chloroplast protein has a similar function and that it may be involved in the splicing reactions. There is however no proof that this protein is synthesized. The fact that there is no apparent ribosome binding site near the ATG initiation codon of the open reading frame does not necessarily imply that it is not expressed since the absence of a ribosome binding site has also been reported for the chloroplast gene of the 5 subunit of ATP synthase in spinach (35). In yeast it has recently been possible to prepare antibodies against maturases and to demonstrate their presence in splicing deficient mutants, but not in wild type cells presumably because these proteins are highly unstable (38,39). In contrast to typical maturases whose coding regions include both exon and intron sequences, the chloroplast ribosomal intron open reading frame is contained entirely within the intron as is the case for its homologues in the introns of the mitochondrial 21S rRNA genes from the yeasts Saccharomyces cerevisiae (28) and Kluyveromyces thermotolerans (32). Recently intron open reading frames that prolong the upstream exons have been found in a chloroplast gene of C. reinhardii (10). The codon usage in the chloroplast ribosomal open reading frame is rather unusual and differs considerably from the restricted codon usage found in chloroplast protein coding sequences of C. reinhardii. Only 51 different codons were found among 1505 codons examined from several chloroplast genes (37).
It is interesting to note that of the 10 missing codons eight are present in the ribosomal open reading frame and they all end with C or G (Table I). It can be seen in fig. 3 that several of the codons which participate in the helical structure d are rarely used in chloroplast gene sequences of C. reinhardii (Table I) suggesting that in this case codon usage is governed by secondary structure requirements. Similar observations have been made for open reading frames of mitochondrial introns (36).
Little is known on the mechanisms of splicing of the chloroplast rRNA precursor in C. reinhardii except that hybridization of RNA with an intron specific probe reveals the presence of a transcript of equal size to the intron (40). Since group I introns also include the ribosomal intron of Tetrahymena pyriformis which is capable of autocatalytic splicing (41), it will be of interest to determine whether the same holds for the chloroplast intron. In this case, there may be no strict requirement for the intron encoded polypeptide, which could merely act as a cofactor for improving the efficiency of splicing.