Free access
Issue
Ann. For. Sci.
Volume 66, Number 5, July-August 2009
Article Number 509
Number of page(s) 21
DOI http://dx.doi.org/10.1051/forest/2009037
Published online 09 July 2009

© INRA, EDP Sciences, 2009

1. INTRODUCTION

Castanopsis sieboldii (Makino) Hatusima var. sieboldii is an evergreen broadleaved canopy tree that is found on Honshu, Shikoku and Kyushu Islands in Japan (Yamazaki and Mashiba, 1987b). It is considered to be a climax species in evergreen forests and, in some places, it coexists with its congeneric species C. cuspidata (Thunb.) Schottky var. cuspidata. However, Castanopsis sieboldii var. sieboldii is mainly found growing in costal regions, while C. cuspidata var. cuspidata predominates in drier hilly regions of the interior (Yamanaka,1966). The two species can be morphologically distinguished by their nut size and shape and the structure of the leaf epidermis (Kobayashi and Sugawa, 1959; Yamazaki and Mashiba, 1987a). C. cuspidata var. cuspidata has small, globular nuts and a single layer of epidermis cells, while C. sieboldii var. sieboldii has large, oblong nuts and two layers of epidermis cells. Morphologically intermediate types have been frequently reported, especially at sites where the two species coexist (Kobayashi and Hiroki, 2003; Yamada and Miyaura, 2003). The differences in the distribution and morphology of these two species may reflect natural selection pressures.

The analysis of a large number of genetic markers within gene regions is an effective strategy for investigating the genetic basis of adaptation (Eveno et al., 2008; Kane and Rieseberg,2007; Tsumura et al., 2007). The genetic information thus acquired can provide valuable insights into various aspects of phylogeography and forest ecology through the study of gene flow, and is also useful in practical applications such as the delineation of seed zones for reforestation programs.

For non-model organisms, one of the most cost-effective methods for obtaining information on genic sequences is to collect and analyze Expressed Sequence Tags (ESTs); these can be obtained by randomly sequencing cloned cDNA derived from mRNA. They represent expressed regions of the genome and can be easily analyzed in single-pass reads using high-throughput capillary sequencers. Another important feature of ESTs is that their putative functions can often be identified by similarity searches against publicly available databases; such functional information may help interpret the results of population genetic studies, particularly with respect to adaptive variations. In fact, some DNA sequences originating from ESTs show signs of adaptive evolution (Kado et al., 2003). As well as being important in fundamental genetic studies, the relationships between genetic variation and adaptation are important to consider in conservational phylogeography.

In the present study, we constructed a cDNA library and generated ESTs using tissue from the inner bark of Castanopsis sieboldii var. sieboldii to obtain transcribed sequences and to develop transcript-based markers. Putative unigenes were constructed from the ESTs we obtained and the function of the putative unigenes was proposed on the basis of similarity searches against public databases. PCR primers were designed for microsatellites, or simple sequence repeats (SSRs), detected within our database. The ease of use and the highly polymorphic nature of microsatellite markers means that the EST-SSR markers developed in the present study should be useful in genetic diversity studies not only of C. sieboldii var. sieboldii, but also for related species.

2. MATERIALS AND METHODS

2.1. Source of mRNA and cDNA sequencing

RNA was extracted from several twigs (about 3 cm in diameter) cut from a C. sieboldii var. sieboldii tree (diameter at breast height ≈ 40 cm) growing in the arboretum at Forestry and Forest Product Research Institute, Tsukuba, Japan, in a fine morning of May 2005. There was no direct sunshine in the morning, because it located at northern side of the institutional building. The origin of the tree was unclear. We used a cutter to peel off the outer bark and then strip the inner bark, which was immediately frozen in liquid nitrogen. The inner bark (ca. 50 g), including the cambium and phloem, was then sliced and ground to a fine powder in liquid nitrogen using a mortar and pestle. The CTAB method (Chang et al., 1993) was used to extract total RNA from the powder; this was further purified using an SV total RNA isolation system (Promega, Madison, USA). The mRNA was recovered using an Oligotex-dT30 Super mRNA purification Kit (Takara, Otsu, Japan); a cDNA Library construction Kit (Stratagene, La Jolla, USA) was used to synthesize first-strand cDNA using oligo (dT)18 primers. Synthesized cDNAs were size-selected using CHROMA-SPIN+TE-1000 (Clontech, Mountain View, USA), ligated into pBluescript II SK(+) vectors (Stratagene, La Jolla, USA) and transformed into competent DH10B Escherichia coli cells by electroporation. The primary library size was 2.2 × 107 recombinants. The lengths of the insert were checked by PCR with RV-M and M13-47 primers (Takara, Otsu, Japan) using 16 clones; it was confirmed that the average length of the inserts was about 1700 bp long. After overnight incubation on LB media, cloned inserts in randomly selected white colonies were subjected to rolling circle amplification and sequenced from the 5 end with dye terminator chemistry using a MegaBACE4000 sequencer (GE Healthcare, Little Chalfont, UK) and T3 primers (Takara, Otsu, Japan). The cDNA library construction and sequencing was performed by the Dragon Genomics Center (Yokkaichi, Japan).

2.2. Analysis of EST sequences

To identify putative functions of the ESTs obtained, we applied the same analytical procedures used for the ESTs identified from Quercus mongolica var. crispula by Ueno et al. (2008). Briefly, trace2dbest, part of the PartiGene computer package (Parkinson et al., 2004), was used to clean up the raw traces, using the Phred error probability (Ewing and Green, 1998; Ewing et al., 1998), which was set at 0.05. We discarded sequences less than 150 bp long and did not subject them to any further analysis. The remaining sequences were grouped, using CLOBB (Parkinson et al., 2002), into clusters based on similarity. Sequences in the same cluster were then assembled into contigs by means of Phrap (Green, 1999) (using the default parameters in PartiGene). We assumed that all singletons (sequences that were not clustered with other sequences by CLOBB) and contigs (sequences produced by PartiGene) were potential unigenes; these were used for primary annotation via similarity searches. The Blastx (Altschul et al., 1990) software was used to interrogate the NCBI nr database with the e-value cutoff set to 1e-5. Blast was also used to interrogate the uniprot (uniprot_trembl and uniprot_sprot) databases (Apweiler et al., 2004) to determine a functional classification of the potential unigenes; in this case the e-value cutoff was set at 1e-25. Annot8r_blast2GO, a script developed by Schmid and Blaxter (http://www.nematodes.org/PartiGene/index.html), was used for the annotation. In addition, conserved protein domains were searched against pfam profiles (Finn et al., 2008) using the hmmer ver. 2.3.2 software packages (), which employ the Hidden Markov Model (HMM).

In order to identify transcripts related to cell wall formation, a similarity search was carried out, using the Blast algorithm against the MAIZEWALL database (Guillaumie et al., 2007), the Cell Wall Navigator database (Girke et al., 2004) and sequences listed by Raes et al. (2003); the e-value cutoff was set at 1e-25.

Characterization of the cDNA library was carried out by the method of Ewing et al. (1999) using a Populus EST resource (Sterky et al., 2004) as a reference. The reference Populus EST sequences were downloaded from the project web site (http://www.populus.db.umu.se/). Eighteen cDNA libraries (excluding partially subtracted library Y) was used. Additional three Populus cDNA library sequences (Leple et al., unpublished; Nanjo et al., 2003) were downloaded from dbEST. Two library sequences by Leple et al. were originated from sivere and mild draught stress-treated mature leaves, while one library sequences by Nanjo et al. (2003) were mixtures from various stress-treated leaves. C. sieboldii contigs with more than five ESTs were blasted against each Populus cDNA library. The number of hits with score > 200 was counted for each C. sieboldii contig. The Pearson correlation coefficient between the number of EST sequences and the number of blast hits for each contig was calculated. The Euclidean distance between two sets was computed as in Ewing et al. (1999).

The microsatellite search tool, SSRIT (Temnykh et al., 2001), from the USDA-ARS Center for Bioinformatics, was used to screen for microsatellite sequences, screening sequences with at least nine, six and five repeats for di-, tri- and tetra-SSRs, respectively. Mono-SSRs did exist in our putative unigenes, however they were not screened here. ESTScan software (Iseli et al., 1999) was used to estimate the location of microsatellites within ESTs (either coding or non-coding regions). Randomization procedures were applied to assess whether the numbers of microsatellite-containing potential unigenes in specific GO categories were significantly overrepresented. Samples of 108 (the number of potential unigenes with microsatellites in all GO categories) were randomly selected from the 1263 GO-annotated potential unigenes 1000 times, and 95% confidence limits for the frequency of putative unigenes in each GO category, were determined. This was achieved using Perl scripts written in-house.

2.3. Development of EST-SSR markers

PCR primers were designed by Primer3 for di- and tri-SSRs (Rozen and Skaletsky, 2000); read2Marker script (Fukuoka et al., 2005) was used to automate this process using its default settings. The utility of the EST-SSR primers designed in the present study was estimated by analyzing the polymorphisms among: 10 individuals of C. sieboldii var. sieboldii, one C. sieboldii var. lutchuensis, four C. cuspidata var. cuspidata, and one C. cuspidata var. carlesii individual (supplementary Tab. I1). C. sieboldii var. sieboldii and C. cuspidata var. cuspidata were discriminated on the basis of the cell layers of their leaf epidermis following the method described by Kobayashi and Sugawa (1959); individuals of both of these species were sampled from different populations across their distribution ranges in Japan. C. sieboldii var. lutchuensis is found only in southern Japan (from Amami-oshima to Iriomote Island) (Yamazaki and Mashiba, 1987a) and we used a sample from Okinawa Main Island. C. cuspidata var. carlesii is found in Taiwan and Southern China and is a relative of C. cuspidata var. cuspidata (Delectis Florae Reipublicae Popularis Sinicae Agendae Academiae Sinicae Edita, 1998; Yamazaki and Mashiba, 1987b). After DNA was extracted from each leaf sample using a modified CTAB method (Murray and Thompson, 1980), PCR was carried out in 10 μL reaction mixtures containing ca. 10 ng genomic DNA, 1 × PCR buffer, 200 μM of each dNTP, 1.5 mM MgCl 2, 0.2 μM of each primer designed in the present study and 0.4 U of Taq polymerase (Promega, Madison, USA), using the following PCR program: 94 °C for 3 min; then 40 cycles of 94 °C for 45 s, 55 °C for 45 s and 72 °C for 45 s, followed by a final extension at 72 °C for 10 min. PCR products were labeled with ChromaTide Rhodamine Green-5-dUTP (Molecular Probes Eugene, USA) according to the method of Kondo et al. (2000), and analyzed using an ABI 3100 Genetic Analyzer (Applied Biosystems, Foster City, USA). The number of alleles (Na) and observed heterozygosity (Ho) (Nei and Kumar, 2000) were determined for each locus.

In order to examine transferability of EST-SSR markers developed here to related species , we carried out cross-spceies PCR amplification for eight Fagaceae species (Fagus crenata, Lithocarpus glabra, Castanea crenata, Quercus glauca, Q. dentata, Q. serrata, Q. mongolica and Q. variabilis), which naturally occur in Japan. PCR was performed in the same reaction conditions as above. The PCR products were electrophoretically separated on 2% agarose gels and stained with ethidium bromide. The resulting banding pattern was recorded as a single main band (+), no amplification (–) or multiple-banding pattern (m).

The discriminant power of EST-SSR markers for different species and varieties in Castanopsis was tested by using eight individuals of C. sieboldii var. sieboldii, C. sieboldii var. lutchuensis, C. cuspidata var. cuspidata and C. cuspidata var. carlesii. These individuals were from Ibaraki, Okinawa and Hiroshima (Suppl. Tab. I 1) and Taiwan (Aoki et al., unpublished) population, respectively. PCR was carried out as described before except that fluoresent primer was used. PCR products were analyzed on ABI 3100 Genetic Analyzer (Applied Biosystems, Foster City, USA). The proportion of shared alleles between individuals was computed by the MSA software (Dieringer and Schlötterer 2003) and an NJ dendrogram was constructed with the MEGA ver. 4.0.2 (Tamura et al., 2007).

thumbnail Figure 1

Functional profile of the 1263 C. sieboldii putative unigenes and the 108 microsatellite-containing putative unigenes annotated according to GO slim terms (squares and circles, respectively). The error bars indicate 95% confidence limits for the frequencies of putative unigenes annotated with the specific GO slim term when 108 putative unigenes were randomly sampled 1000 times. Significantly highly represented GO slim terms amongst putative unigenes with microsatellites are indicated by the closed circle. The GO IDs and corresponding terms are as follows: GO:0005622, intracellular; GO:0005623, cell; GO:0005576, extracellular region; GO:0003824, catalytic activity; GO:0005488, binding; GO:0003676, nucleic acid binding; GO:0005198, structural molecule activity; GO:0005215, transporter activity; GO:0004871, signal transducer activity; GO:0030528, transcription regulator activity; GO:0030234, enzyme regulator activity; GO:0003774, motor activity; GO:0016209, antioxidant activity; GO:0008152, metabolic process; GO:0006139, nucleobase, nucleoside, nucleotide and nucleic acid metabolic process; GO:0006810, transport; GO:0050896, response to stimulus; GO:0007154, cell communication; GO:0006118, electron transport; GO:0009987, cellular process; GO:0006519, amino acid and derivative metabolic process; GO:0008219, cell death; GO:0007275, multicellular organismal development.

3. RESULTS

3.1. cDNA sequencing and EST analysis

We obtained a total of 3638 sequences and identified 3354 high quality readings (DDBJ accession numbers DC600172 to DC603525) that were more than 150 bp long (average length, 552 bp). The high quality sequences were grouped into 423 clusters, which contained 1386 EST sequences. The remaining 1968 sequences were singletons. When sequences in the same cluster were assembled into contigs, 449 contigs were created. We identified a total of 2417 potential unigenes (contigs and singletons). The ratio of unique sequences in the library (= the number of unigenes/the number of high quality sequences) was 0.721. The average number of EST sequences per contig was 3.1, average size of the sequence in the contig was 749 bp. The largest cluster included 39 ESTs, while the other clusters contained less than 30 ESTs (Suppl. Tab. II 1).

In total, 1856 potential unigenes had similarities with other proteins in the NCBI nr database at the e-value cutoff level of 1e-5. The remaining 561 potential unigenes, however, did not have any hits with e-value cutoff level of 1e-5. The gene ontology (GO) functional classification assigned 533, 1100 and 875 of the potential unigenes to the cellular component, molecular function and biological processes categories, respectively. The GO terms represented by the largest numbers of potential unigenes within each of these categories were: intracellular (GO:0005622; 70%), catalytic activity (GO:0003824; 38%), binding (GO:0005488; 38%) and metabolic processes (GO:0008152; 50%) (Fig. 1). At least one GO term was assigned to 1263 of the potential unigenes. HMM searches against the pfam database recognized 695 putative unigenes at the e-value cutoff level of 1e-10. Four hundred protein families were identified, with the ubiqutin family being the most common (Tab. I). The other common protein families included AP2 domain, RNA recognition motif, DnaJ domain, ubiquitin-conjugating enzyme and core histone H2A/H2B/H3/H4.

Table I

Abundant protein families (number of putative unigenes > 5).

Table II

Abundance of microsatellites in the C. sieboldii putative unigenes.

The Blastx search against the Cell Wall Navigator database identified 45 putative unigenes with a significant similarity at the e-value cutoff level of 1e-25; of these, 37 were singleton putative unigenes. In addition, the Tblastx search against MAIZEWALL database and the sequences listed by Raes et al. (2003) identified 170 and six putative unigenes, respectively. In total, 193 putative unigenes including 312 transcripts (9.3%) exhibited a similarity to genes for cell wall formation. Examples were callose synthase, cellulose synthase-like (Csl), trans-cinnamate 4-hydroxylase (C4H) and cinnamyl alcohol dehydrogenase 2 (CAD2).

Calculation of correlation coeffiient between the number of EST sequences in a contig and the number of blast hit with the contig produced a 61 × 22, contig by library matrix, which were then used to calculated Euclidian distance among cDNA libraries. The cDNA library in the present study was the closest to Leple et al. library which was originated from sivere draught stress-treated mature leaves. Libraries of mild draught stress-treated mature leaves by Leple et al, wood cell death (X) and senescing leaves (I) by Strerky et al. (2004) were also closer than the other libraries (Suppl. Tab. III 1).

We detected 314 microsatellites in 2417 potential unigenes (Suppl. Tab. IV 1), with some potential unigenes containing up to three microsatellites. The numbers and relative proportions of di-, tri- and tetra-SSRs were 183 (58.3%), 125 (39.8%) and 6 (1.9%), respectively. The AG motif was the most common, while there were no CG repeats in the potential unigenes (Tab. II). Among the tri-SSRs (98% of which had fewer than 10 repeats), AAG and AGG repeats were relatively frequent, however, there were fewer tri-SSRs than di-SSRs. All six of the tetra-SSRs had a different motif. The maximum number of repeats was 71 (of the AG motif). Estimation of the locations of the SSRs within the ESTs suggested that 176 (56%) of them reside in non-coding regions, and 91 (29%) of them are likely to be located in coding regions. The locations of the 47 (15%) remaining SSRs could not be determined because of no data returned by ESTScan software, so they are assumed as non-coding in the present study. No di-SSRs were identified as coding, while 91 (72.8%) of the tri-SSRs appeared to be located in coding regions (Tab. II). GO annotations, which were assigned to a total of 108 microsatellite-containing putative unigenes, indicated that GO:0003676 (nucleic acid binding) was strongly overrepresented (22.5%; P < 0.01) even after the levels of significance were adjusted by Bonferroni criteria (Fig. 1).

3.2. Analysis of EST-SSR markers

In total, 39 and 24 primer pairs were designed for the di- and tri-SSRs, respectively, although some of the target sequences contained both di- and tri-SSRs. Primers were designed for all sequences with more than or equal to 10 repeat (max) in the read2Marker software output. Twenty-nine sequences were in this category. Additional 34 sequences for primer design were selected arbitrary from repeat (max) more than 6 and less than 9. Thirty-seven of the 63 designed primer pairs amplified C. sieboldii var. sieboldii genomic DNA within the expected size range. Nine primer pairs produced larger fragments than expected, probably due to the occurrence of introns. These loci are impossible to genotype by capillary sequencers because of the large fragment size; they were excluded from further analysis. The remaining primer pairs had no amplifications or multi-banding pattern. After fluorescent labeling and electrophoresis by the capillary sequencer, 16 primer pairs (Tab. III) showed a clear single locus amplification pattern for 16 of the genomic DNA samples. Some primer pairs produced three peaks and/or peaks that were difficult to genotype; these are not listed in Table III. Four potential unigenes (CcC01513_1, CcC02069_1, CcC02291_1 and CcC02375_1) were predicted, by ESTScan, to have coding SSRs. Most of the markers listed in Tab. III were found within the unigene sequences that had similarities with other proteins in the nr database. The number of alleles per locus (Na), and observed heterozygosity (Ho) for C. sieboldii var. sieboldii ranged from 3 to 9, and 0.2 to 0.9, respectively (Tab. IV). For C. cuspidata var. cuspidata, Na and Horanged from 2 to 6, and 0 to 1, respectively (Tab. IV).

Table III

Characteristics of the C. sieboldii EST-SSR markers.

Table IV

Polymorphisms of the EST-SSR markers, based on samples of 10 C. sieboldii var. sieboldii (CSS), four C. cuspidata var. cuspidata (CCC), one C. sieboldii var. lutchuensis (OKN) and one C. cuspidata var. carlesii (TWR). For OKN and TWR individuals, allele size is shown.

When these primers were applied to related Fagaceae species, three (19%) of them amplified all Fagaceae species (Tab. V). When applied to C. crenata, 14 (88%) of the primer pairs produced single PCR product. However, for F. crenata, the transferability was low, with half of the primre pairs no amplification. Within two Castanopsis species and two variaties, all of the markers were well applied. Genotypes were cleary determined for all 32 individuals. The NJ dendrogram using shared allele distance between individuals (Fig. 2) showed clear split between C. sieboldii and C. cuspidata. However, discrimination of the varieries (C. sieboldii var. lutchuensis and C. cuspidata var. carlesii) from either of C. sieboldii var. sieboldii or C. cuspidata var. cuspidata was not so clear.

4. DISCUSSION

4.1. Characteristics of putative unigenes

In the present study, a non-normalized cDNA library was constructed from RNA extracted from the inner bark of C. sieboldii var. sieboldii. The largest cluster obtained contained 39 transcripts (1.2% of the total), indicating the suitability of bark tissue for library construction without normalization. Bhalerao et al. (2003) reported that, with young leaf tissue as the RNA source, 14% of the total number of their clones represented the RbcS gene (a small subunit of Rubisco). Although the objectives and method to construct putative unigenes was different among studies, young leaf library by Bhalerao et al. (2003) had 1943 unique gene families from 4923 EST sequences, leading to the ratio of unique sequence of 0.395. The ratio shows great contrast to that (0.721) of the present study.

Table V

Cross-species amplification of 16 EST-SSR primer pairs from C. sieboldii. Fc, Lg, Cc, Qg, Qd, Qs, Qm and Qv indicate Fagus crenata, Lithocarpus glabra, Castanea crenata, Quercus glauca, Q. dentata, Q. serrata, Q. mongolica and Q. variabilis, respectively. The plus (+) and minus (-) signe and ‘m’ indicate single locus amplification, no products and multiple banding pattern, respectively.

thumbnail Figure 2

NJ dendrogram for individuals of C. sieboldii var. lutchuensis (L), C. sieboldii var. sieboldii (S), C. cuspidata var. carlesii (R) and C. cuspidata var. cuspidata (C) based on the proportion of shared alleles for 16 EST-SSR markers.

The non-normalized nature of the library was reflected in the sizes of the clusters, which indicated the relative levels of gene expression. The most abundant ESTs (Suppl. Tab. II 1) included an RD22-like protein (CcC00008_1; 39 ESTs), a metallothionein-like protein (CcC00064_1; 19 ESTs), and a translationally controlled tumor protein (CcC00353_1; 13 ESTs), whose expression is thought to be related to stress (Bommer and Thiele, 2004; Navabpour et al., 2003; Yamaguchi-Shinozaki and Shinozaki, 1993) These transcripts may reflect environmental conditions where the tree grows and/or the active physiological state of the tree. When the library was associated with reference and stress libraries of Populus, it was related more to the sivere draught stress-treated library. Other cluster types were related to transcription and translation. The cluster CcC00231 had similarity to translation elongation factor 1 ffand had hits with ESTs in all the other libraries, probably due to “housekeeping” role of the gene. Translationally controlled tumor proteins (corresponding to the cluster CcC00353) may also be involved in the elongation step of protein synthesis (Cans et al., 2003). The other highly represented clusters (CcC00021 and CcC00068) showed similarities with asparaginase. Rapidly growing tissues (e.g. apical meristems, expanding leaves, inflorescences and seeds) are known to display asparaginase activity (Grant and Bevan, 1994); the cambium (lateral meristem) of the inner bark is one such fast-growing, actively developing tissue. Thus, the detected ESTs that resemble asparaginase are probably related to the lateral growth of the stem, although low temperature stress may also induce asparaginase expression (Cho et al., 2007). In EST analysis of wood-forming tissue of poplar (Sterky et al., 1998), putative genes for cyclophilin, a translationally controlled tumor protein, S-adenosyl-L-methionine synthase, the elongation factor 1- ffand a 14-3-3-like protein were identified as the most highly expressed genes. Furthermore, analysis of Cryptomeria japonica inner bark EST (Ujino-Ihara et al., 2000) revealed that peptidyl-proryl cis-trans isomerase, a protein translation factor SUI1 homologue, a metallothionein-like protein type 2 and the elongation factor 1- ffwere highly abundant transcripts. These transcripts found in poplar and C. japonica were also abundant in the material examined in the present study (Suppl. Tab. II 1).

Although the number of ESTs in the present study is limited, the HMM search identified the ubiquitin protein family as the most abundant (Tab. I). Pavy et al. (2005) reported that the most abundant protein families in various tissue of Picea glauca, based on EST analysis, were the kinase domain, cytochrome P450, protein tyrosine kinase and the ubiquitin family. The observed differences may be the result of the characteristics of the inner bark. When the result was limited to putative unigenes relating to cell wall formation, the most abundant protein family was ubiquitin, followed by ubiquitin-conjugating enzyme and core histone H2A/H2B/H3/H4, with the number of potential unigenes 16, 9 and 9, respectively. Therefore, the putative unigenes with ubiquitin protein family was mostly (16 out of 18) related to cell wall formation. Ubiquitin is associated with protein degradation and cell death during the development of vascular tissue (Stephenson et al., 1996). The larger number of putative unigenes related to ubiquitin in the present library may relate to cell wall formation.

When our putative unigenes were compared with sequences relating to cell wall formation, 9.3% of the total number of transcripts exhibited similarities. The ratio in the present study was higher than or nearly the same as that reported for poplar (Sterky et al., 1998), C. japonica (Ujino-Ihara et al., 2000) and Pinus taeda (Allona et al., 1998), with ratios of 4%, 3% and ca. 10%, respectively. When we limited the MAIZEWALL database sequence to that listed in Table II of Sterky et al. (1998) and adopted the same threshold (score > 100), 7.9% of our transcripts were considered to be related to cell wall formation.

4.2. Microsatellite mining and EST-SSR markers

We found a total of 286 (11.8%) putative unigenes with microsatellite motifs. The percentage of microsatellite-containing ESTs is reported to vary substantially among species, ranging from 2.65% for Solanum tuberosum to 16.82% for Juglans regia, with an average of 6.0% among dicotyledonous species (Kumpatla and Mukhopadhyay, 2005), and from 1.5% for maize to 4.7% for rice among cereal species (monocotyledons) (Kantety et al., 2002). Thus, unsurprisingly the abundance of microsatellites in our EST library is within the range reported for dicotyledons (although it should be noted that the thresholds for the number of microsatellite repeats differed between the two cited studies). Moreover, the abundance of SSRs, in terms of the microsatellite motifs, was highest for AG and AAG amongst di- and tri-SSRs, respectively (Tab. II). This trend is also found in most of the dicotyledonous species (Kumpatla and Mukhopadhyay, 2005). In coding regions, tri-SSRs were more frequent, while we found no di-SSRs (Tab. II), probably because selection against frameshift mutations removes di-SSR repeats from coding regions (Metzgar et al., 2000).

When we forcus on EST-SSRs in tree species, the percentage of microsatellite-containing ESTs for angiosperm species is generally high and AG motif is more rich, compared to gymnosperm species. For five Eucalyptus species, average frequency of microsatellite-containing EST was 12.9% with AG motif most common (Yasodha et al., 2008). Quercus mongolica var. crispula had 248 microsatellite-containing potential unigenes (11.6%) in all 2140 potential unigenes, with AG motif most common (Ueno et al., 2008). C. sieboldii var. sieboldii in the present study (Tab. II) showd nealy the same trend as Q. mongolica. For Pinus taeda and P. pinaster, the percentage of SSR in their putative unigene was 1.2 and 2.1%, respectively (Chagné et al., 2004). The most common motif was AT and AG for P. taeda and P. pinaster, respectively. In P. pinaster, the occurrence of AT motof (47%) within di-SSR was nearly the same as that of AG motif (51%). Picea had only 183 contigs (0.93%) with at least one microsatellite in 20275 putative unigenes (Rungis et al., 2004), while Cryptomeria japonica had 163 (3.6%) unique di- and tri- SSRs in about 4500 cDNA clones (Moriguchi et al., 2003). For both Picea and C. japonica, AT motif was the most frequent.

Functional analysis of the putative unigenes and EST-SSRs based on GO slim terms (Fig. 1) revealed that the EST-SSRs were significantly more frequent than would be expected by chance in nucleotide-binding proteins (GO:0003676). Thirty-one putative unigenes with microsatellites were in this category; of these, 11 and 20 were estimated to be located in the coding and non-coding regions, respectively. More than half of the microsatellite-containing putative unigenes in GO:0003676 category had similarities to transcription and translation-related proteins. Complex regulation and machinery, including many protein-protein and/or protein-nucleic acid interactions, are needed in such processes and microsatellite or mono amino acid repeats may be important. Previous analyses of Arabidopsis thaliana and Oryza sativa proteins have shown that amino acid repeats are also overrepresented in their transcription factors (Zhang et al., 2006). The other suggested functions of the remaining microsatellite-containing putative unigenes included histone, DNA repair, and ribosomal proteins. Analysis of inner bark EST of Quercus mongolica (Ueno et al., 2008), produced similar results, with the GO:0003676 category being overrepresented.

In the present study, we developed 16 EST-SSR markers for C. sieboldii var. sieboldii, 12 of which have at least one di-SSR motif. The number of alleles per locus (Na) and observed heterozygosity (Ho) for these di-SSR markers ranged from 3 to 9, and from 0.2 to 0.9, respectively, in the C. sieboldii var. sieboldii individuals we examined (Tab. IV). Ueno et al. (2000 and 2003) surveyed the polymorphisms of 13 genomic anonymous di-SSR markers in 32 and 24 C. sieboldii var. sieboldii individuals, and found between six and 23 alleles per locus, with observed heterozygosity values ranging from 0.50 to 0.97. When allelic richness (Petit et al., 1998) for 10 diploid individuals is calculated for genomic di-SSR markers in the cited studies, it ranges from 4.6 to 12.5. The levels of variability still appear higher for anonymous genomic di-SSR markers than for EST-SSR markers with a di-SSR motif. This trend is to be expected, considering the more conservative nature of ESTs than that of anonymous genomic regions. However, in the systematic comparison made by Pashley et al. (2006), there were no significant differences between the levels of variability in transferable EST-SSR and gSSR markers. Moreover, the quality of the electropherograms for the EST-SSR markers was higher than that of the anonymous genomic SSRs (Pashley et al., 2006). In the present study, all of the EST-SSR markers had high quality electropherograms for samples from both of the two species and the two varieties. Similarly, 20 out of 44 candidate EST-SSR markers are reported to have been amplified successfully in samples from 23 Picea species (Rungis et al., 2004). Single nucleotide polymorphisms (SNPs) are frequently found within ESTs. However, analysis of SNPs markers for a large population is still costly. time-consuming and specific to a focal species, so it is inappropriate for practical purposes. EST-SSR markers are, in contrast, attractive tools for population genetic surveys with a wide scope for applying them to related species.

The cross-species PCR amplification over Fagaceae family in the present study (Tab. V) showed three markers (CcC02014, CcC02022 and CcC02069) were able to amplify all species tested. In order to develop common featurs for transferability, we focused on presence/absence of PCR products and primer locations on the putative unigenes. When both primers are located on the estimated coding region, they produced PCR product for all species tested. These markers were CcC01513, CcC02069, CcC02291 and CcC02375. The primer position may be an important factor to increase cross-species transferability, as in the case of Helianthus (Pashley et al., 2007). The cross-species transferability is also dependent on phylogenetic distance between species (Chagné et al., 2004). This was also confirmed in the present study. F. crenata is the most distant species from C. sieboldii var. sieboldii, while C. crenata is the closest (Manos and Stanford, 2001). The transferability was the lowest for F. crenata and hightest for C. crenata.

Within Castanopsis genus, all of the markers in Table III gave clear electropherograms for all individuals in two species and two varieties. The discrimination of populations between C. sieboldii var. sieboldii and C. cuspidata var. cuspidata was clear (Fig. 2). Moreover, we have confirmed the usefulness of the markers in the present study through population analysis of C. sieboldii var. sieboldii, C. sieboldii var. lutchuensis, C cuspidata var. cuspidata and C. cuspidata var. carlesii (Aoki et al., unpublished). However, caution is required, when using putative single homologues of EST-SSR that have been amplified from C. sieboldii var. sieboldii, if the markers are applied to related species. Without sequencing, confirming the expected difference in allele size may be an alternative method to validate homologous amplification. Ten of the 16 EST-SSRs in Table IV had alleles with the expected difference in the repeat unit. The remaining loci probably experienced insertions or deletions in the vicinity of the microsatellites. We believe that the PCR primers designed in the present study will be of considerable value in future research into variations within C. sieboldii and related species.

Acknowledgments

The authors are grateful to H. Yoshimaru, Y. Tsuda, R. Kusano, K. Kimura, T. Kamijo, H. Setoguchi and K. Oono for collecting samples, to Y. Taguchi and Y. Komatsu for laboratory work and to T. Ujino-Ihara for valuable discussions. This research was supported by a grant for Research on Genetic Guidelines for Restoration Programs using Genetic Diversity Information from the Ministry of Environment, Japan.


1

Supplementary material is available at www.afs-journal.org.

References

  1. Allona I., Quinn M., Shoop E., Swope K., St Cyr S., Carlis J., Riedl J., Retzel E., Campbell M.M., Sederoff R., and Whetten R.W., 1998. Analysis of xylem formation in pine by cDNA sequencing. Proc. Natl. Acad. Sci. USA 95: 9693–9698 [PubMed] [CrossRef].
  2. Altschul S.F., Gish W., Miller W., Myers E.W., and Lipman D.J., 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403–410 [PubMed] [CrossRef].
  3. Apweiler R., Bairoch A., Wu C.H., Barker W.C., Boeckmann B., Ferro S., Gasteiger E., Huang H., Lopez R., Magrane M., Martin M.J., Natale D.A., O'Donovan C., Redaschi N., and Yeh L.S., 2004. UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 32: D115–D119 [PubMed] [CrossRef].
  4. Bhalerao R., Keskitalo J., Sterky F., Erlandsson R., Bjorkbacka H., Birve S.J., Karlsson J., Gardestrom P., Gustafsson P., Lundeberg J., and Jansson S., 2003. Gene expression in autumn leaves. Plant Physiol. 131: 430–442 [PubMed] [CrossRef].
  5. Bommer U.A. and Thiele B.J., 2004. The translationally controlled tumour protein (TCTP). Int. J. Biochem. Cell Biol. 36: 379–385 [PubMed] [CrossRef].
  6. Cans C., Passer B.J., Shalak V., Nancy-Portebois V., Crible V., Amzallag N., Allanic D., Tufino R., Argentini M., Moras D., Fiucci G., Goud B., Mirande M., Amson R., and Telerman A., 2003. Translationally controlled tumor protein acts as a guanine nucleotide dissociation inhibitor on the translation elongation factor eEF1A. Proc. Natl. Acad. Sci. USA 100: 13892–13897 [PubMed] [CrossRef].
  7. Chagne D., Chaumeil P., Ramboer A., Collada C., Guevara A., Cervera M.T., Vendramin G.G., Garcia V., Frigerio J.M., Echt C., Richardson T., and Plomion C., 2004. Cross-species transferability and mapping of genomic and cDNA SSRs in pines. Theor. Appl. Genet. 109: 1204–1214 [PubMed] [CrossRef].
  8. Chang S., Puryear J., and Cairney J., 1993. A simple and efficient method for isolating RNA from pine trees. Plant Mol. Biol. Rep. 11: 113–116 [CrossRef].
  9. Cho C., Lee H., Chung E., Kim K., Heo J., Kim J., Chung J., Ma Y., Fukui K., Lee D., Kim D., Chung Y., and Lee J., 2007. Molecular characterization of the soybean L-asparaginase gene induced by low temperature stress. Mol. Cells 23: 280–286 [PubMed].
  10. Delectis florae reipublicae popularis sinicae agendae academiae sinicae edita, 1998. Flora : reipublicae popularis sinicae (in Chinese) Science Press, Beijing, Vol. 22, 66–67.
  11. Dieringer D. and Schlotterer C., 2003. Microsatellite analyzer (MSA): a platform independent analysis tool for large microsatellite data sets. Mol. Ecol. Notes 3: 167–169 [CrossRef].
  12. Eveno E., Collada C., Guevara M.A., Leger V., Soto A., Diaz L., Leger P., Gonzalez-Martinez S.C., Cervera M.T., Plomion C., and Garnier-Gere P.H., 2008. Contrasting patterns of selection at Pinus pinaster Ait. Drought stress candidate genes as revealed by genetic differentiation analyses. Mol. Biol. Evol. 25: 417–437.
  13. Ewing B. and Green P., 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8: 186–194.
  14. Ewing B., Hillier L., Wendl M.C., and Green P., 1998. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8: 175–185.
  15. Ewing R.M., Ben Kahla A., Poirot O., Lopez F., Audic S., and Claverie J.M. (1999) Large-scale statistical analyses of rice ESTs reveal correlated patterns of gene expression. Genome Res. 9: 950–959
  16. Finn R.D., Tate J., Mistry J., Coggill P.C., Sammut S.J., Hotz H.R., Ceric G., Forslund K., Eddy S.R., Sonnhammer E.L., and Bateman A., 2008. The Pfam protein families database. Nucleic Acids Res. 36: D281–D288 [PubMed] [CrossRef].
  17. Fukuoka H., Nunome T., Minamiyama Y., Kono I., Namiki N., and Kojima A., 2005. Read2Marker: a data processing tool for microsatellite marker development from a large data set. Biotechniques 39: 472 [PubMed] [CrossRef], 474, 476.
  18. Girke T., Lauricha J., Tran H., Keegstra K., and Raikhel N., 2004. The cell wall navigator database. A systems-based approach to organism-unrestricted mining of protein families involved in cell wall metabolism. Plant Physiol. 136: 3003–3008; discussion 3001.
  19. Grant M. and Bevan M.W., 1994. Asparaginase gene expression is regulated in a complex spatial and temporal pattern in nitrogen-sink tissues. Plant J. 5: 695–704 [CrossRef].
  20. Green P., Documentation for phrap and cross_match. 1999. [online] Available from http://bozeman.mbt.washington.edu/phrap.docs/phrap.html [accessed 7 March 2007].
  21. Guillaumie S., San-Clemente H., Deswarte C., Martinez Y., Lapierre C., Murigneux A., Barriere Y., Pichon M., and Goffner D., 2007. MAIZEWALL. Database and developmental gene expression profiling of cell wall biosynthesis and assembly in maize. Plant Physiol. 143: 339–363.
  22. Iseli C., Jongeneel C.V., and Bucher P., 1999. ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences. Proc. Int. Conf. Intell. Syst. Mol. Biol. 138–148.
  23. Kado T., Yoshimaru H., Tsumura Y., and Tachida H., 2003. DNA variation in a conifer, Cryptomeria japonica (Cupressaceae sensu lato). Genetics 164: 1547–1559 [PubMed].
  24. Kane N.C. and Rieseberg L.H., 2007. Selective sweeps reveal candidate genes for adaptation to drought and salt tolerance in common sunflower, Helianthus annuus. Genetics 175: 1823–1834 [PubMed] [CrossRef].
  25. Kantety R.V., La Rota M., Matthews D.E., and Sorrells M.E., 2002. Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol. Biol. 48: 501–510 [PubMed] [CrossRef].
  26. Kobayashi S. and Hiroki S., 2003. Patterns of occurrence of hybrids of Castanopsis cuspidata and C. sieboldii in the IBP Minamata Special Research Area, Kumamoto Prefecture, Japan. J. Phytogeogr. Taxon. 51: 63–67.
  27. Kobayashi Y. and Sugawa T., 1959. Identification of wood of some Castanopsis species in Japan (in Japanese with English abstract). Bull. Gov. For. Exp. Stn. 118: 139–178.
  28. Kondo H., Tahira T., Hayashi H., Oshima K., and Hayashi K., 2000. Microsatellite genotyping of post-PCR fluorescently labeled markers. Biotechniques 29: 868–872 [PubMed].
  29. Kumpatla S.P. and Mukhopadhyay S., 2005. Mining and survey of simple sequence repeats in expressed sequence tags of dicotyledonous species. Genome 48: 985–998 [PubMed] [CrossRef].
  30. Manos P.S. and Stanford A.M., 2001. The historical biogeography of Fagaceae: Tracking the tertiary history of temperate and subtropical forests of the Northern Hemisphere. Int. J. Plant Sci. 162: S77–S93 [CrossRef].
  31. Metzgar D., Bytof J., and Wills C., 2000. Selection against frameshift mutations limits microsatellite expansion in coding DNA. Genome Res. 10: 72–80 [PubMed].
  32. Moriguchi Y., Iwata H., Ujino-Ihara T., Yoshimura K., Taira H., and Tsumura Y., 2003. Development and characterization of microsatellite markers for Cryptomeria japonica D. Don. Theor. Appl. Genet. 106: 751–758.
  33. Murray M.G. and Thompson W.F., 1980. Rapid isolation of high molecular weight plant DNA. Nucleic Acids Res. 8: 4321–4325 [PubMed] [CrossRef].
  34. Nanjo T., Futamura N., Nishiguchi M., Igasaki T., Shinozaki K., and Shinohara K., 2004. Characterization of full-length enriched expressed sequence tags of stress-treated poplar leaves. Plant Cell Physiol. 45: 1738–1748 [PubMed] [CrossRef].
  35. Navabpour S., Morris K., Allen R., Harrison E., S A.H.-M., and Buchanan-Wollaston V., 2003. Expression of senescence-enhanced genes in response to oxidative stress. J. Exp. Bot. 54: 2285–2292 [PubMed] [CrossRef].
  36. Nei M. and Kumar S., 2000. Molecular evolution and phylogenetics, Oxford University Press, New York, 333 p.
  37. Parkinson J., Anthony A., Wasmuth J., Schmid R., Hedley A., and Blaxter M., 2004. PartiGene–constructing partial genomes. Bioinformatics 20: 1398–1404 [PubMed] [CrossRef].
  38. Parkinson J., Guiliano D.B., and Blaxter M., 2002. Making sense of EST sequences by CLOBBing them. BMC Bioinformatics 3: 31 [PubMed] [CrossRef].
  39. Pashley C.H., Ellis J.R., McCauley D.E., and Burke J.M., 2006. EST databases as a source for molecular markers: lessons from Helianthus. J. Hered. 97: 381–388 [PubMed] [CrossRef].
  40. Pavy N., Paule C., Parsons L., Crow J.A., Morency M.J., Cooke J., Johnson J.E., Noumen E., Guillet-Claude C., Butterfield Y., Barber S., Yang G., Liu J., Stott J., Kirkpatrick R., Siddiqui A., Holt R., Marra M., Seguin A., Retzel E., Bousquet J., and MacKay J., 2005. Generation, annotation, analysis and database integration of 16,500 white spruce EST clusters. BMC Genomics 6: 144 [PubMed] [CrossRef].
  41. Petit R.J., El Mousadik A., and Pons O., 1998. Identifying populations for conservation on the basis of genetic markers. Conserv. Biol. 12: 844–855 [CrossRef].
  42. Raes J., Rohde A., Christensen J.H., Van de Peer Y., and Boerjan W., 2003. Genome-wide characterization of the lignification toolbox in Arabidopsis. Plant Physiol. 133: 1051–1071 [PubMed] [CrossRef].
  43. Rozen S. and Skaletsky H.J., 2000. Primer3 on the WWW for general users and for biologist programmers. In: Krawetz S.A. and Misener S. (Eds.), Bioinformatics methods and protocols: Methods in molecular biology, Humana Press, Totowa, pp. 365–386.
  44. Rungis D., Berube Y., Zhang J., Ralph S., Ritland C.E., Ellis B.E., Douglas C., Bohlmann J., and Ritland K., 2004. Robust simple sequence repeat markers for spruce (Picea spp.) from expressed sequence tags. Theor. Appl. Genet. 109: 1283–1294 [PubMed] [CrossRef].
  45. Stephenson P., Collins B.A., Reid P.D., and Rubinstein B., 1996. Localization of ubiquitin to differentiating vascular tissues. Am. J. Bot. 83: 140–147 [CrossRef].
  46. Sterky F., Regan S., Karlsson J., Hertzberg M., Rohde A., Holmberg A., Amini B., Bhalerao R., Larsson M., Villarroel R., Van Montagu M., Sandberg G., Olsson O., Teeri T.T., Boerjan W., Gustafsson P., Uhlen M., Sundberg B., and Lundeberg J., 1998. Gene discovery in the wood-forming tissues of poplar: analysis of 5, 692 expressed sequence tags. Proc. Natl. Acad. Sci. USA 95: 13330–13335 [PubMed] [CrossRef].
  47. Sterky F., Bhalerao R.R., Unneberg P., Segerman B., Nilsson P., Brunner A.M., Charbonnel-Campaa L., Lindvall J.J., Tandre K., Strauss S.H., Sundberg B., Gustafsson P., Uhlen M., Bhalerao R.P., Nilsson O., Sandberg G., Karlsson J., Lundeberg J., and Jansson S., 2004. A Populus EST resource for plant functional genomics. Proc. Natl. Acad. Sci. USA 101: 13951–13956 [PubMed] [CrossRef].
  48. Tamura K., Dudley J., Nei M., and Kumar S., 2007. MEGA4: Molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24: 1596–1599 [PubMed] [CrossRef].
  49. Temnykh S., DeClerck G., Lukashova A., Lipovich L., Cartinhour S., and McCouch S., 2001. Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Res. 11: 1441–1452 [PubMed] [CrossRef].
  50. Tsumura Y., Kado T., Takahashi T., Tani N., Ujino-Ihara T., and Iwata H., 2007. Genome scan to detect genetic structure and adaptive genes of natural populations of Cryptomeria japonica. Genetics 176: 2393–2403 [PubMed] [CrossRef].
  51. Ueno S., Taguchi Y., and Tsumura Y., 2008. Microsatellite markers derived from Quercus mongolica var. crispula (Fagaceae) inner bark expressed sequence tags. Genes Genet. Syst. 83: 179–187 [PubMed] [CrossRef].
  52. Ueno S., Yoshimaru H., Kawahara T., and Yamamoto S., 2000. Isolation of microsatellite markers in Castanopsis cuspidata var. sieboldii Nakai from an enriched library. Mol. Ecol. 9: 1188–1190 [PubMed].
  53. Ueno S., Yoshimaru H., Kawahara T., and Yamamoto S., 2003. A further six microsatellite markers for Castanopsis cuspidata var. sieboldii Nakai. Conserv. Genet. 4: 813–815 [CrossRef].
  54. Ujino-Ihara T., Yoshimura K., Ugawa Y., Yoshimaru H., Nagasaka K., and Tsumura Y., 2000. Expression analysis of ESTs derived from the inner bark of Cryptomeria japonica. Plant Mol. Biol. 43: 451–457 [PubMed] [CrossRef].
  55. Yamada H. and Miyaura T., 2003. Geographic occurrence of intermediate type between Castanopsis sieboldii and C. cuspidata (Fagaceae) based on the structure of leaf epidermis. J. Plant Res. 116: 477–482 [PubMed] [CrossRef].
  56. Yamaguchi-Shinozaki K. and Shinozaki K., 1993. The plant hormone abscisic acid mediates the drought-induced expression but not the seed-specific expression of rd22, a gene responsive to dehydration stress in Arabidopsis thaliana. Mol. Gen. Genet. 238: 17–25 [PubMed].
  57. Yamanaka T., 1966. Problems of Castanopsis cuspidata Schottky (in Japanese with English abstract). Bull. Fac. Educ., Kochi Univ. 18: 65–73.
  58. Yamazaki T. and Mashiba S., 1987a. A taxonomical revision of Castanopsis cuspidata (Thunb.) Schottky and the allies in Japan, Korea and Taiwan (1). J. Jap. Bot. 62: 289–298.
  59. Yamazaki T. and Mashiba S., 1987b. A taxonomical revision of Castanopsis cuspidata (Thunb.) Schottky and the allies in Japan, Korea and Taiwan (2). J. Jap. Bot. 62: 332–339.
  60. Yasodha R., Sumathi R., Chezhian P., Kavitha S., and Ghosh M., 2008. Eucalyptus microsatellites mined in silico: survey and evaluation. J. Genet. 87: 21–25 [PubMed] [CrossRef].
  61. Zhang L., Yu S., Cao Y., Wang J., Zuo K., Qin J., and Tang K., 2006. Distributional gradient of amino acid repeats in plant proteins. Genome 49: 900–905 [PubMed] [CrossRef].

Online material

Download PDF file.

Supplementary Table I

Sampling sites for polymorphism screening.

Supplementary Table II

The characteristics of abundant clusters (cluster size > 5) in the C. sieboldii inner bark cDNA library.

Supplementary Table III

Distance matrix among cDNA libraries. CcC: C. sieboldii in the present study, Leple M: mild draught stress-treated Populus tremula × Populus alba leaves, Leple S: sevear draught stress-treated Populus tremula x Populus alba leaves, Nanjo: mixture of several stress-treated Populus nigra leaves. Description for other libraries can be found at http://www.populus.db.umu.se/.

Supplementary Table IV

SSRs found in potential unigenes by SSRIT.pl script and estimation of SSR location.

All Tables

Table I

Abundant protein families (number of putative unigenes > 5).

Table II

Abundance of microsatellites in the C. sieboldii putative unigenes.

Table III

Characteristics of the C. sieboldii EST-SSR markers.

Table IV

Polymorphisms of the EST-SSR markers, based on samples of 10 C. sieboldii var. sieboldii (CSS), four C. cuspidata var. cuspidata (CCC), one C. sieboldii var. lutchuensis (OKN) and one C. cuspidata var. carlesii (TWR). For OKN and TWR individuals, allele size is shown.

Table V

Cross-species amplification of 16 EST-SSR primer pairs from C. sieboldii. Fc, Lg, Cc, Qg, Qd, Qs, Qm and Qv indicate Fagus crenata, Lithocarpus glabra, Castanea crenata, Quercus glauca, Q. dentata, Q. serrata, Q. mongolica and Q. variabilis, respectively. The plus (+) and minus (-) signe and ‘m’ indicate single locus amplification, no products and multiple banding pattern, respectively.

Supplementary Table I

Sampling sites for polymorphism screening.

Supplementary Table II

The characteristics of abundant clusters (cluster size > 5) in the C. sieboldii inner bark cDNA library.

Supplementary Table III

Distance matrix among cDNA libraries. CcC: C. sieboldii in the present study, Leple M: mild draught stress-treated Populus tremula × Populus alba leaves, Leple S: sevear draught stress-treated Populus tremula x Populus alba leaves, Nanjo: mixture of several stress-treated Populus nigra leaves. Description for other libraries can be found at http://www.populus.db.umu.se/.

Supplementary Table IV

SSRs found in potential unigenes by SSRIT.pl script and estimation of SSR location.

All Figures

thumbnail Figure 1

Functional profile of the 1263 C. sieboldii putative unigenes and the 108 microsatellite-containing putative unigenes annotated according to GO slim terms (squares and circles, respectively). The error bars indicate 95% confidence limits for the frequencies of putative unigenes annotated with the specific GO slim term when 108 putative unigenes were randomly sampled 1000 times. Significantly highly represented GO slim terms amongst putative unigenes with microsatellites are indicated by the closed circle. The GO IDs and corresponding terms are as follows: GO:0005622, intracellular; GO:0005623, cell; GO:0005576, extracellular region; GO:0003824, catalytic activity; GO:0005488, binding; GO:0003676, nucleic acid binding; GO:0005198, structural molecule activity; GO:0005215, transporter activity; GO:0004871, signal transducer activity; GO:0030528, transcription regulator activity; GO:0030234, enzyme regulator activity; GO:0003774, motor activity; GO:0016209, antioxidant activity; GO:0008152, metabolic process; GO:0006139, nucleobase, nucleoside, nucleotide and nucleic acid metabolic process; GO:0006810, transport; GO:0050896, response to stimulus; GO:0007154, cell communication; GO:0006118, electron transport; GO:0009987, cellular process; GO:0006519, amino acid and derivative metabolic process; GO:0008219, cell death; GO:0007275, multicellular organismal development.

In the text
thumbnail Figure 2

NJ dendrogram for individuals of C. sieboldii var. lutchuensis (L), C. sieboldii var. sieboldii (S), C. cuspidata var. carlesii (R) and C. cuspidata var. cuspidata (C) based on the proportion of shared alleles for 16 EST-SSR markers.

In the text