Abstract
Chamaecyparis taiwanensis is an endemic plant suffering illegal logging in Taiwan for its high economic value. Lack of direct evidence to correlate stump and timber remains a hurdle for law enforcement. In this report, 23 polymorphic Genomic Simple Sequence Repeat (gSSR) and 12 Expressed Sequence Tag (EST)-SSR markers were developed and their transferability was assessed. The individual identification system built from selected non-linkage 30 SSR markers has a combined probability of identity as 5.596 × 10–12 equivalents to identifying an individual in a population of up to 18 million C. taiwanensis with 99.99% confidence level. We also applied the system in an actual criminal case by selecting 19 of these markers to correlate illegally felled timbers and victim trees. Our data demonstrate that molecular signals from three timbers hit with three victim trees with confidence level more than 99.99%. This is the first example of successfully applying SSR in C. taiwanensis as a court evidence for law enforcement. The identification system adapted advanced molecular technology and exhibits its great potential for natural resource management on C. taiwanensis.
Similar content being viewed by others
Introduction
Chamaecyparis taiwanensis Masam. & Suzuki [= Chamaecyparis obtusa (Sieb. & Zucc.) Endl. var. formosana (Hayata) Hayata] (Cupressaceae) is a gymnosperm endemic in Taiwan. C. taiwanensis is endemic to Taiwan and is the dominant species in the conifer and broadleaf tree mixed forest, located in middle altitude region (from 1700 m to 2600 m) of Taiwan island1. The lowest latitude boundary of cypress’ natural distribution falls into Taiwan, suggesting a great significance in biogeography2. As an indispensable resource for making elegant buildings, furniture and handicrafts, these species play a vital role in serving wood source and timber industry. C. taiwanensis is well-known for their wood quality and expensiveness (4400 USD/m3)(woodprice.forest.gov.tw), which often lead to endless illegal felling crimes. Therefore, developing individual identification system to C. taiwanensis is of more importance3.
Illegal felling remains a persistent problem in the timber producing countries all over the world. For decades, illegal logging endangered precious and valuable tree species such as cypress4, ash5, mahogany tree6, and Brazilian rosewood7 all over the world. In some cases, the law enforcement authorities, such as forestry police, arrest the suspects in time. However, lack of direct scientific evidence that correlate timbers to the stumps leads the conviction processes rather difficult and ineffective. Thus, the need of individual identification is critical to the forestry industry.
The problem of illegal logging has been paid attention since 1995. More and more national and international regulations mandate tracking systems that ensure traceability on wood market8,9,10. Wood anatomy and dendrochronology are common visual identification method. The former is based on the anatomical characteristics to identify the wood, and can usually be identified to the genus11; the latter is often used to illustrate past climates, but may also provide the age and origin of the trees12. Compounds synthesized by trees and other plants are often called phytochemicals and are often used to identify species or distinguish genera. Intraspecific variation can also be detected in some species through some chemical analysis such as mass spectrometry12,13, near infrared spectroscopy14, detector dogs15, stable isotopes16, and radiocarbon17. Genetic analysis can provide species-level identification, which is usually achieved by DNA sequence polymorphism18. Simple sequence repeats (SSRs) and Single nucleotide polymorphisms (SNPs) can be used to identify individuals and can be used in population genetics or systematic geography to determine the geographical region of origin within a species19. DNA fingerprinting is built into each organism itself and cannot be forged20. When enough markers are developed, in principle every individual has its own unique DNA fingerprint. DNA fingerprinting has the potential to track wood products independently within complex global supply network21. Theoretically, DNA fingerprinting is the only forensic wood identification technology that could be used to connect seized timber to illegally felled stumps8.
SSR is the most common marker used in individual identification for its short length, high polymorphism, easy polymerase chain reaction (PCR) amplification, high reproducibility, and high sensitivity20,22,23. SSRs are divided into two broad categories by different sources: Genomic (g)-SSR and expressed sequence tag (EST)-SSR24. gSSR markers are derived from amplified genomic libraries. EST-SSRs are markers mined from EST sequence collections. gSSR markers have been reported to be more polymorphic when compared with EST-SSR in gymnosperms4 and crops25,26 because of a more diversified nucleotide sequence. Since the development of high-throughput sequencing technology, the marker development technique has been continuously advanced. Wang et al., 2018 published the first report on gSSR developed by De novo genome sequencing27. In contrast, EST-SSR, derived from the expressed sequence, is fast-acting, cost-effective and labor-saving alternative for non-model organisms24. Because of the conservative nature in gene coding regions24, newly developed EST-SSRs usually can be transferred in closely related species for marker development. The first EST-SSR based on Illumina-based de novo transcriptome was also published by Zhou et al. in 201828. A study to develop both markers would avail of their merits and functions simultaneously.
For C. taiwanensis, evaluation of genetic variation or population structure is necessary for its preservation2,29 because this species is used extensively. After mid-twentieth century, the number of C. taiwanensis plunged, which also led to a significant decrease in both genetic variation and population structure. As an important tool for genetic and subsequent breeding, SSR markers are helpful for breeding polymorphic maternal plants and increasing the diversity of progeny. The objective of this study is to establish a scientifically valid SSR mediated individual identification system for C. taiwanensis in order to provide court evidence to link the seized wood and the victim tree, and to provide traceability proof for wood supply network. In the beginning of the research, we used Next Generation Sequence (NGS) technology to establish the DNA and RNA libraries of C. taiwanensis to accelerate the development of gSSR and EST-SSR markers. A total of 96 samples from four populations were used to evaluate the polymorphism, discriminative power, and random match rate of the selected SSRs. The linkage disequilibrium between markers was calculated to estimate the availability and credibility of the individual identification system. In this study, we successfully linked 3 stolen timbers back to 3 victim trees (case number MJIB-DNA-1080413 combine 1080328), marked the first successful application of C. taiwanesis individual identification system. Finally, our work would deter illegal felling toward these precious species by manifesting law enforcement effectively.
Result and discussion
Developing C. taiwanensis individual identification system
Choice of template and library preparation
The gSSR are characterized by high polymorphism and is suitable for developing individual identification markers. The EST-SSR are highly conservative which could be used for developing markers to categorize species and populations20,22,23. In this study, both DNA and RNA libraries were constructed simultaneously as gSSR and EST-SSR markers, respectively (Fig. 1, Supplementary Sect. 1). From the three DNA libraries and from a RNA library prepared for the study, the sequences were compared between individual plants as well as between groups (Supplementary Sect. 1). With these two nucleic acid markers, we envisioned to differentiate samples within or among species.
Nucleic acid sequencing and analysis
Next-generation sequencing technology enables the possible procurement of large number of sequences in a short time. In this study, we used the Illumina MiSeq platform (2 × 300 bp) to sequence the DNA and RNA libraries (Fig. 1). A total of 13,651,578 and 11,763,646 raw reads were produced from DNA and RNA libraries, respectively. The raw reads were deposited in the NCBI Sequence Read Archive (PRJNA506084). The sequences were then subjected to quality-trimming and merging and afterwards 4,236,284 contigs of the DNA pool and 4,392,534 RNA contigs were assembled. The base lengths of contigs ranged from 120–579 and 120–529, at an average of 420 for DNA and RNA, respectively. According to the work published by timber researchers23,30, the nucleic acid markers with fragment lengths of around 250 bases best meet our research goals. The lengths of contigs derived from the four libraries we have prepared were found to be suitable for screening markers within 250 bp length. A target band size below 250 bp implies a higher PCR success rate as the DNA of wood samples from seized timber and victim trees were mostly severely degraded.
SSR discovery and primer design
A sum of 318,153 gSSR and 63,390 EST-SSR candidate sequences were screened by Simple Sequence Repeat Identification Tool (SSRIT)31 (Fig. 1). The proportions of SSR in the genomic DNA and RNA libraries were 7.51% and 1.44%, respectively. Study by Squirrell et al.32 suggests that the overall success rate of SSR marker development is about 10%. With PCR, polymorphic high-quality markers could be successfully amplified resulting to a good peak pattern quality with little stuttering and absence of non-amplifying (null) alleles and other factors. Therefore, about 90% of the designed markers could be screened out. We designed a total of 395 gSSR and 105 EST-SSR primer pairs for testing in C. taiwanensis.
Marker validation
From the PCR results, 23 gSSR and 12 EST-SSR markers with polymorphism were selected (Fig. 1, Tables 1, 2), and the success rate for gSSR and EST-SSR marker was found to be 5.82% and 11.42%, respectively. Our data showed that it is easier to select SSR markers from the RNA library than from the DNA library, which is akin to previous studies4,32,33. Other reports24,34,35 suggest that SSR occurs more frequently in EST sequence than in the genome. In addition, the fact that the information content in EST is markedly lower than that in the genome promotes the calculation and analysis of EST in silico24,33.
The samples used in marker validation came from 4 ethnic groups (TP, SY, DS, FR), with 20 to 30 individuals in each group (N = 25, 29, 21, 21), qualified the basic requirement of at least 15 individuals per group and 3 groups per study (Fig. 1, Supplementary Sect. 1). Among the 96 individuals sampled in this study, the number of alleles per gSSR is between 2 and 14 with 6.5 in average, whereas the number of alleles per EST-SSR is between 2 and 16, 7 in average (Tables 3, 4). The levels of Ho are from 0.000 to 0.802 and 0.021 to 0.604, with average of 0.399 and 0.379, respectively. The levels of He of gSSR and EST-SSR are ranged from 0.041 to 0.833 and 0.205 to 0.872, with average of 0.488 and 0.528, respectively. Significant (P < 0.001) deviations of Hardy–Weinberg equilibrium (HWE) in terms of heterozygosity deficiency were detected in 9 gSSR loci: CoTW76, CoTW77, CoTW539, CoTW545, CoTW554, CoTW556, CoTW561, CoTW585 and CoTW595 (9/23 = 39.13%) and also in 6 EST-SSR loci: CoTW383, CoTW502, CoTW511, CoTW513, CoTW514 and CoTW528(6/12 = 50%). The levels of PIC of gSSR and EST-SSR are ranged from 0.058 to 0.821 and 0.187 to 0.858, with average 0.459 and 0.482. The levels of PD from 0.041 to 0.749 and 0.205 to 0.885, with average 0.494 and 0.555. The levels of PE of gSSR and EST-SSR are ranged from 0.000 to 0.479 and 0.000 to 0.312, with average 0.169 and 0.180. The levels of PI of gSSR and EST-SSR from 0.029 to 0.939 and 0.114 to 0.794, with average 0.505 and 0.443. Two EST-SSR markers, CoTW383 and CoTW581, have putative functions found by BLAST hit (Table 2). Heterozygosity, being one of the first parameters that appear often in a data set, reveals lot of information including population structure and other historical clue. High heterozygosity means a lot of genetic variation, whereas low heterozygosity means almost no genetic variation. The heterozygosity data echo the results of PIC, PD and PI, suggesting that the SSR marker developed in this study has moderate genetic variation. In addition, most of these markers show Ho < He (except CoTW495, CoTW556, CoTW559, CoTW598, CoTW424), which suggests that the population of C. taiwanensis is an inbred. A total of 15 sets of SSR marker used in this study deviated from HWE, which suggest the population may be not under the ideal status of HWE. The reason for this deviation could be artificial selection, non-panmixia or genetic drift36. Generally, EST-SSR markers are less polymorphic than gSSR in plants because of high conservation in transcribed regions24. Moreover, other factors33,37 such as SSR motif type, sample size, population and species may also differentiate gSSR and EST-SSR markers. However in this study, in terms of polymorphism and cross-species transferability, there was no significant difference between gSSR and EST-SSR groups (Supplementary Sects. 2, 3), but the difference rather occurred among individual markers. This fact might be explained by polymorphism and detection limit as markers with higher PD are often selected for individual identification. Also in our study, the differences in polymorphism and cross-species transferability between gSSR and EST-SSR are not significant, but those among markers are significant. It might be because of the giant genome size in taxa Chamaecyparis (20.03–27.40 pg/2C)38,39 which leads a deviation from random sampling in marker selection. When using the system to perform individual identification assay, a marker with higher PD should be considered as priority.
Probability of identity and power of discrimination analysis
Continued multiplication can be used to calculate the cumulative random probability of identity (CPI) and the combined power of discrimination (CPD) for non-linked markers, where CPI is the probability of two individuals most likely the same genotype, CPD is the probability of individuals being identified, and CPI + CPD = 1. The credibility of the system is calculated based on "Random match probability in population size and confidence levels" published by Budowle et al.43: Confidence levels = (1 − CPI)N where N = Population size.
While applying the system in criminal cases, for the sake of objective and impartiality, practically the court will use the credibility of 95%, 99%, or 99.99% as aacceptance criteria40 (Wall 2002, ISO ISO/IEC 17025). In this study, only one marker in the same linkage group is used for CPI analysis, and up to 30 markers can be continuously accumulated (Table 5). Also, the individual identification system’s CPI is as small as 5.596 × 10–12, and the CPD is as high as 0.99999999994404 (extremely close to 1). Applying the court's strictest credibility standard of 99.99%, when the number of markers reached up to 30, the system can identify 18 million individuals, which actually exceed the whole C. taiwanensis population of 7.39 ± 0.73 million in Taiwan41. While applying to the lowest acceptable credibility standard of 95%, the system could identify at least 2300 plants with a minimum number of 6 markers (Table 5).
Aligning seized timbers to victim trees
In this case (MJIB-DNA-1080413 combine 1,080,328), we successfully matched 3 seized timbers back to 3 victim trees by using 19 pairs of non-linkage SSR markers (Fig. 2, Table 6, Supplementary Sect. 4). The credibility values of the 3 cases are all above 99.99%. In our experiments, DNA samples were extracted twice or more from each sample in order to optimize the DNA concentration. Since 2007, forestry researchers have noticed that molecular markers can be used to provide direct evidence linking stolen timber and victim trees42. Although many techniques have developed for extracting DNA from fresh and dried leaves (including published literature43,44 and commercial reagents), yet few studies have reported on extracting DNA from dried wood, which is still considered the most challenging part in this field of research45. In forensic science field of study, it has been established that the validated DNA concentration range is between 0.625 and 10 ng/μL46. False negative result cannot be ruled out from over-concentrated sample and vice versa. Therefore, it is necessary to extract DNA two or more times for dry timber, as abovementioned, because its DNA extraction is challenging. Several studies suggested that the error rate increases along with PCR cycles47,48. Base misincorporation incurred by PCR occurs randomly throughout the sequence without hot spots48. The probability of base misincorporation is 1.85 × 10–5 per base per cycle48. After comparing the results of positive and negative endpoint, we discovered 36 cycles is the upper limit which leads to positive PCR product without false-positive result. From comparing the results of positive and negative endpoint, we discovered 36 cycles is the upper limit which leads to positive PCR product without false-positive result. Therefore, the cycles were controlled below 35 cycles in our study, but not increasing cycles without limit. In addition, the SSR types of each marker were analyzed at least twice with ABI 3130XL. Signals below 150 RFU peak height threshold were considered not detectable. We developed a protocol of two sessions of instrumental operation and setup threshold value from pilot test result for illegal felling investigation cases. By comparing the profiles from positive and negative controls with test samples, we can obtain objective data with least erroneous possibility to conclude our investigation for court evidence.
Thirty-seven victim trees were reported by Luodong Forest District Office in December 2018. According to census data, the crime scene forest area is 281.03 hectare and the density of C. taiwanensis is 16 ± 1.6 individuals/hectare. However, in order to protect suspects’ rights, we took an excess of the maximum possible population size: 10,000 into the calculation. Among 22 samples in this case, 7 succeeded in analysis, which is, by our definition, showing positive result in just 35 PCR cycles. The rest were denoted “Not detected” due to low positive PCR result (all tests comply the standard of accredited laboratory ISO/IEC 17025) or CL < 95%. It is worthwhile to note that seized illegally-felled timber 6TC matched the victim tree 6 TB (CPI = 3.342 × 10–13, CPD = 0.999999999999666, CL = 99.9999999%). In addition, seized illegally-felled timber 7TC matched with victim tree 7TA and 7 TB (CPI = 1.631 × 10–13, CPD = 0.999999999999837, CL = 99.9999999%), and seized illegally-felled timber 8 TB matched with victim tree 8TA (CPI = 4.468 × 10–10, CPD = 0.999999999553151, CL = 99.999532%). In this individual identification system test case (Table 6, Fig. 2, Supplementary Sect. 4), the minimal amount of matching marker was 17 among the positive-matched groups (CL = 99.999532%). The credibility increased along with the matching marker amount. The credibility is dependent on population size and matching marker amount. In addition, while aligning the evidence to the victim individuals, it is a common scenario that the sample DNA might have been degraded. Successful extraction is one of the crucial steps to identify same individual using DNA matching techniques. The extractable DNA in desiccated timber is low in quantity and poor in quality. The extracted DNA can only be used for individual identification using markers developed for specific species. In this regard, SSR marker is a traditional marker for individual identification, which has been widely used in human and gradually extended to other species. All the SSR markers designed in this study are shorter than 300 bp, which would be suitable for amplifying the lysed DNA fragments from desiccated timbers. Although SSR marker has the merits above mentioned, the overall success rate of DNA extraction and genotyping from timber is relatively low (31.81%, 7 out of 22 samples tested successfully). The low quantity and quality of DNA in timber sample might have limited our success rate. An improvement on DNA extraction method would enhance our success rate on timber samples. In addition, increasing the number of SSR markers capable of individual identification would decrease the overall CPI and increase CL. Overall, we provide scientific proof that can be used directly as court evidence in illegal felling cases. This is the first time study reporting the SSR individual identification system which could be applied to various precious species. A warning to forestall illegal felling is the most valuable impact of this study: DNA types of these precious trees have been filed. The illegal felling crime rate is dropping after public propagation of cypress individual identification system. Moreover, the individual identification system would also provide certificate for legal timber trading21. This system would also deter dishonest businessman piggybacking illegal material in legal timber auction, which would further forestall illegal logging. In addition, these markers can be also used in population genetic analysis studies that facilitate the conservation and breeding of C. taiwanensis.
Conclusions
In this study, we developed an individual identification system for C. taiwanensis and provided the scientific evidence. This methodology can be adopted by the courts to link seized timber and victim trees. The C. taiwanensis individual identification system of this study includes 23 gSSR and 12 EST-SSR markers revealing polymorphism. When the 30 non-linkage markers were applied to C. taiwanensis identification, the lowest CPI was 5.596 × 10–12 and the highest CPD was 0.999999999994404, which was sufficient to identify 18 million random samples of C. taiwanensis (CL = 99.99%). While applied in the criminal cases of C. taiwanensis illegal logging, this SSR marker system successfully matched five seized illegally-felled timbers to three victim trees with minimal 99.99% CL. To the best of our knowledge, this is the first time the SSR technology is being applied to provide molecular evidence for court conviction on C. taiwanensis illegal logging. Our study would provide not only the scientific evidence correlating seized timber and victim tree, but also could inherent unique serial number to identify every single C. taiwanensis timber. We demonstrated the feasibility of matching seized/ illegally-felled timber with victim tree by modern SSR technology, which would prevent illegal logging by warning the criminals that the woodland trees could be identified on the basis of molecular level. Additionally, these markers can be also used in population genetic analysis studies that facilitate the conservation and breeding of C. taiwanensis.
Materials and methods
Developing C. taiwanensis individual identification system
Library preparation and SSR enrichment
In this study, we constructed both DNA and RNA libraries of C. taiwanensis (Fig. 1.). Three DNA libraries were created from individuals of TP (Voucher no. Chung 2448) and 100R (Voucher no. Chung 2603, 2621) (Supplementary Sect. 1). To build the DNA libraries, genomic DNA was extracted from fresh leaves using the cetyltrimethylammonium bromide (CTAB) method49. The quality and concentration of DNA were measured by NanoDrop 2000 (Thermo Fisher Scientific, San Diego, California, USA) and Qubit 2.0 Fluorometer (Thermo Fisher Scientific). From the total genomic DNA, microsatellites enriched in SSR markers was followed the magnetic bead enrichment method of Glenn and Schable50. Briefly, DNA was digested using AluI/XmnI and HaeIII/XmnI (New England Biolabs, Ipswich, Massachusetts, USA). The double-stranded SuperSNX linkers (SuperSNX24 Forward: 5′-GTTTAAGGCCTAGCTAGCAGAATC-3′; SuperSNX24 + 4p: 5′-pGATTCTGCTAGCTAGGCCTTAAACAAA-3′) were ligated to the digested DNA fragments. The linker-conjugated DNA fragments were hybridized with Biotin-labeled microsatellite probes containing Mix 2: (AG)12, (TG)12; Mix 3: (AAC)6, (AAG)8, (AAT)12, (ATC)8, (ACT)12; Mix4: (AAAC)6, (AAAG)6, (AATC)6, (AATG)6, (ACAG)6, (ACCT)6, (ACTG)6, (ACTC)6, (AAAT)8, (AACT)8, (ACAT)8, (AAGT)8, and (AGAT)8. The SSR hybridized fragments were extracted using Streptavidin M-280 Dynabeads (Invitrogen, Carlsbad, Calsbad, California, USA) and recovered by PCR using the SuperSNX24 Forward primers. The concentration and quality of SSR-enriched libraries were measured by Nanodrop 2000 (Thermo Fisher Scientific, Carlsbad, San Diego, California, USA) and Qubit 2.0 Fluorometer (Thermo Fisher Scientific, USA).
One individual of C. taiwanensis (Voucher no.: Chung 2627) from XI was used to prepare RNA library. RNA was extracted from fresh leaves by using the CTAB method51. The quality and concentration of RNA were measured by NanoDrop 2000 and Qubit 2.0 Fluorometer. The RNA was reverse transcribed into complementary DNA (cDNA) using Ovation RNA-Seq System V2 (NuGEN, San Carlos, California, USA) and the cDNA was quantitated using Nanodrop 2000 and Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, California, USA) by Tri-I Biotech, Inc. (New Taipei City, Taiwan). The cDNA was fragmented by Covaris S220 focused-ultrasonicator (Covaris, Woburm, Massachusetts, USA) and the cDNA library was prepared according to the manual of Ovation Ultralow DR Multiplex System 1–96 (NuGEN).
Sequencing and analysis
Three DNA and one RNA libraries were sequenced using the Illumina MiSeq System (2 × 300 bp paired-end; Illumina, San Diego, California, USA) at Tri-I Biotech (New Taipei City, Taiwan). The raw reads were prescreened to remove adapter sequences and reads with greater than 0.1% error or with an average quality less than QV30. High-quality filtered DNA and cDNA reads were merged by CLC Genomics Workbench version 7.5 (QIAGENE, Aarhus, Denmark).
SSR screening and primer design
SSRIT was applied to screen the gSSR and EST-SSR containing sequences from contigs. To design gSSR and EST-SSR primers, sequences with at least five di-, tri-, tetra-, penta-, and hexa-nucleotide repeats were selected using BatchPrimer352, with optimized conditions set length at 18–23 bp, melting temperature 45–62 ℃, and a product size of 80–300 bp.
Marker validation
A total of 75 markers including 23 gSSR and 12 EST-SSR markers newly designed in this study, and 40 published SSR4,53,54 (Supplementary Sect. 2) were subjected to validation test on 96 samples from four C. taiwanensis populations (TP, SY, DS and FR, see Supplementary Sect. 1). In addition, we also tested cross-species transferability of the designed gSSR and EST-SSR markers (Supplementary Sect. 3). The samples used in marker validation and cross-species transferability of DNA were extracted using the VIOGENE plant DNA extraction kit (VIOGENE, New Taipei City, Taiwan). The PCR reaction was conducted with a final volume 20 μL containing 2 ng of genomic DNA, 0.25 μL of 10 μM each primer and 10 μL of Q-Amp 2 × Screening Fire Taq Master Mix (Bio-Genesis Technologies, Taipei, Taiwan). The following PCR conditions were used: an initial denaturation of 95 ℃ for 2 min; 30 cycles of 95 ℃ for 45 s, a primer-specific annealing temperature (Tables 1, 2) for 45 s, and 72 ℃ for 45 s; followed by a 15-min extension at 72 ℃ (Tables 1, 2). The amplified products were evaluated on the ABI 3130XL (Applied Biosystems, Waltham, Massachusetts, USA) with GeneScan 500 ROX Size Standard (Applied Biosystems). Fragment size was determined by using GeneMapper version 3.2 (Applied Biosystems).
Marker analysis
GenAlex 6.51b255 was used to calculate number of alleles (A), observed heterozygosity (Ho), expected heterozygosity (He), Hardy–Weinberg equilibrium (HWE) of the newly developed gSSR and EST-SST markers. PowerMarker V3.2556 was used to calculate polymorphism information content or power of information content (PIC)57. Power of discrimination (PD)58, PD = 1 − ΣPi2, where Pi is the frequency of genotype i . Power of exclusion or probability of exclusion (PE)58, PE = h2[1 − 2 h(1 − h)2], where h is the frequency of heterozygotes. Probability of identity (PI)59, PI = 1 − PD. Combined power of discrimination (CPD)58, here we calculated CPD of 30 markers. CPD = 1 − [(1 − PD1)(1 − PD2)…(1 − PD30)].Combined probability of identity (CPI)59. Microsoft Excel (Microsoft Office 2016) was used to calculate PD, PI, PE, CPD, CPI. GENEPOP 4.260 was used to test for linkage disequilibrium.
Aligning seized timbers to victim trees
Samples from five seized timbers of Taiwan Yilan District Prosecutors Office, six illegally-felled timbers found at crime scene woodland and seven victim trees (Supplementary Sect. 4) were collected. Duplicates of a victim tree (7TA and 7TB) was sourced out in order to ensure the reproducibility of the identical SSR type in individual tree. Two grams of each sample was powdered in liquid nitrogen and the total genomic DNA was extracted following the protocol of VIOGENE plant DNA extraction kit (VIOGENE, New Taipei City, Taiwan). Nineteen non-linkage markers were selected for DNA typing. The sample succeeded in typing were further combined to the aforementioned database to calculation the CPI.
References
Hwang, S. Y., Lin, H. W., Kuo, Y. S. & Lin, T. P. RAPD variation in relation to population differentiation of Chamaecyparis formosensis and Chamaecyparis taiwanensis. Bot. Bull. Acad. Sin. 42, 173–179 (2001).
Wang, W. P., Hwang, C. Y., Lin, T. P. & Hwang, S. Y. Historical biogeography and phylogenetic relationships of the genus Chamaecyparis (Cupressaceae) inferred from chloroplast DNA polymorphism. Plant Syst. Evol. 241, 13–28. https://doi.org/10.1007/s00606-003-0031-0 (2003).
Chen, Y. J. & Chang, S. T. Distribution and characteristic comparisons of the endemic cypress in Taiwan. Taiwan J. For. Sci. 32, 71–86 (2017).
Huang, C. J. et al. Isolation and characterization of SSR and EST-SSR loci in Chamaecyparis formosensis (Cupressaceae). Appl. Plant Sci. 6, e01175. https://doi.org/10.1002/aps3.1175 (2018).
Tereba, A., Woodward, S., Konecka, A., Borys, M. & Nowakowska, J. A. Analysis of DNA profiles of ash (Fraxinus excelsior L.) to provide evidence of illegal logging. Wood Sci. Technol. 51, 1377–1387. https://doi.org/10.1007/s00226-017-0942-5 (2017).
Cabral, E. C. et al. Wood typification by Venturi easy ambient sonic spray ionization mass spectrometry: The case of the endangered Mahogany tree. J. Mass Spectrom. 47, 1–6. https://doi.org/10.1002/jms.2016 (2012).
Kite, G. C. et al. Dalnigrin, a neoflavonoid marker for the identification of Brazilian rosewood (Dalbergia nigra) in CITES enforcement. Phytochemistry 71, 1122–1131. https://doi.org/10.1016/j.phytochem.2010.04.011 (2010).
Dormontt, E. E. et al. Forensic timber identification: It’s time to integrate disciplines to combat illegal logging. Biol. Cons. 191, 790–798. https://doi.org/10.1016/j.biocon.2015.06.038 (2015).
Vlam, M. et al. Developing forensic tools for an African timber: Regional origin is revealed by genetic characteristics, but not by isotopic signature. Biol. Cons. 220, 262–271. https://doi.org/10.1016/j.biocon.2018.01.031 (2018).
Celani, C. P., Lancaster, C. A., Jordan, J. A., Espinoza, E. O. & Booksh, K. S. Assessing utility of handheld laser induced breakdown spectroscopy as a means of Dalbergia speciation. Analyst 144, 5117–5126. https://doi.org/10.1039/c9an00984a (2019).
Gasson, P. How precise can wood identification be? Wood anatomy’s role in support of the legal timber trade, especially CITES. IAWA J. 32, 137–154 (2011).
Speer, J. H. Fundamentals of Tree-Ring Research (University of Arizona Press, Tucson, 2010).
McClure, P. J., Chavarria, G. D. & Espinoza, E. Metabolic chemotypes of CITES protected Dalbergia timbers from Africa, Madagascar, and Asia. Rapid Commun. Mass Spectrom. 29, 783–788. https://doi.org/10.1002/rcm.7163 (2015).
Tsuchikawa, S. & Schwanninger, M. A review of recent near-infrared research for wood and paper (Part 2). Appl. Spectrosc. Rev. 48, 560–587. https://doi.org/10.1080/05704928.2011.621079 (2013).
Braun, B. Wildlife detector dogs—A guideline on the training of dogs to detect wildlife in trade. WWF Germany, Berlin, 1–16 (2013).
Rummel, S., Hoelzl, S., Horn, P., Rossmann, A. & Schlicht, C. The combination of stable isotope abundance ratios of H, C, N and S with 87Sr/86Sr for geographical origin assignment of orange juices. Food Chem. 118, 890–900. https://doi.org/10.1016/j.foodchem.2008.05.115 (2010).
Hua, Q., Barbetti, M. & Rakowski, A. Z. Atmospheric radiocarbon for the period 1950–2010. Radiocarbon 55, 2059–2072. https://doi.org/10.2458/azu_js_rc.v55i2.16177 (2013).
Hollingsworth, P. M. et al. A DNA barcode for land plants. Proc. Natl. Acad. Sci. 106, 12794–12797. https://doi.org/10.1073/pnas.0905845106 (2009).
Lowe, A. J. & Cross, H. B. The applicat ion of DNA methods to timber tracking and origin verification. Iawa J. 32, 251–262 (2011).
Jobling, M. A. & Gill, P. Encoded evidence: DNA in forensic analysis. Nat. Rev. Genet. 5, 739–751. https://doi.org/10.1038/nrg1455 (2004).
Lowe, A. J., Wong, K. N., Tiong, Y. S., Iyerh, S. & Chew, F. T. A DNA Method to verify the integrity of timber supply chains; confirming the legal sourcing of merbau timber from logging concession to sawmill. Silvae Genetica 59, 263–268. https://doi.org/10.1515/sg-2010-0037 (2010).
Dawnay, N. et al. A forensic STR profiling system for the Eurasian badger: A framework for developing profiling systems for wildlife species. Forensic Sci. Int. Genet. 2, 47–53. https://doi.org/10.1016/j.fsigen.2007.08.006 (2008).
Fregeau, C. J. & Fourney, R. M. DNA typing with fluorescently tagged short tandem repeats: A sensitive and accurate approach to human identification. Biotechniques 15, 100–119 (1993).
Varshney, R. K., Graner, A. & Sorrells, M. E. Genic microsatellite markers in plants: features and applications. Trends Biotechnol. 23, 48–55. https://doi.org/10.1016/j.tibtech.2004.11.005 (2005).
Cho, Y. G. et al. Diversity of microsatellites derived from genomic libraries and GenBank sequences in rice (Oryza sativa L.). Theor. Appl. Genet. 100, 713–722. https://doi.org/10.1007/s001220051343 (2000).
Eujayl, I., Sorrells, M., Baum, M., Wolters, P. & Powell, W. Assessment of genotypic variation among cultivated durum wheat based on EST-SSRS and genomic SSRS. Euphytica 119, 39–43. https://doi.org/10.1023/a:1017537720475 (2001).
Wang, C. et al. Genome survey sequencing of purple elephant grass (Pennisetum purpureum Schum ‘Zise’) and identification of its SSR markers. Mol. Breed. 38, 94 (2018).
Zhou, S. et al. The first Illumina-based de novo transcriptome analysis and molecular marker development in Napier grass (Pennisetum purpureum). Mol. Breed. 38, 95 (2018).
Liao, P. C., Lin, T. P. & Hwang, S. Y. Reexamination of the pattern of geographical disjunction of Chamaecyparis (Cupressaceae) in North America and East Asia. Bot. Stud. 51, 511–520 (2010).
Schroeder, H. et al. Development of molecular markers for determining continental origin of wood from White Oaks (Quercus L. sect. Quercus). PLoS ONE 11, e0158221. https://doi.org/10.1371/journal.pone.0158221 (2016).
Temnykh, S. et al. Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): Frequency, length variation, transposon associations, and genetic marker potential. Genome Res. 11, 1441–1452 (2001).
Liu, G., Xie, Y., Zhang, D. & Chen, H. Analysis of SSR loci and development of SSR primers in Eucalyptus. J. For. Res. 29, 273–282. https://doi.org/10.1007/s11676-017-0434-3 (2018).
Zhang, M., Mao, W., Zhang, G. & Wu, F. Development and characterization of polymorphic EST-SSR and genomic SSR markers for Tibetan annual wild barley. PLoS ONE 9, e94881. https://doi.org/10.1371/journal.pone.0094881 (2014).
Gao, L., Tang, J., Li, H. & Jia, J. Analysis of microsatellites in major crops assessed by computational and experimental approaches. Mol. Breed. 12, 245–261 (2003).
Kantety, R. V., La Rota, M., Matthews, D. E. & Sorrells, M. E. Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol. Biol. 48, 501–510 (2002).
Sharma, R. et al. Genetic diversity estimates point to immediate efforts for conserving the endangered Tibetan sheep of India. Meta Gene 8, 14–20 (2016).
Ouyang, P. et al. Development and characterization of high-throughput EST-Based SSR markers for Pogostemon cablin using transcriptome sequencing. Molecules https://doi.org/10.3390/molecules23082014 (2018).
Hizume, M., Kondo, T., Shibata, F. & Ishizuka, R. Flow cytometric determination of genome size in the Taxodiaceae, Cupressaceae sensu stricto and Sciadopityaceae. Cytologia 66, 307–311 (2001).
Ohri, D. & Khoshoo, T. N. Genome size in gymnosperms. Plant Syst. Evol. 153, 119–132 (1986).
Wall, W. Genetics & DNA Technology: Legal Aspects. (Routledge-Cavendish, 2002).
Qiu, L. W., Huang, Q. X., Wu, C. C. & Hsieh, H. T. The Summary of the Fourth Forest Resources Inventory in Taiwan. (Taipei, 2015).
Degen, B. & Fladung, M. Use of DNA-markers for tracing illegal logging. In Proceedings of the international workshop “Fingerprinting methods for the identification of timber origins” October. 8–9 (2007).
Asif, M. & Cannon, C. H. DNA extraction from processed wood: A case study for the identification of an endangered timber species (Gonystylus bancanus). Plant Mol. Biol. Rep. 23, 185–192 (2005).
Fatima, T., Srivastava, A., Hanur, V. S. & Rao, M. S. An effective wood DNA extraction protocol for three economic important timber species of India. Am. J. Plant Sci. 09, 139–149. https://doi.org/10.4236/ajps.2018.92012 (2018).
Tnah, L. H., Lee, S. L., Ng, K. K. S., Bhassu, S. & Othman, R. Y. DNA extraction from dry wood of Neobalanocarpus heimii (Dipterocarpaceae) for forensic DNA profiling and timber tracking. Wood Sci. Technol. 46, 813–825 (2012).
Dormontt, E. et al. Forensic validation of a SNP and INDEL panel for individualisation of timber from bigleaf maple (Acer macrophyllum Pursch). Forensic Sci. Int. Genet. 46, 102252 (2020).
Blais, J. et al. Risk of misdiagnosis due to allele dropout and false-positive PCR artifacts in molecular diagnostics: Analysis of 30,769 genotypes. J. Mol. Diagn. 17, 505–514 (2015).
Cummings, S. M., McMullan, M., Joyce, D. A. & van Oosterhout, C. Solutions for PCR, cloning and sequencing errors in population genetic analysis. Conserv. Genet. 11, 1095–1097 (2010).
Doyle, J. J. & Doyle, J. L. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. (1987).
Glenn, T. C. & Schable, N. A. Isolating microsatellite DNA loci. Methods Enzymol. 395, 202–222 (2005).
Chang, S., Puryear, J. & Cairney, J. A simple and efficient method for isolating RNA from pine trees. Plant Mol. Biol. Rep. 11, 113–116 (1993).
You, F. M. et al. BatchPrimer3: A high throughput web application for PCR and sequencing primer design. BMC Bioinform. 9, 253. https://doi.org/10.1186/1471-2105-9-253 (2008).
Nakao, Y., Iwata, H., Matsumoto, A., Tsumura, Y. & Tomaru, N. Highly polymorphic microsatellite markers in Chamaecyparis obtusa. Can. J. For. Res. 31, 2248–2251. https://doi.org/10.1139/cjfr-31-12-2248 (2001).
Matsumoto, A. et al. Development and polymorphisms of microsatellite markers for hinoki (Chamaecyparis obtusa). Mol. Ecol. Notes 6, 310–312. https://doi.org/10.1111/j.1471-8286.2006.01212.x (2006).
Peakall, R. & Smouse, P. E. GenAlEx 6.5: Genetic analysis in Excel. Population genetic software for teaching and research—An update. Bioinformatics 28, 2537–2539 (2012).
Liu, K. & Muse, S. V. PowerMarker: An integrated analysis environment for genetic marker analysis. Bioinformatics 21, 2128–2129 (2005).
Botstein, D., White, R. L., Skolnick, M. & Davis, R. W. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am. J. Hum. Genet. 32, 314 (1980).
Fisher, R. Standard calculations for evaluating a blood-group system. Heredity 5, 95 (1951).
Jones, D. A. Blood samples: Probability of discrimination. J. Forensic Sci. Soc. 12, 355–359. https://doi.org/10.1016/s0015-7368(72)70695-7 (1972).
Raymond, M. & Rousset, F. GENEPOP (version 1.2): Population genetics software for exact tests and ecumenicism. J. Hered. 86, 248–249 (1995).
Acknowledgements
The authors would like to thank Dr. Kuo-Fang Chung (Biodiversity Research Center, Academia Sinica, Taiwan) for his support in research funding and facilities. The authors would like to thank Dr. Chaolun Allen Chen and Dr. Shu-Miaw Chaw (Biodiversity Research Center, Academia Sinica, Taiwan) for their support in research facilities. The authors would like to thank Forestry Bureau Council of Agriculture of Executive Yuan, Taiwan for its support in samples collections. The authors also thank to the officials of Taiwan Yilan District Prosecutors Office, Forth Division of the Seventh Special Police Corps National Police Agency of Ministry of the Interior and Luodong Forest District Office for generously providing samples of the test cases. This work was financially supported by Ministry of Science and Technology, Taiwan (grant no. MOST 104-2321-B-002-056) and Ministry of Justice, Taiwan (grant no. 109-1301-05-17-02).
Author information
Authors and Affiliations
Contributions
C.J.H. conceived, designed and conducted the experiments, and wrote the main manuscript text, drew the figures and tables, sample collection, funding application and manuscript submitted. F.H.C. edited the manuscript. Y.S.H. edited the manuscript and assisted to draw figures and tables. Y.M.H. performed the experiments. Y.H.T. edited the manuscript. C.E.P. data analysis and edited the manuscript. C.H.C. edited the manuscript. Y.S.C. data analysis. S.C.L., Y.T.Y., S.Y.H., H.C.H., and C.T.H. sample collection and performed the experiments. M.Y.C., T.A.L., H.Y.S., and Y.C.T. performed the experiments. C.T.C. funding application. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Huang, CJ., Chu, FH., Huang, YS. et al. Development and technical application of SSR-based individual identification system for Chamaecyparis taiwanensis against illegal logging convictions. Sci Rep 10, 22095 (2020). https://doi.org/10.1038/s41598-020-79061-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-020-79061-z
This article is cited by
-
Artificial intelligence in timber forensics employing DNA barcode database
3 Biotech (2023)
-
Gene flow between wild trees and cultivated varieties shapes the genetic structure of sweet chestnut (Castanea sativa Mill.) populations
Scientific Reports (2022)
-
SSR individual identification system construction and population genetics analysis for Chamaecyparis formosensis
Scientific Reports (2022)
-
DNA databases of an important tropical timber tree species Shorea leprosula (Dipterocarpaceae) for forensic timber identification
Scientific Reports (2022)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.