Introduction

Chamaecyparis taiwanensis Masam. & Suzuki [= Chamaecyparis obtusa (Sieb. & Zucc.) Endl. var. formosana (Hayata) Hayata] (Cupressaceae) is a gymnosperm endemic in Taiwan. C. taiwanensis is endemic to Taiwan and is the dominant species in the conifer and broadleaf tree mixed forest, located in middle altitude region (from 1700 m to 2600 m) of Taiwan island1. The lowest latitude boundary of cypress’ natural distribution falls into Taiwan, suggesting a great significance in biogeography2. As an indispensable resource for making elegant buildings, furniture and handicrafts, these species play a vital role in serving wood source and timber industry. C. taiwanensis is well-known for their wood quality and expensiveness (4400 USD/m3)(woodprice.forest.gov.tw), which often lead to endless illegal felling crimes. Therefore, developing individual identification system to C. taiwanensis is of more importance3.

Illegal felling remains a persistent problem in the timber producing countries all over the world. For decades, illegal logging endangered precious and valuable tree species such as cypress4, ash5, mahogany tree6, and Brazilian rosewood7 all over the world. In some cases, the law enforcement authorities, such as forestry police, arrest the suspects in time. However, lack of direct scientific evidence that correlate timbers to the stumps leads the conviction processes rather difficult and ineffective. Thus, the need of individual identification is critical to the forestry industry.

The problem of illegal logging has been paid attention since 1995. More and more national and international regulations mandate tracking systems that ensure traceability on wood market8,9,10. Wood anatomy and dendrochronology are common visual identification method. The former is based on the anatomical characteristics to identify the wood, and can usually be identified to the genus11; the latter is often used to illustrate past climates, but may also provide the age and origin of the trees12. Compounds synthesized by trees and other plants are often called phytochemicals and are often used to identify species or distinguish genera. Intraspecific variation can also be detected in some species through some chemical analysis such as mass spectrometry12,13, near infrared spectroscopy14, detector dogs15, stable isotopes16, and radiocarbon17. Genetic analysis can provide species-level identification, which is usually achieved by DNA sequence polymorphism18. Simple sequence repeats (SSRs) and Single nucleotide polymorphisms (SNPs) can be used to identify individuals and can be used in population genetics or systematic geography to determine the geographical region of origin within a species19. DNA fingerprinting is built into each organism itself and cannot be forged20. When enough markers are developed, in principle every individual has its own unique DNA fingerprint. DNA fingerprinting has the potential to track wood products independently within complex global supply network21. Theoretically, DNA fingerprinting is the only forensic wood identification technology that could be used to connect seized timber to illegally felled stumps8.

SSR is the most common marker used in individual identification for its short length, high polymorphism, easy polymerase chain reaction (PCR) amplification, high reproducibility, and high sensitivity20,22,23. SSRs are divided into two broad categories by different sources: Genomic (g)-SSR and expressed sequence tag (EST)-SSR24. gSSR markers are derived from amplified genomic libraries. EST-SSRs are markers mined from EST sequence collections. gSSR markers have been reported to be more polymorphic when compared with EST-SSR in gymnosperms4 and crops25,26 because of a more diversified nucleotide sequence. Since the development of high-throughput sequencing technology, the marker development technique has been continuously advanced. Wang et al., 2018 published the first report on gSSR developed by De novo genome sequencing27. In contrast, EST-SSR, derived from the expressed sequence, is fast-acting, cost-effective and labor-saving alternative for non-model organisms24. Because of the conservative nature in gene coding regions24, newly developed EST-SSRs usually can be transferred in closely related species for marker development. The first EST-SSR based on Illumina-based de novo transcriptome was also published by Zhou et al. in 201828. A study to develop both markers would avail of their merits and functions simultaneously.

For C. taiwanensis, evaluation of genetic variation or population structure is necessary for its preservation2,29 because this species is used extensively. After mid-twentieth century, the number of C. taiwanensis plunged, which also led to a significant decrease in both genetic variation and population structure. As an important tool for genetic and subsequent breeding, SSR markers are helpful for breeding polymorphic maternal plants and increasing the diversity of progeny. The objective of this study is to establish a scientifically valid SSR mediated individual identification system for C. taiwanensis in order to provide court evidence to link the seized wood and the victim tree, and to provide traceability proof for wood supply network. In the beginning of the research, we used Next Generation Sequence (NGS) technology to establish the DNA and RNA libraries of C. taiwanensis to accelerate the development of gSSR and EST-SSR markers. A total of 96 samples from four populations were used to evaluate the polymorphism, discriminative power, and random match rate of the selected SSRs. The linkage disequilibrium between markers was calculated to estimate the availability and credibility of the individual identification system. In this study, we successfully linked 3 stolen timbers back to 3 victim trees (case number MJIB-DNA-1080413 combine 1080328), marked the first successful application of C. taiwanesis individual identification system. Finally, our work would deter illegal felling toward these precious species by manifesting law enforcement effectively.

Result and discussion

Developing C. taiwanensis individual identification system

Choice of template and library preparation

The gSSR are characterized by high polymorphism and is suitable for developing individual identification markers. The EST-SSR are highly conservative which could be used for developing markers to categorize species and populations20,22,23. In this study, both DNA and RNA libraries were constructed simultaneously as gSSR and EST-SSR markers, respectively (Fig. 1, Supplementary Sect. 1). From the three DNA libraries and from a RNA library prepared for the study, the sequences were compared between individual plants as well as between groups (Supplementary Sect. 1). With these two nucleic acid markers, we envisioned to differentiate samples within or among species.

Figure 1
figure 1

Flowchart describing the procedure of developing SSR markers and aligning illegally-felled timbers to victim trees of C. taiwanensis. (a) 35 SSR markers specific to individual identification were selected from the DNA and RNA libraries of C. taiwanensis. The cumulative random matching rate of the system reaches CPI = 5.596 × 10–12, which can be used to identify 18 million individuals with a credibility of 99.99% (b) 11 seized timbers were compared with 7 victim trees, and 3 timbers were matched with 3 victim trees successfully. The values of credibility in all matched cases were over 95%. (N number of individuals, P number of populations).

Nucleic acid sequencing and analysis

Next-generation sequencing technology enables the possible procurement of large number of sequences in a short time. In this study, we used the Illumina MiSeq platform (2 × 300 bp) to sequence the DNA and RNA libraries (Fig. 1). A total of 13,651,578 and 11,763,646 raw reads were produced from DNA and RNA libraries, respectively. The raw reads were deposited in the NCBI Sequence Read Archive (PRJNA506084). The sequences were then subjected to quality-trimming and merging and afterwards 4,236,284 contigs of the DNA pool and 4,392,534 RNA contigs were assembled. The base lengths of contigs ranged from 120–579 and 120–529, at an average of 420 for DNA and RNA, respectively. According to the work published by timber researchers23,30, the nucleic acid markers with fragment lengths of around 250 bases best meet our research goals. The lengths of contigs derived from the four libraries we have prepared were found to be suitable for screening markers within 250 bp length. A target band size below 250 bp implies a higher PCR success rate as the DNA of wood samples from seized timber and victim trees were mostly severely degraded.

SSR discovery and primer design

A sum of 318,153 gSSR and 63,390 EST-SSR candidate sequences were screened by Simple Sequence Repeat Identification Tool (SSRIT)31 (Fig. 1). The proportions of SSR in the genomic DNA and RNA libraries were 7.51% and 1.44%, respectively. Study by Squirrell et al.32 suggests that the overall success rate of SSR marker development is about 10%. With PCR, polymorphic high-quality markers could be successfully amplified resulting to a good peak pattern quality with little stuttering and absence of non-amplifying (null) alleles and other factors. Therefore, about 90% of the designed markers could be screened out. We designed a total of 395 gSSR and 105 EST-SSR primer pairs for testing in C. taiwanensis.

Marker validation

From the PCR results, 23 gSSR and 12 EST-SSR markers with polymorphism were selected (Fig. 1, Tables 1, 2), and the success rate for gSSR and EST-SSR marker was found to be 5.82% and 11.42%, respectively. Our data showed that it is easier to select SSR markers from the RNA library than from the DNA library, which is akin to previous studies4,32,33. Other reports24,34,35 suggest that SSR occurs more frequently in EST sequence than in the genome. In addition, the fact that the information content in EST is markedly lower than that in the genome promotes the calculation and analysis of EST in silico24,33.

Table 1 Characteristics of 23 gSSR loci developed in Chamaecyparis taiwanensis.
Table 2 Characteristics of 12 EST-SSR loci developed in Chamaecyparis taiwanensis.

The samples used in marker validation came from 4 ethnic groups (TP, SY, DS, FR), with 20 to 30 individuals in each group (N = 25, 29, 21, 21), qualified the basic requirement of at least 15 individuals per group and 3 groups per study (Fig. 1, Supplementary Sect. 1). Among the 96 individuals sampled in this study, the number of alleles per gSSR is between 2 and 14 with 6.5 in average, whereas the number of alleles per EST-SSR is between 2 and 16, 7 in average (Tables 3, 4). The levels of Ho are from 0.000 to 0.802 and 0.021 to 0.604, with average of 0.399 and 0.379, respectively. The levels of He of gSSR and EST-SSR are ranged from 0.041 to 0.833 and 0.205 to 0.872, with average of 0.488 and 0.528, respectively. Significant (P < 0.001) deviations of Hardy–Weinberg equilibrium (HWE) in terms of heterozygosity deficiency were detected in 9 gSSR loci: CoTW76, CoTW77, CoTW539, CoTW545, CoTW554, CoTW556, CoTW561, CoTW585 and CoTW595 (9/23 = 39.13%) and also in 6 EST-SSR loci: CoTW383, CoTW502, CoTW511, CoTW513, CoTW514 and CoTW528(6/12 = 50%). The levels of PIC of gSSR and EST-SSR are ranged from 0.058 to 0.821 and 0.187 to 0.858, with average 0.459 and 0.482. The levels of PD from 0.041 to 0.749 and 0.205 to 0.885, with average 0.494 and 0.555. The levels of PE of gSSR and EST-SSR are ranged from 0.000 to 0.479 and 0.000 to 0.312, with average 0.169 and 0.180. The levels of PI of gSSR and EST-SSR from 0.029 to 0.939 and 0.114 to 0.794, with average 0.505 and 0.443. Two EST-SSR markers, CoTW383 and CoTW581, have putative functions found by BLAST hit (Table 2). Heterozygosity, being one of the first parameters that appear often in a data set, reveals lot of information including population structure and other historical clue. High heterozygosity means a lot of genetic variation, whereas low heterozygosity means almost no genetic variation. The heterozygosity data echo the results of PIC, PD and PI, suggesting that the SSR marker developed in this study has moderate genetic variation. In addition, most of these markers show Ho < He (except CoTW495, CoTW556, CoTW559, CoTW598, CoTW424), which suggests that the population of C. taiwanensis is an inbred. A total of 15 sets of SSR marker used in this study deviated from HWE, which suggest the population may be not under the ideal status of HWE. The reason for this deviation could be artificial selection, non-panmixia or genetic drift36. Generally, EST-SSR markers are less polymorphic than gSSR in plants because of high conservation in transcribed regions24. Moreover, other factors33,37 such as SSR motif type, sample size, population and species may also differentiate gSSR and EST-SSR markers. However in this study, in terms of polymorphism and cross-species transferability, there was no significant difference between gSSR and EST-SSR groups (Supplementary Sects. 2, 3), but the difference rather occurred among individual markers. This fact might be explained by polymorphism and detection limit as markers with higher PD are often selected for individual identification. Also in our study, the differences in polymorphism and cross-species transferability between gSSR and EST-SSR are not significant, but those among markers are significant. It might be because of the giant genome size in taxa Chamaecyparis (20.03–27.40 pg/2C)38,39 which leads a deviation from random sampling in marker selection. When using the system to perform individual identification assay, a marker with higher PD should be considered as priority.

Table 3 Genetic characterization of 23 polymorphic gSSR loci of Chamaecyparis taiwanensis.
Table 4 Genetic characterization of 12 polymorphic EST-SSR loci of Chamaecyparis taiwanensis.

Probability of identity and power of discrimination analysis

Continued multiplication can be used to calculate the cumulative random probability of identity (CPI) and the combined power of discrimination (CPD) for non-linked markers, where CPI is the probability of two individuals most likely the same genotype, CPD is the probability of individuals being identified, and CPI + CPD = 1. The credibility of the system is calculated based on "Random match probability in population size and confidence levels" published by Budowle et al.43: Confidence levels = (1 − CPI)N where N = Population size.

While applying the system in criminal cases, for the sake of objective and impartiality, practically the court will use the credibility of 95%, 99%, or 99.99% as aacceptance criteria40 (Wall 2002, ISO ISO/IEC 17025). In this study, only one marker in the same linkage group is used for CPI analysis, and up to 30 markers can be continuously accumulated (Table 5). Also, the individual identification system’s CPI is as small as 5.596 × 10–12, and the CPD is as high as 0.99999999994404 (extremely close to 1). Applying the court's strictest credibility standard of 99.99%, when the number of markers reached up to 30, the system can identify 18 million individuals, which actually exceed the whole C. taiwanensis population of 7.39 ± 0.73 million in Taiwan41. While applying to the lowest acceptable credibility standard of 95%, the system could identify at least 2300 plants with a minimum number of 6 markers (Table 5).

Table 5 The discrimination power in SSR marker combination.

Aligning seized timbers to victim trees

In this case (MJIB-DNA-1080413 combine 1,080,328), we successfully matched 3 seized timbers back to 3 victim trees by using 19 pairs of non-linkage SSR markers (Fig. 2, Table 6, Supplementary Sect. 4). The credibility values of the 3 cases are all above 99.99%. In our experiments, DNA samples were extracted twice or more from each sample in order to optimize the DNA concentration. Since 2007, forestry researchers have noticed that molecular markers can be used to provide direct evidence linking stolen timber and victim trees42. Although many techniques have developed for extracting DNA from fresh and dried leaves (including published literature43,44 and commercial reagents), yet few studies have reported on extracting DNA from dried wood, which is still considered the most challenging part in this field of research45. In forensic science field of study, it has been established that the validated DNA concentration range is between 0.625 and 10 ng/μL46. False negative result cannot be ruled out from over-concentrated sample and vice versa. Therefore, it is necessary to extract DNA two or more times for dry timber, as abovementioned, because its DNA extraction is challenging. Several studies suggested that the error rate increases along with PCR cycles47,48. Base misincorporation incurred by PCR occurs randomly throughout the sequence without hot spots48. The probability of base misincorporation is 1.85 × 10–5 per base per cycle48. After comparing the results of positive and negative endpoint, we discovered 36 cycles is the upper limit which leads to positive PCR product without false-positive result. From comparing the results of positive and negative endpoint, we discovered 36 cycles is the upper limit which leads to positive PCR product without false-positive result. Therefore, the cycles were controlled below 35 cycles in our study, but not increasing cycles without limit. In addition, the SSR types of each marker were analyzed at least twice with ABI 3130XL. Signals below 150 RFU peak height threshold were considered not detectable. We developed a protocol of two sessions of instrumental operation and setup threshold value from pilot test result for illegal felling investigation cases. By comparing the profiles from positive and negative controls with test samples, we can obtain objective data with least erroneous possibility to conclude our investigation for court evidence.

Figure 2
figure 2

The photos and matrices of the three groups in which seized timbers and victim trees are successfully linked via 19 non-linked SSR markers matches. (a) Group 6TB/6TC, in wild. (b) Group 7TA/7TB/7TC, sampled. (c) Group 8TA/8TB, chopped. *A1 Freq Allele 1 Frequency. **A2 Freq = Allele 2 Frequency. Allele frequencies are from 96 individuals of C. taiwanensis. PI = p2 or 2pq. CPI combined probability of identity, CL confidence levels.

Table 6 individual PI and CPI of four matched groups.

Thirty-seven victim trees were reported by Luodong Forest District Office in December 2018. According to census data, the crime scene forest area is 281.03 hectare and the density of C. taiwanensis is 16 ± 1.6 individuals/hectare. However, in order to protect suspects’ rights, we took an excess of the maximum possible population size: 10,000 into the calculation. Among 22 samples in this case, 7 succeeded in analysis, which is, by our definition, showing positive result in just 35 PCR cycles. The rest were denoted “Not detected” due to low positive PCR result (all tests comply the standard of accredited laboratory ISO/IEC 17025) or CL < 95%. It is worthwhile to note that seized illegally-felled timber 6TC matched the victim tree 6 TB (CPI = 3.342 × 10–13, CPD = 0.999999999999666, CL = 99.9999999%). In addition, seized illegally-felled timber 7TC matched with victim tree 7TA and 7 TB (CPI = 1.631 × 10–13, CPD = 0.999999999999837, CL = 99.9999999%), and seized illegally-felled timber 8 TB matched with victim tree 8TA (CPI = 4.468 × 10–10, CPD = 0.999999999553151, CL = 99.999532%). In this individual identification system test case (Table 6, Fig. 2, Supplementary Sect. 4), the minimal amount of matching marker was 17 among the positive-matched groups (CL = 99.999532%). The credibility increased along with the matching marker amount. The credibility is dependent on population size and matching marker amount. In addition, while aligning the evidence to the victim individuals, it is a common scenario that the sample DNA might have been degraded. Successful extraction is one of the crucial steps to identify same individual using DNA matching techniques. The extractable DNA in desiccated timber is low in quantity and poor in quality. The extracted DNA can only be used for individual identification using markers developed for specific species. In this regard, SSR marker is a traditional marker for individual identification, which has been widely used in human and gradually extended to other species. All the SSR markers designed in this study are shorter than 300 bp, which would be suitable for amplifying the lysed DNA fragments from desiccated timbers. Although SSR marker has the merits above mentioned, the overall success rate of DNA extraction and genotyping from timber is relatively low (31.81%, 7 out of 22 samples tested successfully). The low quantity and quality of DNA in timber sample might have limited our success rate. An improvement on DNA extraction method would enhance our success rate on timber samples. In addition, increasing the number of SSR markers capable of individual identification would decrease the overall CPI and increase CL. Overall, we provide scientific proof that can be used directly as court evidence in illegal felling cases. This is the first time study reporting the SSR individual identification system which could be applied to various precious species. A warning to forestall illegal felling is the most valuable impact of this study: DNA types of these precious trees have been filed. The illegal felling crime rate is dropping after public propagation of cypress individual identification system. Moreover, the individual identification system would also provide certificate for legal timber trading21. This system would also deter dishonest businessman piggybacking illegal material in legal timber auction, which would further forestall illegal logging. In addition, these markers can be also used in population genetic analysis studies that facilitate the conservation and breeding of C. taiwanensis.

Conclusions

In this study, we developed an individual identification system for C. taiwanensis and provided the scientific evidence. This methodology can be adopted by the courts to link seized timber and victim trees. The C. taiwanensis individual identification system of this study includes 23 gSSR and 12 EST-SSR markers revealing polymorphism. When the 30 non-linkage markers were applied to C. taiwanensis identification, the lowest CPI was 5.596 × 10–12 and the highest CPD was 0.999999999994404, which was sufficient to identify 18 million random samples of C. taiwanensis (CL = 99.99%). While applied in the criminal cases of C. taiwanensis illegal logging, this SSR marker system successfully matched five seized illegally-felled timbers to three victim trees with minimal 99.99% CL. To the best of our knowledge, this is the first time the SSR technology is being applied to provide molecular evidence for court conviction on C. taiwanensis illegal logging. Our study would provide not only the scientific evidence correlating seized timber and victim tree, but also could inherent unique serial number to identify every single C. taiwanensis timber. We demonstrated the feasibility of matching seized/ illegally-felled timber with victim tree by modern SSR technology, which would prevent illegal logging by warning the criminals that the woodland trees could be identified on the basis of molecular level. Additionally, these markers can be also used in population genetic analysis studies that facilitate the conservation and breeding of C. taiwanensis.

Materials and methods

Developing C. taiwanensis individual identification system

Library preparation and SSR enrichment

In this study, we constructed both DNA and RNA libraries of C. taiwanensis (Fig. 1.). Three DNA libraries were created from individuals of TP (Voucher no. Chung 2448) and 100R (Voucher no. Chung 2603, 2621) (Supplementary Sect. 1). To build the DNA libraries, genomic DNA was extracted from fresh leaves using the cetyltrimethylammonium bromide (CTAB) method49. The quality and concentration of DNA were measured by NanoDrop 2000 (Thermo Fisher Scientific, San Diego, California, USA) and Qubit 2.0 Fluorometer (Thermo Fisher Scientific). From the total genomic DNA, microsatellites enriched in SSR markers was followed the magnetic bead enrichment method of Glenn and Schable50. Briefly, DNA was digested using AluI/XmnI and HaeIII/XmnI (New England Biolabs, Ipswich, Massachusetts, USA). The double-stranded SuperSNX linkers (SuperSNX24 Forward: 5′-GTTTAAGGCCTAGCTAGCAGAATC-3′; SuperSNX24 + 4p: 5′-pGATTCTGCTAGCTAGGCCTTAAACAAA-3′) were ligated to the digested DNA fragments. The linker-conjugated DNA fragments were hybridized with Biotin-labeled microsatellite probes containing Mix 2: (AG)12, (TG)12; Mix 3: (AAC)6, (AAG)8, (AAT)12, (ATC)8, (ACT)12; Mix4: (AAAC)6, (AAAG)6, (AATC)6, (AATG)6, (ACAG)6, (ACCT)6, (ACTG)6, (ACTC)6, (AAAT)8, (AACT)8, (ACAT)8, (AAGT)8, and (AGAT)8. The SSR hybridized fragments were extracted using Streptavidin M-280 Dynabeads (Invitrogen, Carlsbad, Calsbad, California, USA) and recovered by PCR using the SuperSNX24 Forward primers. The concentration and quality of SSR-enriched libraries were measured by Nanodrop 2000 (Thermo Fisher Scientific, Carlsbad, San Diego, California, USA) and Qubit 2.0 Fluorometer (Thermo Fisher Scientific, USA).

One individual of C. taiwanensis (Voucher no.: Chung 2627) from XI was used to prepare RNA library. RNA was extracted from fresh leaves by using the CTAB method51. The quality and concentration of RNA were measured by NanoDrop 2000 and Qubit 2.0 Fluorometer. The RNA was reverse transcribed into complementary DNA (cDNA) using Ovation RNA-Seq System V2 (NuGEN, San Carlos, California, USA) and the cDNA was quantitated using Nanodrop 2000 and Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, California, USA) by Tri-I Biotech, Inc. (New Taipei City, Taiwan). The cDNA was fragmented by Covaris S220 focused-ultrasonicator (Covaris, Woburm, Massachusetts, USA) and the cDNA library was prepared according to the manual of Ovation Ultralow DR Multiplex System 1–96 (NuGEN).

Sequencing and analysis

Three DNA and one RNA libraries were sequenced using the Illumina MiSeq System (2 × 300 bp paired-end; Illumina, San Diego, California, USA) at Tri-I Biotech (New Taipei City, Taiwan). The raw reads were prescreened to remove adapter sequences and reads with greater than 0.1% error or with an average quality less than QV30. High-quality filtered DNA and cDNA reads were merged by CLC Genomics Workbench version 7.5 (QIAGENE, Aarhus, Denmark).

SSR screening and primer design

SSRIT was applied to screen the gSSR and EST-SSR containing sequences from contigs. To design gSSR and EST-SSR primers, sequences with at least five di-, tri-, tetra-, penta-, and hexa-nucleotide repeats were selected using BatchPrimer352, with optimized conditions set length at 18–23 bp, melting temperature 45–62 ℃, and a product size of 80–300 bp.

Marker validation

A total of 75 markers including 23 gSSR and 12 EST-SSR markers newly designed in this study, and 40 published SSR4,53,54 (Supplementary Sect. 2) were subjected to validation test on 96 samples from four C. taiwanensis populations (TP, SY, DS and FR, see Supplementary Sect. 1). In addition, we also tested cross-species transferability of the designed gSSR and EST-SSR markers (Supplementary Sect. 3). The samples used in marker validation and cross-species transferability of DNA were extracted using the VIOGENE plant DNA extraction kit (VIOGENE, New Taipei City, Taiwan). The PCR reaction was conducted with a final volume 20 μL containing 2 ng of genomic DNA, 0.25 μL of 10 μM each primer and 10 μL of Q-Amp 2 × Screening Fire Taq Master Mix (Bio-Genesis Technologies, Taipei, Taiwan). The following PCR conditions were used: an initial denaturation of 95 ℃ for 2 min; 30 cycles of 95 ℃ for 45 s, a primer-specific annealing temperature (Tables 1, 2) for 45 s, and 72 ℃ for 45 s; followed by a 15-min extension at 72 ℃ (Tables 1, 2). The amplified products were evaluated on the ABI 3130XL (Applied Biosystems, Waltham, Massachusetts, USA) with GeneScan 500 ROX Size Standard (Applied Biosystems). Fragment size was determined by using GeneMapper version 3.2 (Applied Biosystems).

Marker analysis

GenAlex 6.51b255 was used to calculate number of alleles (A), observed heterozygosity (Ho), expected heterozygosity (He), Hardy–Weinberg equilibrium (HWE) of the newly developed gSSR and EST-SST markers. PowerMarker V3.2556 was used to calculate polymorphism information content or power of information content (PIC)57. Power of discrimination (PD)58, PD = 1 − ΣPi2, where Pi is the frequency of genotype i . Power of exclusion or probability of exclusion (PE)58, PE = h2[1 − 2 h(1 − h)2], where h is the frequency of heterozygotes. Probability of identity (PI)59, PI = 1 − PD. Combined power of discrimination (CPD)58, here we calculated CPD of 30 markers. CPD = 1 − [(1 − PD1)(1 − PD2)…(1 − PD30)].Combined probability of identity (CPI)59. Microsoft Excel (Microsoft Office 2016) was used to calculate PD, PI, PE, CPD, CPI. GENEPOP 4.260 was used to test for linkage disequilibrium.

Aligning seized timbers to victim trees

Samples from five seized timbers of Taiwan Yilan District Prosecutors Office, six illegally-felled timbers found at crime scene woodland and seven victim trees (Supplementary Sect. 4) were collected. Duplicates of a victim tree (7TA and 7TB) was sourced out in order to ensure the reproducibility of the identical SSR type in individual tree. Two grams of each sample was powdered in liquid nitrogen and the total genomic DNA was extracted following the protocol of VIOGENE plant DNA extraction kit (VIOGENE, New Taipei City, Taiwan). Nineteen non-linkage markers were selected for DNA typing. The sample succeeded in typing were further combined to the aforementioned database to calculation the CPI.