Online Resources Calvert et al. Melaleuca TPS - Springer10.1007/s00606-017-1454... · construction...
Click here to load reader
Transcript of Online Resources Calvert et al. Melaleuca TPS - Springer10.1007/s00606-017-1454... · construction...
Online Resources Calvert et al. Melaleuca TPS
TERPENE SYNTHASE GENES IN MELALEUCA ALTERNIFOLIA: COMPARATIVE ANALYSIS OFLINEAGE-SPECIFIC SUBFAMILY VARIATION WITHIN MYRTACEAE
Plant Systematics and Evolution
Authors: Jed Calvert*, Abdul Baten*, Jakob ButlerΨ, Bronwyn Barkla* and Mervyn Shepherd*^
*Southern Cross Plant Science, Southern Cross University, Lismore Australia NSW 2480.
ΨSchool of Biological Science, University of Tasmania, Hobart TAS 7005 Australia.
^Corresponding author: Dr Mervyn Shepherd, Southern Cross Plant Science, Southern Cross University, Lismore Australia NSW 2480.
Email: [email protected]
Phone: +61 2 66203412
Online Resource 1 Methods and results for generating a draft genome sequence for M.
alternifolia
The study used a draft genome sequence from a reference genotype of M. alternifolia
(SCU01). This individual has Chemotype 4 terpene chemistry (high 1,8 cineole and
intermediate terpinen-4-ol) and was clonally replicated and archived in a germplasm resource
collection located at the Lismore campus of SCU (Shepherd et al. 2015). DNA was first
prepared using the DNeasy Plant Maxi-kit (Qiagen P/L) and the procedure for frozen tissue
according to (Shepherd et al. 2010). Small insert (300, 350 and 550 bp) libraries were
constructed with the TruSeq DNA PCR-Free Library Preparation Kit (Illumina P/L) (library
construction and sequencing by Macrogen P/L Singapore) and four lanes of Illumina Hiseq
were used to generate a total of 100 Gb of high quality sequence from the paired ends reads
of 100 bp.
Raw sequencing reads were trimmed to remove low quality bases and adaptor sequences.
A draft assembly of M. alternifolia was constructed using the CLC de novo assembler (CLC
Bio, Aarhus, Denmark).
The draft genome comprised a total of 221,396 contigs with a total length of 356 Mb and
an N50 of 8,778 bp.
1
Online Resources Calvert et al. Melaleuca TPS
A gene annotation for M. alternifolia was generated with MAKER pipeline version v2.31.8
(Cantarel et al. 2008) using protein sequence evidence from E. grandis, C. citriodora and
Vitus sp. MAKER generated 33,184 draft gene models for M. alternifolia with an Annotation
Edit Distance of >0.35. An annotated version of the contigs was uploaded for homology
searching using CoGe BLAST and gene structure comparative analysis using CoGe GEVO
(Lyons and Freeling 2008; Lyons et al. 2008). To check Maker’s efficacy, tBLASTn was used
against the M. alternifolia genome assembly to explore the presence of TPS genes outside of
Maker gene models (amino acid queries from Kulheim et al. 2015 SM3). Two query
sequences (TPSb line 1 & TPSf line 2) returned no hits. Hits to all other queries (116 in total)
were associated (overlapping or contained within) with gene models predicted by Maker. See
Table SM5 for tabulated results. This suggests that the pipeline, which used protein sequence
evidence from E. grandis, C. citriodora and Vitus sp. to draw gene model predictions, is at
least as effective as a straight homology search, having search parameters relaxed enough to
allow for some missing consensus sequences and using multiple lines of evidence.
Gene set completeness and assembly quality were evaluated on the basis of the predicted
genes from Maker by determining the degree of capture of a set of 429 low copy number
highly conserved eukaryote genes in our set of contigs (Benchmarking Universal Single-
Copy Orthologs (BUSCO)) (Simão et al. 2015). BUSCO analysis reports the recovered genes
as “complete” when their lengths are within two SD of the BUSCO group mean length and
“duplicated” when more than one copy is detected. A BUSCO analysis was also carried out
on the E. grandis v2 genome for comparative purposes (Table SM1).
The fragmented state of our draft genome for M. alternifolia precluded the localisation of
genes to chromosomes, so this was not evaluated in this study. However, it was noted where
TPS gene models co-occurred on the same contig, as TPS tend to cluster in gene families in
the Myrtaceae (Kulheim et al. 2015). Although the chromosome number for M. alternifolia is
unknown, the 56 Melaleuca studied so far have a haploid number of 11, the base number for
the Myrtaceae (Rye 1979).
2