Online Resources Calvert et al. Melaleuca TPS - Springer10.1007/s00606-017-1454... · construction...

2

Click here to load reader

Transcript of Online Resources Calvert et al. Melaleuca TPS - Springer10.1007/s00606-017-1454... · construction...

Page 1: Online Resources Calvert et al. Melaleuca TPS - Springer10.1007/s00606-017-1454... · construction and sequencing by Macrogen P/L Singapore) and four lanes of Illumina Hiseq ... Online

Online Resources Calvert et al. Melaleuca TPS

TERPENE SYNTHASE GENES IN MELALEUCA ALTERNIFOLIA: COMPARATIVE ANALYSIS OFLINEAGE-SPECIFIC SUBFAMILY VARIATION WITHIN MYRTACEAE

Plant Systematics and Evolution

Authors: Jed Calvert*, Abdul Baten*, Jakob ButlerΨ, Bronwyn Barkla* and Mervyn Shepherd*^

*Southern Cross Plant Science, Southern Cross University, Lismore Australia NSW 2480.

ΨSchool of Biological Science, University of Tasmania, Hobart TAS 7005 Australia.

^Corresponding author: Dr Mervyn Shepherd, Southern Cross Plant Science, Southern Cross University, Lismore Australia NSW 2480.

Email: [email protected]

Phone: +61 2 66203412

Online Resource 1 Methods and results for generating a draft genome sequence for M.

alternifolia

The study used a draft genome sequence from a reference genotype of M. alternifolia

(SCU01). This individual has Chemotype 4 terpene chemistry (high 1,8 cineole and

intermediate terpinen-4-ol) and was clonally replicated and archived in a germplasm resource

collection located at the Lismore campus of SCU (Shepherd et al. 2015). DNA was first

prepared using the DNeasy Plant Maxi-kit (Qiagen P/L) and the procedure for frozen tissue

according to (Shepherd et al. 2010). Small insert (300, 350 and 550 bp) libraries were

constructed with the TruSeq DNA PCR-Free Library Preparation Kit (Illumina P/L) (library

construction and sequencing by Macrogen P/L Singapore) and four lanes of Illumina Hiseq

were used to generate a total of 100 Gb of high quality sequence from the paired ends reads

of 100 bp.

Raw sequencing reads were trimmed to remove low quality bases and adaptor sequences.

A draft assembly of M. alternifolia was constructed using the CLC de novo assembler (CLC

Bio, Aarhus, Denmark).

The draft genome comprised a total of 221,396 contigs with a total length of 356 Mb and

an N50 of 8,778 bp.

1

Page 2: Online Resources Calvert et al. Melaleuca TPS - Springer10.1007/s00606-017-1454... · construction and sequencing by Macrogen P/L Singapore) and four lanes of Illumina Hiseq ... Online

Online Resources Calvert et al. Melaleuca TPS

A gene annotation for M. alternifolia was generated with MAKER pipeline version v2.31.8

(Cantarel et al. 2008) using protein sequence evidence from E. grandis, C. citriodora and

Vitus sp. MAKER generated 33,184 draft gene models for M. alternifolia with an Annotation

Edit Distance of >0.35. An annotated version of the contigs was uploaded for homology

searching using CoGe BLAST and gene structure comparative analysis using CoGe GEVO

(Lyons and Freeling 2008; Lyons et al. 2008). To check Maker’s efficacy, tBLASTn was used

against the M. alternifolia genome assembly to explore the presence of TPS genes outside of

Maker gene models (amino acid queries from Kulheim et al. 2015 SM3). Two query

sequences (TPSb line 1 & TPSf line 2) returned no hits. Hits to all other queries (116 in total)

were associated (overlapping or contained within) with gene models predicted by Maker. See

Table SM5 for tabulated results. This suggests that the pipeline, which used protein sequence

evidence from E. grandis, C. citriodora and Vitus sp. to draw gene model predictions, is at

least as effective as a straight homology search, having search parameters relaxed enough to

allow for some missing consensus sequences and using multiple lines of evidence.

Gene set completeness and assembly quality were evaluated on the basis of the predicted

genes from Maker by determining the degree of capture of a set of 429 low copy number

highly conserved eukaryote genes in our set of contigs (Benchmarking Universal Single-

Copy Orthologs (BUSCO)) (Simão et al. 2015). BUSCO analysis reports the recovered genes

as “complete” when their lengths are within two SD of the BUSCO group mean length and

“duplicated” when more than one copy is detected. A BUSCO analysis was also carried out

on the E. grandis v2 genome for comparative purposes (Table SM1).

The fragmented state of our draft genome for M. alternifolia precluded the localisation of

genes to chromosomes, so this was not evaluated in this study. However, it was noted where

TPS gene models co-occurred on the same contig, as TPS tend to cluster in gene families in

the Myrtaceae (Kulheim et al. 2015). Although the chromosome number for M. alternifolia is

unknown, the 56 Melaleuca studied so far have a haploid number of 11, the base number for

the Myrtaceae (Rye 1979).

2