Supporting Information - PNAS...2014/06/06 · 5. Kessner D, Chambers M, Burke R, Agus D, Mallick P...

Supporting InformationSalomon et al. 10.1073/pnas.1406110111SI Materials and MethodsBacterial Strains and Media. The Vibrio parahaemolyticus RIMD2210633 derivative strain POR1 (RIMD 2210633 ΔtdhAS) (1)and its derivatives, as well as Vibrio alginolyticus 12G01 and itsderivatives, were routinely cultured in marine LB (MLB) broth[LB broth containing 3% (wt/vol) sodium chloride] or on marineminimal media agar (2) at 30 °C. Escherichia coli DH5α andE. coli S17 (λ pir) were routinely cultured in 2× yeast-extract andtryptone broth at 37 °C. The medium was supplemented withkanamycin (30 μg/mL for E. coli and 250 μg/mL for Vibriosspecies) or chloramphenicol (25 μg/mL), when necessary, tomaintain a plasmid.

Plasmids. For arabinose-inducible expression of vp1389, vp1416,and vti2, the genes coding the sequences were amplified andcloned into the pBAD/Myc-His vector (Invitrogen), in which theantibiotic resistance was changed from ampicillin to kanamycin,or into the pBAD33 vector (Addgene). Empty pBAD/Myc-Hisand pBAD33 vectors were used to provide resistance to kana-mycin and chloramphenicol, respectively.

Construction of Deletion Strains.Gene deletions were performed aspreviously described using the pDM4 vector (3, 4) for vp1388,vp1388–vp1389, vp1415, vp1415–vp1416, vpa1263, vpa1263–vti2,v12G01_02265, v12G01_02265–v12G01_02260, and v12G01_01540(V. alginolyticus hcp1). Construction of POR1Δhcp1 was describedpreviously (4).

Bacterial Killing Assays. Bacterial killing assays were performed aspreviously described (4). Cells were cocultured at an OD600 ratioof 1:4 (prey/attacker) on MLB (for V. parahaemolyticus attacker)or LB (for V. alginolyticus attacker) agar plates for 4 h at 30 °C intriplicate. Where appropriate, plates were supplemented with0.1% arabinose to induce expression from plasmids. Each ex-periment was done in triplicate and repeated at least twice, withsimilar results. A representative experiment is presented. A two-tailed Student t test was used to determine significance betweenindicated sample groups unless stated otherwise.

Proteomics and MS. V. parahaemolyticus cultures of POR1ΔopaR[type 6 secretion system (T6SS1)+] and POR1ΔopaRΔhcp1(T6SS1−) were grown in triplicate in 50 mL of M9 minimalmedia with 3% (wt/vol) NaCl media containing 20 μM phenamilfor 5 h at 30 °C. Media were collected, and protein was pre-cipitated as previously described (4). Protein samples were run10 mm into the top of an SDS/PAGE gel, stained with Coo-massie Blue, and excised. Overnight digestion with trypsin(Promega) was performed after reduction and alkylation withDTT and iodoacetamide (Sigma–Aldrich). The resultingsamples were analyzed by tandem MS using a QExactive massspectrometer (Thermo Electron) coupled to an Ultimate 3000RSLC-Nano liquid chromatography system (Dionex). Peptideswere loaded onto a 180–μm i.d., 15-cm long, self-packed columncontaining 1.9 μm C18 resin (Dr. Maisch, Ammerbuch, Ger-many) and eluted with a gradient of 0–40% buffer B for 60 min.Buffer A consisted of 2% (vol/vol) acetonitrile (ACN) and 0.1%formic acid in water. Buffer B consisted of 80% (vol/vol) ACN,10% (vol/vol) trifluoroethanol, and 0.08% formic acid in water.The QExactive mass spectrometer acquired up to 10 high-en-ergy, collision-induced dissociation fragment spectra for each fullspectrum acquired.

Raw MS data files were converted to peak list format usingProteoWizard msconvert (version 3.0.3535) (5). The resultingfiles were analyzed using the central proteomics facilities pipeline(CPFP), version 2.0.3 (6, 7). Peptide identification was performedusing the X!Tandem (8) and openMS search algorithm (OMSSA) (9)search engines against a database consisting of V. parahaemolyticusRIMD 2210633 sequences from the National Center for Bio-technology Information, with common contaminants and reverseddecoy sequences appended (10). Fragment and precursor toler-ances of 20 ppm and 0.1 Da were specified, and three missedcleavages were allowed. Carbamidomethylation of Cys was spec-ified as a fixed modification, and oxidation of Met was specified asa variable modification. Label-free quantitation of proteins acrosssamples was performed using SINQ normalized spectral indexsoftware (11).To identify statistically significant differences in protein amount

between T6SS1+ and T6SS1− strains, SINQ quantitation results forthree biological replicates per strain were processed using thepower law global error model (PLGEM) package in R (12, 13).Protein identifications were filtered to an estimated 1% proteinfalse discovery rate (FDR) using the concatenated target-decoymethod (10). An additional requirement of two unique peptidesequences per protein was imposed, resulting in a final proteinFDR <1%. Spectral index quantitation was performed usingpeptide-to-spectrum matches (PSMs) with a q-value ≤0.01, corre-sponding to a 1% FDR rate for PSMs.The dataset of tandem MS results was uploaded to the Pep-

tideAtlas repository (www.peptideatlas.org/PASS/PASS00442;accession no. PASS00442).

Bioinformatics.TheN-terminal sequence fromVP1388 (GIj28898162,range 144–297) identifies conserved sequence motifs (PhhPhR andGhhYhhh) near the N terminus of numerous bacterial sequences.One of these sequences (GIj520945140, range 1–249) was queriedagainst the nonredundant (NR) database using the position-spe-cific iterative (PSI)-BLAST application of BLAST+ (14) [10 iter-ations, E-value cutoff of 0.005, collecting all subject GIs (NCBIprotein sequence identifier)] to collect all potential MIX (markerfor type six effectors) sequences. Collected sequences were filteredto include only those from complete genomes in the ReferenceSequences (RefSeq) database (15) and were clustered usingCLANS (16). Sequences from five resulting broad groups wereindependently aligned using the MAFFT server (17), and werecolored according to conservation using Jalview (18) to high-light hydrophobicity patterns surrounding the conserved motifs.The corresponding domain sequences from each of the fivemultiple sequence alignments were aligned manually using themotifs and surrounding hydrophobicity patterns and PROMALS3D(19) alignments of representatives as a guide. Motifs were high-lighted with WebLogo (20).The domain organization of representative sequences was

defined using Batch Entrez and conserved domains (CD)-search(21) with default cutoffs. Identified domains were sorted ac-cording to E-value, and incomplete domains with E-values>1.6e-4 were excluded from counts (excluded partial domainsinclude mainly low-complexity sequence, such as coiled coils andtransmembrane helices). Two low-complexity coil domains withE-values better than the cutoff were also excluded. Domainsidentified within this lower E-value range (0.01–1.6e−4) thatwere also identified in other sequences with higher confidencewere included in the counts (i.e., 38 LysM domains were iden-tified with an E-value range from 0.00015 to 6.8e−14). Unknown

Salomon et al. www.pnas.org/cgi/content/short/1406110111 1 of 6

http://www.peptideatlas.org/PASS/PASS00442

www.pnas.org/cgi/content/short/1406110111

sequences represent 855 sequences that identified no additionaldomains and 58 sequences that identified domains with low-complexity regions that were excluded. Given the presence ofa number of excluded transmembrane helix (TMH)-containingdomain predictions, Phobius (22) was used to predict TMHs inthe C-terminal sequence of all unknown MIX-containing pro-teins. Several of the sequences with TMH-containing domainspredicted by CD-search that were excluded due to low confi-dence were submitted to the HHpred server (23) for domainvalidation. All of the submitted sequences identified variouscolicin pore-forming toxins (ranging from 9–93.3% probability)in their predicted TMH-containing regions.T6SS components VrgG (GIj28898168, range 1–525) and

haemolysin coregulated proteins (Hcp) (GIj256599595, range 1–163) were queried against the NR database (two iterations withan E-value cutoff of 0.0001 and 10 iterations with an E-valuecutoff of 0.001, respectively, collecting all subject GIs) using PSI-BLAST (14). Subject sequences were filtered to include onlythose from complete genomes in the RefSeq database (15), andVrgG sequences were clustered using CLANS (16) to purgefalse-positive hits. Species containing the resulting VrgG andHcp sequence representatives were sorted and compared withthose containing MIX using GeneVenn (24).

Genomic neighborhoods encompassing the T6SS machinerywere explored using the Microbial Genome Database for Com-parative Analysis (25) and the Search Tool for the Retrieval ofInteracting Genes (STRING) (26). The STRING database v9.1describing protein association networks was used to generategenome neighborhood information for (i) core T6SS compo-nents and (ii) MIX-containing groups. For the core T6SS com-ponents, a network was generated in the clusters of orthologousgroups (COG) mode from Hcp (COG3157). Only neighborhoodand gene fusion scores with a high-confidence cutoff (>0.7) wereconsidered for generating the T6SS network. MIX sequencesfor each group were submitted to the STRING database using“multiple sequences” submission in the COG mode to identifyrelevant groups for neighborhood analysis. The nonorthologousgroup populated with the most MIX sequences was selected forneighborhood analysis, which was limited to neighborhood andgene fusion scores with a medium-confidence cutoff for networkgeneration. For each of the five MIX groups, combined STRINGscores were ranked, and all links with high confidence scores areshown in Table 1. For links of lower confidence, only those scoresup to and including the top core T6SS component were reported.For the MIX II group, a low-confidence link to a T6SS componentwas also reported.

1. Park KS, et al. (2004) Functional characterization of two type III secretion systems ofVibrio parahaemolyticus. Infect Immun 72(11):6659–6665.

2. Eagon RG (1962) Pseudomonas natriegens, a marine bacterium with a generationtime of less than 10 minutes. Journal of Bacteriology 83:736–737.

3. Salomon D, et al. (2013) Effectors of animal and plant pathogens use a commondomain to bind host phosphoinositides. Nat Commun 4:2973.

4. Salomon D, Gonzalez H, Updegraff BL, Orth K (2013) Vibrio parahaemolyticus type VIsecretion system 1 is activated in marine conditions to target bacteria, and isdifferentially regulated from system 2. PLoS ONE 8(4):e61086.

5. Kessner D, Chambers M, Burke R, Agus D, Mallick P (2008) ProteoWizard: Open sourcesoftware for rapid proteomics tools development. Bioinformatics 24(21):2534–2536.

6. Trudgian DC, Mirzaei H (2012) Cloud CPFP: A shotgun proteomics data analysispipeline using cloud and high performance computing. J Proteome Res 11(12):6282–6290.

7. Trudgian DC, et al. (2010) CPFP: A central proteomics facilities pipeline. Bioinformatics26(8):1131–1132.

8. Craig R, Beavis RC (2004) TANDEM: Matching proteins with tandem mass spectra.Bioinformatics 20(9):1466–1467.

9. Geer LY, et al. (2004) Open mass spectrometry search algorithm. J Proteome Res 3(5):958–964.

10. Elias JE, Gygi SP (2007) Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods 4(3):207–214.

11. Trudgian DC, et al. (2011) Comparative evaluation of label-free SINQ normalizedspectral index quantitation in the central proteomics facilities pipeline. Proteomics11(14):2790–2797.

12. Pavelka N, et al. (2008) Statistical similarities between transcriptomics and quantitativeshotgun proteomics data. Mol Cell Proteomics 7(4):631–644.

13. Pavelka N, et al. (2004) A power law global error model for the identification ofdifferentially expressed genes in microarray data. BMC Bioinformatics 5:203.

14. Camacho C, et al. (2009) BLAST+: Architecture and applications. BMC Bioinformatics10:421.

15. Pruitt KD, Tatusova T, Brown GR, Maglott DR (2012) NCBI Reference Sequences(RefSeq): Current status, new features and genome annotation policy. Nucleic AcidsRes 40(Database issue):D130–D135.

16. Frickey T, Lupas A (2004) CLANS: A Java application for visualizing protein familiesbased on pairwise similarity. Bioinformatics 20(18):3702–3704.

17. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7:Improvements in performance and usability. Mol Biol Evol 30(4):772–780.

18. Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ (2009) Jalview Version 2—Amultiple sequence alignment editor and analysis workbench. Bioinformatics 25(9):1189–1191.

19. Pei J, Tang M, Grishin NV (2008) PROMALS3D web server for accurate multiple proteinsequence and structure alignments. Nucleic Acids Res 36(Web Server issue):W30–W34.

20. Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: A sequence logogenerator. Genome Res 14(6):1188–1190.

21. Marchler-Bauer A, et al. (2011) CDD: A Conserved Domain Database for the functionalannotation of proteins. Nucleic Acids Res 39(Database issue):D225–D229.

22. Käll L, Krogh A, Sonnhammer EL (2004) A combined transmembrane topology andsignal peptide prediction method. J Mol Biol 338(5):1027–1036.

23. Hildebrand A, Remmert M, Biegert A, Söding J (2009) Fast and accurate automaticstructure prediction with HHpred. Proteins 77(Suppl 9):128–132.

24. Pirooznia M, Nagarajan V, Deng Y (2007) GeneVenn—A web application forcomparing gene lists using Venn diagrams. Bioinformation 1(10):420–422.

25. Uchiyama I, Mihara M, Nishide H, Chiba H (2013) MBGD update 2013: The microbialgenome database for exploring the diversity of microbial world. Nucleic Acids Res41(Database issue):D631–D635.

26. Franceschini A, et al. (2013) STRING v9.1: Protein-protein interaction networks, withincreased coverage and integration. Nucleic Acids Res 41(Database issue):D808–D815.

* *Prey: E. coli

hcp1

opaR

opaRhcp1

102103104105106107108109

PreyAttacker

Attacker:

Log[CFU/ml]

Fig. S1. V. parahaemolyticus ΔopaR strain kills E. coli in a T6SS1-dependent manner. Viability counts of E. coli prey 4 h after coculturing with V. para-haemolyticus POR1Δhcp1, POR1ΔopaR, or POR1Δhcp1ΔopaR attacker strains. Mixed cultures at an OD600 ratio of 1:4 (prey/attacker) were spotted on MLBplates at 30 °C for 4 h. Asterisks mark statistical significance between prey sample groups at t = 4 h (P < 0.05).



C

A*

B

Prey: vp1388-vp1389

Empty

pVP1389

104

105

106

107

108

t=0 ht=4 h

Prey:

Log[CFU/ml ]

Attacker: POR1

Prey: vpa1263-vti2

Empty

pVti2

104105106107108109

t=0 ht=4 h

Prey:

Log[CFU/ml]

Attacker: POR1

*

*

Prey: vp1415-vp1416

Empty

pVP1416

104

105

106

107

t=0 ht=4 h

Prey:

Log[CFU/ml]

Attacker: POR1

Fig. S2. vp1389, vti2, and vp1416 encode T6SS immunity proteins. Viability counts of prey strains before (0 h) and after (4 h) coculture with the attacker strainPOR1 mixed at an OD600 ratio of 1:4 (prey/attacker) on MLB plates containing 0.1% arabinose at 30 °C are shown. Prey strains used were as follows:POR1Δvp1388-vp1389 (A), POR1Δvpa1263-vti2 (B), and POR1Δvp1415-vp1416 (C); they contained an arabinose-inducible expression vector that was eitherempty or encoding the putative immunity protein. Asterisks mark statistical significance between sample groups at t = 4 h based on a one-tailed Student t test(P < 0.05).

MIX

Fig. S3. MIX-containing proteins are found in bacterial species that encode additional T6SS components. A Venn diagram comparing species with MIX-containingproteins (red) identified in the RefSeq database of complete genomes with those containing valine-glycine repeat protein G (VgrG) domains (green) and Hcpdomains (yellow) is shown.



Fig. S4. MIX regions are grouped into five clans. Pairwise BLAST scores of all identified MIX sequences were used to cluster similar sequences into groupsdepicted in a 2D representation. Nodes are colored and labeled according to visual grouping: MIX I (green), MIX II (magenta), MIX III (blue), MIX IV (cyan), andMIX V (red), with connections between nodes representing pairwise BLAST scores better than E = 0.0001. MIX sequences of known T6SS effectors are labeled.



, * *, * -012 3 4550 1% 3-2'6' % 3*, 4,* 3 3520%,2 , ,, * *& 1 3-3 4550 1% 6610 3- ', % % 4 * 3 32 ,6 , ,, * *& 2 2 212 3-3 4550 1% 3- ', % % 4 *, * *,302 4 3612 % 3 1% ,7 6215 3- '&222% ' 4,* 3,332 ,2 ,, * * 502 3 '1% 3 1% 7 7515 %-2' ',2,37 % 4,* .+/' ,332 ,5677, ,, * *,3-, % &12 3 40'0 1% , 3-2' 5 '33 % 4 * 2,332 ,2 , -, * *& ,3016 --3 4050 1% 5015 3- 33 5 6 2 4,* .+/' 3 327%2, ,, * *& 3 , %12 --3 1% 6 '216 3- ,' % % 2 4,* .+/' 3 232 ,2 22, )89, * *- 2,3212 3 4555 1% ,0 601'37-&' ', 5 % 4 * 7*76,3 2,*7,6''7, ,, * %*, , 2',%1 -,3 ,41% & 337' ' % -4 * ,0 -2 ,, * %*, ,'.+/0'0 2 ,%1 3,3 ,41% & 6'10 33%' ' % -4 * .9/ ,0 -2*070%-67 , )88, * %*, , 2 ,%1 -,3 ,41% , 5'10 337' ' % -4 * .9/ ,0 -2*070%-07 ,, * % , ,6.+/0,5 , 3,- ,41% -3 '515 337' ' % -4 * .9/ ,0,--2*070*-57 ,,&* *& * , %1* -,3,- ,41% ,7 *7-%' %, %% 4 * 7'72, -2 ,2 , ,,* * 6,, 7 , %1 -,3 ,41% %. /623 ,603 3% 4 * , -2 4 ,* *&26& * %1* 3,- 1% , 3 %,2'- % 4,* 3 4 , &

7 * *- , * , %1 3 ,41% , 7.8/563 ' % 4,* ,7 -2 ,2 , ,7 * *- , * , %1 3 ,41% , 7.8/263 ' % 4,* ,7 -2 ,2 , ,7 * *- , * , %1 3 3%7%,41% , %.8/663 ' % 4,* -2 ,2 , ,

* *, 4 ,* %1* 3,3 1% -%' ', % 4,* 0 -2 42 ,53,,7* *& 4 ,**%1- 3,3,- ,41% ,, &%' %, % ,4 * - -2 42 53,

* *& 4 ,*,%12 3,3,- ,41% -%' ', %% 4,* -2 42 ,53,* *& 4 ,*,%1 -,3 ,41% , ,%' ', %% ,4,* ,% -2 42 , ,

% ,* *,6, , * %1* 3 ,41% , -,' %,6'3 % 4 * -7 -2 422%% ,, *,6,, * %1 3 1% 7 2. /32,0' 5,373 % 4 * -2 42 ,53,, *,6,, 3 *2%15 3-3 1% & -0' ', %'66 4,* 32 42 ,53,0,& *, , * %1 -,3 37'0 1% 7 *6-7' ',273 %0*6 ,4,* -2 42 6, * *,64, %12 3,3,- 1% % -6' ', %% 4,* ,, -2 42 , -,,-* *, & 7 ,,%12 3,3 ,41% ,76 0&%' ', % 2 ,4,* 42 42 ,,& 2 *&73, * %1 -,3 1% , %. /* - 2' %.9/ %'36 , ,* % 32 4 ,6-, *, , ,*%1 % 3,- ,41% , 5, ', %.8/0% , * .8/%36 32 4*423 '4,6,, *, , ,*%1 % 3 ,41% , 5&0' ', % ,,-* .8/53, 32 46 74,6, %*, , - &,%1* 3,3 ,41% , %,2' ', % , * .8/0 % 32 42 %4,6, %*, , - &,%1* 3 ,41% , %,2' ', % , * .8/0 % 32 42 %4,77 * 3- ,%1' 4,3 ,41% ,7 3%' 5, %%23*% ,* 7 -2 42 '7,, * %3 ,%1' -,3 ,41% ,0 6.8/6%-5' 5, % *50,5 7 -2 42407 7,-, *,%& ,%1 3 ,41% , -0' 5, 0.)9/% 6 *5%,3 32 42407 -, *-,3 2* 612 3 ,41% ,7 ,-2' 5,347 % 0*622 , * 42 42 ,,-4*,& * '- ,%1' 4,3,- ,41% 0 ,511 33' 5, -.8/,% 6%,22, * 0 -2'6742 70,-6 * *0%3 ,%1 4,3 ,41% ,7 ,*21 -,0' 5,%%3 % * 32 424*% &% * ,*1* 4,3 ,41 3 , ', 76 %* 6,* 7 -2 42 ,* *& 13 333 ,41% ,2' ',2,3 7'-% 2 ,,* 7 -2 42 ,5 *& 3-3 ,41% ,*71 , ',%,3 05-%2-56 , * ,7 -2 424'3 50&* & , 333 ,41% , ', % ,0,* -2 42473 -2,

0 * *& 333-- ,41% &, ', 72-%*55 ,6,* 37 -2 424'3 ,F **6 *6, - -1 4,3,-%.9/,41% 3 %, 5, '1,%%*6 ,,,* , -2 ,F 5 ,* * %3 3.)(/3,,%,,%1 3 0 7 1 -%3 0- 33- '6 6-3* - 11110 2 32 4 '0&5''F 5 * * %3 3.)(/3,,%,,%1 4,3 0 *-1 -03 0- 33- '6 26-3* - 11110 2 32 42 '5&F 5 * * %3 3.)(/3,,%,,%1 3 0 7 1 -%3 7- 33- ' 26-3* - 11110 2 32 4 '0&52'

0 * * %3 3.)9/3, %,,%1 3 % 3-1 -03 7-603,- '' 3* - 1111 2 36 , '5,5250 * * %3 3 3 %,,%10 3 '.9/0,1 -03 - 30- 3 - 11110 2 36 , '0 72'' * * %3 3 3 %,,%1* 3 0'.+/6-1 -03 3%0- - * 33 - 11115 2 32 4

* * %3 3 3 %,,%1* 3 0 3,1 -23 -303 -0050. / 2 3' 4 11110 2 36 ,2 , '03* * * %3 3.)(/3 %,,%1 3 % ,,1 -03 - 3,-0050.9/6 7,* - 11110 2 36 ,2*5, '0&5 * * %3 3.)8/32 %&*%1* 3 0 5,1, -0- 3603 7 , 3*-, 11117 2 32 , ,50&'5'

* * %3 3 -2 %,,%16 3 7,1, -0 3'5 3 2 73*3,* 11115 ,-,-2 ,2'3 ,'0&5 * %3 3 -2 % ,%1' 3 ,1, -7 - ,, 7 %3*3,* 11117 2 -2 ,2 ,00 07'

% *6%- , 3' %%,%1' 3 413 3%3 -2,- *6* , 11112 3&&-6 ,*&5,,%00' %,3* %,6, 3',%%,%1' 3 413 3'3 3', 2 2, 11115 32 ,*32,,*'34,3* %, , 3 ,%%*%12 3 413 3 3 -67-% 4 ,*7* *, 1111 32 , -, ,05 * %, , 3 ,%%,%1' 3 7,13,373 -77,-3 * 07 , 11116 3 32'&,,2 ,,20,05 * %, , 3 ,%%,%1' 3 7,13,303 -%7,-3 * 07 , 11116 ' 32'&,,2 ,,26,&0'

, * %, , 3 %%,,%1 3 413 303 -673 - 7 &2,* , 1111 ' 32 , -,70,2 * * %32- & %&,%1 '.+/541 *- 35.9/ 0-603 4 3* -, 11115-2 -6 42 6- ,0 * *,% 3 - ,%0,%10 3 412 ---'. / 2-20&74 4 , 6 3 11115 4 -2 42 -

* *6%- ,%% %17 3 341& -%37.)+/2-%7324 4 -5'2 3* 3 32 4 --2 * * %- - 3 ,%% %17 3 341& -%3 -%&374 40 3* - 32 4 3-' * %-2, 3 ,%%,%17 3-3 041, -03 -67304 0 -*3*3 32 4 ,- ,' * * %-2, 3 ,%%,%17 3-3 041, -03 -673 4 0 -* 3 32 4 ,- ,0 * * %-2, 3 ,%%,%17 3-3 041, -03 -673 4 0 3* - 32 4 ,- ,' ,-* %-2& 3',%%,%17 3,3 %41, - 3 -673 4 00,* ,* 3 32 4 &- ,' * * %-2& 3',%%,%17 3,3 %41, - 3 -673 4 00,*%2-* 3 32 4 &- ,* *,3* %3 , 3 ,&7,%17 3 041, -%3 -,, 04 -56 - ' 32 42 ,* *,3* %3 , 2,&7,%17 3 041, -%3 -,, 04 -56 - ' 32 42 -5 * %, , 3 ,%,,%1 3 413 3 - - 3% 111111111111111111111111111111*, * *,%3 ,.) / 7, ,%17 3 041& -%3 - 374 &' - '067711114 3 32 42 ,'7,, -3* % 3 3%%, 3,3 5, 35 - & 07062 ,,', ',-32 ,2 57-, -,* % 3 ,%%,'16 3,3 37 3, ,,' 32 42 &0 ,* %32 .)+/3 %&,% 3,3,% 4 3 3,77-* 4 ,4 - ,0 ,* %3 .)+/3 %&,%1* 3,3 % 4 3 3, -' 4 ,53&- ,6 ,* %3 .)+/35 %&,%16 3,3 % 4 3 3,6%-' 4 3,- ,200 ,* %3 .)+/35 %,,% 3,3 4 3 3, -' 4 ,43,-2*'* -056 ,* %3 .)+/35,%&,% 3,3 4 3 -,50-' 4 43&- , ,0%%%- ,,*,%3 3 %*,% -,3,, 4 3% 2,,3 6, ,%*,-2*2*,6*%&,'53%,,%6-2-,*,%3 3 ,%*,% -,3 4 3% ,'- '* &,%, ,%-,32 ,6 -%7,02 5 *,%32 36 %-,% 3,3 %4 33 ,,2% 5* 2 3- 3&,-,*'*,2 , ,025,6,6 * % 6,%,,%16 3,3 ,. / 417737 0*15 ',6%- * , 244,-6*'*,6 37 ,,,,*,%3 .)(/36 %&,% 3,3 04 4 5217 3 *, 2 ,0 -,*'*,6 ,,06 5 *,032 36 %& '12 3,, 4 3 65010 - - 2 6 3,, ,6 --- * 23 36 0& '1% 3 -. / 4 3 6 3 -* 2 2 3 6 ,2 -755

0 -5 *,%36 3 & 516 3%, 4 30 6*15 ,23-* 2-522 3 6 &2 3, ,7 *,%36 366 & '12 3 4 3 6*10 3,22-*,' 6 3 36 &6 3'2--,-*6%3 36,03,516 3,3 %4 3% -, , , 3,,-2 ,6 ,0,36 * ,,* %36 3 ,% ,%17 -,3 241'642 , 2 6, 64 -2*2&42 32 * ,,* %3 3 ,% ,%17 -,3 24 42 6'1' ,407, 2 2 -2*2*42 ,037 * ,,* %3 3 ,% &%1% 3 3.+/'4 42 2, -,6, 4 -6 46 , 3* * %3 * 3 ,% &% -,3 3.+/ 4 42 6*1' , 2 , ,4 36 4 , 3

0 ,6 * %3 36,% ,%17 3,3 34 32 2,4,77 -&2 4 -273*4 ,'%3%6%6 , * %36 3 %6,%17 &,3,- 74 3 6, 6, 4 -2 ,6 ,0%36 , * %36 3 ,%6,%17 &,3,- 4 32 6, * 6 4 -2 46 ,'%36 , % 3 ,%6,%17 3 -4 3 6, 4 -2 46*'6,'%37 , *2%3 3 ,%7,% 3,, 4 32 6, 3, 4 -2 427'6 0%36 ,* * %3 3 ,%6,%1' 4,3,- 34 30 %, 3 -2 4 -'037 , %3 3 ,%7,% 3-, 4 3 6, , 4 -2 42 7%36 ,* * %3 3 ,%6,% 34 3% 6*1' %,407', ,7&' 6 4 -2 4 -073' , %3 3 ,%6,% 3,- 04 3 %, 4 -2 42 -0 ,* *,%3 366%,,%1* 3,3 '.+/ 4 3 3,-7-'* 6 4 -2 46 ,7 ,* *,%3 .)8/36,%,,% 3,3 4 3 3, -'* 6 4 -2 46* ,* *,%3 .)(/366%,,%1* 3,3 '.+/04 3 23,3'- * 0 4 -2'6*46- -* %3 .))/36,%,,%12 3 4 47 ', -' ,4 -2 46,* * %3 36,%,,% -,3 5.+/%4 33 6'1 -,2%3 6 ,3 -2''%46

7 ,* *,%3 .)8/36,%,,%1* 3,3 5.+/24 37 15 3,3*3 & 3 -2 460 ,* * %3 36,%,&%1* 3,3 5.+/24 3 3,3*- *3, % & 4 -2'6*46,* *,%3 .))/36,%, %17 3,3 4 37 621'23, -'* 2 % 6 4 -6 4 5%%2

2 , * % .))/36 %6,% -,3 4 34 671% 3,&0,* 6 4 32 46 52 , * %* .))/36 % ,%12 3,3,3 4 34 6'13 3-&0 * 6 4 -2 46% ,* %' 360%,,%12 '.+/%4 3- ,,30,2, 6 4 -2'6,46 5,* %3 7%,,%12 3,3 '.+/%4 3- 6*1' 33&*-', 6, 4 -2 46,* *,%3 3 %,,%1* 3,3 , 33 6,35-', 2,,6 4 32 46

0 ,* * %3 .))/36,%6,%13 3 3.+/%4 3 3,3'3* 7, 7 6 3 -2 46 37 ,, * %3 .)8/3-,%2 %10 3.+/04 3 3 -53*, 7 2 3 -2 46 30 ,* * %* 36,%,&%12 3,5 4 3- -33*, 2 6, -4-242 467 ,* %-2 36,%5,%12 3,3 '.+/2 10632, 1 , 6, 5 6 - 32 460 ,* * %3 .)8/36 %6,%1* 3,3,- 4 3 12 ,22 *, 2 6 4 -2 46 57 ,* %- - 3 6* %06 3 '.+/2-133- 0'71- 6- & 6 4 32 46 ,* , * %3 .)(/36,% &%1* -,3 4 3, ,51 ,40 * 362&3-6 4 32 46 505772,, * %3 .)(/36,% ,%1* 3,363'.+/0, 3& ,'1' &45-** % 2 ,4 42 46 5'72,, * %3 .)(/36,% ,% 3,363'.+/5, 3& ,'15 ,4%-* % ' ,4 42 4602,,,3* %3 36,% &% 3,3 , 3& ,*12 &&42- * 3 ' 4 42 465 ,, * %3 32,% ,%1* 3,363'.+/%, 3 &*16 , 3*, 3 ' -4 42 46,* * %3 3 ,%,&% 3,3,3'.+/%4 4& 621'2%, -** ,,,6 3 32 462%&,''&,* * %3 3 ,%,,% 3,3,-'.+/%4 4& 621' %, -** 2, 6 3 32 42 &,* * %3 3 ,%7,%17 3,3-3 4 3 621' 0337-* 2 6, ,3 -2 4,* * %3 .8)/3 ,%%,% 3,3,3 4 37 6'1 -47-** 2 , 4 3 42,* %, , 3 ,%6,% 3 4 -% 1 3,6* *, &,- *7*076110% 4 -262*42 3,* % / %6,%1* 4 -0 1 3,6* *, &,- 4 32 42 3,* %* ,.)(/3 &%6,% 4 -7 2'1 3, 3*, &,, 4 32 42 3-* %*6 .))/* %,, 3.+/04 4%2351' 6,00, 6, 4 6 4 ,%'366',7 6*2 .))/* % % 0,3-.+/%4 47 15 ,6%0 2 4 36*%,42 ,773,* *,% 3 ,%6,%1& 3,3 4 37 1 3,%7-5* 2 5 4 -22'*42 , 3

*,% 3 ,%6,% 3 4 37 6*17 -,%&-5 22 5 -25'*4 , 33* *,%3 .)(/36,% ,%10 41763 6'1' 3,%,, 2 5 4 -2 4 -,* *,%% , & ,%6,%1& 3,3 4 3 17 ,%%-5* 22-,5 4 -22'*4 , 3,* *,% , , ,%6,% 3,3 4 37 2*1' 3,%7-5* 2-,5 4 - 4 , 3,* % 3 7,%,,%17 3 4 3 1 6,%%-* 2 5 '6,,3111%0 4 - 4 ,'%3,* *,% 3 7*%6,%1% -,3 4 37 6*1 &,%7-** 5 '6'0-11126 4 4 , 3,* *,% 3 7*% ,%1% -,3 ,' 34 37 6*1 ,%7-* 2 5 4 4 , 37,* *,% 3 7*% ,%17 3,3 ,'%% 4 3 2*1 ,27-%*37& 2 5 4 4 ,'23,,2 %*6 .)+/3 ,%,,% 3 24 3% 17 &,* - * ,%6 4 -2&5442 0'

22,2,,,,%*2*.)8/- ,%,,% ,. 54 3% 17 ', - ,0 2 ,4 32 4 50*, * * 3- % * 3,3 07'064 3,,'010 33%0 * & 67 6 4 32 4 3 ''* %4 3.9/0 2%,,% 3 '5 4 3 671 ,02-6 52 % ,4 32 46 755

,* *&% 3 3&,% 3 '.8/ 4 - 16 7& * ''' 2 5 4 32 46 ,2%',0 %3 - 26,%10 -,3 417 30 10 -,0% 3 , 3-032 42 ,%50,7 %3 3 ,66,% 3,3 &.+/04 3 2717 -,0% '55'3 , 6 3&032 420'1100 50* * %3 .)(/36,% ,%1* 3 1 3 11 ,3 ,%3 , 6 4 4 32 46 6

*% * * %3 .)(/36,% ,%1* 3 1 3' 11 3 ,63 , 22%% 4 4 -2 46 &2% * * %3 .)(/36,% ,%1* 3 1 ' 7111 3 ,53 , &2 4 4 32 46 2'

* * %3 .)+/36,% ,% 363 1 5111 ,3 ,, 75, 2 5 4 32 46707 2' ,22%,3,,% *.)+/3 ,5*,%10 3 31% 3 2,65, 42' %*3,*, ,4 -2 42703 ,,5'

'3 %,3,2,3 .) /3 ,%10 3 -10 3 , 42'& 65 ,%073'11100 4 -2 42073 ,%,3--%3 ,. 8/ ,51* 3 5.8/531- ,7 7 54732 '-0 3 % 32 42 ,565' %3 , 3 ,51* 3,3 '.8/,413 3 %3 ,4 5 ,' 6- ,4 32 42 372 %2'6 *-%3 , 3 6*, 4,3&35.+/2 10 37 ,-%, , 6 460 3& -2 4 '0&,05

06 ,3* %4 3 %0,61* 3&-2.+/6,1 3% , 7* * -,0- 46 7 %&'*, * %3 -.)(/36,%7,%1' 4,3 '.+/6,1 3 0,60 '4 % ,3* ,2 3 46 &%252' * %3 ,.)(/36,%7,%1' 4,3 '.+/6-1 37 -60 '4222'.9/0% 3* 3 - 464%& &%

' * %3 .)(/36,%7,%1' 4,3 2.+/6-1 37 27- &'4 % 23205'111116,6 -2 464%, &%2500 * %3 ,.)(/3 ,%7,%1' 4,3 '.+/6-1' 3 26- 1 ,,3* ,2 32,7%46 0 & 02

' * %3 36,%7,%1' 4,3 - 3 2 - 3', % 575*-,63 42 24%,-5- ,& ' * %3 36,%7,%1' 4,3 2.+/6-1 3 67,0035, % -,63 ,3 4207%464%,-53 ,652 * %3 -6,%7,%1' 4,3 0.9/7-1 37 5, 3'4 % 0626 6350%'111116,4 -205%42 &%250 * %3 36,%7,%1' 4,3 ,1 637 - 63'2''111116,4,232 46*%, -20 * %3 , 35,%7,%1' 4,3 2.+/0,1 630 - 63'2''111116,4,2- 46 6,-2*5 * %3 , 36,%7,31' 3 '.+/6,1 3 ,2% 4 % ' -,,3* , -2*&7464%& &%*' *&%3 , 36,%7,310 3 '.+/6,1 3232626 7,6% 04 % ,,3*%05111116, 32*5746 %*' *&%3 , ,6,%7,%1' 3 ,1 3 - 2,6% 4 % ,,3*%52111110,% -2*47464%& &%

* %3 ,.)(/36,%7,%1' 4,3 '.+/6-1 37 - 4 %55* 3,,3*%%5111117,% 32 464%& &%' ' * %3 36,%7,%1' 4,3 1' 3 6- ,63* 32 46 ,,* * %3 36,%7,%1' 4,3 '.+/2,1' 3 67-6,30, % -,63 ,2 32 4645&-&% ,556 5 * %3 .) /36,%7,%1' 4,3 ,1' 3 5-673,& % 632 2 -2 42 3 %

* %3 .))/36,%7,%1' 4,3 5.+/6 1' 37 67- -0, % -,23* ,2,,32 42 -7' ,'0 7 * %3 , %,,%1' 4,3 &. /,,1 3%6 7 66,,4 % 5, - ,3 32 46 27 05,0 7,,*,%3 , %,&%1' 4,3 ,1 3% 2011 67,,45502.8/0% 2, - ,3 32 46 %7 ,72

7 *,%3 , , %, %1' 4,3 &1 3% 7511 67,,4 % -*%22111110,3 32 46,7,,*,%3 , ,%% %1' 3,3 ,1 37,6511 5 ,,477'0.8/2%626 , -* ,3 32 46 27 ,7,,*,%3 , , %,,%1' 4,3 ,1 37 ,511 , ,, % -2%%2111116, - 46 67 ,

26 7,,*,%3 , , %,,%1' 4,3 ,12 34 3511 &,- % 6, -*062111116,3 3 42 %7 5'55 7,,*,%3 % , ,%,,%1' 3,3 ,1 3% *20,,4 % -* , 32 46 ,%6 5,,* %3 % , 3%,,%13 3,3 ,1 3, , , 3* ,3 - ,6 ,'250 7,,* %3 , %, 23,3 13 3 , , 2 7 32 2 ,,%2

6,,*,%3 , %, % 3,3 ,13 -%, ,6%, % ,6- ,7 3 42 5%&,,, * %3 .)9/3 %*,%15 3 ,.+/%,1 3% 2 , 6 * 3 * 42 42 - %',,2 *,%3 ,.) /36 %,,%1' 4,3 7.+/7,1 3% -676 0 %2,*3,,,* & 42*%742'27-52 0,,2 * %3 .) /36 %,,%1' 4,3 7.+/%,1, 37 7, 6 0 3,, * , 42 42 - 605',, * %3 3 *%&,%1' 4,3 ,15 3% 2 ,0*6 3,&, ,242 42 3 077

- * %3 , & %7,%12 -,3 7 ,17%- & *7 , 64,,36 , ,2- * %3 & & ,%%,% 3,3,4 ,1 %3 , * ,502611111' 4&,32 ,2 ,002

2 * %3 .) /36,%7,%1' 4,3 1 37 ,711 0,-%-, % ,,376'%11111*3- 3 46 3* * % %, ,*6111 333 - 17 30 4 -3 672 -60'72 340, 111 73 3 6 ,'5,2* *7 7 11 %, - 6111 3 - 170374,'1' 11111116,2 2 340, 0' 111 7 3 6 ,'5,5''* *6 7 %, - 61116 3,3 - 170374 2 340 7, 1116'72 3 6 , 5* * 7 %, 6 1116 3,3 -7'6% 17 374 2 356372 -40, 111 75 3 6 '5,256 ))(*,,*2, 0, 42,1 -333- ,,13 ,' 4 4 672 0 32% ,6 ,*,,*2, 0, 4 ,111 -333- ,,13 ,' 04 4% 672 340 1116'7 ,,3%2 ,6 , ,22'* *6 7 %, -*6111 -33 - 17 3%, 4'1-3 6,2 340 1110'73 3 ,2 57 2* * 7 %, ,*6111 3-3 - 17 30, 421-3 6%2 340 1116'7 3%26*-6 ,'7,2* *2 7 %, ,*6111 333 - 17 30, 04 -3 6%2 356'72 -40, 111 73 3,7%7 6 ,'5,6

2- * 6*6 7 %, -*2111 3-3 - 17 30& - -3 6%6 2 340 111'573 3 77 6 ,'5,'* *2 7 %, ,*6111 333 - 17 30, 4 -3 6%2 2 4% '5 ,111'57 ,,3,*%0,6 - '* * 7 11'%, -*6111 3-3 - 17 30 %4 -7 6%2 ,5037 340 '2 1116'7 3,'2* 6 , 5* *6 7 112%, -*6111 -33 - 17 30 4 -3 6%2 65037 340 '2 1110'7 3,' 6 ,'5,5

E , * 7 %, , 61117 333 - 17 37 5 %1- ,632 22 * 0 *111 7 32% ,6 , 2* * ,, 4 1117 -333- ,&13 & -2147& 2 * 00 111 7 3%7 ,6 , 5

- * 0, 7 , 4 6111 3-3 - 1& 37 0421,7 6-2 *30 576, 111 73 370 ,6 ,27 '-- * 0*2 7 11'%, 4*0111 --3 - 15 37, 221-%&632 - *40 1112%7 ,,360 &6 , 5- * 0*2 7 %, 4*6111 3-3 - ,61 30 4 -5 2 26 *30 '5 111 7 ,,3 ,27',,,'' 2

Fig. S5. Alignment of MIX sequences. Multiple sequence alignment of MIX representatives (partial sequences and redundant sequences removed at 90%identity cutoff) is shown. Sequences are labeled to the left by National Center for Biotechnology Information GI number, species, and gene name, with residuenumbers corresponding to the first and last alignable residues indicated to the left and right of the sequence, respectively. Domain neighborhoods sur-rounding the MIX-encoding gene (represented by X, colored according to group as in Fig. S4) are indicated on the left, with T6SS components highlighted inblue (VgrG), lavender (Hcp), and gray (proline-alanine-alanine-arginine), and linked Duf4123 highlighted in yellow. uk, genes with unknown domains. TheTMH column represents MIX-encoding genes that have predicted TMHs. Residue positions are highlighted according to group-wise conservation: mainlyhydrophobic (yellow), mainly small (gray), mainly basic (blue), mainly acidic (red), mainly S/T (orange), and MIX motif Y (black). Unalignable regions betweenconserved core secondary structures (indicated above alignment) are omitted, with the number of deleted residues shown in brackets.



Hcp

COG3515ImpA

IglA

COG3913ImpMVgrG

COG4455ImpE

DotU

Fha1

COG3521

IcmFCOG3520Duf1305

COG3519Duf879

IglB

COG3522ImpJ

COG3518ImpF

COG5351

COG0542

Neighborhood > 0.7Gene Fusion > 0.7Neighborhood 0.553Neighborhood 0.546

Fig. S6. T6SS functional network. A network of connections representing genomic neighborhood and gene fusion scores was derived by the STRING database.Nodes that represent conserved T6SS component COGs are labeled (black) and connected with solid lines for confident (>0.7) neighborhood (green) and genefusion (red) scores, or with dashed lines for neighborhood scores (green) with medium confidence. One additional COG that was not defined as a conservedT6SS component yet retains a large number of confident connections is labeled in gray.

Other Supporting Information Files

Dataset S1 (XLSX)Dataset S2 (XLSX)


http://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1406110111/-/DCSupplemental/pnas.1406110111.sd01.xlsx

http://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1406110111/-/DCSupplemental/pnas.1406110111.sd02.xlsx


Supporting Information - PNAS...2014/06/06 · 5. Kessner D, Chambers M, Burke R, Agus D, Mallick P...

Documents

Transcript of Supporting Information - PNAS...2014/06/06 · 5. Kessner D, Chambers M, Burke R, Agus D, Mallick P...