SASBDB Small Angle Scattering Biological Data Bank

69
SASBDB Small Angle Scattering Biological Data Bank Erica Valentini Dmitri Svergun group Solution Scattering from biological macromolecules EMBO course 2014

Transcript of SASBDB Small Angle Scattering Biological Data Bank

Page 1: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Small Angle Scattering Biological Data Bank

Erica ValentiniDmitri Svergun group

Solution Scattering from biological macromolecules EMBO course 2014

Page 2: SASBDB Small Angle Scattering Biological Data Bank

Index

1. Introduction:– What is SAS?

– Do we need a SAS database?

2. SASBDB:– Features

– Usage

– Quality check

– Missing

3. Conclusions

2SAS EMBO Course 201411/2/2014

Page 3: SASBDB Small Angle Scattering Biological Data Bank

Index

1. Introduction:– What is SAS?

– Do we need a SAS database?

2. SASBDB:– Features

– Usage

– Quality check

– Missing

3. Conclusions

3SAS EMBO Course 201411/2/2014

Page 4: SASBDB Small Angle Scattering Biological Data Bank

What is SAS?SAS Experiment

2θs

|s| = 4π sinθ/λ

s scattering vector2θ scattering angleλ wavelengthI(s) intensity

X-ray/Neutron beam

Low resolution Model

ATSAS

Scattering In

tensity, Lo

g I(s)

4SAS EMBO Course 201411/2/2014

Page 5: SASBDB Small Angle Scattering Biological Data Bank

What is SAS?ATSAS Package

Rg

MM

Dmax

Volume

Shape

Rigid bodymodelling

Missingfragments

Oligomericmixtures

FlexibleSystem

5SAS EMBO Course 201411/2/2014

Page 6: SASBDB Small Angle Scattering Biological Data Bank

Do we need a SAS DB?SA(X)S advantages

Increasing popularity of SAXS

Solution

Broad size range

New developments

in software and hardware

From few kDa to GDa

Fast experiments: μor m seconds. Small amount of sample: 5-30 μl.

Monitor alteration in environmental conditions.

6SAS EMBO Course 201411/2/2014

Page 7: SASBDB Small Angle Scattering Biological Data Bank

Do we need a SAS DB?SAS database motivations

7SAS EMBO Course 2014

• Increasing number of publications about SAS and the ATSAS package.

• Increasing amount of data collected with a single experiment.

• Importance of making the data underlying scientific publications available for the community.

Graewert, M. a and Svergun, D.I. (2013) Impact and progress in small and wide angle X-ray scattering (SAXS and WAXS). Curr. Opin. Struct. Biol., 23, 748–54.Franke, D., Kikhney, A.G. and Svergun, D.I. (2012) Automated acquisition and analysis of small angle X-ray scattering data. Nucl. Instruments Methods Phys. Res. Sect. A Accel. Spectrometers, Detect. Assoc. Equip., 689, 52–59.Collins, F.S. and Tabak, L. a (2014) Policy: NIH plans to enhance reproducibility. Nature, 505, 612–3.

0

50

100

150

200

250

300

350

400

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Number of publications referring to biological SAS

ATSAS

bioSAS

.

11/2/2014

Page 8: SASBDB Small Angle Scattering Biological Data Bank

Do we need a SAS DB?wwPDB SAS task force

SAS EMBO Course 2014 8

Trewhella, J., Hendrickson, W.A., Kleywegt, G.J., Sali, A., Sato, M., Schwede, T., Svergun, D.I., Tainer, J.A., Westbrook, J. and Berman, H.M. (2013) Report of the wwPDB Small-Angle Scattering Task Force: Data Requirements for Biomolecular Modeling and the PDB. Structure, 21, 875–881.

“…a global repository is needed that holds standard format X-ray and neutron SAS data that is searchable and freely accessible for download”

Database and small angle scattering experts

SASBDB11/2/2014

Page 9: SASBDB Small Angle Scattering Biological Data Bank

Do we need a SAS DB?Existing DB including SAS data

Database SAS data included Missing

47 models where SAS was used for refinement

Primary data used to calculate the models

Scattering curves from 20.000 pdb structures

Models and possibility to deposit SAS data.

SAXS data and models Complete search, cross-references to other databases, quality check on data

Scattering curves and ensembles models fromdisordered proteins

SAS data and models from “not disordered proteins”

9SAS EMBO Course 201411/2/2014

Page 10: SASBDB Small Angle Scattering Biological Data Bank

Do we need a SAS DB?Existing DB including SAS data

Database SAS data included Missing

47 models where SAS was used for refinement

Primary data used to calculate the models

Scattering curves from 20.000 pdb structures

Models and possibility to deposit SAS data.

SAXS data and models Complete search, cross-references to other databases, quality check on data

Scattering curves and ensembles models fromdisordered proteins

SAS data and models from “not disordered proteins”

10SAS EMBO Course 2014

Berman, H., Henrick, K., Nakamura, H. and Markley, J.L. (2007) The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res., 35, D301–3.

11/2/2014

Page 11: SASBDB Small Angle Scattering Biological Data Bank

Do we need a SAS DB?Existing DB including SAS data

Database SAS data included Missing

47 models where SAS was used for refinement

Primary data used to calculate the models

Scattering curves from 20.000 pdb structures

Models and possibility to deposit SAS data.

SAXS data and models Complete search, cross-references to other databases, quality check on data

Scattering curves and ensembles models fromdisordered proteins

SAS data and models from “not disordered proteins”

11SAS EMBO Course 2014

dara.embl-hamburg.deSokolova, A. V, Volkov, V. and Svergun, D. I. (2003) Prototype of a database for rapid protein classification based on solution scattering data. Conference papers classification based on solution scattering data. 1, 865–868.

11/2/2014

Page 12: SASBDB Small Angle Scattering Biological Data Bank

Do we need a SAS DB?Existing DB including SAS data

Database SAS data included Missing

47 models where SAS was used for refinement

Primary data used to calculate the models

Scattering curves from 20.000 pdb structures

Models and possibility to deposit SAS data.

SAXS data and models Complete search, cross-references to other databases, quality check on data

Scattering curves and ensembles models fromdisordered proteins

SAS data and models from “not disordered proteins”

12SAS EMBO Course 2014

Hura, G.L., Menon, A.L., Hammel, M., Rambo, R.P., Poole, F.L., Tsutakawa, S.E., Jenney, F.E., Classen, S., Frankel, K. a, Hopkins, R.C., et al. (2009) Robust, high-throughput solution structural analyses by small angle X-ray scattering (SAXS). Nat. Methods, 6, 606–12.

11/2/2014

Page 13: SASBDB Small Angle Scattering Biological Data Bank

Do we need a SAS DB?Existing DB including SAS data

Database SAS data included Missing

47 models where SAS was used for refinement

Primary data used to calculate the models

Scattering curves from 20.000 pdb structures

Models and possibility to deposit SAS data.

SAXS data and models Complete search, cross-references to other databases, quality check on data

Scattering curves and ensembles models fromdisordered proteins

SAS data and models from “not disordered proteins”

13SAS EMBO Course 2014

Varadi, M., Kosol, S., Lebrun, P., Valentini, E., Blackledge, M., Dunker, a K., Felli, I.C., Forman-Kay, J.D., Kriwacki, R.W., Pierattelli, R., et al. (2014) pE-DB: a database of structural ensembles of intrinsically disordered and of unfolded proteins. Nucleic Acids Res., 42, D326–35.

11/2/2014

Page 14: SASBDB Small Angle Scattering Biological Data Bank

Index

1. Introduction:– What is SAS?

– Do we need a SAS database?

2. SASBDB:– Features

– Usage

– Quality check

– Missing

3. Conclusions

14SAS EMBO Course 201411/2/2014

Page 15: SASBDB Small Angle Scattering Biological Data Bank

SASBDB features:

1. Entries

2. Cross links

3. Searching

4. Browsing

5. Benchmark

6. Plots

7. Interactivity

8. Availability

SAS EMBO Course 2014 1511/2/2014

Page 16: SASBDB Small Angle Scattering Biological Data Bank

1. Entries

SAS EMBO Course 2014 16

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

www.sasbdb.org

Page 17: SASBDB Small Angle Scattering Biological Data Bank

1. Entries

SAS EMBO Course 2014 17

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 18: SASBDB Small Angle Scattering Biological Data Bank

2. Cross links

SAS EMBO Course 2014 18

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 19: SASBDB Small Angle Scattering Biological Data Bank

3. Searching1. Simple search:

SAS EMBO Course 2014 19

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 20: SASBDB Small Angle Scattering Biological Data Bank

3. Searching1. Simple search:

SAS EMBO Course 2014 20

2. Advanced search:

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 21: SASBDB Small Angle Scattering Biological Data Bank

3. Searching

SAS EMBO Course 2014 21

Browsing unit

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 22: SASBDB Small Angle Scattering Biological Data Bank

4. Browsing

SAS EMBO Course 2014 22

Scattering curve

Model

Kratky plot

Experiment information

Publication

Structural parametersUnique code

format: SASXXXN

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 23: SASBDB Small Angle Scattering Biological Data Bank

4. Browsing

SAS EMBO Course 2014 23

Chronological order

Browse according to the selected field

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 24: SASBDB Small Angle Scattering Biological Data Bank

5. Benchmark

SAS EMBO Course 2014 24

Benchmark

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 25: SASBDB Small Angle Scattering Biological Data Bank

5. Benchmark

SAS EMBO Course 2014 25

• 17 Entries from a set of 14 “standard proteins”

• SAXS and WAXS data

• Extra purification steps

• Benchmark for algorithm testing proposes

• Dissemination

Dissemination

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 26: SASBDB Small Angle Scattering Biological Data Bank

6. Plots

SAS EMBO Course 2014 26

Scattering plot

Guinierregion

Kratky plot

P(r) distribution

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 27: SASBDB Small Angle Scattering Biological Data Bank

vRadius of Gyration

Maximum Distance

MWs & Porod

Volume

vRadius of Gyration

27SAS EMBO Course 2014

6. Plots

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 28: SASBDB Small Angle Scattering Biological Data Bank

Fitting 1 Model 1

Fitting 2 Model 2

28SAS EMBO Course 2014

7. Interactivity

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 29: SASBDB Small Angle Scattering Biological Data Bank

Fitting 3 Model 1

Model 2

Model 3

29SAS EMBO Course 2014

Model 4

7. Interactivity

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 30: SASBDB Small Angle Scattering Biological Data Bank

Experim

ental

details

Mo

lecule

details

30SAS EMBO Course 2014

7. Interactivity

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 31: SASBDB Small Angle Scattering Biological Data Bank

8. Availability

SAS EMBO Course 2014 31

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 32: SASBDB Small Angle Scattering Biological Data Bank

8. Availability

SAS EMBO Course 2014 32

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 33: SASBDB Small Angle Scattering Biological Data Bank

8. Availability

• Possibility to log in using ATSAS account

• Submission form

• Users can choose between:– “on hold”

– “public”

33SAS EMBO Course 2014

1. Entries2. Cross links3. Searching4. Browsing5. Benchmark6. Plots7. Interactivity8. Availability

11/2/2014

Page 34: SASBDB Small Angle Scattering Biological Data Bank

Index

1. Introduction:– What is SAS?

– Do we need a SAS database?

2. SASBDB:– Features

– Usage

– Quality check

– Missing

3. Conclusions

34SAS EMBO Course 201411/2/2014

Page 35: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Usage

SAS EMBO Course 2014 35

More than 500 users from August 2014We are currently monitoring also search items and number of downloads

11/2/2014

Page 36: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Usage: use cases

11/2/2014 SAS EMBO Course 2014 36

SAS userSAS novice Article referee

Page 37: SASBDB Small Angle Scattering Biological Data Bank

11/2/2014 SAS EMBO Course 2014 37

SASBDB Usage: use cases

Page 38: SASBDB Small Angle Scattering Biological Data Bank

11/2/2014 SAS EMBO Course 2014 38

SASBDB Usage: use cases

Page 39: SASBDB Small Angle Scattering Biological Data Bank

11/2/2014 SAS EMBO Course 2014 39

SASBDB Usage: use cases

Page 40: SASBDB Small Angle Scattering Biological Data Bank

11/2/2014 40

SASBDB Usage: use cases

Page 41: SASBDB Small Angle Scattering Biological Data Bank

11/2/2014 41

SASBDB Usage: use cases

Page 42: SASBDB Small Angle Scattering Biological Data Bank

11/2/2014 42

SASBDB Usage: use cases

SAS EMBO Course 2014

Page 43: SASBDB Small Angle Scattering Biological Data Bank

11/2/2014 43

SASBDB Usage: use cases

Page 44: SASBDB Small Angle Scattering Biological Data Bank

11/2/2014 44

SASBDB Usage: use cases

SAS EMBO Course 2014

Page 45: SASBDB Small Angle Scattering Biological Data Bank

11/2/2014 45

SASBDB Usage: use cases

SAS EMBO Course 2014

Page 46: SASBDB Small Angle Scattering Biological Data Bank

11/2/2014 46

SASBDB Usage: use cases

SAS EMBO Course 2014

Page 47: SASBDB Small Angle Scattering Biological Data Bank

Index

1. Introduction:– What is SAS?

– Do we need a SAS database?

2. SASBDB:– Features

– Usage

– Quality check

– Missing

3. Conclusions

47SAS EMBO Course 201411/2/2014

Page 48: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Quality check:Difference Rg (Guinier) and Rg (p(r))

11/2/2014 SAS EMBO Course 2014 48

A B

Page 49: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Quality check:Difference Rg (Guinier) and Rg (p(r))

11/2/2014 SAS EMBO Course 2014 49

A B

Page 50: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Quality check:Difference MW (expected) and MW (experimental)

11/2/2014 SAS EMBO Course 2014 50

A B

Page 51: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Quality check:Quality p(r) distribution

11/2/2014 SAS EMBO Course 2014 51

A B

Page 52: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Quality check:Quality Guinier region

11/2/2014 SAS EMBO Course 2014 52

A B

Page 53: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Quality check:Quality of the fit

11/2/2014 SAS EMBO Course 2014 53

A B

Page 54: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Quality check:Quality of the data

11/2/2014 SAS EMBO Course 2014 54

A B

Page 55: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Quality check:Quality of the data

11/2/2014 SAS EMBO Course 2014 55

A B

Page 56: SASBDB Small Angle Scattering Biological Data Bank

SASBDB Quality check:

11/2/2014 SAS EMBO Course 2014 56

A B

A B

• Difference between structural parameters• Quality of the Guinier region• Quality of the p(r) distribution• Discrepancy between expected and experimental MW• Overall quality of the data• Goodness of fit of the model

Quality score based on the comparison between the selected entry and all the other entries.

Page 57: SASBDB Small Angle Scattering Biological Data Bank

Index

1. Introduction:– What is SAS?

– Do we need a SAS database?

2. SASBDB:– Features

– Usage

– Quality check

– Missing

3. Conclusions

57SAS EMBO Course 201411/2/2014

Page 58: SASBDB Small Angle Scattering Biological Data Bank

SASBDB: missing

Network of SAS databases

Validation/Quality check

Pipeline to compare

values

Assessment of the

angular range

Difference between

curves

Validation of models

Standard format

sasCIF

Submission interface

Automatic

SAS EMBO Course 2014 5811/2/2014

Page 59: SASBDB Small Angle Scattering Biological Data Bank

SASBDB: missing

Network of SAS databases

Validation/Quality check

Pipeline to compare

values

Assessment of the

angular range

Difference between

curves

Validation of models

Standard format

sasCIF

Submission interface

Automatic

SAS EMBO Course 2014 5911/2/2014

Berman, H., Henrick, K., Nakamura, H. and Markley, J.L. (2007) The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res., 35, D301–3.

Page 60: SASBDB Small Angle Scattering Biological Data Bank

SASBDB: missing

Network of SAS databases

Validation/Quality check

Pipeline to compare

values

Assessment of the

angular range

Difference between

curves

Validation of models

Standard format

sasCIF

Submission interface

Automatic

SAS EMBO Course 2014 60

Read, R.J., Adams, P.D., Arendall, W.B., III, Brunger, A.T., Emsley, P., Joosten, R.P., Kleywegt, G.J., Krissinel, E.B., Lutteke, T., Otwinowski, Z., Perrakis, A., Richardson, J.S., Sheffler, W.H., Smith, J.L., Tickle, I.J., Vriend, G., Zwart, P.H.. (2011) A new generation of crystallographic validation tools for the Protein Data Bank. Structure 19: 1395-1412.

11/2/2014

Page 61: SASBDB Small Angle Scattering Biological Data Bank

SASBDB: missing

Network of SAS databases

Validation/Quality check

Pipeline to compare

values

Assessment of the

angular range

Difference between

curves

Validation of models

Standard format

sasCIF

Submission interface

Automatic

SAS EMBO Course 2014 61

Franke, D., Kikhney, A.G. and Svergun, D.I. (2012) Automated acquisition and analysis of small angle X-ray scattering data. Nucl. Instruments Methods Phys. Res. Sect. A Accel. Spectrometers, Detect. Assoc. Equip., 689, 52–59.

11/2/2014

Page 62: SASBDB Small Angle Scattering Biological Data Bank

SASBDB: missing

Network of SAS databases

Validation/Quality check

Pipeline to compare

values

Assessment of the

angular range

Difference between

curves

Validation of models

Standard format

sasCIF

Submission interface

Automatic

SAS EMBO Course 2014 62

Konarev, P. and Svergun, D.I. (2014) Submitted.

11/2/2014

Page 63: SASBDB Small Angle Scattering Biological Data Bank

SASBDB: missing

Network of SAS databases

Validation/Quality check

Pipeline to compare

values

Assessment of the

angular range

Difference between

curves

Validation of models

Standard format

sasCIF

Submission interface

Automatic

SAS EMBO Course 2014 63

Franke, D., Jeffries, C.M. and Svergun, D.I. (2014) Submitted.

11/2/2014

Page 64: SASBDB Small Angle Scattering Biological Data Bank

SASBDB: missing

Network of SAS databases

Validation/Quality check

Pipeline to compare

values

Assessment of the

angular range

Difference between

curves

Validation of models

Standard format

sasCIF

Submission interface

Automatic

SAS EMBO Course 2014 64

Tuukkanen, A. and Svergun, D.I. (2015) In preparation.

11/2/2014

Page 65: SASBDB Small Angle Scattering Biological Data Bank

SASBDB: missing

Network of SAS databases

Validation/Quality check

Pipeline to compare

values

Assessment of the

angular range

Difference between

curves

Validation of models

Standard format

sasCIF

Submission interface

Automatic

SAS EMBO Course 2014 65

Malfois, M. and Svergun, D.I. (2000) sasCIF: an extension of core Crystallographic Information File for SAS. J. Appl. Crystallogr., 33, 812–816.

11/2/2014

Page 66: SASBDB Small Angle Scattering Biological Data Bank

SASBDB: missing

Network of SAS databases

Validation/Quality check

Pipeline to compare

values

Assessment of the

angular range

Difference between

curves

Validation of models

Standard format

sasCIF

Submission interface

Automatic

SAS EMBO Course 2014 66

Yang, H., Guranovic, V., Dutta, S., Feng, Z., Berman, H. M. & Westbrook, J. D. (2004). Automated and accurate deposition of structures solved by X-ray diffraction to the Protein Data Bank. Acta Cryst. D60, 1833-1839.

11/2/2014

Page 67: SASBDB Small Angle Scattering Biological Data Bank

Index

1. Introduction:– What is SAS?

– Do we need a SAS database?

2. SASBDB:– Features

– Usage

– Quality check

– Missing

3. Conclusions

67SAS EMBO Course 201411/2/2014

Page 68: SASBDB Small Angle Scattering Biological Data Bank

SASBDB: Conclusions

• With 100 entries and 163 models SASBDB is currently the largest repository of SAS data available.

• Entirely browsable according to different criteria.• Highly flexible search.• Embedded Javascript to display interactive 3D models.• Set of SAXS and WAXS data from “standard proteins”.• Cross links to other biological databases.• Aimed at different types of users• Several validation methods under development.• Development of the standard format: sasCIF.• Network of interconnected SAS databases.• Paper about SASBDB in N.A.R. 2015 Database issue.

68SAS EMBO Course 201411/2/2014

Page 69: SASBDB Small Angle Scattering Biological Data Bank

Thanks for your attention!

69SAS EMBO Course 201411/2/2014