Protein structure.pdf

6
Protein structure Protein structure Protein structure is the biomolecular structure of a protein molecule. Proteins are polymers – specifically polypeptides -sequences formed from various L-α-amino acids. Each unit of a protein is called an amino acid residue because it is the residue of every amino acid that forms the protein by losing a water molecule. By con- vention, a chain under 40 residues is often identified as a peptide, rather than a protein. To be able to perform their biological function, proteins fold into one or more spe- cific spatial conformations, driven by a number of non- covalent interactions such as hydrogen bonding, ionic in- teractions, Van der Waals forces, and hydrophobic pack- ing. To understand the functions of proteins at a molec- ular level, it is often necessary to determine their three- dimensional structure. This is the topic of the scientific field of structural biology, which employs techniques such as X-ray crystallography, NMR spectroscopy, and dual polarisation interferometry to determine the structure of proteins. Protein structures range in size from tens to several thou- sand residues. [1] By physical size, proteins are classified as nanoparticles, between 1–100 nm. Very large aggre- gates can be formed from protein subunits. For exam- ple, many thousands of actin molecules assemble into a microfilament. A protein may undergo reversible structural changes in performing its biological function. The alternative struc- tures of the same protein are referred to as different conformations, and transitions between them are called conformational changes. 1 Levels of protein structure Protein structure, from primary to quaternary structure. There are four distinct levels of protein structure. 1.1 Amino acid residues Main article: Amino acid Main article: Proteinogenic amino acid Each α-amino acid consists of a backbone part that is present in all the amino acid types, and a side chain that is unique to each type of residue. An exception from this rule is proline. Because the carbon atom is bound to four different groups it is chiral, however only one of the isomers occur in biological proteins. Glycine however, is 1

Transcript of Protein structure.pdf

Page 1: Protein structure.pdf

Protein structure

Protein structure

Protein structure is the biomolecular structure of aprotein molecule. Proteins are polymers – specificallypolypeptides -sequences formed from various L-α-aminoacids. Each unit of a protein is called an amino acidresidue because it is the residue of every amino acid thatforms the protein by losing a water molecule. By con-vention, a chain under 40 residues is often identified as apeptide, rather than a protein. To be able to perform theirbiological function, proteins fold into one or more spe-cific spatial conformations, driven by a number of non-covalent interactions such as hydrogen bonding, ionic in-teractions, Van der Waals forces, and hydrophobic pack-ing. To understand the functions of proteins at a molec-ular level, it is often necessary to determine their three-dimensional structure. This is the topic of the scientificfield of structural biology, which employs techniques suchas X-ray crystallography, NMR spectroscopy, and dualpolarisation interferometry to determine the structure ofproteins.Protein structures range in size from tens to several thou-sand residues.[1] By physical size, proteins are classifiedas nanoparticles, between 1–100 nm. Very large aggre-gates can be formed from protein subunits. For exam-ple, many thousands of actin molecules assemble into amicrofilament.A protein may undergo reversible structural changes inperforming its biological function. The alternative struc-tures of the same protein are referred to as differentconformations, and transitions between them are calledconformational changes.

1 Levels of protein structure

Protein structure, from primary to quaternary structure.

There are four distinct levels of protein structure.

1.1 Amino acid residues

Main article: Amino acidMain article: Proteinogenic amino acid

Each α-amino acid consists of a backbone part that ispresent in all the amino acid types, and a side chain thatis unique to each type of residue. An exception fromthis rule is proline. Because the carbon atom is bound tofour different groups it is chiral, however only one of theisomers occur in biological proteins. Glycine however, is

1

Page 2: Protein structure.pdf

2 1 LEVELS OF PROTEIN STRUCTURE

not chiral since its side chain is a hydrogen atom. A sim-ple mnemonic for correct L-form is “CORN": when theCα atom is viewed with the H in front, the residues read“CO-R-N” in a clockwise direction.

1.2 Primary structure

Main article: Protein primary structure

The primary structure of a protein refers to the linear se-quence of amino acids in the polypeptide chain. The pri-mary structure is held together by covalent bonds suchas peptide bonds, which are made during the process ofprotein biosynthesis or translation. The two ends of thepolypeptide chain are referred to as the carboxyl termi-nus (C-terminus) and the amino terminus (N-terminus)based on the nature of the free group on each extremity.Counting of residues always starts at the N-terminal end(NH2-group), which is the end where the amino groupis not involved in a peptide bond. The primary structureof a protein is determined by the gene corresponding tothe protein. A specific sequence of nucleotides in DNA istranscribed into mRNA, which is read by the ribosome ina process called translation. The sequence of amino acidswas discovered by Frederick Sanger. The sequence of aprotein is unique to that protein, and defines the structureand function of the protein. The sequence of a protein canbe determined by methods such as Edman degradation ortandem mass spectrometry. Often, however, it is read di-rectly from the sequence of the gene using the geneticcode. We know that there are over 10,000 proteins in thehuman body which are composed of different arrange-ments of 20 types of amino acid residues. It is strictlyrecommended to use the words “amino acid residues”when discussing proteins because when a peptide bondis formed, a water molecule is lost, so proteins are madeup of amino acid residues. Post-translational modifica-tion such as disulfide bond formation, phosphorylationsand glycosylations are usually also considered a part ofthe primary structure, and cannot be read from the gene.For example, insulin is composed of 51 amino acids in 2chains. One chain has 31 amino acids, and the other has20 amino acids.

1.3 Secondary structure

Main article: Protein secondary structure

Secondary structure refers to highly regular local sub-structures. Two main types of secondary structure, thealpha helix and the beta strand or beta sheets, were sug-gested in 1951 by Linus Pauling and coworkers.[2] Thesesecondary structures are defined by patterns of hydrogenbonds between themain-chain peptide groups. They havea regular geometry, being constrained to specific valuesof the dihedral angles ψ and φ on the Ramachandran plot.

An alpha-helix with hydrogen bonds (yellow dots)

Both the alpha helix and the beta sheet represent a wayof saturating all the hydrogen bond donors and acceptorsin the peptide backbone. Some parts of the protein areordered but do not form any regular structures. Theyshould not be confused with random coil, an unfoldedpolypeptide chain lacking any fixed three-dimensionalstructure. Several sequential secondary structures mayform a "supersecondary unit".[3]

1.4 Tertiary structure

Main article: Protein tertiary structure

Page 3: Protein structure.pdf

2.1 Structural domain 3

Tertiary structure refers to the three-dimensional struc-ture of a single, double, or triple bonded proteinmolecule.The alpha-helixes and beta pleated-sheets are folded intoa compact globular structure. The folding is driven bythe non-specific hydrophobic interactions, the burial ofhydrophobic residues from water, but the structure is sta-ble only when the parts of a protein domain are lockedinto place by specific tertiary interactions, such as saltbridges, hydrogen bonds, and the tight packing of sidechains and disulfide bonds. The disulfide bonds are ex-tremely rare in cytosolic proteins, since the cytosol (in-tracellular fluid) is generally a reducing environment.

1.5 Quaternary structure

Main article: Protein quaternary structure

Quaternary structure is the three-dimensional structureof a multi-subunit protein and how the subunits fit to-gether. In this context, the quaternary structure is sta-bilized by the same non-covalent interactions and disul-fide bonds as the tertiary structure. Complexes of twoor more polypeptides (i.e. multiple subunits) are calledmultimers. Specifically it would be called a dimer ifit contains two subunits, a trimer if it contains threesubunits, a tetramer if it contains four subunits, and apentamer if it contains five subunits. The subunits are fre-quently related to one another by symmetry operations,such as a 2-fold axis in a dimer. Multimers made up ofidentical subunits are referred to with a prefix of “homo-" (e.g. a homotetramer) and those made up of differentsubunits are referred to with a prefix of “hetero-", for ex-ample, a heterotetramer, such as the two alpha and twobeta chains of hemoglobin.

2 Domains, motifs, and folds inprotein structure

Protein domains. The two shown protein structures share a com-mon domain (maroon), the PH domain, which is involved inphosphatidylinositol (3,4,5)-trisphosphate binding

Proteins are frequently described as consisting of sev-eral structural units. These units include domains, motifs,and folds. Despite the fact that there are about 100,000different proteins expressed in eukaryotic systems, there

are many fewer different domains, structural motifs andfolds.

2.1 Structural domain

A structural domain is an element of the protein’s overallstructure that is self-stabilizing and often folds indepen-dently of the rest of the protein chain. Many domainsare not unique to the protein products of one gene or onegene family but instead appear in a variety of proteins.Domains often are named and singled out because theyfigure prominently in the biological function of the pro-tein they belong to; for example, the "calcium-bindingdomain of calmodulin". Because they are independentlystable, domains can be “swapped” by genetic engineeringbetween one protein and another to make chimera pro-teins.

2.2 Structural and sequence motif

The structural and sequence motifs refer to short seg-ments of protein three-dimensional structure or aminoacid sequence that were found in a large number of dif-ferent proteins.

2.3 Supersecondary structure

The supersecondary structure refers to a specific com-bination of secondary structure elements, such as beta-alpha-beta units or a helix-turn-helix motif. Some ofthem may be also referred to as structural motifs.

2.4 Protein fold

A protein fold refers to the general protein architecture,like a helix bundle, beta-barrel, Rossman fold or different“folds” provided in the Structural Classification of Pro-teins database.[4]

2.5 Superdomain

A superdomain consists of two or more nominally unre-lated structural domains that are inherited as a single unitand occur in different proteins.[5] An example is providedby the protein tyrosine phosphatase domain and C2 do-main pair in PTEN, several tensin proteins, auxilin andproteins in plants and fungi. The PTP-C2 superdomainevidently came into existence prior to the divergence offungi, plants and animals is therefore likely to be about1.5 billion years old.

Page 4: Protein structure.pdf

4 7 REFERENCES

3 Protein folding

Main article: Protein folding

Once translated by a ribosome, each polypeptide foldsinto its characteristic three-dimensional structure from arandom coil.[6] Since the fold is maintained by a networkof interactions between amino acids in the polypeptide,the native state of the protein chain is determined by theamino acid sequence (Anfinsen’s dogma).[7]

4 Protein structure determination

Examples of protein structures from the PDB

Around 90% of the protein structures available in theProtein Data Bank have been determined by X-ray crys-tallography. This method allows one to measure thethree-dimensional (3-D) density distribution of electronsin the protein, in the crystallized state, and thereby inferthe 3-D coordinates of all the atoms to be determined toa certain resolution. Roughly 9% of the known proteinstructures have been obtained by nuclear magnetic reso-nance techniques. The secondary structure compositioncan be determined via circular dichroism. Vibrationalspectroscopy can also be used to characterize the confor-mation of peptides, polypeptides, and proteins.[8] Cryo-electron microscopy has recently become a means of de-termining protein structures to high resolution, less than5 angstroms or 0.5 nanometer, and is anticipated to in-crease in power as a tool for high resolution work inthe next decade. This technique is still a valuable re-source for researchers working with very large protein

complexes such as virus coat proteins and amyloid fibers.A more qualitative picture of protein structure is oftenobtained by proteolysis, which is also useful to screen formore crystallizable protein samples. Novel implementa-tions of this approach, including fast parallel proteolysis(FASTpp), can probe the structured fraction and its sta-bility without the need for purification.[9]

5 Structure classification

Protein structures can be grouped based on their similar-ity or a common evolutionary origin. The Structural Clas-sification of Proteins database[10] and CATH database[11]provide two different structural classifications of pro-teins. Shared structure between proteins is considered ev-idence of evolutionary relatedness between proteins andis used group proteins together into protein superfami-lies.[12]

6 Computational prediction of pro-tein structure

Main article: Protein structure prediction

The generation of a protein sequence is much easier thanthe determination of a protein structure. However, thestructure of a protein gives muchmore insight in the func-tion of the protein than its sequence. Therefore, a num-ber of methods for the computational prediction of pro-tein structure from its sequence have been developed.[13]Ab initio prediction methods use just the sequence ofthe protein. Threading and homology modeling meth-ods can build a 3-D model for a protein of unknownstructure from experimental structures of evolutionarily-related proteins, called a protein family.

7 References

[1] Brocchieri L, Karlin S (2005-06-10). “Protein length ineukaryotic and prokaryotic proteomes”. Nucleic AcidsResearch 33 (10): 3390–3400. doi:10.1093/nar/gki615.PMC 1150220. PMID 15951512.

[2] Pauling L, Corey RB, Branson HR (1951). “The structureof proteins; two hydrogen-bonded helical configurationsof the polypeptide chain”. Proc Natl Acad Sci USA 37 (4):205–211. doi:10.1073/pnas.37.4.205. PMC 1063337.PMID 14816373.

[3] Chiang YS, Gelfand TI, Kister AE, Gelfand IM(2007). “New classification of supersecondary struc-tures of sandwich-like proteins uncovers strict patternsof strand assemblage.”. Proteins. 68 (4): 915–921.doi:10.1002/prot.21473. PMID 17557333.

Page 5: Protein structure.pdf

5

[4] Govindarajan S, Recabarren R, Goldstein RA.(17 September 1999). “Estimating the to-tal number of protein folds.”. Proteins. 35(4): 408–414. doi:10.1002/(SICI)1097-0134(19990601)35:4<408::AID-PROT4>3.0.CO;2-A.PMID 10382668.

[5] Haynie DT, Xue B (2015). “Superdomain in the proteinstructure hierarchy: the case of PTP-C2.”. Protein Sci-ence. doi:10.1002/pro.2664. PMID 25694109.

[6] Alberts, Bruce; Alexander Johnson; Julian Lewis; MartinRaff; Keith Roberts; Peter Walters (2002). “The Shapeand Structure of Proteins”. Molecular Biology of the Cell;Fourth Edition. New York and London: Garland Science.ISBN 0-8153-3218-1.

[7] Anfinsen, C. (1972). “The formation and stabilization ofprotein structure”. Biochem. J. 128 (4): 737–49. PMC1173893. PMID 4565129.

[8] Krimm, Samuel; Bandekar, J. (1986). “Vibrational Spec-troscopy and Conformation of Peptides, Polypeptides, andProteins”. Advances in Protein Chemistry. Advances inProtein Chemistry 38 (C): 181–364. doi:10.1016/S0065-3233(08)60528-8. ISBN 9780120342389.

[9] Minde DP, Maurice MM, Rüdiger SG (2012).Uversky, Vladimir N, ed. “Determining biophys-ical protein stability in lysates by a fast proteoly-sis assay, FASTpp”. PLoS ONE 7 (10): e46147.doi:10.1371/journal.pone.0046147. PMC 3463568.PMID 23056252.

[10] Murzin, A. G.; Brenner, S.; Hubbard, T.; Chothia, C.(1995). “SCOP: A structural classification of proteinsdatabase for the investigation of sequences and struc-tures”. Journal of Molecular Biology 247 (4): 536–540.doi:10.1016/S0022-2836(05)80134-2. PMID 7723011.

[11] Orengo, C. A.; Michie, A. D.; Jones, S.; Jones, D. T.;Swindells, M. B.; Thornton, J. M. (1997). “CATH--a hierarchic classification of protein domain structures”.Structure (London, England : 1993) 5 (8): 1093–1108.doi:10.1016/S0969-2126(97)00260-8. PMID 9309224.

[12] Holm, L; Rosenström, P (July 2010). “Dali server: con-servation mapping in 3D.”. Nucleic Acids Research 38(Web Server issue): W545–9. doi:10.1093/nar/gkq366.PMID 20457744.

[13] Zhang Y (2008). “Progress and challenges in proteinstructure prediction”. Curr Opin Struct Biol 18 (3): 342–348. doi:10.1016/j.sbi.2008.02.004. PMC 2680823.PMID 18436442.

8 Further reading• 50 Years of Protein Structure Determination Time-line - HTML Version - National Institute of GeneralMedical Sciences at NIH

Page 6: Protein structure.pdf

6 9 TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

9 Text and image sources, contributors, and licenses

9.1 Text• Protein structure Source: http://en.wikipedia.org/wiki/Protein%20structure?oldid=649694311 Contributors: AxelBoldt, Ghakko, Lexor,

Ahoerstemeier, Darkwind, Lfh, Ike9898, Chris 73, Sverdrup, Academic Challenger, Graeme Bartlett, NeoJustin, Dmb000006, AlanAu, Christopherlin, Klemen Kocjancic, Thorwald, Mike Rosoft, Rich Farmbrough, Nina Gerlach, SocratesJedi, Foolip, Konstantin,Bobo192, Cmdrjameson, Password, Arcadian, Kjkolb, Alansohn, Etxrge, Moleculesoflife, Theyeti, Wtmitchell, Gene Nygaard, K3rb, Re-cury, RyanGerbil10, LadyofHats, SeventyThree, Jfx319, Grammarbot, Rjwilmsi, Smoe, Yamamoto Ichiro, Elmer Clark, Malenkylizards,YurikBot, Mushin, Team6and7, Tralala, Splette, Gaius Cornelius, Pseudomonas, Thane, ENeville, Kkmurray, Dtemp, Phgao, CWenger,Curpsbot-unicodify, Banus, DVD R W, SmackBot, Pavlovič, Kjaergaard, Gilliam, Bluebot, RDBrown, Deli nk, Miguel Andrade, Can'tsleep, clown will eat me, OrphanBot, DMacks, Tim Ross, Clicketyclack, SashatoBot, Mikelr, Euchiasmus, Dwpaul, ClarkFreifeld, Loodog,Wheedhee, Beetstra, Saganatsu, Beno1000, Cryptic C62, Shrimp wong, CmdrObot, Sir Vicious, A876, WillowW, Anonymi, Carstensen,Omicronpersei8, Opabinia regalis, Rasmusw, Speedyboy, Michael A.White, AntiVandalBot, Pro crast in a tor, TimVickers, Qwerty Binary,Sluzzelin, Timur lenk, Iulus, Berky, Blueskylab, DerHexer, JaGa, Hbent, MartinBot, J.delanoy, Trusilver, Hodja Nasreddin, Gurchzilla,Jcwf, Belovedfreak, Antony-22, Abaighv, Edwardzou, Pdcook, Rkirian, CardinalDan, TXiKiBoT, A4bot, Albval, Jackfork, Winged-submariner, Alexbateman, Enviroboy, Lynnbridgebook, Langtucodoc, Flyer22, Kochipoik, MadmanBot, Retama, Kanonkas, ImageR-emovalBot, Webridge, ClueBot, Jan1nad, Akjohnson, Niceguyedc, Jordell 000, NuclearWarfare, Vriend, Zlacroix, La Pianista, Floul1,AC+79 3888, Apparition11, Alchemist Jack, XLinkBot, Chymæra, Addbot, Element16, Friginator, Debresser, Ginosbot, LinkFA-Bot,LuK3, Luckas-bot, Yobot, Fraggle81, Shalley303, Choij, Law, Grafened, Materialscientist, Citation bot, Biophysik, Vijaykumarutkam,P99am, Fuzzball24816, Pravinhiwale, Q31245, Dcrjsr, Shadowjams, Much noise, Fdardel, Sms1371, ROBE0191, OgreBot, Citation bot1, Pinethicket, Jujutacular, Stefano Garibaldi, Jesse V., RjwilmsiBot, Mandolinface, Tommymac10, Tommy2010, Swanandgore, Math-ghamhainn, Donner60, ChuispastonBot, Last Lost, 28bot, ClueBot NG, Lalo1121, Calabe1992, BG19bot, Roberticus, MusikAnimal,GKFX, PRKelleher, Bigegar, ChrisGualtieri, Dexbot, Holger87, Makecat-bot, David P Minde, AmaryllisGardener, Axelgriewel, Jakec,Evolution and evolvability, Ginsuloft, ML19962, Jkirby754, Monkbot, Bubbles11264, Superdomain and Anonymous: 310

9.2 Images• File:Alpha_helix.png Source: http://upload.wikimedia.org/wikipedia/commons/7/75/Alpha_helix.png License: CC-BY-SA-3.0 Contrib-utors: en:Alpha.png Original artist: Zsolt Bikadi / en:User:Bikadi

• File:Domain_Homology.png Source: http://upload.wikimedia.org/wikipedia/commons/1/19/Domain_Homology.png License: CC BY-SA 3.0 Contributors: Own work Original artist: Fdardel

• File:Main_protein_structure_levels_en.svg Source: http://upload.wikimedia.org/wikipedia/commons/c/c9/Main_protein_structure_levels_en.svg License: Public domain Contributors: Own work based on what i could get. in between others:[1], [2], [3], [4], [5], [6],[7],[8]. Original artist: LadyofHats

• File:Protein_structure.png Source: http://upload.wikimedia.org/wikipedia/commons/0/05/Protein_structure.png License: CC BY-SA3.0 Contributors: Own work Original artist: Holger87

• File:Protein_structure_examples.png Source: http://upload.wikimedia.org/wikipedia/commons/2/24/Protein_structure_examples.pngLicense: CC BY-SA 3.0 Contributors: Own work Original artist: Axel Griewel

• File:Question_book-new.svg Source: http://upload.wikimedia.org/wikipedia/en/9/99/Question_book-new.svg License: Cc-by-sa-3.0Contributors:Created from scratch in Adobe Illustrator. Based on Image:Question book.png created by User:Equazcion Original artist:Tkgd2007

9.3 Content license• Creative Commons Attribution-Share Alike 3.0