4 Tertiary Protein Structure and folds
Chapters 1 and 2 introduced a-helices and b-sheets (Secondary Struture), and some common "motifs" composed of 2 or 3 of these elements (Super-secondary Struture). Tertiary structure describes the folding of the polypeptide chain to assemble the different secondary structure elements in a particular arrangement. As helices and sheets are units of secondary structure, so the domain is the unit of tertiary structure. In multi-domain proteins, tertiary structure includes the arrangement of domains relative to each other as well as that of the chain within each domain.
There is a blurred distinction between "super-secondary structure" and "tertiary structure". The introduction of the term "super-secondary structure" was necessary when it became clear that certain arrangements of two or three secondary structures are present in many different protein structures, even with completely different sequences.
Note that some proteins do not consist of an assembly of these super-secondary motifs. For example, proteins of the globin family consist of eight a-helices in contact, but the helices do not pack against other helices which are adjacent in the sequence, with the exception of the final two, which form an anti-parallel helix-turn-helix motif.
Although the term "motif" is often used to describe super-secondary structures (e.g. Branden and Tooze, 1991), it may also be used to describe a consensus sequence of amino acids identified in a number of different proteins, rather than a repeated three-dimensional conformation. Such a consensus in primary structure generally implies a similarity in tertiary structure. But bear in mind that there are very many protein sequences of which the three dimensional structures are not known for certain, so that the term "motif" strictly applies to primary rather than supersecondary or tertiary structure in these cases.
4.2 All-a topologies
4.2.1 The lone helix
There are a number of examples of small proteins (or peptides) which consist of little more than a single helix. A striking example is alamethicin, a transmembrane voltage gated ion channel, acting as a peptide antibiotic.
4.2.2 The helix-turn-helix motif
The simplest packing arrangement of a domain of two helices is for them to lie antiparallel, connected by a short loop. This constitutes the structure of the small (63 residue) RNA-binding protein Rop , which is found in certain plasmids (small circular molecules of double-stranded DNA occurring in bacteria and yeast) and involved in their replication. There is a slight twist in the arrangement as shown.
4.2.3 The four-helix bundle
The four-helix bundle is found in a number of different proteins. In many cases the helices part of a single polypeptide chain, connected to each other by three loops. However, the Rop molecule is in fact a dimer of two of the two-helix units shown above.
In four-helix-bundle proteins the interfaces between the helices consist mostly of hydrophobic residues while polar side chains on the exposed surfaces interact with the aqueous environment, as indicated below:
Compare this with the arrangement of residues that would be expected in a membrane-spanning helical domain. The central helices of the photosynthetic reaction centre in fact are arranged similar to the four- helix bundle.
Other examples exhibit a much more open packing arrangement, as in the steroid-binding proteins uteroglobin, and Clara cell 17kDa protein.
The four helices may be arranged in a simple up-and-down topology, as indicated. A good example is myohemeyrthrin.
Others are cytochrome c'
and cytochrome b-562.
A more complex arrangement, such as ferritin is possible:
A number of cytokines consist of four a-helices in a bundle. Here is a diagram of Interleukin-2, human Growth Hormone, Granulocyte-macrophage colony-stimulating factor (GM-CSF) and Interleukin-4.
126.96.36.199 a domains which bind DNA
Transcription factors are proteins which bind to control regions of DNA. These regions are "upstream" of the structural gene (the sequence which actually codes for a protein) whose transcription they regulate. Transcription factors have a DNA-binding domain and a domain that activates transcription.
The RNA-binding two-helix protein Rop has already been mentioned. A three-helix bundle forms the basis of a DNA-binding domain which occurs in a number of proteins- for example homeodomain proteins. Examine the crystal structure of engrailed homeodomain binding to DNA.
And here is the structure of the cro repressor from phage 434.
The globin fold usually consists of eight a-helices. The two helices at the end of the chain are anti-parallel, forming a helix-turn-helix motif, but the remainder of the fold does not include any characterised super-secondary structures. These helices pack against each other with larger angles, around 50 °, between them than occurs between antiparallel helices (approximately 20°). See the section below on helix-helix packing. Jane Richardson (1981) describes the globin fold as a "Greek key helix bundle", due to the topological similarity with the Greek key arrangement of anti-parallel b-sheets (see section 4.3 on all b topologies).
In all, fifty-six categories of "mostly a" folds are listed in the Structural Classification of Proteins database. A number of the entries have links to appropriate diagrams by Manuel Peitsch.
4.2.4 Helix-helix packing
When alpha-helices pack against each other, the side-chains in their interface are buried. The two interface areas should have complementary surfaces. The surface of an a-helix can be thought of as consisting of grooves and ridges, like a screw thread: for instance, the side chains of every 4th residue form a ridge (because there are 3.6 residues per turn). The direction of this ridge is 26° from the direction of the helix axis. Therefore if 2 helices pack such that such a ridge from each fits into the other's groove, the expected angle between the two is 52°. In fact, in the distribution of this angle between packed alpha-helices, there is a sharp peak at 50°. Besides the type of ridge described, ridges can be formed by other stacking patterns of residues, such as every 3rd residue, or indeed every residue. Which ridges are used for packing depends on the size and conformations of the side chains at these relative positions. The "i+4" ridge is believed to be the most common because residues at every 4th position have side-chains which are more closely aligned than in "i+3" or "i+1" ridges as indicated below.
Two other types of packing do occur, however : between an "i+4" ridge and an "i+3" ridge (there is an angle of 23° between the 2 helix axes) and between an "i+4" and an "i+1" ridge (the helices are 105° apart). The "ridges and grooves" model does not describe all the helix-helix packings, as there are examples with unusual inter-axial angles. For instance in the globin fold a pair of helices (B and E) pack such that their ridges cross each other, by means of a notch formed at a pair of glycine residues.
Below is a diagrams of the notch in the ridges of helices B and H:
and here is one for a slice through a space-filling model of the two helices packing against each other.
The inter-axial distance between packed helices varies from 6.8-12.0Å, the mean being 9.4 Å; the mean inter-penetration of atoms at the interface is 2.3Å. Therefore it is mainly side chains which make the contacts between the helices.
4.2.5 Other distinctive all-a proteins include :
- Annexin V
- Glutathione S-transferase
- Calmodulin- and Parvalbumin-like calcium-binding proteins
4.3 All-b topologies
Protein folds which consist of almost entirely b sheets exhibit a completely or mostly antiparallel arrangement. Many of these anti-parallel domains consist of two sheets packed against each other, with hydrophobic side chains forming the interface. Bearing in mind that side chains of a b-strand point alternately to opposite sides of a sheet, this means that such structures will tend to have a sequence of alternating hydrophobic and polar residues.
4.3.1 b sandwiches and b barrels
The immunoglobulin fold the strands form two sheets packed against each other, forming a "b sandwich".
188.8.131.52 Aligned and orthogonal b sandwiches
In the immunoglobulin and fibronectin type-3 folds, the two sheets are approximately aligned. In fact the mean angle between the 2 sheets is approximately 30° (designated -30° because the uppermost sheet is rotated clockwise with respect to the lower). The two sheets are usually independent in that the linking residues between them are not in b sheet conformation. The angle between the sheets is determined by their right-handed twist. The observed angle varies between -20° and -50°; this is due to variation in the twist. Also side-chains are not always ideally aligned at the interface.
Orthogonal b sheet packings consist of b sheets folded on themselves; the two sheets make an angle of -90°. The strands at one corner or 2 diagonally opposite corners go uninterrupted from one layer to the other. Local coiling at the corner or a b bulge facilitates the right-angled bend. These bends are right-handed, due to permitted F and Y angles. The figure below illustrates this model.
Only along one diagonal do the two sheets make contact. Large side-chains in loops usually fill the spaces between the splayed corners.
Diagram of this b-sheet arrangement in the Lipocalin family, which bind small molecules between the sheets of the sandwich.
184.108.40.206 b barrels
Some antiparallel b-sheet domains are better described as b-barrels rather than b-sandwiches, for example streptavadin and porin. Note that some structures are intermediate between the extreme barrel and sandwich arrangements.
4.3.2 Up-and-down antiparallel b sheets
The simplest topology for an antiparallel b-sheet involves loops connecting adjacent strands.
220.127.116.11 The Greek Key topology
The Greek Key topology, named after a pattern that was common on Greek pottery, is shown below. Three up-and-down b-strands connected by hairpins are followed by a longer connection to the fourth strand, which lies adjacent to the first.
Folds including the Greek key topology have been found to have 5-13 strands. An example is given below.
- Plastocyanin. Notice that this has a mixed sheet- there are two parallel pairs of strands.
- Gamma-crystallin Gamma-crystallin has two domains each of which is an eight- stranded B-barrel-type structure composed of two Greek keys. In fact, the structure is more accurately described as consisting of two B-sheets, one consisting of strands 2,1,4,7 (white) and the other of strands 6,5,8,3 (red) as indicated in the diagram. Sequence homology has been found between the two Greek key motifs within each domain, and also between the two domains themselves. The latter homology is higher than the former; this implies that the structure evolved from a single Greek key fold by means of a gene duplication to produce a domain of two Greek keys, followed by a second duplication resulting in two similar domains. This is supported by the fact that in some crystallins each Greek key motif is coded by a different exon, with introns between them.
18.104.22.168 The Jellyroll Topology
Richardson(1981) describes the jellyroll fold as being formed by the addition of an extra "swirl" to a Greek key:
Click here for a diagram illustrating this fold in the coat protein of satellite tobacco necrosis virus.
One molecule is composed of four of these subunits. Each is a superbarrel of six four-stranded antiparallel sheets. The whole structure has a basically up-down topology and is called a b-propellor.
This fold has an approximately 3-fold axis of symmetry.
This dramatically unusual fold was discovered quite recently. The b-strands wind round the structure describing a helical topology.
4.4 a/b topologies
The most regular and common domain structures consist of repeating b-a-b supersecondary units, such that the outer layer of the structure is composed of a helices packing against a central core of parallel b-sheets. These folds are called a/b , or wound a b.
Many enzymes, including all those involved in glycolysis , are a/b structures. Most a/b proteins are cytosolic.
The b-a-b is always right-handed. In a/b structures, there is a repetition of this arrangement, giving a b-a-b-a.....etc sequence. The b strands are parallel and hydrogen bonded to each other, while the a helices are all parallel to each other, and are anitparallel to the strands. Thus the helices form a layer packing against the sheet.
The b-a-b-a-b subunit, often present in nucleotide-binding proteins, is named the Rossman Fold, after Michael Rossman (Rao and Rossman,1973).
Richardson (1981) names the a/b structures "parallel a/b domains", to denote the fact that each of the 2 secondary structures forms a parallel arrangement. Note that there is no obvious reason why one would not expect to find "parallel all a" (a-a-a subunit) folds, or "parallel all b" (b-b-b) folds in equally large numbers, but these do not occur. However, the marked tendency for helices to pack aligned with sheets has been explained by the "complementary twist" model (Chothia et al. , 1977). The right-handed twist of b sheets and the right-handed twist of the row of every 4th residue of the helices (the "i+4" ridges"- see section 4.2.4 on helix-helix packing) mean that the two have complementary surfaces when aligned. This model is supported by the observation that approximately 90% of the helix residues which interface with a sheet are indeed a multiple of 4 residues apart. Helices packing side by side on a sheet would have helices rotated with respect to each other, due to the sheet twist; the observed interhelical angle is in agreement with this model in 80% of cases. In the other cases the helices are splayed from the sheet, with only one end in contact.
4.4.1 a/b horseshoe
The structure of the remarkable placental ribonuclease inhibitor (Kobe, B. & Diesenhofer, J. (1993 ) Nature V.366, 751) takes the concept of the repeating a/b unit to extremes. It is a cytosolic protein that binds extremely strongly to any ribonuclease that may leak into the cytosol. Look at the image below and you will see the 17-stranded parallel b sheet curved into an open horseshoe shape, with 16 a-helices packed against the outer surface. It doesn't form a barrel although it looks as though it should. The strands are only very slightly slanted, being nearly parallel to the central `axis'.
Consider a sequence of eight b-a motifs:
If the first strand hydrogen bonds to the last, then the structure closes on itself forming a barrel-like structure. This is shown in the picture of triose phosphate isomerase.
Note that the "staves" of the barrel are slanted, due to the twist of the b sheet. Also notice that there are effectively four layers to this structure. The direction of the sheet does not change (it is anticlockwise in the diagram). Such a structure may therefore be described as singly wound.
In a structure which is open rather than closed like the barrel, helices would be situated on only one side of the b sheet if the sheet direction did not reverse. Therefore open a/b structures must be doubly wound to cover both sides of the sheet.
The chain starts in the middle of the sheet and travels outwards, then returns to the centre via a loop and travels outwards to the opposite edge:
Doubly-wound topologies where the sheet begins at the edge and works inwards are rarely observed.
4.4.3 Alpha+Beta Topologies
This is where we collect together all those folds which include significant alpha and beta secondary structural elements, but for which those elements are `mixed', in the sense that they do NOT exhibit the wound alpha-beta topology. This class of folds is therefore referred to as a+ b
Thus we see that this class includes:-
- Bacterial and mammalian pancreatic ribonucleases.
- Histidine-Carrier protein.
- Cysteine proteases such as papain and actinidin.
- Zinc Metallo-proteases.
- SH2 domains.
- Protein G (prokaryotic Ig-binding) in blue.
- FK506 binding protein (peptidyl-prolyl isomerase).
- Carbonic anhydrase.
- Serine protease inhibitor (Serpins).
- Thymidylate synthase.
4.5 Small disulphide-rich folds
Here we see a few examples of the main families of small disulphide-rich domains of known structure. The members of these families contain a large number of disulphide bonds which stabilise the fold.
- Serine proteinase inhibitor
- Sea anemone toxin (NMR structure)
- EGF-like domain
- Complement C-module domain
- Wheat Plant Toxin; Naja (Cobra) neurotoxin; green Mamba anticholinesterase.
- Kringle domain