SWISS-MODEL Repository (SMR) | Help

Introduction

The SWISS-MODEL Repository is a database of annotated three-dimensional comparative protein structure models generated by the fully automated homology-modelling pipeline SWISS-MODEL. The repository is developed at the Biozentrum Basel within the Swiss Institute of Bioinformatics.

The repository currently contains three-dimensional models for sequences from UniProtKB generated using the automated SWISS-MODEL homology modelling server pipeline. Experimental structures of proteins are mapped between the PDB and UniProtKB using SIFTS. The content of the repository is updated on a regular basis incorporating new sequences, taking advantage of new template structures becoming available, and reflecting improvements in the underlying modelling algorithms. The current data status is given on the entry page.

If you are using models from the SWISS-MODEL Repository, please cite the following articles:

Bienert S, Waterhouse A, de Beer TA, Tauriello G, Studer G, Bordoli L, Schwede T (2017). The SWISS-MODEL Repository - new features and functionality Nucleic Acids Res. 45(D1):D313-D319.

Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, Kiefer F, Cassarino TG, Bertoni M, Bordoli L, Schwede T (2014). SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information Nucleic Acids Research 2014 (1 July 2014) 42 (W1): W252-W258

Kiefer F, Arnold K, Künzli M, Bordoli L, Schwede T (2009). The SWISS-MODEL Repository and associated resources. Nucleic Acids Res. 37, D387-D392.

Kopp J, and Schwede T (2006). The SWISS-MODEL Repository: new features and functionalities. Nucleic Acids Res.,34, D315-D318.

Repository main page

The Repository can be searched with UniProt accession codes (UniProt AC). Alternatively, users can search SMR using a free text search on protein names, functional description and organisms. The main page offers some statistics about the Repository in general as well as statistics about the core species which are updated regularly when the PDB releases new structures: the number of the sequences in a given proteome, the number of sequences with at least one model, the total number of models and a sequence coverage plot. The sequence coverage vertical bars show the coverage for every protein in the reference proteome. Different colours (dark green to red boxes) represent the coverage of the targets. Targets with high coverage are represented in dark green (more than 80% of the target's length is covered by models), whereas low coverage is shown in red. The size of each box is proportional to the number of target sequences with a given coverage (For E. Coli for instance, more than 60% of the proteins are covered by more than 80% over their sequence length by structures and models, as of october 2016).

Download allows to download all the metadata about models and structures of the given model core species (see help file in the downloaded folder as well as the chapter about SMR API below).

Model species page

The model species page consists of six elements (details are explained on the page itself). i) On top is a small general description of the species. ii) The coverage plot, as shown on the main page. iii) Evolution of structural information over the years.. iv) Summary of the quality of the models. v) Oligomeric state of the models. vi) Download for the metadata of models and structures.

Description of the various sections of the SMR web interface

Graphical representation of the target coverage by models and experimental structures

This View shows a schematic representation of the UniProtKB sequence of the protein in question. The sequence is shown in grey with its UniProt AC. All other features of the sequence are annotated on this sequence arc. If there are models or structures they are shown below the UniProtKB sequence. The boxes on each track show the coverage for each structure/model with dotted blue outlines indicating a model and solid green lines a PDB structure. The sequence arc also shows other sequence annotations such as transmembrane helices, InterPro domains and alternative splice variants. By mousing over them, the specific sequence details are revealed. The showing/hiding of sequence annotations are linked to both the sequence arc as well as the protein structure viewer. When a specific feature is shown, it will appear on the arc and be highlighted in the relevant section of the 3D structure.

For some sequences there might be a large number of structures available. In order to only show relevant information, the structures are clustered based on their coverage and oligomeric state. Thus, one track in the sequence arc might be listed as a monomer, but this monomer cluster contains more than one structure.

Sequence Features

All identified features can be seen on the graphical representation of the target protein (see above) as well as a dropdown list. The different features are colour coded. Clicking on the feature name shows all features in the structure.

Model/structure detailed information

In this section we show information regarding the selected model/structure. In the case of a PDB structure, we show the PDB id, the title, a link to download the biological assembly and links to external protein structure databases. In the case of an homology model, we indicate the template the model was based on, the sequence identity and sequence similarity (based on the BLOSUM62 substitution matrix) between target and template sequence alignment, and a global quality score GMQE (Global Model Quality Estimation). GMQE is a quality estimation which combines properties from the target-template alignment and QMEAN score (see Model Quality for details). The resulting GMQE score is expressed as a number between 0 and 1, reflecting the expected accuracy of a model built with that alignment and template. Higher numbers indicate higher reliability. The date on which the model was last updated and a link to download the model coordinates are also provided.

Model Quality

Each model in SMR is evaluated by QMEAN to provide model quality measures on a per-residue basis as well as a global scale. "QMEAN (Benkert et al.) is a composite scoring function for the estimation of the global and local model quality. QMEAN consists of four structural descriptors: The local geometry is analysed by a torsion angle potential over three consecutive amino acids. Two pairwise distance-dependent potentials are used to assess all-atom and C-beta interactions. A solvation potential describes the burial status of the residues.

In a nutshell a score of 0 means that the quality is estimated to be comparable to other high resolution structures of similar size. If the score is positive, it is predicted to be of better quality and vice versa a negative value means the model is predicted to be of lower quality.

Expanding “Model Quality” provide plots of the estimated local quality of each part of the model as well as how the model compares to other structures in the PDB.

Local estimates of the model quality based on the QMEAN scoring function are shown as per-residue plot. Each residue is assigned a reliability score between 0 and 1, describing the expected similarity to the native structure. Higher numbers indicate higher reliability of the residues.

The comparison with non-redundant set of PDB structures helps to identify outliers, i.e. models with expected quality which are not comparable to experimental structures.

3D Structure viewer

Next to the model/structure information there is a 3D protein structure viewer implemented using PV, an interactive JavaScript/WebGL based 3D structure viewer. Any model/structure that is selected in the sequence arc will be displayed here. Any features that are shown/hidden will also affect this display. The user can interact with the 3D representation by using the mouse buttons to rotate, translate and zoom. The buttons below the viewer allow for some extra functionality such as taking a snapshot or running pv in a separate window.

Sequence Alignments

Below the graphical representation of the target protein, the sequence alignment between the target and the template (or experimental structure) sequences is shown.

The alignment can be coloured in various ways by clicking on the cog icon. The colouring will also change the 3D view and the sequence arc. In the same menu the alignments can be downloaded, either in FASTA, Clustal format or as a picture.

List of Experimental Structures and Homology Models

The All Models button allows you to see all the models/structures for the specific UniProtKB entry. If there are more than one model/structure in a cluster, it will be shown here. By clicking on specific models, it will be shown in the 3D viewer.

Colouring

Clicking on the Colours button below the 3D viewer or in the alignment section, allows the user to colour the sequence/structure based on many parameters as well as export the alignment or an image of the alignment. The sequence can be coloured based on different parameters like a per residue model quality score (QMEAN).

No models or structure available

If no structure or model is available for the UniprotKB entry, SWISS-MODEL Repository can automatically add a Model by clicking the "Build Models" Button. Model building takes typically no more than 15 minutes.

Please note that models with a QMEAN score below -5 (generally a sign for a low quality model) are not added to the repository.

If you prefer to interactively explore various models build on different templates, please use the link to the SWISS-MODEL server.

SMR API

The new SWISS-MODEL Repository is continuously updated to provide you with the best available model/experimental structure for a given UniProt AC.

Due to this dynamic environment, there are no fixed URLs to specific versions of models, and indeed models/structures may no longer exist if the best available template has been improved at any moment.

To find out what is currently available in SMR, first get a list of available models and / or experimental structures in JSON format (here Uniprot AC P07900 is used as an example).

From the list of available models, you will be able to find which model/structure you are interested in based on QMEAN, SeqId, range etc and fetch the coordinates using the URL in value of "coordinates". As previously stated, the continuous update of SMR means this model may not exist, so no specific model ID is available. The API will allow you to get the best model for your query. Specifying provider, template and range should be enough to return the exact model.

An example URL would look like https://swissmodel.expasy.org/repository/uniprot/P07900.pdb?from=14&to=223&template=2k5b&provider=pdb

If no model is found for your requested coordinates, a 404 Not Found error will be raised.