Structure Assessment Introduction
The Model Results page of SWISS-MODEL provides the user with an essential, first glance view of a homology model showing ligands, global and local quality, target-template alignments.
The Structure Assessment page aims to provide more detailed structural information of those homology models, with alternative displays and by running additional software tools.
It is also possible to upload your own structures to be structurally assessed. The uploaded structure can be assessed against a known reference structure, if a reference structure is uploaded on the input page. This allows calculation of lDDT, the local difference distance test, as well as other comparison-to-reference scores.
Structure Assessment Input
SWISS-MODEL homology models can be sent directly to the Structure Assessment page. Open the drop down menu within the model details on the results page of your SWISS-MODEL project.
Upload your own structure in PDB / mmCIF format. It is possible to add several structures at a time in a tar or zip archive. If you are logged in, or if you provide an email, the new project will be associated with your account at SWISS-MODEL.
If you want to assess your model against a known reference structure, again PDB / mmCIF format files are allowed.
Projects will be deleted automatically from our servers after two weeks.
Membrane Prediction
An implicit solvation model implemented in OpenStructure (mol.alg.FindMembrane) estimates the optimal membrane location and performs a classification based on energetic and geometric criteria.
The original algorithm and the used energy function are described by Lomize et al.
Ramachandran Plots
A Ramachandran plot is a way to visualize energetically favoured regions for backbone dihedral angles against of amino acid residues in protein structure.
To determine the contours of favoured regions, data was extracted from 12,521 non redundant experimental structures (pairwise sequence identity cutoff 30%, X-ray resolution cutoff 2.5Å) as culled from PISCES. Histograms with a binning of 4 degrees were then used to count Φ (Phi; C-N-CA-C) / Ψ (Psi; N-CA-C-N) occurences for all displayed categories. The number of observed Φ / Ψ pairs determines the contour lines.
General (No Proline or Glycine) | |
Ψ | ![]() |
Φ |
Glycine Only | |
Ψ | ![]() |
Φ |
Proline Only | |
Ψ | ![]() |
Φ |
Pre-Proline Only | |
Ψ | ![]() |
Φ |
- 99.7% are within the first contour line
- 95.0% are within the second contour line
- 80.0% are within the third contour line
MolProbity
MolProbity is a structure-validation web service that provides evaluation of model quality at both the global and local levels for both proteins and nucleic acids.
The SWISS-MODEL Structure Assessment page runs MolProbity version 4.4 as available from https://github.com/rlabduke/MolProbity.
The Structure Assessment page hopes to show the most relevant scores provided by Molprobity and help the user easily identify where residues of low quality lie in their model or structure. A table of results is presented to the user. For scores-per-residue (or residue-pair), the residues are sorted in decreasing order of quality, so that the lowest quality residue (or residue-pair) is presented first. A tooltip provides the score for the residue/pair.
Score | Comment | Ideal Case |
---|---|---|
MolProbity Score | Combined protein quality score that reflects the crystallographic resolution at which such a quality would be expected | As low as possible |
Clash Score | Clashes show > 0.45Å non-H-bond | Zero |
Ramachandran Favoured | > 98% | |
Ramachandran Outliers | At resolutions below 3.0Å, any outliers should be considered errors. | < 0.2% |
Rotamer Outliers | At resolutions below 3.0Å, any outliers should be considered errors. | < 1% |
C-Beta Deviations | Position deviates from ideal by > 0.25Å | Zero |
Bad Bonds | > 4σ deviations from ideal | Zero |
Bad Angles | > 4σ deviations from ideal | Zero |
Cis/Twisted Prolines/Non-Prolines | < 30° from ideal defined as CIS; >150° from ideal defined as Twisted | Zero |
Before MolProbity is run, it should be noted that the following steps are performed on the users uploaded structure.
Uploaded Structure |
If mmCIF, converted to PDB, first biounit is selected |
Chains with < 15 residues are discarded |
MolProbity |
Quality Estimate
QMEAN (Studer et al.) is a composite estimator based on different geometrical properties and provides both global (i.e. for the entire structure) and local (i.e. per residue) absolute quality estimates on the basis of one single model. Please read the SWISS-MODEL help and for full details, please refer to the publication.
For SWISS-MODEL Homology models, QMEAN will have been calculated during the normal modelling pipeline process. For uploaded structures, QMEAN will be started as a separate job using the QMEANDisCo method.
Residue Quality
The residues found in all protein polypeptide chains of the model are displayed in an interactive sequence display. Each residue is displayed by its one letter code below a bar chart displaying the QMEAN local quality estimation value.
The secondary structure of the protein, either DSSP or PSIPRED, is displayed above each residue code as a single letter. DSSP is calculated using the DSSP implementation in OpenStructure.
- B = residue in isolated β-bridge
- C = loop or irregular
- E = extended strand, participates in β ladder
- G = 3-helix (310 helix)
- H = α-helix
- I = 5 helix (π-helix)
- T = hydrogen bonded turn
- S = bend
To change representation between DSSP and PSIPred, click the icon to the top left of the sequence. Setting the choice of secondary structure will also update the structure in the 3D Viewer.
Comparison to Reference
Chain Mapping
Automatic detection of a one-to-one relationship of reference and model chains.
The mapped chains are displayed in a table, non-mapped chains are displayed separately if present.
In detail: to be considered, chains must have at least 10 residues for peptides, 4 for nucleotides. The chain mapping is a three step process: 1) Group chemically identical chains in reference using pairwise alignments and a sequence identity threshold of 95%. 2) Map model chains to these groups based on pairwise alignments. 3) Obtain one-to-one mapping by enumerating possible chain assignments and optimize for QS-score.
Chain mapping comes with factorial complexity. If reference and model are homo-octamers, there are 8!=40320 possible mappings. If the number of chains in model and reference are ≤ 12, the full solution space is enumerated and optimal QS-score is guaranteed. For larger complexes, a greedy heuristic is used instead.
Comparison Scores
lDDT
The local distance difference test (Mariani et al.). lDDT assesses the differences in interatomic distances between model and reference structure and is thus inherently superposition free. As opposed to the original publication, lDDT displayed here natively supports oligomers as well as RNA/DNA. Interatomic distances across interfaces get treated as any other pairwise distance.
The global lDDT for the full model is displayed in the result table. Per-residue lDDT values are available in the interactive sequence display which also allows to map lDDT as a color gradient onto the model in the 3D viewer by clicking on the icon. If stereochemical irregularities are present, they're displayed right underneath the global lDDT score.
In detail: Each interatomic distance ≤15Å in the reference structure is compared with its model counterpart. lDDT computes the fraction of distance differences below a threshold d. The finally reported value is the average over 4 fractions computed with thresholds [0.5, 1.0, 2.0, 4.0].
Some residues are symmetric, i.e. allow different mappings that are chemically equivalent. An example are the OD1/OD2 atoms in aspartic acid (ASP). Symmetries can additionally be found in GLU, LEU, VAL, ARG, PHE, TYR. lDDT first pre-processes each residue that contains symmetric atoms and computes lDDT of the symmetric atoms with respect to all fixed atoms (i.e. all atoms from other residues in the whole reference that are not symmetric). lDDT1: with the original mapping, lDDT2: with swapped mapping (ASP example: OD1 becomes OD2 and vice versa). Whatever mapping scores higher makes it into the final lDDT computation.
Furthermore, lDDT considers stereochemistry. Prior to lDDT computations, the model is checked for serious stereochemical irregularities. Clashes: atoms that are closer than their van der Waals radii minus a tolerance of 1.5Å. Bad bonds: Bond lengths that are more than 12 standard deviations from expected lengths. Bad angles: Bond angles that are more than 12 standard deviations from expected angles. With bond length/angle statistics from CCP4 MON_LIB. lDDT is reduced by removing the full sidechain if any of the sidechain atoms is involved by such irregularities or by removing the full residue if the backbone is involved. The latter results in per-residue lDDT of 0.0. Backbone definition for amino acids: (N, CA, C, O), for nucleotides: (P, OP1, OP2, OP3, O5', C5', C4', C3', C2', C1', O4', O3', O2').
QS-score / QS-best
QS-score (Bertoni et al.) quantifies the similarity between interfaces as a function of shared interface contacts. As opposed to many interface comparison tools, it does not process single interfaces but compares the full complexes at once. QS-score thereby discriminates between alternative quaternary structures and binding modes. QS-best is a variant which will be described below.
Bertoni et al. introduce a 4 bin classification scheme Incorrect (QS-score<0.1), Acceptable (0.1≤QS-score<0.3), Medium (0.3≤ QS-score<0.7) and High (QS-score≥0.7) that resemble the interface quality classification in CAPRI (Lensink et al.). QS-score and QS-best occurences are colored accordingly.
In detail:
- QS-score = weighted_scores / (weight_sum + weight_extra_all)
- QS-best = weighted_scores / (weight_sum + weight_extra_mapped)
- weighted_scores = sum(w(min(d1,d2)) * (1 - abs(d1-d2)/12)) for shared
- weight_sum = sum(w(min(d1,d2))) for shared
- weight_extra_mapped = sum(w(d)) for all mapped but non-shared
- weight_extra_all = sum(w(d)) for all non-shared
- w(d) = 1 if d ≤ 5Å, exp(-2 * ((d-5.0)/4.28)^2) else
- d: CB-CB (CA for GLY, C3' for nucleotides) distance of an inter-chain contact (d1, d2 for shared contacts).
- mapped: we could map chains of two structures and align the respective residues
- shared: pairs of residues which are mapped and have inter-chain contact in both structures.
- inter-chain contact: CB-CB pairs (CA for GLY, C3' for nucleotides) with distance ≤ 12Å
- w(d): weighting function (prob. of 2 res. to interact given distance) from Xu et al.
In words: QS-score assesses distance differences of inter-chain contacts that are present in both, reference and model, and can be mapped (see inter-chain contact and mapped definitions above). Their contribution is weighted by the probablity that they actually interact with each other (see w(d) definition above). QS-score is lowered by any inter-chain contact that is only present in one of the structures by adding the respective interaction probability to the denominator (see weight_extra_all formula above). This has two consequences: 1) QS-score is symmetric, i.e. it does not matter which structure is the model and which structure is the reference 2) Different quaternary structures or additional interfaces/contacts in general have a negative impact on QS-score.
The QS-best variant: Only considers inter-chain contacts between residues that can be mapped between model and reference (see inter-chain contact and mapped definitions above). That only considers the part of the structure that is present in both, the model and the reference, and can be useful if the model is a substructure of the reference, or if the reference is not fully resolved.
DockQ-wave
DockQ-wave is a variant of the DockQ score (Basu et al.). DockQ strictly processes single interfaces and aims to provide a continuous protein-protein docking model quality measure derived by combining 3 measures commonly used in CAPRI (Lensink et al.): Fnat, LRMS and iRMS. DockQ-wave scores full complexes and is a weighted average of per-interface DockQ scores. Weights are derived from the number of native contacts (contacts in reference) of each interface.
Basu et al. introduce a 4 bin classification scheme Incorrect (DockQ<0.23), Acceptable (0.23≤DockQ<0.49), Medium (0.49≤ DockQ<0.8) and High (DockQ≥0.8) that resemble the interface quality classification in CAPRI (Lensink et al.). DockQ-wave and DockQ occurences are colored accordingly.
In detail: DockQ operates on single interfaces between two chains and is based on the following three quality measures common in CAPRI:
- Fnat: Fraction of native contacts that are conserved in model. A contact is defined as a pair of residues in distinct chains that have at least one heavy atom within 5 Å of each other
- LRMS: RMSD based on N, CA, C, O atoms. The larger reference chain is considered receptor whereas the smaller is the ligand. The full model is superposed onto the reference based on the mapped receptor positions using a Kabsch superposition. LRMS is defined as the RMSD of the resulting ligand positions.
- iRMS: RMSD based on N, CA, C, O atoms. The reference interface is defined as all residues participating in contacts as for Fnat but with increased distance threshold of 10Å. iRMS is defined as the minimum RMSD between these interface positions and the respective model positions as computed with a Kabsch superposition.
DockQ(Fnat, LRMS, iRMS, d1, d2)=(Fnat+RMSscaled(LRMS,d1)+RMSscaled(iRMS,d2))/3
with:
- d1: 8.5Å
- d2: 1.5Å
- RMSscaled(RMS,di)=1/(1+(RMS/di)2)
Oligo-GDTTS
Oligo-GDTTS is conceptually similar to GDT as defined in Zemla with distance cutoffs 1, 2, 4 and 8.
Uses Cα (C3' for nucleotides) positions of all mapped residues to first derive a Kabsch superposition. Upon superposition, these position are used to derive the oligo-GDTTS score. This may be susceptible to outliers.
RMSD
Uses Cα (C3' for nucleotides) positions of all mapped residues to to derive an RSMD using a Kabsch superposition. This may be susceptible to outliers.
Interface Scores
Evaluation on a per-interface basis. Lists all pairs of chains in the reference and their respective counterparts in the model with non-zero native contacts as defined for Fnat. Displays:
- Reference: Pair of reference chains - interactively hooked up with 3D viewer
- Model: Same for the respective model chains
- Contacts: Number of native contacts as defined for Fnat
- DockQ: DockQ as described above
- QS-score: QS-score as described above but only considering that pair of chains
Mismatching Residues
The described scores rely on sequence alignments to establish a residue-residue relationship between model and reference. In case of mismatches (>5% of aligned columns) we display a warning. Backbone only scores (QS-score, DockQ, Oligo-GDTTS, RMSD) may still be informative in this case but we leave it to the user to check whether the displayed alignments are sensible. Full atomic scores on the other hand expect all atoms of the reference to be present in the model. In case of mismatches, atoms beyond the backbone are different and the scores become invalid. This is the case for lDDT. The score is still displayed but greyed out.