Local Distance Difference Test

lDDT web server documentation

Introduction

Many structural biology techniques and protocols require a measurement of the structural similarity between two proteins. Traditionally, such comparisons are carried out using superposition-based similarity scores (Global Distance Test; GDT) on Cα atoms. However, GDT scores, being based on rigid-body superposition, cannot account for changes of relative domain orientation in multi-domain proteins, requiring each domain to be compared separately. Additionally, as a Cα measure they do not account for accuracy differences of non-Cα atoms, which constitute almost 90% of all atoms in a protein model.

To overcome these limitations, in our assessment of the 9th instalment of the CASP experiment we introduced the local Distance Difference Test (lDDT) score, which evaluates how well inter-atomic distances in a reference protein structure are reproduced in an second structure that is compared to it [1]. Being superposition-free, the lDDT score can be used to compare multi-domain structures without any prior processing. Furthermore, due to its focus on the conservation of the chemical environment including all atoms, it naturally lends itself to local comparisons of functionally relevant regions of proteins (i.e. binding sites or interaction surfaces). In [1] we introduced an improved version of the lDDT score, which includes stereo-chemical quality checks and allows the use of multiple reference structures at the same time. A protein model can, for example, be compared directly with an ensemble of NMR structures, without the need to define a single representative structure.This version of the lDDT score was used during the assessment of the 10th instalment of CASP.

This website allows you to calculate the global and local lDDT scores for one or more protein structures, using single or multiple references, and performing stereo-chemical checks on the input structures. Reasonable options are used by default, but can be changed as desired by advanced users. These notes will help you use the website and will describe in detail the options available at every step of the calculation.

Overview

The Local Distance Difference Test (lDDT)

The Local Distance Difference Test score measures how well local interactions in a reference structure are conserved the protein model being assessed. It is computed considering distances between all pairs of atoms in the reference structure lying at a distance closer than a predefined threshold (called inclusion radius), and not belonging to the same residue. A distance is considered conserved in the model being evaluated if it has, within a tolerance threshold, the same length as in the reference. If the atoms defining the distance are not present in the model, the distance is considered not conserved. The lDDT score is computed over a range of threshold values. For each threshold, the fraction of conserved distances is computed. The final score is the average fraction of conserved distances over four tolerance thresholds: 0.5, 1, 2, and 4 Å. Local lDDT scores can also be computed. They are computed on a per-residue basis, and represent the average fraction of conserved distances that involve atoms of the residue

The Local Distance Difference Test can be computed against multiple structures of the same protein at the same time. The score is then computed using a set of distances determined from all the reference structures. The set includes all pairs of corresponding atoms which, in all structures where they appear, lie at a distance closer than a reference threshold . From each atom pair, the minimum and the maximum distances observed across all the reference structures are used to determine distance conservation and are compared with the distance between the corresponding atoms in the structure being evaluated. The distance is considered conserved if its length falls within the interval defined by the minimum and the maximum reference distances, or if it lies outside of the interval by less than a predefined length threshold.

At the user’s option, the calculation of the Local Distance Difference test can also be preceded by a series of structure quality checks. These checks make the final lDDT score representative not only of the accuracy with which the target structure was modeled, but also of the general stereo-chemical quality of the model, and of his physical plausibility. The checks are performed in two steps. During the first step, all bond lengths and all angle widths are compared with reference values defined by Engh and Huber for each amino-acid type [2]. When they diverge from the average reference value by more than a predefined number of standard deviations (12 by default), a stereo-chemical violation is detected. During the second step of the structure quality checks, all distances between pairs of non-bonded atoms in the model being evaluated are compared with the sums of the corresponding Van der Waals atomic radii [3]. The two atoms of the pair are not considered to clashing when the distance between them is shorter than the sum of the atomic radii by more than 1.5 Å.

The lDDT Web Server

This website allows you to calculate the global and local lDDT scores for:

  • One or more models with respect to a single reference structure
  • One or more models with respect to multiple structures used as reference at the same time

For both models and references, the website can take as input:

  • A PDB file containing a single structure
  • A PDB file containing multiple structures separated by TER tags
  • A compressed archive containing one or more PDB files, each with single or multiple structures

After pre-processing each input structure to prepare it for the calculation, the website returns information on the global and local lDDT scores, and on the general stereo-chemical quality of the model, if the user chooses to perform quality checks. In addition to being shown on the output web-page, this information is also made available for download in the form of a compressed archive.

Using the Web-Server

Uploading the Input Structures

On the web-site home page, the Model and Reference entries can be used to upload the test and reference files respectively. For each entry, the user can click on the Choose File button and select the relevant file. The following file types are allowed:

  • files in PDB format containing a single structure
  • files in PDB format containing several structures separated by TER records
  • Compressed archives with one of the following extensions:
    • zip
    • gz
    • tar.gz
    • bz2
    • tar.bz2
    • tgz
    • tar
  • Compressed archives can contain one of the following (any different content will generate an error):
    • A single PDB file
    • Several PDB files
    • A single folder containing one or more PDB files

Gzipped files in PDB format are correctly processed by the web-server, but only if their extension is strictly 'pdb.gz'. Non-compressed files in PDB format will instead be recognized irrespective of their extension. Hidden files and files whose name begins with a dot will be ignored

Please notice that the lDDT score has been designed to compare only single chains. The web server will only compare the first chain in the model structure with the first chain in the reference structure, irrespective of their names. Furthermore, before computing the score, the web server checks that residue names in the structures being compared are the same. This requirement is also enforced when using multiple reference structures at the same time. Residues with the same number in all references should all have the same name. Furthermore, only chains amino-acids liked by peptide bonds are considered for the calculation, everything else (water, ligands, DNA) is ignored.

Email Address and Job Name

After choosing the structures to upload, the user can enter an email address and a job name on the main page of the web-server. An email containing a link to the Results page of the lDDT score calculation is sent to the provided address, with the job name as part of the subject line. Aside from bookmarking the Results page in the user's browser, the link provided in the email is the only way to access the Results page after closing the browser's window where the calculation has been started. The result of each lDDT score calculation is stored on the server for 48 hours, then deleted.

Preprocessing

The uploaded structures are preprocessed before the calculation begins. By default, the lDDT score is computed only on structures containing standard residues in their unmodified form. The preprocessing guarantees that the uploaded structures meet these requirements. Specifically:

  • All non-standard residues are removed from the structure
  • Hydrogen atoms are removed
  • Terminal OXT atoms are removed
  • Atoms that do not belong to the unmodified form of each specific residue type are removed

These default preprocessing steps guarantee that the lDDT score can be calculated on any structure uploaded by the user.

lDDT Calculation

After uploading the input structures, clicking on the Run lDDT button in the bottom left starts the calculation. By default the lDDT score is computed:

  • Performing stereo-chemical checks
  • Using a tolerance of 12 standard deviations for bond length and angle width violations
  • Checking distances with an inclusion radius of 15 Å

An advanced user can modify the default parameters by clicking on the lDDT - Advanced Options entry on the main web-page. Available options are described in detail in the next paragraphs

Advanced lDDT Options

This paragraph describes in detail all the parameters of the Local Distance Difference Test that can be changed by the user.

Inclusion Radius

The local Distance Difference Test evaluates distances between atoms lying closer than a pre-determined inclusion radius. The default value of 15 Å can be changed by editing this field. For an explanation of the choice of the default value, see [1]

Perform structural checks on the input data

When this option is ticked, some stereo-chemical and steric clashing checks are carried out during the calculation of the lDDT score. The lengths of all bonds, and the widths of all angles in the input test structure are compared with average values derived by Engh and Huber and published in [2]. When bond lengths and angle widths deviate from the tabulated values by more than a predefined number of standard deviations, a violation is detected. Distances between non-bound atoms are also compared to the sum of their atomic radii as defined in the Cambridge Structural Database, and if they are found to be shorter than the sum by more than 1.5 Å, a violation is detected. When a violation is found in the side-chain of a residue, all distances involving atoms of that side-chain are automatically considered as non-conserved during the calculation of the lDDT score. When a violation involves the backbone atoms of a residue, all distances containing atoms of the residue are considered non-conserved. The final lDDT score thus reflects the stereo-chemical quality of the test model, as well as its accuracy.

Tolerance for bond and angle deviations (in standard deviations)

These two parameters are used only when structural checks are performed during the calculation of the lDDT score. When bond lengths and angle widths deviate from the ideal values by more than the number of standard deviations specified by these parameters, a serious stereo-chemical violation is detected. See the description of the previous parameter for more details.

Fault tolerant loading of the input structures

This parameters allows PDB files with mild format inconsistencies to be processed by the lDDT server (For example, it prevents structures presenting duplicate atoms from being rejected). By default this parameter is ticked. Unticking it results in a more strict processing of the input structures, with a higher number of rejected structures.

Sequence Separation

By default, the lDDT score is computed considering distances between atoms belonging to different residues, but no minimal sequence separation between the two residues is required. When this parameter is set to a value different than 0, only distances between atoms in residues separated by at least the specified number of sequence positions are considered. This parameter can be used, for example, to downweight the importance of highly structured regions in the calculation of the lDDT score. Regions with rigid and tightly packed secondary structure tend to show a higher number of conserved distances irrespective of the accuracy of the model. See [2] for more details.

Atoms with zero occupancy

Removes from the calculation all atoms which are present in the structure, but whose Occupancy column in the PDB file reports a value of 0.

The Waiting Page

While the lDDT scores for the uploaded structures are being calculated, a waiting page is shown to the user. A log of the operations carried out by the server is shown in real time. If it turns out to be too verbose, the log can be closed by clicking on the Log header at the top of the page.

The Results Page

The waiting page automatically loads the results page when the calculation is done. Alternatively, the page can be reached by clicking on the link received by the user via email when a valid address is provided on the web server's main page. The results page remains available for 48 hours after the completion of the score calculation, and it is then deleted from the server.

The Results section occupies most of the page. At the top, a link allows the user to download a zipped archive containing the results of the calculation. Below the link, the page shows the information about the refence structure (or structures) used for the calculation. A static picture of the reference structure is shown, colored according to the identity of secondary structure elements (red for helices and yellow for sheets). If the user uploaded multiple references, the picture shows a sequence-based superposition of all references used for the calculation. Clicking on the picture allows the user to zoom-in. By clicking on the PNG link below the figure, the user can download a copy. Clicking on the 3D link in a java-enabled browser opens a window showing an interactive representation of the reference structure (or structures). Finally, clicking on the PDB entry allows the user to download a PDB-format file of the reference structure (or structures) shown in the picture.

Below the reference information, the Results page shows the outcome of the lDDT score calculation for all uploaded structures, in four columns

The first column on the left, called Model name, lists all the structures for which the lDDT score was computed. The structure name corresponds to name of the uploaded file. If a file contains several structures, a progressive number is appended to the file name to denote each structure, in the order in which it appears in the file. As all structures are preprocessed before entering the lDDT score calculation, all names listed in this column end with the string '_preprocessed'. Clicking on each structure's name allows the user to download a file in PDB format containing the preprocessed structure, as used to compute the lDDT score.

The second column, called Global lDDT, shows the value of the global lDDT score. As explained before this is the average number of conserved distances in the structure over four different tolerance thresholds: 0.5 Å, 1 Å, 2 Å, and 4 Å

The third column, called Local lDDT plot, shows a plot of the local lDDT scores vs. the residue number. The local lDDT scores are computed on a per-residue basis, and represent the average number of conserved distances that involve atoms of the residue. Clicking on the image allows the user to zoom it up, while clicking on the PNG link below the image allows the user to download a copy of the plot. The link marked as TABLE allows the user to download a tab-separated file describing in detail the outcome of the lDDT score calculation. Both the global and local lDDT scores are listed in therein, plus information about coverage of the reference structure, stereo-chemical quality of the model, etc.

Finally, the fourth column, Local lDDT - 3D, shows a static picture of the 3D structure of the molecule, colored according to the values of the local LDDT score. Once again, clicking on the picture allows the user to zoom it up, while clicking on the PNG link downloads a copy of the picture to the user's computer. Finally, clicking on the 3D link in a java-enabled browsers opens a window showing an interactive representation of the 3D structure, which can be rotated, manipulated, etc.

At the bottom of the page the user can still access the Log section, which is identical to the one in the waiting page. This section is closed by default, but can be opened by clicking on the header: the Results section closes automatically and the log messages are shown. Clicking back on the header of the Results section reverses the operation

The zipped archive that can be downloaded at the top of the page contains a folder for each structure. Each folder has the same name as the structure with the string '_lddt' added at the end, and contains:

  • The file in PDB format with the preprocessed structure
  • The tab-separated-value file with the result of the lDDT score calculation
  • The plot of the local lDDT scores vs the residue numbers
  • The static image with the 3D structure colored according to the local lDDT scores

Frequently asked questions

Why doesn't the server show the superposition between the model and the reference structures?

The local Distance Difference Test is inherently a non-superposition based score, hence the server doesn't show a superposition of the two structures. Actually, one of the reasons to design this score was exactly to overcome the problems of superposition-based similarity scores. Please see [2] for more details.

I uploaded a PDB-format file with multiple chains. Why is the lDDT score only computed for the first chain?

The lDDT score has been designed to compare only single chains. The web server will compare the first chain in the model structure with the first chain in the reference structure only, irrespective of their names. Furthermore, only chains amino-acids liked by peptide bonds are considered for the calculation, everything else (water, ligands, DNA) is ignored

The web server is complaining about inconsistencies in residue names. What is going on?

Before computing the lDDT score the web server checks that residue names in the structures being compared are the same. If this is not true, if, for example, the web-server is requested to compare an Tryptophan with an Arginine, the server throws an error and does not proceed with the calculation. This requirement is also enforced when using multiple reference structures at the same time. Residues with the same number in all reference structures should all have the same name.

References

  1. Mariani, V., F. Kiefer, et al. (2011). Assessment of template based protein structure predictions in CASP9. Proteins 79 Suppl 10: 37-58.

  2. Engh, R. A. and R. Huber (2006). Structure quality and target parameters. International Tables for Crystallography, John Wiley & Sons, Ltd.

  3. Allen, F. H. (2002). The Cambridge Structural Database: a quarter of a million crystal structures and rising. Acta Crystallogr B 58 (Pt 3 Pt 1): 380-388.

If you use local distance difference test (lDDT), please cite the following reference:
  • Mariani V., Biasini M., Barbato A., and Schwede T. (2013). lDDT: A local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics. Bioinformatics 29 (21), 2722-2728. PubMed