SWISS-MODEL Help

Introduction to SWISS-MODEL

SWISS-MODEL is a web-based integrated service dedicated to protein structure homology modelling. It guides the user in building protein homology models at different levels of complexity.

Building a homology model comprises four main steps: (i) identification of structural template(s), (ii) alignment of target sequence and template structure(s), (iii) model-building, and (iv) model quality evaluation. These steps require specialised software and integrate up-to-date protein sequence and structure databases. Each of the above steps can be repeated interactively until a satisfying modelling result is achieved.


The SWISS-MODEL Workspace

The SWISS-MODEL Workspace (Waterhouse et al.) is a personal web-based working environment, where several modelling projects can be carried out in parallel. Protein sequence and structure databases necessary for modelling are accessible from the workspace and are updated in regular intervals. Tools for template selection, model building, and structure quality evaluation can be invoked from within the workspace directly or via the web page menu.

From the workspace, the user accesses the current modelling projects, inspects their status and visualises the results upon job completion. Project names can be changed retroactively by clicking on the symbol next to the project title. Alternatively, the project title can also be changed by double-clicking on the title when the project results are displayed. By default, projects are stored for two weeks on the server with an option to extend the project lifetime. The remaining time until a given project is deleted from the server is displayed accordingly.


Model Building

Models are computed by the SWISS-MODEL server homology modelling pipeline (Waterhouse et al.) and (Bordoli et al.) which relies on ProMod3, an inhouse comparative modelling engine based on OpenStructure (Biasini et al.).

ProMod3 extracts initial structural information from the template structure. Insertions and deletions, as defined by the sequence alignment, are resolved by first searching for viable candidates in a structural database. Final candidates are then selected using statistical potentials of mean force scoring methods. If no candidates can be found, a conformational space search is performed using Monte Carlo techniques. Non-conserved side chains are modelled using the 2010 backbone-dependent rotamer library from the Dunbrack group (Shapovalov et al.). The optimal configuration of rotamers is estimated using the graph-based TreePack algorithm (Xu et al.) by minimising the SCWRL4 energy function (Krivov et al.). As a final step, small structural distortions, unfavourable interactions or clashes introduced during the modelling process are resolved by energy minimisation. ProMod3 uses the OpenMM library (Eastman et al.) to perform the computations and the CHARMM27 force field (Mackerell et al.) for parameterisation.


Modelling Modes

Depending on the difficulty of the modelling task, three different types of modelling modes are provided, which differ in the amount of user intervention: automated mode, alignment mode, and project mode.


Automated Mode

The Automated Mode only requires the amino acid sequence or the UniProtKB accession code of the target protein as input.

The automatic pipeline identifies suitable templates based on BLAST (Camacho et al.), and HHblits (Remmert et al.). The resulting templates are ranked according to the expected quality of the resulting models (see the Template Ranking section for more details). Top-ranked templates and alignments are compared to verify whether they represent alternative conformational states or cover different regions of the target protein. In this case, multiple templates are selected automatically and different models are built accordingly.

This mode is subject to continuous evaluation within the Continuous Automated Model Evaluation (CAMEO) platform (Haas et al.).

Please note that it is unnecessary to run automated mode by pressing "Build Model" and afterwards start the project again and "Search for Templates" only. Both options start the same template search, which is also accessible in the first case, once the models are built.


Alignment Mode

If the desired template for modelling is known and available in the SWISS-MODEL Template Library (SMTL), a target–template alignment in either FASTA or Clustal format may be used to start the modelling process, thereby skipping the template search.

The template sequence(s) should be named using the PDB ID format (i.e. “1CNR” or “1CNR_A”). The user will be asked to specify which sequence in the alignment corresponds to the target and/or the template protein from a drop-down list.

The Alignment mode allows the advanced user to invoke the modelling step starting from alternative alignments and to evaluate the quality of these alternative models.

>THN_DENCL
KSCCPTTAARNQYNICRLPGTPRPVCAALSGCKIISGTGCPPGYRH-
>1crnA
TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN-



Project Mode

In difficult modelling situations, where the correct alignment between target and template cannot be clearly determined by sequence-based methods, visual inspection and manual manipulation of the alignment can help improving the quality of the resulting model significantly.

The program DeepView - Swiss-PdbViewer (Guex et al.) can be used to generate, display, analyse, and manipulate modelling project files in the SWISS-MODEL workspace. Project files contain the superposed template structures and the alignment between the target and the template. In this mode, the user has full control over essential modelling parameters, i.e. the choice of template structures, the correct alignment, and the placement of insertions and deletions in the context of the 3D structure. Project files can also be generated by the workspace template selection tools.

The program DeepView can be downloaded for free from the ExPASy web site. SWISS-MODEL supports DeepView legacy projects by relying on the previous version of the PROMOD modelling pipeline.


Ligand Modelling

Biologically relevant ligands and cofactors are modelled using a conservative homology transfer approach from the templates identified in the SMTL. Ligands in the SMTL are annotated either as: (i) relevant, non-covalently bound ligand, (ii) covalent modifications, or (iii) non-functional binders (e.g. buffer or solvent). A non-covalently bound ligand is considered for the model if the coordinating residues are conserved in the target–template alignment. The relative coordinates of the ligand(s) are transferred from the template, if the resulting atomic interactions in the model are within the expected range for van der Waals interactions and water-mediated contacts.

If the homology transfer approach described above is too restrictive, the user needs to use an appropriate ligand docking tool. To simplify this process, a model can be sent directly to SwissDock (Grosdidier et al.) by choosing the appropriate entry in the drop-down menu next to the model preview image. The user can then use SwissDock to predict molecular interactions between the model and a ligand.


Oligomeric Modelling

In SWISS-MODEL, the quaternary structure annotation of the template is used to model the target sequence in its oligomeric form. The method (Bertoni et al.) is based on a supervised machine learning algorithm, Support Vector Machines (SVM), which combines interface conservation, structural clustering, and other template features to provide a quaternary structure quality estimate (QSQE). The QSQE score is a number between 0 and 1, reflecting the expected accuracy of the interchain contacts for a model built based a given alignment and template. In general a higher QSQE is "better", while a value above 0.7 can be considered reliable to follow the predicted quaternary structure in the modelling process. This complements the GMQE score which estimates the accuracy of the tertiary structure of the resulting model.
QSQE is only computed if it is possible to build an oligomer and only for the top ranked templates.


The SWISS-MODEL Template Library (SMTL)

The SWISS-MODEL template library is a large structural database of experimentally determined protein structures derived from the Protein Data Bank (Berman et al).

It serves as the main repository of structural information for the modelling pipeline and provides atomic coordinates of protein structures as well as maintains sequence and profile databases which can be searched by BLAST and HHblits. Alignment-independent properties of the templates are precalculated and stored in the database, e.g. a mapping between residues resolved in the experiment and corresponding residues in the full protein sequence, predicted solvent accessibility and secondary structure information.

Individual entries of the SMTL can be inspected using the web interface. The sequence features are linked to a 3D structure viewer and can be interactively explored. SMTL IDs consist of the PDB ID, an integer representing the biounit and a capital letter for the chain ID. The SMTL chain ID is not necessary, the same as the PDB chain ID. The mapping is shown in "SMTL:PDB".

The web interface also includes an online annotation system for ligands contained in the experimental structures. Ligands can be marked as synthetic, natural or part of crystallisation buffer. This information is used by the modelling pipeline to determine whether a ligand is considered for inclusion into the final model.

SMTL

Biological Assemblies (Biounit) of Templates

The biological assembly (biounit) describes the oligomeric state, or quaternary assembly, which is thought of as the biologically most relevant form of the molecule. For a detailed description see Biological Assemblies on PDB-101.

The biological assembly reported in the SMTL is retrieved from the PDB entry.

SMTL entries are organised (if more than one assembly is available) by likely quaternary structure assemblies which are created according to the author and software-annotated oligomeric states listed in the PDB deposition. If not all chains of the asymmetric unit are included by any biounit of a PDB entry, the asymmetric unit is included as a template.


Input Data


Protein amino acid sequence or UniProtKB identifier

The amino acid sequence of the target protein can be submitted either as plain text, or in FASTA format.

- Example of plain text sequence:

MVEIVYWSGTGNTEAMANEIEAAVKAAGADVESVRFEDTNVDDVASKDVILLGCPAMGSE
ELEDSVVEPFFTDLAPKLKGKKVGLFGSYGWGSGEWMDAWKQRTEDTGATVIGTAIVNEM
PDNAPECKELGEAAAKA

- Example of FASTA sequence:

>sp|P00321|FLAV_MEGEL Flavodoxin - Megasphaera elsdenii.
MVEIVYWSGTGNTEAMANEIEAAVKAAGADVESVRFEDTNVDDVASKDVILLGCPAMGSE
ELEDSVVEPFFTDLAPKLKGKKVGLFGSYGWGSGEWMDAWKQRTEDTGATVIGTAIVNEM
PDNAPECKELGEAAAKA

If the protein sequence is deposited in the UniProtKB (The UniProt Consortium) database, the UniProtKB identifier of the entry can be provided as input (i.e. P00321). In this case, the identifier is immediately validated and replaced with the corresponding sequence.

The "Add Hetero Target" button is provided to input multiple target sequences representing different subunits of a hetero-oligomer. The target sequences must be unique and can be submitted as plain text, FASTA sequences, or UniProtKB ACs. If a hetero-oligomer is requested, we only look for biounits of templates that contain connected chains with all desired subunits.


Target–template alignment

The following formats are currently supported: FASTA and Clustal.

- Examples for FASTA:

>THN_DENCL
KSCCPTTAARNQYNICRLPGTPRPVCAALSGCKIISGTGCPPGYRH-
>1crnA
TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN-

- Examples for Clustal:

CLUSTAL W (1.82) multiple sequence alignment
THN_DENCL       KSCCPTTAARNQYNICRLPGTPRPVCAALSGCKIISGTGCPPGYRH- 46
1crnA           TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN- 46
                .:***  ..*  :  **: * .. :**  :** **..: **  *   


User Template

If the user knows the structure of the template to use for modelling, the coordinates can be uploaded in PDB format(*) together with the target protein sequence.
Oligomeric templates are accepted, and it is also possible to build heteromers by adding multiple target sequences to the input.
To start a modelling job with your own template:

  1. Press the "User Template" button
  2. Enter the target sequence as normal.
    • Optional : to start a hetero project, you can now click "Add Hetero Target" to add another target sequence
  3. Click the "Add Template File..." button
  4. Locate your coordinate file. Important: Make sure that there are no chemically modified amino acids!
  5. Click "Build Model"

(*) A PDB-like file containing the coordinates of the template structure. For more information about PDB file format please see this link.

Please notice that the mmCIF format is currently not supported.


DeepView Project

Project files containing the superposed template structures, and the alignment between the target and template can be directly uploaded into the SWISS-MODEL Workspace. See the “Project Mode” section for further details.

An example of DeepView Project and its application in modelling of Oligomeric proteins can be found here.


Immunoglobulin sequence input

If an antibody sequence is present in the input, the user is presented with a notice as well as a link to a dedicated server for antibody modelling (PIGSPro). By clicking on the link, the user is redirected to the PIGSPro server home page where the input form is pre-filled with the detected antibody variable domains. The link is displayed while the SWISS-MODEL pipeline continues running and thus, the user still has the option to use SWISS-MODEL for the modelling.
An “Antibody detected” label is also shown on the different pages of the project.
Antibody sequences are identified by aligning the sequence against Hidden Markov Models developed on purpose for immunoglobulins.


Display of modelling results

Coordinates of the model, the corresponding alignment and quality evaluations can be accessed and downloaded via web browser from the workspace.


Model details

This section allows to display the 3D structure of models and their target–template sequence alignment as well as to download the model coordinates. For better assistance, many sequence features/scoring schemes are synchronised with the 3D molecular view.

The colouring of the alignment can be changed by clicking on the "Options" button (cog icon) and selecting the desired colouring scheme.

Model coordinates are available in two different formats:

If the model has been build using the Automated Mode, the information about the selected template(s) is provided with cross-references to structural databases via the link to the SWISS MODEL Template library.

By default, the final model is presented in colours based on the QMEAN model quality. This allows instant visualisation of regions of the model that are well or poorly modelled. Information about the oligomeric state, as well as bound ligands and cofactors are provided. The user can alternatively choose to see the results in a well formatted report page which shows all the results in a readable format that can be copied and pasted to other documents. The user can download an archive file containing all the models and reports for the given target sequence.

model


Model evaluation


GMQE

GMQE (Global Model Quality Estimation) is a quality estimation which combines properties from the target–template alignment and the template search method. The resulting GMQE score is expressed as a number between 0 and 1, reflecting the expected accuracy of a model built with that alignment and template and the coverage of the target. Higher numbers indicate higher reliability. Once a model is built, the GMQE ((1) in the figure above) gets updated for this specific case by also taking into account the QMEAN score of the obtained model in order to increase reliability of the quality estimation.


QMEAN

QMEAN (Benkert et al.) is a composite estimator based on different geometrical properties and provides both global (i.e. for the entire structure) and local (i.e. per residue) absolute quality estimates on the basis of one single model.

The QMEAN Z-score ((2) in the figure above) provides an estimate of the ‘degree of nativeness’ of the structural features observed in the model on a global scale. It indicates whether the QMEAN score of the model is comparable to what one would expect from experimental structures of similar size. QMEAN Z-scores around zero indicate good agreement between the model structure and experimental structures of similar size. Scores of -4.0 or below are an indication of models with low quality. This is also highlighted by a change of the "thumbs-up" symbol to a "thumbs-down" symbol next to the score.

QMEAN consists of four individual terms. The four individual terms of the global QMEAN quality scores are also listed ((3) in the figure above). The white area in the bar plots (numerical values close to zero) indicates that the property is similar to what one would expect from experimental structures of similar size. Positive values indicate that the model scores higher than experimental structures on average. Negative numbers indicate that the model scores lower than experimental structures on average. The QMEAN Z-score itself is shown on top. The individual Z-scores compare the interaction potential between Cβ atoms only, all atoms, the solvation potential and the torsion angle potential. For details, please refer to the publication.

Besides using the terms as for global scoring, the accuracy of the QMEAN local scores is enhanced by QMEANDisCo. QMEANDisCo assesses the consistency of observed interatomic distances in the model with ensemble information extracted from experimentally determined protein structures that are homologues to the target sequence. The “Local Quality” plot ((4) in the figure above) shows, for each residue of the model (reported on the x-axis), the expected similarity to the native structure (y-axis). Typically, residues showing a score below 0.6 are expected to be of low quality. Different model chains are shown in different colours. If the model is downloaded, the local score is reported in the B-factor column of the PDB file. The local quality can also be visualised by choosing the colour scheme "QMEAN".

In the “Comparison” plot ((5) in the figure above), model quality scores of individual models are related to scores obtained for experimental structures of similar size. The x-axis shows protein length (number of residues). The y-axis is the normalized QMEAN score. Every dot represents one experimental protein structure. Black dots are experimental structures with a normalized QMEAN score within 1 standard devation of the mean (|Z-score| between 0 and 1), experimental structures with a |Z-score| between 1 and 2 are grey. Experimental structure that are even further from the mean are light grey. The actual model is represented as a red star. The mean and standard deviation of the experimental structures around the x-location of the star are the basis to calculate the QMEAN Z-score of the model ((2) in the figure above), i.e. how many standard deviations from the mean scores my model.


Modelling report

The SWISS-MODEL Homology Modelling Report offers a summary of all Models built in the project.

Note: The report is accessible (i) per model via a drop-down menu, next to the model in the Models view or (ii) for all models in report.html in the downloaded file when choosing to download the project by pressing the download button below the project title.

It is structured in the following sections:

  • Model building Report: Contains project name, project date and references. The target sequence is in Table T1 of the Report.
  • Results: Version of the SWISS-MODEL template library and PDB release. All identified templates are listed in Table T2.
  • Models: Models are listed sequentially with each entry showing a picture of the model, a link to the PDB file, the version of the modelling engine, the oligomeric state, the ligands (if any), the global model quality estimate, and the QMEAN score
    A graphical representation of the QMEAN score and its four terms separately, the local quality estimate plot, and the comparison with non-redundant set of PDB structures are also provided. For the template, a link to the template itself is provided together with the following information: the title of the structure, the target sequence coverage, the sequence identity to the target, the experimental method used to obtain the structure (and the resolution, if applicable), the oligomeric state, the ligands (if any), the sequence similarity to the target, the template search method used.
  • Save Project Locally: Allows to download the project as a zip file.
    The main folder contains the Model report (report.html), images folder (banner for the Report), and the model folder. Each model has its own subfolder.