Introduction to SWISS-MODEL Workspace

The SWISS-MODEL Workspace is a web-based integrated service dedicated to protein structure homology modelling. It assists and guides the user in building protein homology models at different levels of complexity.

Building a homology model comprises four main steps: identification of structural template(s), alignment of target sequence and template structure(s), model building, and model quality evaluation. These steps can be repeated until a satisfying modelling result is achieved. Each of the four steps requires specialized software and access to up-to-date protein sequence and structure databases.

Protein sequence and structure databases necessary for modelling are accessible from the workspace and are updated in regular intervals. Software tools for template selection, model building, and structure quality evaluation can be invoked from within the workspace.

A personal working environment (workspace), where several modelling projects can be carried out in parallel, is provided for each user.

This help file provides references and illustrates the use of the individuals tools available from within the SWISS-MODEL Workspace.

Workspace

The SWISS-MODEL Workspace provides a personal web-based area for each user in which protein homology models can be built and the results of completed modelling projects are stored and visualized.

In the workspace a list of the current Modelling Projects and their current status is displayed. When the job has been completed the final results will be available, otherwise running (job is running and programs are calculating), or failed (if something went wrong during the process) status will be displayed.

The results are stored for two weeks on the server. The remaining time before deletion of a Project is also displayed. The user can decide to either delete a Modelling Project or to prolong its life.

Modelling

Three different types of modelling requests (automated mode, alignment mode, project mode) are provided, which differ in the amount of user intervention.

Modelling requests are computed by the SWISS-MODEL server homology modelling pipeline (Biasini et al.) for the top-ranking templates using ProMod3.

PROMOD3 is an inhouse comparative modelling engine based on OpenStructure. Loop modelling is performed with a database approach to find appropriate candidates to solve a loop modelling problem at hand. The loop candidates are then adapted to the environment using CCD and a final candidate gets selected using statistical potentials of mean force. The sidechain modelling is inspired by SCWRL4 and uses the 2010 backbone dependent rotamer library from the Dunbrack lab. A final energy minimization is performed using the OpenMM molecular mechanics library.

Automated Mode

This submission requires only the amino acid sequence or the UniProtKB accession code of the target protein as input data. The pipeline will automatically identify suitable templates based on Blast (Altschul et al.) and HHblits (Remmert et al.). Template quality is estimated from its properties. The templates are then ranked according to their estimated quality.

Depending on the planned model application, it can be necessary to select a different structural template than the ones found in the automated template identification process. To specify the structure to be used as modelling template please upload a file in PDB format (*) with coordinates of the template structure. Please make sure that this file contains only a single protein chain, and does not contain chemically modified amino acids, hetero atoms, ligands, etc.

(*) A simple PDB-like file containing the coordinates of the template structure. For more information about PDB file format please see link: http://www.wwpdb.org/docs.html mmCIF upload is not currently supported.

Alignment Mode

A multiple sequence alignment can be user as the starting point if the three-dimensional structure is known for at least one of the sequences.
The "alignment mode" allows the user to test several alternative alignments and evaluate the quality of the resulting models in order to achieve an optimal result.

1. Prepare a multiple sequence alignment.

2. Submit your alignment to the SWISS-MODEL Alignment Mode.

CLUSTAL W (1.82) multiple sequence alignment
THN_DENCL       KSCCPTTAARNQYNICRLPGTPRPVCAALSGCKIISGTGCPPGYRH- 46
THNX_TEST       KSCCPDTTGRDIYNTCRFGGGSRQVCARISGCKIISASTCPS-YPNK 46
1crnA           TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN- 46
                .:***  ..*  :  **: * .. :**  :** **..: **  *   

Deepview Project Mode

In difficult modelling situations, where the correct alignment between target and template cannot be clearly determined by sequence based methods, visual inspection and manual manipulation of the alignment can significantly help improving the quality of the resulting model.

Project files contain the superposed template structures, and the alignment between the target and template. Project files can be generated inside the program DeepView (Swiss-PdbViewer Guex et al.), by the SWISS-MODEL template identification tools, and are also one of the output formats of the modelling pipeline. This allows analysing and iteratively improving the models generated by the "Automated mode" and "Alignment mode" modelling approaches.

The program DeepView can be downloaded freely from the ExPASy web site.

Oligomeric structure prediction

In SWISS-MODEL, the quaternary structure annotation of the template is used to model the target sequence in its oligomeric form. The model is built based on the quaternary form of the template structure, if conservation of the oligomeric state can be assumed with high confidence as described in (Biasini et al.). Currently homo-oligomeric assemblies are computed by the "Automated Mode" of the SWISS-MODEL server. Hetero-oligomeric assemblies can be computed using the beta-oligo-version of the SWISS-MODEL server.
Alternatively "DeepView Project Mode" can be used to build the oligomeric structure of the target protein (see below).

Deepview Project Mode and its application in modelling of Oligomeric proteins

Example: Modelling a dimeric protein

In order to demonstrate how to use DeepView to build oligomeric assemblies, we are going to build a model of the protease of murine leukemia virus based on the PDB file 3S43. (Please keep in mind that this just an example to illustrate the workflow, presumably there would be much better templates available.)

  1. Get the template in the correct quaternary state
    First, check the correct biological assembly of your template protein. Copies of the asymetric unit of the PDB files can be generated by applying the correct crystallographic symmetry operators. If you are unsure how to do this, PDB will most likely have the correctly assembled coordinate file ("Biological Assembly") for you. In our example it is Biological Assembly coordinate file for 3S43.
    Download and save the template coordinates as a PDB file to your local disk.
  2. Remove all non-amino acid residues
    Open the file in DeepView and remove all non-amino acid groups such as ions, ligands, OXT, etc. from the template (unless they are at the very end of the file). You can do this by selecting the groups in the control panel of DeepView and Remove the selected residues ("Build" menu) [3s43_dimer.pdb].
  3. Ensure unique chain IDs
    Make sure each chain has a unique name, e.g. "A","B", etc. Colouring the molecule by chain helps to check.
  4. Target sequence
    In our example, we will model the protease domain of murine leukemia virus (UniProt AC: P03356). As you can see, the virus encoded polyprotein consists of several domains. Before modelling, it make things easier to focus on the interesting segment. You may use e.g. the IprScan utility to identify the individual domains. In our case, we will use residue 3-100.
    Create a FASTA file with your target sequences for each chain in the SAME order as in the template, i.e. "A", then "B" etc separated by semicolons. [target.txt]

    >TARGET
    QGQEPPPEPRITLTVGGQPVTFLVDTGAQH
    SVLTQNPGPLSDRSAWVQGATGGKRYRWTT
    DRKVHLATGKVTHSFLHVPDCPYPLLGR
    DL
    LTKLKAQI
    ;
    QGQEPPPEPRITLTVGGQPVTFLVDTGAQH
    SVLTQNPGPLSDRSAWVQGATGGKRYRWTT
    DRKVHLATGKVTHSFLHVPDCPYPLLGR
    DL
    LTKLKAQI


  5. Adjust target-template alignment in DeepView
    - Load the FASTA file containing the target sequence into DeepView (Menu: Swissmodel -> Load raw sequence ...),
    - Open the template file with the correct biological assembly (Menu: File -> Open PDB File) - generate a preliminary target-template alignment (Menu: Fit - Fit raw sequence)
    - open the alignment window and adjust alignment. Make sure NOT to align residues of different chains. Do not align to "non amino acid residues" like HET groups, OXT. Make sure all insertions & deletions are correctly positioned in the structural context.



  6. SWISS-MODEL submission
    Save the project to your local disk [e.g. dimer_3s43_project.pdb] and submit the file to the project mode of SWISS-MODEL workspace for model building. A new workunit will be created, containing the modelling results, includung log file, quality evaluation, and model project file of the modelled dimer.

Model of the dimeric protease.

 

Input target sequence and UniProt AC code

The amino acid sequence of a protein to be modelled or analysed can be submitted in FASTA or raw format. If the protein sequence is deposited in the UniProt (Bairoch et al.) knowledgebase, the AC (ACcession number) for the entry can be also specified.

Examples:

- Raw format: the amino acid sequence of the protein in plain text:

MVEIVYWSGTGNTEAMANEIEAAVKAAGADVESVRFEDTNVDDVASKDVILLGCPAMGSE
ELEDSVVEPFFTDLAPKLKGKKVGLFGSYGWGSGEWMDAWKQRTEDTGATVIGTAIVNEM
PDNAPECKELGEAAAKA

- FASTA format consists of a single-line description, followed by lines of sequence data. The first character of the description line is a greater-than (">") symbol:

>sp|P00321|FLAV_MEGEL Flavodoxin - Megasphaera elsdenii.
MVEIVYWSGTGNTEAMANEIEAAVKAAGADVESVRFEDTNVDDVASKDVILLGCPAMGSE
ELEDSVVEPFFTDLAPKLKGKKVGLFGSYGWGSGEWMDAWKQRTEDTGATVIGTAIVNEM
PDNAPECKELGEAAAKA

- UniProt Accession number: P00321

Supported alignment formats

The following formats are currently supported: FASTA and CLUSTALW.

Examples:

Fasta:

>THN_DENCL
KSCCPTTAARNQYNICRLPGTPRPVCAALSGCKIISGTGCPPGYRH-
>THNX_TEST
KSCCPDTTGRDIYNTCRFGGGSRQVCARISGCKIISASTCPS-YPNK
>1crnA
TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN-

Clustal:

CLUSTAL W (1.82) multiple sequence alignment
THN_DENCL       KSCCPTTAARNQYNICRLPGTPRPVCAALSGCKIISGTGCPPGYRH- 46
THNX_TEST       KSCCPDTTGRDIYNTCRFGGGSRQVCARISGCKIISASTCPS-YPNK 46
1crnA           TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN- 46
                .:***  ..*  :  **: * .. :**  :** **..: **  *   

User specified template:

Depending on the planned model application, it can be necessary to select a different structural template than the ones found in the automated template identification process. To specify the structure to be used as modelling template please upload a file in PDB format (*) with coordinates of the template structure. Please make sure that this file contains only a single protein chain, and does not contain chemically modified amino acids, hetero atoms, ligands, etc.

(*) A simple PDB-like file containing the coordinates of the template structure. For more information about PDB file format please see link: http://www.wwpdb.org/docs.html - mmCIF is not currently supported.

DeepView Project file

The program DeepView (Swiss-PdbViewer, Guex et al.) can be used to generate, display, analyse and manipulate modelling project files for the SWISS-MODEL workspace.

Project files contain the superposed template structures, and the alignment between the target and template. The user therefore has full control over essential modelling parameters, i.e. the choice of template structures, the correct alignment of residues, and the placement of insertions and deletions in the context of the three-dimensional structure.

Project files can be generated inside DeepView, by the SWISS-MODEL template identification tools, and are also one of the supported output formats of the modelling pipeline. This allows analysing and iteratively improving the output of the different modelling tools.

DeepView allows to visualize the model and the templates, and to analyse certain structural features e.g. Ramachandran plots or electrostatic properties. Moreover, it allows adjusting manually the placement of insertions and deletions in the alignment on which the initial modelling process was based on. The project with the modified alignment can then be re-submitted to the SWISS-MODEL workspace for model building.

DeepView can be downloaded at: http://www.expasy.org/spdbv/

DeepView does not require administrator privileges for installation. E.g. under MS Windows, simply uncompress the distributed archive at any location you like (e.g. c:\spdbv or on your desktop) and start working by starting the spdbv.exe application.

Template identification (search for templates)

The degree of difficulty in identifying a suitable template for a target sequence can range from "trivial" for well-characterized protein families to "impossible" for proteins with an unknown fold. The SWISS-MODEL server provides access to a set of increasingly complex and computationally demanding methods to search for templates.

Display of template identification results

The page serves both as an overview of available templates as well as an interactive template selection tool. The top part of the screen contains a summary of the top-ranking templates identified by the template search methods. Three types of views are available: (i) a Templates summary table, listing all templates in tabular form and providing an overview of relevant attributes of each template, (ii) an interactive chart showing the templates in relation to each other in Sequence Similarity space and (iii) the sequence Alignment of Selected Templates. Templates for a subsequent modelling step can be selected in any of the three views. Since the selections between the views are synchronized, selection of a template in the table, automatically selects the template in the two similarity plots and vice versa. Additionally, selected templates are superposed in the 3D viewer of choice to allow instant visualization of structural differences between templates.

In the detailed Templates summary table (i), the template annotations and target-template alignment can be retrieved by clicking on the small arrows at the end of the template lines to expand the box with the description of the individual templates.

Target-template sequence similarity is calculated from the BLOSUM62 substitution matrix. The values in the substitution matrix have been normalized, such that the largest value in the BLOSUM62 matrix gets a 1, the smallest a zero. The sequence similarity of the alignment is then the sum of the substitution score divided by the number of aligned residue pairs. Gaps are not taken into account. Sequence similarity is more robust than sequence identity when measuring evolutionary relationship between two sequences.

In the Sequence Similarity chart (ii) each template is shown as a circle. The distances between the templates in the plot is proportional to the sequence identity between them. Thus, similar sequences cluster together. Perform a "sweep select" of one of the clusters and analyse the templates structurally.

In the Alignment of Selected Templates view (ii) alignments can be obtained as a DeepView project file. The latter allows the user to visualize the different alignments in the structural context of the template, to correct misplaced insertions and deletions, and to manually adjust misaligned regions. The modified project can then be saved to disk and submitted as "project mode" to the workspace for model building by the SWISS-MODEL pipeline.

Template Results

To change the colouring of the alignment please click on the "Options" button (cog icon) and select the preferred colouring scheme. Colour Schemes

Score Schemes

SOA (Solvent Accessibility)                              Solvent Accessibility
b-Factor <10< <15< <20< <25< <30< <35< 40Low disorder -> High Disorder
b-Factor Range                               Low disorder -> High Disorder
Entropy                               Low Entropy -> High Entropy;
High Conservation -> Low Conservation

Model Schemes

QMEAN                                                         Low Quality -> High Quality
Indels MODEL    XXXXXXXXXXX---XXX
TEMPLATE XXXXXXXXXXX----XX
Highlights insertions/deletions in model

Alignment Index Schemes

Chain      Cycle of 5 colours
Rainbow                              N-teminus -> C-terminus

Residue Schemes

HydrophobicRKDENQHPYWSTGAMCFLVILeast hydrophobic-> Most hydrophobic
SizeGASPVTCLINDKQEMHFRYWSmallest -> Largest
ChargedDK 
PolarRKDENQ 
ProlineP 
Ser/Thr ST 
CysteineC 
AliphaticILV 
AromaticFYWH 

Clustal Scheme

Rules are specified in this way: (A,C,D): {50%, p,q,rstv}{85%, w,y} The column residue is given first in the round brackets; more than one may be specified, in which case the rules apply to each of these residues. Next, the rule or rules are given in curly braces; only one rule has to be met for the colour to be applied. The minimum percentage is given first, followed by the residue or residues which must meet or exceed this percentage within the column. If a group of residues is concatenated together, such as 'rstv', then any combination of these residues in total must meet or exceed the given percentage for the colour to be applied. For residues or residue groups separated by commas, at least one of these must by itself exceed the percentage.

 (WLVIMF): {50%, p}{60%, wlvimafcyhp}
(A): {50%, p}{60%, wlvimafcyhp}{85%, t,s,g}
(C): {50%, p}{60%, wlvimafcyhp}{85%, s}
 (KR): {60%, kr}{85%, q}
 (T): {50%, ts}{60%, wlvimafcyhp}
(S): {50%, ts}{80%, wlvimafcyhp}
(N): {50%, n}{85%, d}
(Q): {50%, qe}{60%, kr}
 (C) {85%, c}
 (D): {50%, de,n}
(E): {50%, de,qe}
 (G): {always}
 (HY): {50%, p}{60%, wlvimafcyhp}
 (P): {always}

Display of modelling results

Coordinates of the model, the underlying alignment, log files, and quality evaluations can be accessed and downloaded via web-browser from the workspace.

Model details

This section gives access to display the model, the model-template sequence alignment and download its coordinates. For better assistance, many sequence features/scoring schemes are synchronised with the 3D molecular view.
To change the colouring of the alignment please click on the "Options" button (cog icon) and select the preferred colouring scheme.

The model coordinates are available in two different formats:

If the model has been build using the Automated Mode, information about the template(s) used for modelling is provided with cross references to structural information databases via the link to the SWISS MODEL Template library.

The final model is presented initially coloured by model quality assigned using QMEAN. This allows instant visualisation of regions of the model that are well or poorly modelled. Information about the oligomeric state, included bound ligands and cofactors are provided. The user can alternatively choose to see the results in a well formatted report page which shows all the results in a readable format that can be copied and pasted to other documents. The user can download an archive file containing all the models and reports for the given target sequence.

model

QMEAN

QMEAN (Benkert et al.) is a composite scoring function for the estimation of the global and local model quality. QMEAN consists of four structural descriptors: The local geometry is analysed by a torsion angle potential over three consecutive amino acids. Two pairwise distance-dependent potentials are used to assess all-atom and C-beta interactions. A solvation potential describes the burial status of the residues.

The pseudo energies returned from the four structural descriptors and the final QMEAN score are directly related to what we would expect from high resolution X-ray structures of similar size using a Z-score scheme. The score of a model is also shown in relation to a set of high-resolution PDB structures (Z-score). The plot relates the obtained global QMEAN value to scores calculated from a set of high-resolution X-ray structures.

Local estimates of the model quality based on the QMEAN scoring function are shown as per-reside plot. Each residue is assigned a reliability score between 0 and 1, describing the expected similarity to the native structure. Higher numbers indicate higher reliability of the residues.

GMQE

GMQE (Global Model Quality Estimation) is a quality estimation which combines properties from the target-template alignment. The resulting GMQE score is expressed as a number between 0 and 1, reflecting the expected accuracy of a model built with that alignment and template. Higher numbers indicate higher reliability. Once a model is built, the GMQE gets updated for this specific case by also taking into account the QMEAN score of the obtained model in order to increase reliability of the quality estimation.

SWISS-MODEL Template Library (SMTL)

The SWISS-MODEL template library is a large structural database of experimentally determined protein structures derived from the Protein Data Bank (Westbrook et al.).

It serves as the main repository of structural information for the modelling pipeline and provides atomic coordinates of protein structures and maintains sequence and profile databases which can be searched by BLAST and HHblits. Alignment-independent properties of the template are precalculated and stored in the database, e.g. a mapping between residues resolved in the experiment and the corresponding residue in the full protein sequence, or predicted solvent accessibility and secondary structure information.

Individual entries of the SMTL can be viewed in the web-interface. The sequence features are linked to a 3D structure viewer and can be interactively explored.

The web interface also includes an online annotation system for ligands contained in the experimental structures. Ligands can be marked as synthetic, natural or part of crystallization buffer. This information is then used by the modelling pipeline to determine whether a ligand is to be included in a model.

SMTL