|
Introduction The SWISS-MODEL Workspace is a web-based integrated service dedicated
to protein structure homology modelling. It assists and guides the user
in building protein homology models at different levels of complexity.
Protein sequence and structure databases necessary for modelling are accessible from the workspace and are updated in regular intervals. Software tools for template selection, model building, and structure quality evaluation can be invoked from within the workspace. A personal working environment (workspace), where several modelling projects can be carried out in parallel, is provided for each user. The following tutorial aims to facilitate the first steps of working with SWISS-MODEL Workspace. Please let us know if you would like to see other features explained in this tutorial (help-swissmodel@unibas.ch). How do I work with SWISS-MODEL workspace ? How can I build an homology model using SWISS-MODEL workspace ?
How can I assess protein structure model quality ?
1. How do I work with SWISS-MODEL workspace ? => How can I create an account? The SWISS-MODEL Workspace provides a personal web-based area for each user in which protein homology models can be built and the results of completed modelling projects are stored and visualized. It is not necessary to create an account; you may continue to use SWISS-MODEL as before by just providing an email address in the submission form, or by bookmarking the submission window. However, you will not be able to manage your projects inside the Workspace, and we therefor strongly recommend to create your own account:
1. How do I work with SWISS-MODEL workspace ? => How can I manage my projects? In the workspace a list of the current modeling work units is displayed, including the workunit type, a title provided by the user, and the status of the workunit:
The current status of a work unit is indicated by a graphical symbol: submitted (the job has been submitted to the queuing system and is waiting for execution), running (job is currently running and programs are calculating), finished (job has been completed, results are available) or failed/stopped (something went wrong during the process).
Beware: Workunits are kept on the server for one week before they are deleted automatically. You may postpone deletion by one week by pressing the green "refresh arrow". Please download the modelling results within this timeframe to your local system. Each user has a quota of up to a maximum of 25 work units which can be stored simultaneously.
2. My protein is quite large, and would like to identify individual domains I could model separately. Many proteins are modular
and made up of several structurally distinct domains, which often
reflect evolutionary relationships and may correspond to units of
molecular function.The sensitivity and performance of profile-based
template search methods can often be improved when the template search
is performed on individual domains rather than the whole target sequence.
The member databases
of InterPro (Mulder
et al.) allow for both the identification of protein domains
and the assignment of protein function. Using the InterPro Domain
Scan (IprScan, Zdobnov
et al.), protein domains and functional sites can be assigned
to regions of a target sequence. See: [Tools] [ Secondary Structure Prediction and Domain Assignment ]
Let's use the example of Collagen alpha 3(VI) chain (UniProt accession code: P12111) to identify individual domains in the target sequence. The result looks like this:
The location of the individual domains is provided in tabular form below the graphics. Links to the motif definition in InterPro are provided. Interpro Scan has finished. Here are the results: IPR002035: von Willebrand factor, type A, Domain PF00092: 39 - 213 VWA PF00092: 242 - 415 VWA PF00092: 445 - 620 VWA PF00092: 639 - 812 VWA PF00092: 837 - 1009 VWA PF00092: 1029 - 1201 VWA PF00092: 1233 - 1404 VWA PF00092: 1436 - 1609 VWA PF00092: 1639 - 1812 VWA PF00092: 2402 - 2581 VWA PF00092: 2619 - 2810 VWA
If I have no idea about possible templates for my target, and I want to identify possible template structures. The degree of difficulty in identifying a suitable template for a target sequence can range from "trivial" for well-characterized protein families to "impossible" for proteins with an unknown fold. The SWISS-MODEL Workspace provides access to a set of increasingly complex and computationally demanding methods to search for templates within the SWISS-MODEL Template library. SwissModel Template
Library (ExPDB) [Tools] [ SwissModel Template Library ] You may query if a certain PDB entry is part of SMTL. In this example, we search for chains of PDB entry "1HIV". SMTL provides information about the experimental methods used for structure determination, resolution (if applicable), and links to the original PDB entry as well as protein structure classification by SCOP and CATH. Caveat: A significant part of proteins are multimeric in their biologically active state. Single chains, or raw PDB entries often do not represent the biologically correct assembly. The PQS Protein Quaternary Structure Server (Henrick et al.) allows for searching of the list of likely quaternary structures generated at the EBI. As in our example, HIV-I protease is known to be active as a dimer. Multimeric proteins can be modelled in SWISS-MODEL Workspace using the Project Mode.
The target sequence can be used to query the SMTL for suitable template structures using "Template identification" in the Tools menu: [Tools] [ Template Identification] A condensed graphical view of the modeling task is provided containing the target sequence, the template matches sorted and colored according to the associated E-value. Clickable bars indicate the matched regions and guide the user to the underlying original program output.
[Display Alignment in DeepView] Target-template alignments from the search tools (BLAST or SAM) can be visualized in DeepView to correct misplaced insertions and deletions in the structural context of the template, and to manually adjust misaligned regions. The modified project can then be saved to disk and submitted as "project mode" to the workspace for model building by the SWISS-MODEL pipeline.
How do I use the fully automatic mode of SWISS-MODEL workspace? The "automated mode" is suited for cases where the target-template similarity is sufficiently high to allow for fully automated modelling. As a rule of thumb, automated sequence alignments are sufficiently reliable when target and template share more than 50% percent of sequence identity. This submission requires only the amino acid sequence (FASTA format or single letter raw sequence) or the UniProt accession code of the target protein as input data. The modelling pipeline automatically selects suitable templates based on a Blast E-value limit, which can be adjusted upon submission (Altschul et al.). The automated template selection will favour high-resolution template structures with reasonable stereochemical properties as assessed by ANOLEA mean force potential (Melo et al.) and Gromos96 force field energy (van Gunsteren et al.). Example: Modelling the catalytic domain of Cyclodextrin glucanotransferase from Bacillus stearothermophilus (UniProt AC code: Q9ZAQ0). [ Modelling ] [ Automated Mode ] Note: Workunits will be automatically deleted after 1 week from the server. When the modelling project is finished, please download the results and save them locally:
Alignment Mode Multiple sequence alignments are a common tool in many
molecular biology projects. If the three-dimensional structure is known
for at least one of the members, this alignment can be used as starting
point for comparative modelling using the "alignment mode".
In order to facilitate the use of alignments in different
formats, the submission is implemented as a three step procedure: 1. Prepare a multiple sequence alignment.
2. Submit your alignment to the Workspace Alignment Mode.
3. Select Target and Template
4. Check Alignment and Submit
Supported Alignment formats The following formats are currently supported: FASTA,
MSF, CLUSTALW, PFAM and SELEX;
clustal:
msf:
How do I use the Project Mode mode of SWISS-MODEL workspace ? Main application: Visual inspection of alignments; modelling of Oligomeric proteins. In difficult modeling situations, where
the correct alignment between target and template cannot be clearly
determined by sequence based methods, visual inspection and manual
manipulation of the alignment can significantly help improving the
quality of the resulting model. Project files containing the superposed
template structures, and the alignment between the target and the template
can be generated using the program DeepView (Swiss-PdbViewer
Guex et al).The user has therfor full control over essential
modelling parameters, i.e. the choice of template structures, the
correct alignment of residues, and the placement of insertions and
deletions in the context of the three-dimensional structure. The program DeepView can be downloaded freely from the ExPASy web site. DeepView does not require administrator privileges for installation. E.g. under MS windows, simply uncompress the distributed archive at any location you like (e.g. c:\spdbv or on your desktop) and start working by starting the spdbv.exe application. Tutorials, manuals and discussion group for DeepView can be found on the DeepView web site. Example: Modelling a dimeric protein In order to demonstrate Oligomer-Modelling, we are going to build a model of the protease of murine leukemia virus based on thestructure of Nelfinavir-resistant HIV-1 protease (D30N/N88D) in complex with Darunavir [3HVP]. (Please keep in mind that this just an example to illustrate the workflow. Most likely using this template will not make much scientific sense in most cases.)
[ Modelling ] [ Project Mode ]
Model of the dimeric protease. What accuracy can I expect for a model build by the automated mode of SWISS-MODEL? Evaluation of template structure and model quality is a crucial step in homology modelling. The reliability of different protein modeling methods can be assessed by evaluating the results of blind predictions after the corresponding protein structures have been determined experimentally. The overall performance of the SWISS-MODEL pipeline is evaluated by the EVA project. SWISS-MODEL was the first comparative modelling server to join the EVA project in May 2000, and has since then been continuously evaluated. As of Summer 2005, EVA-CM is based on the assessment of 261 weekly releases of the PDB database, resulting in 48098 protein models for 19698 protein target chains for five different prediction servers, among these 18314 from SWISS-MODEL. All models generated by SWISS-MODEL server, evaluation results, score definitions and detailed statistics are available from the EVA project website. The C-alpha atoms RMSD after global superimposition of the model and the experimental target structures was computed and plotted vs. % of sequence identity between target and best template to give an estimation of the overall accuracy of the different modelling servers with regards to different sequence identities between target and template:
In general, major differences between the individual
prediction methods are only observed for target-template pairs sharing
sequence identities of less than 40 %, where methods favouring
higher coverage of the target sequences are more likely to generate
models with a higher RMSD. As expected, model RMSD is increasing with
decreasing alignment accuracy as defined by the percentage of equivalent
C-alpha positions (within 3.5 Angstroms) between the optimally superimposed
target and model structures: How can I assess a structure or model with empirical force-field and Mean Force Potential methods? Evaluation of model quality
is a crucial step in homology modeling. While the performance of
the automated SWISS-MODEL (Schwede
et al.) pipeline in general is continuously evaluated by
the EVA project (Koh
et al.), the quality of individual models can vary significantly. Anolea:
The atomic empirical
mean force potential ANOLEA (Melo
et al.) is used to assess packing quality of the models.
The program performs energy calculations on a protein chain,
evaluating the "Non- Local Environment" (NLE) of each heavy atom
in the molecule.
The y-axis of the plot represents the energy for each amino acid of the protein
chain. Negative energy values (in green) represent favourable energy environment
whereas positive values (in red) unfavourable energy environment for a given
amino acid. Gromos: The y-axis of the plot represents the GROMOS (van Gunsteren et al.) empirical force field energy for each amino acid of the protein chain. Negative energy values (in green) represent favourable energy environment whereas positive values (in red) unfavourable energy environment for a given amino acid.
How can I assess geomoetrical accuracy of a structure or model? Evaluation of model quality
is a crucial step in homology modeling. While the performance of
the automated SWISS-MODEL (Schwede
et al.) pipeline in general is continuously evaluated by
the EVA project (Koh
et al.), the quality of individual models can vary significantly. Procheck The PROCHECK suite of programs (Laskowski et al.) assess the "stereochemical quality" of a given protein structure. The aim of PROCHECK is to assess how normal, or conversely how unusual, the geometry of the residues in a given protein structure is, as compared with stereochemical parameters derived from well-refined, high-resolution structures. What Check Example outputs:
|
||||||||