Introduction to SWISS-MODEL

SWISS-MODEL is a web-based integrated service dedicated to protein structure homology modelling. It guides the user in building protein homology models at different levels of complexity.

Building a homology model comprises four main steps: (i) identification of structural template(s), (ii) alignment of target sequence and template structure(s), (iii) model-building, and (iv) model quality evaluation. These steps require specialised software and integrate up-to-date protein sequence and structure databases. Each of the above steps can be repeated interactively until a satisfying modelling result is achieved.

The SWISS-MODEL Workspace

The SWISS-MODEL Workspace (Waterhouse et al.) is a personal web-based working environment, where several modelling projects can be carried out in parallel. Protein sequence and structure databases necessary for modelling are accessible from the workspace and are updated in regular intervals. Tools for template selection, model building, and structure quality evaluation can be invoked from within the workspace directly or via the web page menu.

From the workspace, the user accesses the current modelling projects, inspects their status and visualises the results upon job completion. Project names can be changed retroactively by clicking on the symbol next to the project title. Alternatively, the project title can also be changed by double-clicking on the title when the project results are displayed. By default, projects are stored for two weeks on the server with an option to extend the project lifetime. The remaining time until a given project is deleted from the server is displayed accordingly.

If you have built a model which you would like to maintain indefinitely and the model will be cited in a journal, you may consider depositing your model at the ModelArchive where it will receive a DOI once the journal citation is available.

Model Building

Models are computed by the SWISS-MODEL server homology modelling pipeline (Waterhouse et al.) which relies on ProMod3 (Studer et al.), an in-house comparative modelling engine based on OpenStructure (Biasini et al.).

ProMod3 extracts initial structural information from the template structure. Insertions and deletions, as defined by the sequence alignment, are resolved by first searching for viable candidates in a structural database. Final candidates are then selected using statistical potentials of mean force scoring methods. If no candidates can be found, a conformational space search is performed using Monte Carlo techniques. Non-conserved side chains are modelled using an in-house backbone-dependent rotamer library. The optimal configuration of rotamers is estimated using the graph-based TreePack algorithm (Xu et al.) by minimising the SCWRL4 energy function (Krivov et al.). As a final step, small structural distortions, unfavourable interactions or clashes introduced during the modelling process are resolved by energy minimisation. ProMod3 uses the OpenMM library (Eastman et al.) to perform the computations and the CHARMM22/CMAP force field (Mackerell et al.) for parameterisation.

Modelling Modes

Depending on the difficulty of the modelling task, three different types of modelling modes are provided, which differ in the amount of user intervention: automated mode, alignment mode, and project mode.

Automated Mode

The Automated Mode only requires the amino acid sequence or the UniProtKB accession code of the target protein as input.

The automatic pipeline identifies suitable templates based on BLAST (Camacho et al.), and HHblits (Steinegger et al.). The resulting templates are ranked according to the expected quality of the resulting models (see the Template Ranking section for more details). Top-ranked templates and alignments are compared to verify whether they represent alternative conformational states or cover different regions of the target protein. In this case, multiple templates are selected automatically and different models are built accordingly.

This mode is subject to continuous evaluation within the Continuous Automated Model Evaluation (CAMEO) platform (Haas et al.).

Please note that it is unnecessary to run automated mode by pressing "Build Model" and afterwards start the project again and "Search for Templates" only. Both options start the same template search, which is also accessible in the first case, once the models are built.

Alignment Mode

If the desired template for modelling is known and available in the SWISS-MODEL Template Library (SMTL), a target–template alignment in either FASTA or Clustal format may be used to start the modelling process, thereby skipping the template search.

The template sequence(s) should be named using the PDB ID format (i.e. “1CNR” or “1CNR_A”). The user will be asked to specify which sequence in the alignment corresponds to the target and/or the template protein from a drop-down list.

The Alignment mode allows the advanced user to invoke the modelling step starting from alternative alignments and to evaluate the quality of these alternative models.

>THN_DENCL
KSCCPTTAARNQYNICRLPGTPRPVCAALSGCKIISGTGCPPGYRH-
>1crnA
TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN-

It is possible to edit your alignment further in the input window,by clicking on the edit icon to the left of the validated input alignment. This will start edit mode, a cursor will appear in first row of the alignment. Use the arrow keys to move the cursor, then press spacebar to insert a gap and del key to delete a character. The sequence identity of the new alignment is displayed, and non-identical residues in a column will be faded to light gray. Use control-z to undo any editing, or just click the reset button to go back to the start. Click the edit icon to exit alignment editing mode.

Project Mode

In difficult modelling situations, where the correct alignment between target and template cannot be clearly determined by sequence-based methods, visual inspection and manual manipulation of the alignment can help improving the quality of the resulting model significantly.

The program DeepView - Swiss-PdbViewer (Guex et al.) can be used to generate, display, analyse, and manipulate modelling project files in the SWISS-MODEL workspace. Project files contain the superposed template structures and the alignment between the target and the template. In this mode, the user has full control over essential modelling parameters, i.e. the choice of template structures, the correct alignment, and the placement of insertions and deletions in the context of the 3D structure. Project files can also be generated by the workspace template selection tools.

Download the program from the DeepView website. SWISS-MODEL supports DeepView legacy projects by relying on the previous version of the PROMOD modelling pipeline.

Ligand Modelling

Biologically relevant ligands and cofactors are modelled using a conservative homology transfer approach from the templates identified in the SMTL. Ligands in the SMTL are annotated either as: (i) relevant, non-covalently bound ligand, (ii) covalent modifications, or (iii) non-functional binders (e.g. buffer or solvent). A non-covalently bound ligand is considered for the model if it has at least three coordinating residues in the protein and those residues are conserved in the target–template alignment. The relative coordinates of the ligand(s) are transferred from the template, if the resulting atomic interactions in the model are within the expected range for van der Waals interactions and water-mediated contacts.

In PDB format, the ligands are all stored in a separate chain named '_' with different residue numbers distinguishing different ligands.

Protein-ligand interactions

When ligands are present in the model, non-covalent protein-ligand interactions are annotated with PLIP (Salentin et al.). Seven types of interactions are covered: hydrogen bonds, hydrophobic contacts, pi-stacking, pi-cation interactions, salt bridges, water bridges and halogen bonds.

Oligomeric Modelling

In SWISS-MODEL, the quaternary structure annotation of the template is used to model the target sequence in its oligomeric form. The method (Bertoni et al.) is based on a supervised machine learning algorithm, Support Vector Machines (SVM), which combines interface conservation, structural clustering, and other template features to provide a quaternary structure quality estimate (QSQE). The QSQE score is a number between 0 and 1, reflecting the expected accuracy of the interchain contacts for a model built based a given alignment and template. In general a higher QSQE is "better", while a value above 0.7 can be considered reliable to follow the predicted quaternary structure in the modelling process. This complements the GMQE score which estimates the accuracy of the tertiary structure of the resulting model. QSQE is only computed if it is possible to build an oligomer and only for the top ranked templates.

The SWISS-MODEL Template Library (SMTL)

The SWISS-MODEL template library is a large structural database of experimentally determined protein structures derived from the Protein Data Bank (Berman et al).

It serves as the main repository of structural information for the modelling pipeline and provides atomic coordinates of protein structures as well as maintains sequence and profile databases which can be searched by BLAST and HHblits. Alignment-independent properties of the templates are precalculated and stored in the database, e.g. a mapping between residues resolved in the experiment and corresponding residues in the full protein sequence, predicted solvent accessibility and secondary structure information.

Individual entries of the SMTL can be inspected using the web interface. The sequence features are linked to a 3D structure viewer and can be interactively explored. SMTL IDs consist of the PDB ID, an integer representing the biounit and a capital letter for the chain ID. The SMTL chain ID is not necessary, the same as the PDB chain ID. The mapping is shown in "SMTL:PDB".

Ligands can be marked as synthetic, natural or part of crystallisation buffer. This information is used by the modelling pipeline to determine whether a ligand is considered for inclusion into the final model.

SMTL

Biological Assemblies (Biounit) of Templates

The biological assembly (biounit) describes the oligomeric state, or quaternary assembly, which is thought of as the biologically most relevant form of the molecule. For a detailed description see Biological Assemblies on PDB-101.

The biological assembly reported in the SMTL is retrieved from the PDB entry.

SMTL entries are organised (if more than one assembly is available) by likely quaternary structure assemblies which are created according to the author and software-annotated oligomeric states listed in the PDB deposition. If not all chains of the asymmetric unit are included by any biounit of a PDB entry, the asymmetric unit is included as a template.

Input Data

Protein amino acid sequence or UniProtKB identifier

The amino acid sequence of the target protein can be submitted either as plain text, or in FASTA format.

Example of plain text sequence:

MVEIVYWSGTGNTEAMANEIEAAVKAAGADVESVRFEDTNVDDVASKDVILLGCPAMGSE
ELEDSVVEPFFTDLAPKLKGKKVGLFGSYGWGSGEWMDAWKQRTEDTGATVIGTAIVNEM
PDNAPECKELGEAAAKA

Example of FASTA sequence:

>sp|P00321|FLAV_MEGEL Flavodoxin - Megasphaera elsdenii.
MVEIVYWSGTGNTEAMANEIEAAVKAAGADVESVRFEDTNVDDVASKDVILLGCPAMGSE
ELEDSVVEPFFTDLAPKLKGKKVGLFGSYGWGSGEWMDAWKQRTEDTGATVIGTAIVNEM
PDNAPECKELGEAAAKA

If the protein sequence is deposited in the UniProtKB (The UniProt Consortium) database, the UniProtKB identifier of the entry can be provided as input (i.e. P00321). In this case, the identifier is immediately validated and replaced with the corresponding sequence.

The "Add Hetero Target" button is provided to input multiple target sequences representing different subunits of a hetero-oligomer. The target sequences must be unique and can be submitted as plain text, FASTA sequences, or UniProtKB ACs. If a hetero-oligomer is requested, we only look for biounits of templates that contain connected chains with all desired subunits.

Target–template alignment

The following formats are currently supported: FASTA and Clustal.

Example for FASTA:

>THN_DENCL
KSCCPTTAARNQYNICRLPGTPRPVCAALSGCKIISGTGCPPGYRH-
>1crnA
TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN-

Example for Clustal:

CLUSTAL W (1.82) multiple sequence alignment
THN_DENCL KSCCPTTAARNQYNICRLPGTPRPVCAALSGCKIISGTGCPPGYRH- 46
1crnA TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN- 46
.:*** ..* : **: * .. :** :** **..: ** *

User Template

If the user knows the structure of the template to use for modelling, the coordinates can be uploaded in PDB format(*) together with the target protein sequence. Oligomeric templates are accepted, and it is also possible to build heteromers by adding multiple target sequences to the input. To start a modelling job with your own template:

  1. Press the "User Template" button
  2. Enter the target sequence as normal.
    • Optional : to start a hetero project, you can now click "Add Hetero Target" to add another target sequence
  3. Click the "Add Template File..." button
  4. Click "Build Model"

Important: Make sure that there are no chemically modified amino acids!

If the file is not accepted, you may first try removing non-standard residues (HETATMS).

(*) A PDB-like file containing the coordinates of the template structure. For more information about PDB file format please see this link.

Please notice that the mmCIF format is currently not supported.

DeepView Project

Project files containing the superposed template structures, and the alignment between the target and template can be directly uploaded into the SWISS-MODEL Workspace. See the “Project Mode” section for further details. An example of DeepView Project and its application in modelling of Oligomeric proteins can be found here.

Display of modelling results

Coordinates of the model, the corresponding alignment and quality evaluations can be accessed and downloaded via web browser from the workspace.

Model details

This section allows to display the 3D structure of models and their target–template sequence alignment as well as to download the model coordinates. For better assistance, many sequence features/scoring schemes are synchronised with the 3D molecular view.

The colouring of the alignment can be changed by clicking on the "Options" button (cog icon) and selecting the desired colouring scheme.

Model coordinates are available in two different formats:

If the model has been build using the Automated Mode, the information about the selected template(s) is provided with cross-references to structural databases via the link to the SWISS MODEL Template library.

By default, the final model is presented in colours based on the QMEAN model quality. This allows instant visualisation of regions of the model that are well or poorly modelled. Information about the oligomeric state, as well as bound ligands and cofactors are provided. The user can alternatively choose to see the results in a well formatted report page which shows all the results in a readable format that can be copied and pasted to other documents. The user can download an archive file containing all the models and reports for the given target sequence.

There are very rare cases where modelling fails because the template structure does not contain enough backbone atoms in the aligned region (we need at least N-CA-C to be available and we skip d-peptides). In such a case, we do not return any model structure.

Currently, there are three very rare cases where major modelling issues appear. These issues are displayed with a prominent warning sign and a potentially sub-optimal model is displayed. The models may locally still be valid and useful. We suggest to carefully look at the local QMEAN scores to judge the model. The three issues are:

  • If the target–template alignment contains very large deletions mixed with small aligned patches, we may return an incomplete model as we are unable to cleanly resolve the deletion. Apart from the unresolved deletion, these could still be high quality models.
  • A "ring punch" is defined by a bond passing through the carbon ring of another amino acid (His, Pro, Phe, Tyr, Trp). This is an unfortunate and unpredictable effect of the final energy minimisation in the modelling process. Apart from the residues involved in the "ring punch", these could still be high quality models.
  • If a very bad template structure is used (so far we have only seen this with user-uploaded structures), the energy minimization may fail. This is usually caused by coordinates of different atoms occupying almost the same position. In such a case, we return the model without any energy minimization applied on it. Unless the failure was caused by a local problem in the template structure, this is expected to lead to very low quality models.

Future versions of ProMod3 may resolve the issues above.

model

Model evaluation

Global model evaluation

GMQE and QMEANDisCo global give an overall model quality measurement between 0 and 1, with higher numbers indicating higher expected quality. GMQE is coverage dependent, i.e. a model covering only half of the target sequence is unlikely to get a score above 0.5. QMEANDisCo on the other hand evaluates the model 'as is' without explicit coverage dependency.

GMQE (Global Model Quality Estimate) is a quality estimate which combines properties from the target-template alignment and the template structure. They are combined using a multilayer perceptron trained to predict the lDDT score of the resulting model. The GMQE is available before building an actual model and thus helpful in selecting optimal templates for the modelling problem at hand. Once a model is built, the GMQE ((1) in the figure above) gets updated for this specific case by also taking into account the QMEANDisCo global score of the obtained model in order to increase reliability of the quality estimation. If a template structure originates from AlphaFold DB, GMQE is a heuristic that sums per-residue plDDT values of aligned template residues and normalizes by target sequence length. Once a model is built, the GMQE is updated and represents the summed per-residue quality estimates normalized by target sequence length. Again, per-residue quality estimates are estimated with an AlphaFold DB specific heuristic.

QMEANDisCo global score (Studer et al., (2) in the figure above) is the average per-residue QMEANDisCo score (see below) which has been found to correlate well with the lDDT score (Mariani et al.). The provided error estimate is based on QMEANDisCo global scores estimated for a large set of models and represents the root mean squared difference (i.e. standard deviation) between QMEANDisCo global score and lDDT (the ground truth). As the reliability of the prediction depends on model size, the provided error estimate is calculated based on models of similar size to the input. The QMEANDisCo global score is not computed for models that use AlphaFold DB templates.

QMEAN Z-score analysis (Benkert et al.) is deprecated and the GMQE and QMEANDisCo global scores should be consulted for global model quality estimates instead. It is based on 4 statistical potentials of mean force and their linear combination: the "QMEAN" score. All scores, 5 in total, are compared with what one would expect from experimentally determined structures of similar size using Z-scores ((4) in the figure above). In other words: How many standard deviations from the mean is my model score given a score distribution from a large set of experimentally determined structures. Z-scores around 0.0 therefore reflect a "native-like" structure and, as a rule of thumb, a "QMEAN" Z-score below -4.0 indicates a model with low quality. This is illustrated by the "Comparison" plot ((5) in the figure above). The x-axis shows protein length (number of residues). The y-axis is the "QMEAN" score. Every dot represents one experimental protein structure. Black dots are experimental structures with a "QMEAN" score within 1 standard devation of the mean (|Z-score| between 0 and 1), experimental structures with a |Z-score| between 1 and 2 are grey. Experimental structure that are even further from the mean are light grey. The actual model is represented as a red star. The QMEAN Z-score analysis is not computed for models that use AlphaFold DB templates.

Local model evaluation

Per residue scores are estimated with the QMEANDisCo scoring function (Studer et al.). QMEANDisCo is a composite score for single model quality estimation. It employs single model scores suitable for assessing individual models, extended with a consensus component by additionally leveraging information from experimentally determined protein structures that are homologous to the model being assessed. The "Local Quality" plot ((3) in the figure above) shows, for each residue of the model (reported on the x-axis), the expected similarity to the native structure (y-axis). Typically, residues showing a score below 0.6 are expected to be of low quality. Different model chains are shown in different colours. If the model is downloaded, the local score is reported in the B-factor column of the PDB file. The local quality can also be visualised by choosing the colour scheme "Confidence". If the model is built using an AlphaFold DB template, per-residue scores are transferred plDDT values from the underlying template. The ProMod3 modelling engine resolves insertions/deletions by remodelling stretches that may be longer than the ones defined in the alignment. The assigned local qualities in these stretches linearly decrease from the anchors as a function of distance.

Modelling report

The SWISS-MODEL Homology Modelling Report offers a summary of all Models built in the project.

Note: The report is accessible (i) per model via a drop-down menu, next to the model in the Models view or (ii) for all models in report.html in the downloaded file when choosing to download the project by pressing the download button below the project title.

It is structured in the following sections:

  • Model building Report: Contains project name, project date and references. The target sequence is in Table T1 of the Report.
  • Results: Version of the SWISS-MODEL template library and PDB release. All identified templates are listed in Table T2.
  • Models: Models are listed sequentially with each entry showing a picture of the model, a link to the PDB file, the version of the modelling engine, the oligomeric state, the ligands (if any), the global model quality estimate, and the QMEAN score.
    A graphical representation of the QMEAN score and its four terms separately, the local quality estimate plot, and the comparison with non-redundant set of PDB structures are also provided. For the template, a link to the template itself is provided together with the following information: the title of the structure, the target sequence coverage, the sequence identity to the target, the experimental method used to obtain the structure (and the resolution, if applicable), the oligomeric state, the ligands (if any), the sequence similarity to the target, the template search method used.
  • Save Project Locally: Allows to download the project as a zip file.
    The main folder contains the Model report (report.html), images folder (banner for the Report), and the model folder. Each model has its own subfolder.

Model Annotations

To add annotations, click the small pen icon below the 3D view of the model to open the input textarea, in which you can freely paste or type and amend your input.

To get started with a new annotation you can also click directly on a residue in the 3D view, or select a region of residues in the target-template alignments.

MOTIF=EHFG[DL]+ST Find a sequence of residues to annotate. Regular expressions are allowed.
MOTIF=(EHFG)(.+)(ST) Tip: to identify a region defined by flanking residues describe the leading, target and trailing residues using round brackets. The second (middle) group will be the annotated region.
TARGET=1 For heteromeric protein models, specify the target sequence. (1 based index)
CHAIN=B Be aware that chain names depend on the template the model was based on. For heteromeric models, check the chain names per target in the expanded model-template alignment.
START=35 Starting residue number, 1 based. Can be combined with END and MOTIF to define the annotated region.
END=50 Ending residue number, 1 based. Can be combined with END and MOTIF to define the annotated region.
COLOR=olive
COLOR=#808000
COLOR=rgb(128,128,0)
Accepted colour format is common name, hex code or rgb. If the input colour cannot be parsed, the colour will be 'white'.
LABEL=default Default label is residue name, chain name, residue number eg "ALA C45". Every residue in the region will have a label.
LABEL="this is the second helical region" For non-default labels with spaces, double quotes must be used. The label will appear on the central residue in the region annotated
LABEL_COLOR=olive
LABEL_COLOR=#808000
LABEL_COLOR=rgb(128,128,0)
Colour of the label
LABEL_SCALE=1.5 Scale the label against the default size for the viewer.
SIDECHAINS=on Sidechains will be shown in ball and stick representation.

If you are logged in as the owner of the project, the annotations will be saved to the modelling project. If you are not the owner of the project, you are still free to edit and view annotations in your own browser window.

Colour Schemes

Score Schemes

SOA (Solvent Accessibility)                              Low SOA -> High SOA
B-factor <10< <15< <20< <25< <30< <35< <40Low disorder -> High Disorder
B-factor range                               Low disorder -> High Disorder.
Range is between the minimum and maximum B-factor values present in the structure.
Entropy                               Low Entropy -> High Entropy;
High Conservation -> Low Conservation

Model Schemes

Confidence gradient                                                         Low Confidence -> High Confidence
Confidence class   Very high ( > .9)
   Confident (.9 > score > .7)
   Low (.7 > score > .5)
   Very low ( score < .5)
For both confidence colour schemes, residues are coloured by their local quality value. For SWISS-MODEL models, QMEANDisCo is used (range 0-1). For AlphaFold models, the score will be pLDDT (range 0-100). If a model is known to be an experimental structure, the B-factor range colour scheme will be used.
Indels
MODEL    AAAAAAAA---AAAAAA-AA
TEMPLATE AA---AAAAAAAAA-AAAAA
Highlights insertions / deletions in model

Alignment Index Schemes

Chain                         Cycle of colours
Rainbow                              N-teminus -> C-terminus

Residue Schemes

HydrophobicRKDENQHPYWSTGAMCFLVI   Least hydrophobic -> Most hydrophobic
SizeGASPVTCLINDKQEMHFRYW   Smallest -> Largest
ChargedED (Negative)  HKR (Positive)
PolarSTNQ
ProlineP
Ser/Thr ST
CysteineC
AliphaticILV
AromaticFYWH

Clustal Scheme

This is an emulation of the default colourscheme used for alignments in Clustal X, a graphical interface for the ClustalW multiple sequence alignment program. Each residue in the alignment is assigned a colour if the amino acid profile of the alignment at that position meets some minimum criteria specific for the residue type.

The table below gives these criteria as clauses: { > X% xx,y }, where X is the threshold percentage presence for any of the xx (or y) residue types. For example, K or R is coloured red if the column includes more than 60% K or R (combined), or more than 80% of either K or R or Q (individually).

CategoryColourResidue at Position{Threshold, Residue group}
Hydrophobic
 
A I L M F W V{ 60% WLVIMAFCYHP }
 
C{ 60% WLVIMAFCYHP }
Positive charge
 
K R{ 60% KR }, { 80% K,R,Q }
Negative charge
 
E{ 50% ED }, { 50% QE }, { 60% KR }, { 85% D,E,Q }
 
D{ 50% ED }, { 60% KR }, { 85% D,E,N }
Polar
 
N{ 50% N }, { 85% N,Y }
 
Q{ 50% QE }, { 60% KR }, { 85% Q,E,K,R }
 
S T{ 50% TS }, { 60% WLVIMAFCYHP }, { 85% S,T }
Cysteine
 
C{ 85% C }
Glycine
 
G{ 0% G }
Proline
 
P{ 0% P }
Aromatic
 
H Y{ 60% WLVIMAFCYHP }, { 85% W,Y,A,C,P,Q,F,H,I,L,M,V }
Unconserved
 
any / gapIf none of the above criteria are met

Membrane Prediction

Biounits of transmembrane proteins are identified in the SMTL solely based on structural information. The most likely membrane location is computed based on structural information using the membrane finding algorithm of the QMEANBrane tool (Studer et al.) which is based on the solvation model described for the Orientations of Proteins in Membranes database (Lomize et al.). The results serve as input to classify each biounit based on energetic and geometric criteria.

The membrane annotation is transferred to a model if at least 80% of all biounit transmembrane residues are aligned with the target sequence(s).

Modelling API

The Modelling API is intended to be used programatically for submissions of many modelling jobs where clicking through a website to submit and view the results is not practical.

SWISS-MODEL projects can be started from the command line, or using an interactive user-interface such as Swagger UI or Core API.

If you are using the coreapi auto-generated code snippets, you will need to add authentication to start a modelling project. You will find detailed help here.

Job submission and status checks are rate limited, if you send too many requests you will receive a 429 response. The results should indicate the current submission rate. The submission rate may change at any time, depending on demand of the service. Currently, the rapid submission rate is 100/m and the prolonged rate is set as 2000/6h .

The API uses a token based authentication system, so the first step is to retrieve a token for your SWISS-MODEL user account. This token is to be placed in the header of subsequent API calls.

1: Obtain a token

This can be done from the command line, but the recommended method to discover (and regenerate) your API token is to visit your SWISS-MODEL account page.

2: Start an Automodel project

import requests
response = requests.post(
    "https://swissmodel.expasy.org/automodel",
    headers={ "Authorization": f"Token {token}" },
    json={ 
        "target_sequences": 
            [
                "VLSPADKTNVKAAWAKVGNHAADFGAEALERMFMSFPSTKTYFSHFDLGHNSTQVKGHGKKVADALTKAVGHLDTLPDALSDLSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPGDFTPSVHASLDKFLASVSTVLTSKYR",
                "VHLTGEEKSGLTALWAKVNVEEIGGEALGRLLVVYPWTQRFFEHFGDLSTADAVMKNPKVKKHGQKVLASFGEGLKHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVVVLARHFGKEFTPELQTAYQKVVAGVANALAHKYH"
            ],
        "project_title":"This is an example using multiple targets for hemoglobin"
      })

2: Start an Alignment project

response = requests.post(
    "https://swissmodel.expasy.org/alignment",
    headers={ "Authorization": f"Token {token}" },
    json={ 
          "target_sequences":  "KSCCPTTAARNQYNICRLPGTPRPVCAALSGCKIISGTGCPPGYRH",
          "template_sequence": "TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN",
          "template_seqres_offset": 0,
          "pdb_id": "1crn",
          "auth_asym_id": "A",
          "assembly_id": 1,
          "project_title": "This is an example of Aligment mode based on 1crn"
        })

2: Start a User Template project

with open("3l9y.1.A.pdb") as f:
    template_coordinates = f.read()
    
response = requests.post(
     "https://swissmodel.expasy.org/user_template",
    headers={ "Authorization": f"Token {token}" },
    json={
          "target_sequences": "MVVKAVCVINGDAKGTVFFEQESSGTPVKVSGEVCGLAKGLHGFHVHEFGDNTNGCMSSGPHFNPYGKEHGAPVDENRHLGDLGNIEATGDCPTKVNITDSKITLFGADSIIGRTVVVHADADDLGQGGHELSKSTGNAGARIGCGVIGIAKV",
          "template_coordinates": template_coordinates,
          "project_title":"This is an example of User Template based on SODC_DROME"
        })

At this point, it is worth checking the status code of the response. A 202 means that valid input was received and a new modelling job will be started when resources become available.

A 200 response means that the valid input has been seen before with the same SMTL version and so the project is already completed / failed.

3: Fetch the results

# Obtain the project_id from the response created above
project_id = response.json()["project_id"]

# And loop until the project completes
import time
while True:
    # We wait for some time
    time.sleep(10)

    # Update the status from the server 
    response = requests.get(
        f"https://swissmodel.expasy.org/project/{ project_id }/models/summary/", 
        headers={ "Authorization": f"Token {token}" })

    # Update the status
    status = response.json()["status"]

    print('Job status is now', status)

    if status in ["COMPLETED", "FAILED"]:
        break

4: Check if the job is COMPLETED and fetch the model coordinates

response_object = response.json()
if response_object['status']=='COMPLETED':
    for model in response_object['models']:
        print(model['coordinates_url'])

Bulk download of coordinates, metadata and overall summary file.

By default ALL projects created using the API will be considered for the bulk download. This can be filtered by creation date, using paramaters "from_datetime" and / or "to_datetime".
# Start a new job which will package all modelling jobs in a single zip archive
# If any jobs are still running, a download_id will not be available and the status code will be 400
response = requests.post(f"https://swissmodel.expasy.org/projects/download/", 
    headers={ "Authorization": f"Token {token}" })

# check that the status_code of the response is either 200 or 202 before proceeding
if response.status_code not in [200, 202]:
    print(response.text)
    import sys
    sys.exit()

# Obtain the download_id for the packaged file
download_id = response.json()['download_id']

while True:
    time.sleep(5)
    response = requests.get(
        f"https://swissmodel.expasy.org/projects/download/{ download_id }/", 
        headers={ "Authorization": f"Token {token}" })

    # Wait for the response status to be COMPLETED
    if response.json()['status'] in ['COMPLETED', 'FAILED']:
        break

# Fetch the bulk download of results from the parameter "download_url"
print("Fetch the results from", response.json()["download_url"])