On the occasion of the 25th Anniversary of SWISS-MODEL, this symposium will provide an overview of recent progress in the Computational Structural Biology field and highlight the key challenges for the coming years. SWISS-MODEL was the first fully automated Web-based server for protein structure modelling. A major driver behind its development has been to mask the complexity associated with protein modelling behind an intuitive user interface, thereby broadening the uses and applications of protein models. Today SWISS-MODEL processes over 1 million model requests per year and is one of the most widely used structure modelling servers worldwide.
The symposium will feature a full day of lectures and presentations by international experts. Leaders in their field will present their perspective on how break-through developments in computational structural biology have contributed to our understanding of biological processes and diseases. The event will be open for anybody interested in structural biology and computational methods in this area. We particularly encourage the participation of students and junior scientists, who will have the opportunity to exchange ideas with international leaders in the field, and expand their collaborative network.
SWISS-MODEL: How it all started
In my talk I will share the motivation that led to SWISS-MODEL creation, how SWISS-MODEL was developed "the early days" and two important lessons learned from this journey.
Computational Enzymology – Enzymes in the PDB
We seek to understand how enzymes work, how they evolve to perform new functions and how we might learn to design new enzymes and new metabolic pathways. We are using computational approaches to address these questions, developing new data resources and novel algorithms/protocols, based in part on structures in the Protein Databank (PDB) and models.
Since we are trying to get an overview of enzyme evolution, we analyse many enzyme families and attempt to relate their function to structure. One of the challenges in this work is how to find the most relevant structure in the PDB and, if not available, make a reasonable model of the protein including relevant ligands. There are currently over 50,000 known enzyme structures, so analyzing each one by hand is impossible.
In this talk I will describe our efforts to develop tools and databases to improve the annotation of ligands in enzymes and to generate appropriate models. I will focus on coenzymes, cognate substrates/products and their binding sites. This is part of the FunPDBe project, led by PDBeurope, which aims to improve functional annotations for PDB entries.
SWISS-MODEL Next Generation
In this talk we will present the latest developments of the SWISS-MODEL server, which pioneered the field of automated modelling 25 years ago and has continuously been further developed ever since. In particular we will describe the recently extended functionality of modelling homo- and heteromeric complexes. Starting from the amino acid sequences of the interacting proteins, both the stoichiometry and the overall structure of the complex are inferred by homology. Other major improvements that will be discussed include the implementation of a new modelling engine, ProMod3 and the introduction of a new local model quality estimation method, QMEANDisCo. Finally, we will highlight how SWISS-MODEL and its accompanying services such as SWISS-MODEL repository, are integrated with other bioinformatics resources and modelling tools.
Application of protein structure prediction and protein complex structure prediction to G-protein-coupled receptors
We have been developing computational tools that can be utilized for atomic-level understanding of protein structure and function. They include methods for template-based protein structure prediction, protein loop and terminus structure prediction, protein model structure refinement, template-based protein complex structure prediction and refinement, template-based ligand binding site prediction, protein-protein docking, protein-peptide docking, and protein-ligand docking. All these methods involve optimization of hybrid energy functions designed to describe both local and global structures of proteins accurately. These methods are available as free webservers or downloadable softwares in http://galaxy.seoklab.org. We have also extended these methods to docking ligand onto predicted GPCR structure by allowing receptor structure relaxation and to GPCR extracellular loop structure prediction by combining with conserved disulfide bond prediction. However, some limitations of these methods have been recognized during application studies such as in community-wide blind prediction experiments. We have also generated candidate model structures for Norrin-Frizzled4 complex as a collaborative effort with experimentalists. Template-based modeling of transmembrane domain of the Frizzled4 receptor, docking of the crystal structure of Norrin-cysteine-rich domain of Frizzled4 to the model structure of transmembrane domain, modeling of the linker region connecting the cysteine-rich domain and the transmembrane domain, and molecular dynamics simulation in the presence of explicit lipid bilayer and water molecules were performed. The models could be used to explain the results of hydrogen-deuterium exchange-mass spectrometry experiment and binding affinity measurement.
Dynamically generating benchmark sets of large heteromeric protein complexes
Multi-protein machines are responsible for most cellular tasks. Unfortunately, the atomic details necessary to understand their function are available for a tiny fraction of them. The computational biology community is developing strategies to model ever-larger molecular machines. However, there is no gold-standard set of three-dimensional (3D) complexes to benchmark the performance of these methodologies and detect their limitations.
In this talk, I present a strategy to dynamically generate non-redundant sets of 3D heteromeric complexes. By changing the values of sequence identity and components overlap between assemblies, we can create sets of representative complexes with known 3D structure (target complexes). Moreover, for each target complex, we identify sets of assemblies, of varying degrees of similarity that can be readily used as input in complex modelling exercises (template subcomplexes). The results of this strategy are made available through a user-friendly web resource . We hope that resources like these will help the development and progress assessment of novel methodologies, as docking benchmarks and blind prediction contests did.
A brief history of CASP: ideas, successes, failures, and surprises
CASP (Critical Assessment of Structure Prediction) was founded about the same time as SWISS-MODEL. Its goal is to determine the state of the art in modeling protein structure from amino acid sequence, and so advance the field. Community experiments are conducted every two years, with participants submitting models that are evaluated by independent assessors. Participants do not know the experimental structures of the target proteins, and assessors are blinded to participant identity, bolstering rigor and objectivity. During the quarter century of these experiments, the field of protein structure modeling has been transformed from an arcane academic pursuit into a widely accepted tool for obtaining structure. How did this come about? In this talk I’ll address that question from a CASP perspective, considering how the ideas underlying modeling methods have evolved, which methods succeeded, which failed, and why. Notably, both successes and failures have often been surprising.
Modelling and quantitative description of antibody complementarity determining regions
Our ability to accurately model the structure of antibodies stems from the recognition that five out of the six loops forming the antigen-binding site only exhibit a limited number of main-chain conformations called canonical structures. In contrast, the third loop of the heavy chain, H3, has revealed to be very difficult to model given its high variability in both length and structure. We used a machine-learning method, combining sequence and structural related features, and contact propensities to select structural templates for H3 loops among a dataset of candidates. The method has led to a significant improvement in the prediction of the H3 region and the overall antigen-binding site. We next analyzed how differences between antigen-binding sites might be linked to their function, that is, the recognition of the antigen. To this aim, we developed a superposition-free method to compare the surface of a given antigen-binding site with those of a large dataset. We showed that similar antigen-binding sites are better detected based on surface descriptors than using traditional structure similarity measures and that a classification method based on surface comparison can provide information on the recognized antigen.
CATH Functional families (FunFams) - insights into impacts of genetic variations
Powerful tools for comparing protein structures and protein sequences have allowed us to analyse proteins from more than 20,000 completed genomes and identify 5500 evolutionary domain superfamilies, comprising a total of ~90 million domains. These superfamilies cover nearly 70% of domains from all kingdoms of life and are captured in our CATH resource. Some structural frameworks seem particularly suited to supporting different residue arrangements in the active sites and structural variations on the surfaces of the domains which can modify protein functions. Sub-classification of CATH superfamilies into functional families (FunFams) allows us to examine the structural mechanisms of function evolution in these superfamilies.
We have used the CATH-FunFams to analyse the impacts of genetic variations. For example, we observe that a particular mode of alternative splicing – Mutually Exclusive Exons (MXE) – is typically associated with variation in a small subset of residues on the surface of the protein and close to known or predicted functional sites for that protein. We analysed these effects using publicly available MXE data from 5 model organisms. Some compelling examples of MXE events in glycolytic proteins have been explored in more detail. The CATH-FunFams have also be used to determine whether genetic variations linked to human disease e.g. cancer, result in changes in residues close to functional sites, thereby modifying the functions of the proteins and affecting specific pathways and processes.
Modelling protein structure and missense variants: Phyre2, Missense3D, EzMol and BioBlox
First our protein structure prediction server Phyre2 will be outlined highlighting features that have led to widespread adoption by the community. The talk will then consider to what extent one can rely on a predicted structure to assess the structural effect of missense variant – a question regularly asked by our users of Phyre2. Accordingly we have developed Missense3D which was designed to use cut offs applicable to both experimental and Phyre2-predicted structures. We show that surprising our benchmarked performance only falls slightly on predicted structures (even those with only 30% identity to the template) compared to that obtainable from the corresponding experimental structure.
Finally, two resources of interest for both scientific research and education will be outlined. The first is EzMol, an exceptionally simple-to-use web-based molecular graphics program. We have just launched an EzMol educational portal. The second resources is a suite of computer games (BioBlox) on the theme of protein docking including a 2D version BiolBlox2D free from the AppStore and GooglePlay.
Computational resources for the study of intrinsically disordered proteins
Intrinsic disorder in proteins is difficult to detect by experimental methods, with comparatively few direct evidences available. Indirect evidence, such as missing electron density in x-ray structures, provide more instances at the cost of lower confidence. Computational disorder predictors have flourished over the last decade, enabling large-scale inference of disordered segments in proteins albeit with somewhat diverging results. The reproducibility and impact of studies relating intrinsic disorder to biological features will benefit greatly from the standardization of the underlying computational methods and data. The DisProtCentral consortium is intended to fill this gap by providing high-quality databases for the scientific community interested in intrinsic disorder. The DisProt database (URL: www.disprot.org) has recently undergone a complete technological overhaul and now contains over 800 fully re-annotated proteins provided by a community of expert curators. The MobiDB database complements DisProt with indirect disorder evidence from PDB structures and large-scale predictions for all known protein sequences. A new generation of computational tools is increasing reproducibility and providing novel annotations for intrinsically disordered protein segments. The on-going Critical Assessment of Intrinsic protein Disorder (CAID) will soon provide the first true blind test since CASP stopped assessing intrinsic disorder in 2012. As computational resources mature, a clearer view of the different types of functions important for intrinsic disorder is emerging.
Structural understanding of the regulatory interactions of proapoptotic Par-4
Par-4 is a 332 amino-acid unique proapoptotic protein with the ability to induce apoptosis selectively in cancer cells. It is predicted to be largely disordered with two important functional domains: an N-terminal SAC domain, which is a minimum sufficient region to induce apoptosis in cancer cells, and a C-terminal domain, which was predicted to form a coiled coil and involved in most of its interactions. This C-terminal region, which regulates its apoptotic function, was also suggested to contain a Leucince zipper domain. The X-ray crystal structure of the C-terminal domain of Par-4 (Par-4CC) was obtained by MAD phasing. Par-4 homodimerizes by forming a parallel coiled-coil structure and contains The the homodimerization subdomain in the N-terminal half of Par-4CC. This structure has a nuclear export signal (Par-4NES) sequence, which is masked upon dimerization indicating a potential mechanism for nuclear localization. To understand the mechanism of heteromeric interactions of Par-4 some of its binding partner sequences were analyzed to predict coiled coil regions, modelled by SWISS-MODEL and docked against Par-4CC monomer using GRAMM-X. The selected models were energy minimized using KoBaMIN and their quality was estimated using QMEAN. These heteromeric-interaction models specifically showed that charge interaction is an important factor in the stability of heteromers of the C-terminal leucine zipper subdomain of Par-4 (Par-4LZ). These heteromer models also displayed NES masking capacity and therefore the ability to influence intracellular localization.
Round table discussion
Celebration party @ Safran Zunft
Situated in Switzerland at the heart of Europe, Basel is one of the continent’s most convenient locations for major events. Basel is easily accessible from many European cities by plane or high-speed trains. Approximate travel times to the Congress Center:
Train station: Basel Badischer Bahnhof (DB) Train station is in walking distance to the conference site. Or 3 minutes with tram lines 2 or 6, exit at "Messeplatz".
Train station: Basel SBB or SBB/SNCF Ca. 20 minutes by public transport (3.80 CHF/ 3.50 EUR). Take tram line 1 or 2 and exit at "Messeplatz".
Basel EuroAirport (BSL) Ca. 15 minutes by taxi (ca. 40 CHF / 35 EUR); ca. 20 minutes by public transport (3.80 CHF/ 3.50 EUR). When arriving at Basel airport, take the exit to "Switzerland" (not France). For public transport, take bus line 50 to SBB train station and from there tram lines 1 or 2 and exit at "Messeplatz".
Zürich Airport (ZRH) Ca. 90 minutes by train and public transport to the conference site (ca. 45 CHF / 37 EUR). Trains to Basel are leaving from Zürich airport every 30 minutes (either direct or via Zürich main station; train schedules are displayed at the luggage belts. Take the train to either Basel SBB or Basel Badischer Bahnhof.
In Basel you will find more than 4'000 hotel beds. First-class establishments with style and tradition, comfortable middle-class hotels and a choice of friendly small hotels and guest houses to suit the smaller budget.
The hotels in Basel are centrally located. All attractions, museums, restaurants and bars, as well as diverse opportunities for shopping are therefore easy to reach on foot or by public transport. When checking in as a guest at a hotel in Basel, you will receive a Mobility Ticket which gives you free use of public transport during your stay.Online booking services
SWISS-MODEL 25 Years is organized by the SIB Swiss Institute of Bioinformatics and the Biozentrum University of Basel.
For information please contact:
SWISS-MODEL 25 Years Symposium c/o Prof. Torsten Schwede, SIB Swiss Institute of Bioinformatics, Biozentrum University of Basel, Klingelbergstrasse 50-70, CH 4056 Basel, Switzerland Email: firstname.lastname@example.org Tel: +41 61 207 15 86
Swiss Institute of Bioinformatics & Biozentrum, University of Basel, CH