Escherichia coli is a Gram negative gammaproteobacterium commonly found in the lower intestine of warm-blooded organisms (endotherms). Most E. coli strains are harmless and are part of the normal flora of the gut.
Since E. coli can survive outside the body for a limited amount of time, it is an indicator organism for fecal contamination of the environment. The descendants of two isolates, K-12 and B strain, are used routinely in molecular biology as both a tool and a model organism. It is the most widely studied prokaryotic model organism due to its ease of culturing and short generation time.
The first E. coli genome was sequenced in 1997 (K12 strain).
("Escherichia coli", Wikipedia: The Free Encyclopedia)
From left to right: i) The number of proteins in the reference proteome of Escherichia coli, ii) the number of unique protein sequences for which at least one model is available, iii) the total number of models and iv) a coverage bar plot is shown.
The bar plot shows the coverage for every protein in the reference proteome of Escherichia coli for which there is at least one model. Different colours (dark green to red boxes) represent the coverage of the targets. Targets with high coverage are represented in dark green (more than 80% of the target's length is covered by models), whereas low coverage is shown in red. The size of each box is proportional to the number of target sequences with a given coverage.
For information on the latest proteome for Escherichia coli, please visit UniProtKB.
You can easily download the latest protein sequences for Escherichia coli proteome here. Please note this is for the current UniProtKB release, the numbers in the table below are for the most up to date SWISS-MODEL Repository, which was built on 2019_03.
|Proteins in proteome||Sequences modelled||Models||Sequence coverage of models|
The plot shows the evolution over years (x-axis) of the fraction of Escherichia coli reference proteome residues (y-axis) for which structural information is available. Different colors (light blue to dark blue) in the plot represent the quality of the sequence alignment between the reference proteome sequences (targets) and the sequences of the protein structure database (templates). Alignments with low sequence identity are displayed in light blue, whereas alignments with high sequence identity are depicted with dark blue in the plot. Target-template alignments were computed using HHblits. NR20 database was used to calculate profiles to search a database derived from all unique PDB protein sequences.
Global quality estimation of SWISS-MODEL Repository models is assessed by the QMEAN4 composite scoring function. The quality bar shows fractions of models divided into categories of varying quality. High QMEAN4 values correspond to high quality models (left side of the bar plot). Below -4.0 QMEAN4 values (right side of the plot), models are often no longer of reliable quality.
Detailed numbers are obtained by hovering the mouse over one of the boxes.
Many proteins form oligomeric structures either by self-assembly (homo-oligomeric) or by assembly with other proteins (hetero-oligomeric) to accomplish their function. In SWISS-MODEL Repository, the quaternary structure annotation of the template is used to model the target sequence in its oligomeric form. Currently our method is limited to the modelling of homo-oligomeric assemblies. The oligomeric state of the template is only considered if the interface is conserved.