The large number of uncharacterized protein structures highlights the need for computational methods for annotating proteins using the tertiary structure. These also include function annotation methods by means of characterizing protein local surfaces. In order to facilitate structure-based protein annotation, 3D-SURFER offers web-based tools for rapid protein surface analysis and comparison. The server integrates various methods to assist in the high throughput screening and visualization of protein surface comparisons. These methods are discussed in detail below.
3D-Surfer's Integrative Web Interface:
3D-SURFER provides a web-based platform that integrates 3D Zernike Descriptors and VisGrid. The results obtained using these methods can be seamlessly visualized in a single intuitive user interface
3D Zernike Descriptors (3DZD) :
3D Zernike Descriptors (3DZD) are utilized for the efficient comparison of protein surfaces [Sael L et al., Proteins, 2008].
The descriptor is a combination of coefficients calculated from a well defined set of orthogonal 3D basis polynomials that approximate a given 3D function (a grid of a discretized surface). 3DZD has various desirable properties when applied to protein surfaces:
- Rotational invariance: Prior structural alignment is not required for protein comparisons.
- Compactness: The protein surface can be represented compactly as a feature vector with only 121 numbers called invariants. Comparisons of these vectors can be done very quickly, thus allowing for rapid shape retrieval.
- Hierarchical Resolution: Invariants of lower resolution are also part of the higher resolution. For example, the first 12 numbers among the 121 invariants represent the same protein at a lower resolution.
3DZD extraction procedure:
- Voxelization: The protein surface triangulation/mesh is extracted using the MSROLL program in Molecular Surface Package version 3.9.3 [Connolly M, 1983]. The mesh is then discretized to form a cubic grid.
- 3D Zernike transformation: The 3DZD program [Novotni M. and Klein R, 2003] takes the cubic grid as input and generates 3DZDs (the 121 invariants).
VisGrid (visibility cirterion) :
- The VisGrid algorithm facilitates the characterization of local geometric features of protein surfaces in an interactive manner using various features provided by the visibility criterion [Li B et al., Proteins, 2007]. The visibility is defined as the fraction of visible directions from a target position on a protein surface. A pocket or a hollow is recognized as a cluster of positions with a small visibility. A large protrusion in a protein structure is recognized as a pocket in the negative image of the structure. While existing methods restrict themselves to locating pockets with potential ligand binding site behavior, VisGrid can also focus on the dominant geometric features in the protein structure by identifying large protrusions, hollow and flat regions on the surface.
- The above figure illustrates various examples of protein surface cavities (Blue), protrustions (red), and flat regions (green) identified by the visibility criterion using VisGrid.
Basic features available through 3D-SURFER:
- Viewing surface comparison results
- Comparisons are performed by calculating the Euclidean distance (the square root of the sum of the squares of the differences between corresponding values) between the Zernike feature vectors (121 scalar values) representing the proteins. In the 3D-Surfer results, this is shown after label, "EucD:" .
- Viewing surface analysis results
- The Jmol applet can be used to rotate the query structure and color the surface by cavity, protrustion, and flatness. Clicking on the buttons called "Cavity", "Protrustion", or "Flat" will render the surface in three different colors based on their rank in terms of geometric visibility: Red (1st), Green (2nd), and Blue (3rd). Also shown are the volumes and surface areas of the convex hull formed from the atom coordinates of the residues identified by VisGrid.
- Rotatable protein surface figures
- Protein surfaces can be rotated by moving the mouse over each of the images of the results. The images will spin 360° along both the X and Y axes to give a complete view of the protein surface.
- Structure alignment calculations
- Structure based alignments of the proteins can be obtained by using the Combinatorial Extension (CE) program. To execute CE, check "Rmsd:" box of proteins you want to compare with your query structure. If the calculation was possible, the RMSD value will be displayed and a new button will appear; if this button is clicked the visualization of the CE alignment will appear on the left panel.
- Viewing CATH codes
- CATH codes for each of the results can also be viewed next to the "CATH: " section.
- CATH code filtering
- It is common that the results returned are very similar, in terms of the CATH codes they have. If the user wants, for example, to get results that are different in terms of the first two levels (specify CATH filter as "CA"), then the query will avoid returning repeated results for structures that share the first two levels. In other words, if two structures have CATH codes 3.40.390.10 and 126.96.36.1990, only one of them would be returned, because 3.40 is repeated.
- Length filtering
- When "Residue Length Filter" is enabled, the results returned will be similar in terms of the number of residues that each structure has. Two structures are considered similar if the size of one with respect to the other is between 0.57 and 1.75 times the size of the other one.
- PDB Link
- Each reported result displays the corresponding PDB ID and is directly linked to the PDB website.
- Zernike Invariants
- The 121 Zernike Invariants (or Zernike Descriptors), that characterize each structure are displayed in text and graphic forms, below the molecule visualization component.
- Uploading structure for custom comparison and analysis
- A custom structure may also be uploaded instead of entering a PDBID. For this, use the upload link from the main page and then select the PDB file and enter the chain identifier and click on "Submit". This way, your custom PDB structure can be used to do a one against all PDB structure comparison.
- Lee Sael, Bin Li, David La, Yi Fang, Karthik Ramani, Raif Rustamov and Daisuke Kihara. Fast protein tertiary structure retrieval based on global surface shape similarity. Proteins: Structure, Function, and Bioinformatics 72:1259-1273(2008).
- Bin Li, Srinivasan Turuvekere, Manish Agrawal, David La, Karthik Ramani and Daisuke Kihara. Characterization of local geometry of protein surfaces with the Visibility Criterion. Proteins: Structure, Function, and Bioinformatics 71:670-683(2008).
- Connolly ML. Solvent-accessible surfaces of proteins and nucleic acids. Science 1983;221(4612):709-713.
- Novotni M, Klein R. 3D Zernike descriptors for content based shape retrieval. ACM Symposium on Solid and Physical Modeling, Proceedings of the eighth ACM symposium on Solid modeling and Applications 2003;216-225