To run PI-LZerD program for protein-protein docking prediction, the Receptor and Ligand in PDB format, and the predictions results from meta-PPISP server are required.
For example, to run PI-LZerD for protein 1A2K, 1A2K_R.pdb and 1A2K_L.pdb are needed in the first step, (usually larger protein is named as receptor, and the smaller one is named as ligand). The prediction need a simple command as:
./PI_LZerD.sh <id> <rec_chn> <lig_chn>
For 1A2K, the command will be:
./PI_LZerD.sh 1A2K R L
The output is the 1A2K.i61 file, predictions are listed in the format of:
<score> <interface-RMSD> <rotation-matrix>
For example,
**** Begin 1A2K.i61 ****
...
-6438.12 1.14679 0.988 -0.09 0.126 0.108 0.984 -0.145 -0.111 0.156 0.981 -9.879 5.268 6.523
...
**** End 1A2K.i61 ****
The physics-based score is -6438.12, the interface-RMSD is 1.14679A, and translation-rotation-matrix is defined in 12-numbers of 0.988 -0.09 0.126 0.108 0.984 -0.145 -0.111 0.156 0.981 -9.879 5.268 6.523, first 9 numbers are the rotation matrix, and the last 3 numbers are the translations in X, Y, Z coordinates.
The details of the process are listed as following, noted that for each step, a script named job.sh will
be called to read the results from previous results and generate the outputs for the next step, the program
can be accessed here:
Pre-LZerD process:
A. Data preparation:
Step 1. Mark surface residues, rename the receptor and ligand to the format of <id>_<chn>.pdb, where id is the ID of length 4, chn is the chain id, receptor and ligand have the same id. (LZerD/01.BASE/[rec/lig])
Input: Receptor and Ligand proteins
Method: Find the surface residues using the mark_sur program
Output: Receptor and Ligand proteins with surface residue information
Example: 1A2K_R.pdb, 1A2K_L.pdb
Step 2. Critical points generation from PDB file using GETPOINTS program (LZerD/01.BASE/CP)
Input: Receptor and Ligand proteins with surface residue information
Method: Run GETPOINTS_32 to create gts and cp(critical points) files, gts files are used for visualization, and cp files are extracted from gts file for protein docking prediction.
Output: gts and cp files
Example: 1A2K_R.gts, 1A2K_R.cp, 1A2K_L.gts, 1A2K_L.cp
Step 3. Compute Zernike Descritors base on each critical points (LZerD/01.BASE/INV)
Input: gts and cp files
Method: Compute the zernike descriptors for each critical points
Output: Zernike descriptors in inv format.
Example:1A2K_R.inv, 1A2K_L.inv
B. Pre-LZerD:
Step 4. Protein docking prediction using LZerD program (LZerD/02.MPI/01.LZerD_MPI)
Input: Receptor and Liand, critical points information, zernike descriptors from Data preparation
Method: LZerD program
Output: Protein-protein docking predicitons using LZerD
Example: 1A2K.out.gz
Step 5. Extract rotation matrics information to preparate for physics based scoring and IRMSD computation (LZerD/02.MPI/02.LZerD_MAT)
Input: Output of LZerD
Method: Extract the rotation/translation matrics
Output: Extracted rotation/translation matrics information
Example: 1A2K.mat
Step 6. Compute physics based scoring and IRMSD for each prediction (LZerD/02.MPI/03.SRB_MPI & LZerD/02.MPI/04.IRMSD_MPI)
Input: Rotation/translation matrics and receptor/ligand protein
Method: Compute the physics based scoring and IRMSD (Interface RMSD) in parallel
Output: Physics based scorings and IRMSDs.
Example: 1A2K.srb, 1A2K.irmsd
Step 7. Compute the critical points dependencies (LZerD/03.Votes)
Input: Protein-protein docking predicitons
Method: For each per prediction, extract the critical points dependencies
Output: The critical points dependencies information
Example: 1A2K.Votes
Step 8. Sort predictions by physical based scores (LZerD/04.ORD_SRB)
Input: Protein-protein docking predicitons
Method: Sort the predictions by physical based scores
Output: Sorted predictions
Example output: 1A2K.LZerD
Step 9. Compute pairwise Common Interface RMSD (CI_RMSD) on top 1000 predictions (LZerD/05.CI_RMSD)
Input: Protein-protein docking predicitons
Method: Compute Common Interface RMSD (CI_RMSD) for top 1000 predictions
Output: Common Interface RMSD (CI_RMSD) for top 1000 predictions
Example output: 1A2K.cirmsd
Step 10. Cluster predictions using CI_RMSD (LZerD/06.LZerD_CIRMSD)
Input: Top 1000 predictions
Method: CI_RMSD clustering
Output: Clustered top 1000 predictions
Example output: 1A2K.t1k
C. PI-LZerD Program:
Step 11. Add the prediction results from meta-PPISP server in PDB format (PI_LZerD/01.Pred_PPI/01.Pred_PPI_pdb)
Input: Receptors and Ligands
Method: meta-PPISP server
Output: Predicted protein interface
Example: 1A2K_R.rec.pdb, 1A2K_L.lig.pdb
Step 12. Compute the critical points belonging to predicted interface residues (PI_LZerD/01.Pred_PPI/01.Pred_PPI_pdb)
Input: Receptors/Ligands and predicted protein interfaces
Method: Compute the critical points belonging to predicted interface residues
Output: Critical points on predicted interface residues
Example: 1A2K_R.rec.pdb, 1A2K_L.lig.pdb
Step 13. Compute the predicted interface residue numbers (PI_LZerD/01.Pred_PPI/03.PPI_RES)
Input: Predicted protein interface
Method: Extract predicted interface residues
Output: List of predicted interface residues
Example output: res.txt
Step 14. First LZerD iteration using predicted interface residues (PI_LZerD/02.Sim_LZerD)
Input: Receptor/Ligand + predicted interface residues
Method: Fast LZerD iteration using predicted interface residues
Output: First iteration LZerD prediction
Example output: 1A2K.sim
Step 15. Compute pair-wise CI_RMSD distances on top 1000 predictions (PI_LZerD/03.LZerD_CIRMSD)
Input: First iteration LZerD predictions
Method: Compute pair-wise CI_RMSD distances on top 1000 predictions
Output: CI_RMSD distances of top 1000 predictions
Example output: 1A2K.cirmsd
Step 16. Cluster on top 1000 predictions base on CI_RMSD distances (PI_LZerD/04.CIRMSD_CLUST)
Input: Top 1000 predictions from first iteration LZerD program
Method: CI_RMSD clustering
Output: Clusterings from top 1000 predictions
Example output: 1A2K.t1k
Step 17. Select top 60 clustered predictions (PI_LZerD/05.CLUST_T60)
Input: Clustered top 1000 predictions
Method: Select top 60 clustered predictions
Output: Selected top 60 clustered predictions
Example output: 1A2K.40A.t60, 1A2K.25A.t60
Step 18. Use simple residue filtering method on predicted interface residues (PI_LZerD/06.SRF)
Input: Top 1000 predictions from first iteration LZerD program
Method: Sort by the consensus by the percentage of agreement with the predicted interface residues
Output: Top 1000 predictions sorted by the percentage of agreement with the predicted interface residues
Example output: 1A2K.rcf
Step 19. Second LZerD iteration using 60 clustered prediction (PI_LZerD/07.T60_Iter)
Input: Selected top 60 clustered predictions
Method: Second LZerD iteration
Output: Second LZerD iteration using 60 clustered prediction
Example output: 1A2K.iter.gz, 1A2K.k60
Step 20. 60 x Pair-wise CI_RMSD distances on top 1000 predictions (PI_LZerD/08.K60_CIRMSD)
Input: Selected top 60 clustered predictions
Method: Pair-wise CI_RMSD distances
Output: 60 x Pair-wise CI_RMSD distances
Example output: 1A2K.cirmsd, 1A2K.trans
Step 21. Re_rank using 61 prediction lists (PI_LZerD/09.T61)
Input: Selected top 60 clustered predictions
Method: Re rank from 61 predicted lists
Output: Re-ranked predictions
Example output: 1A2K.i61