Supplementary Materials For the Paper:

Statistical Potential based Amino Acid Similarity Matrices for Aligning Distantly Related Protein Sequences

Y. H. Tan, H. Huang, and D. Kihara
Proteins: Structure, Function and Bioinformatics 64: 587-600. (2006) See Publication page)

Lindahl and Elofsson's Dataset:


  1. BLAJ Matrix built from structural superposition data for identifying potential remote homologues (Blake-Cohen, 2001)
  2. BLOSUM45 BLOSUM45 substitution matrix (Henikoff-Henikoff, 1992)
  3. JOHM Structure-based amino acid scoring table (Johnson-Overington, 1993)
  4. KOLA Conformational similarity weight matrix (Kolaskar-Kulkarni-Kale, 1992)
  5. KOSJ Context-dependent optimal substitution matrices for all residues (Koshi-Goldstein, 1995)
  6. MIYS Base-substitution-protein-stability matrix (Miyazawa-Jernigan, 1993)
  7. OVEJ STR matrix from structure-based alignments (Overington et al., 1992)
  8. PRLA1 Structure derived matrix (SDM) for alignment of distantly related sequences (Prlic et al., 2000)
  9. PRLA2 Homologous structure dereived matrix (HSDM) for alignment of distantly related sequences (Prlic et al., 2000)
  10. QUC1 Cross-correlation coefficients of preference factors main chain (Qu et al., 1993)
  11. QUC2 Cross-correlation coefficients of preference factors side chain (Qu et al., 1993)
  12. QUIB STROMA score matrix for the alignment of known distant homologs (Qian-Goldstein, 2002)

  13. CCPC
  14. CCPG
  15. CCPQ
  17. CC6PC
  18. CC6PG
  19. CC6PQ

  20. CC10 10% CCPC + 90% KOLA
  21. CC20 20% CCPC + 80% KOLA
  22. CC30 30% CCPC + 70% KOLA
  23. CC40 40% CCPC + 60% KOLA
  24. CC50 50% CCPC + 50% KOLA
  25. CC60 60% CCPC + 40% KOLA
  26. CC70 70% CCPC + 30% KOLA
  27. CC80 80% CCPC + 20% KOLA
  28. CC90 90% CCPC + 10% KOLA

  29. Fam01 Downhill simplex optimization of matrix CCPC using family level data
  30. SFam01 Downhill simplex optimization of matrix CCPC using super family level data
  31. Fold01 Downhill simplex optimization of matrix CCPC using fold level data
  32. RANDMATRIX Matrix containing random values between -5 and 15

Golden Alignments:

Benchmarking Datasets:

Contact Information:

B235 Lily Hall of Life Sciences
915 West State Street
West Lafayette, IN 47907-2054
Email: D. Kihara