Protein tertiary structure provides indispensable information for elucidating protein function and evolution. We are developing computational methods for predicting protein tertiarty structure from sequence [35, 23, 19] and methods for error estimation of computational models . We are actively participating the CASP protein structure prediction assessment. Particularly, we were the top in the novel fold category in CASP11 and among the top in the protein structure refinement category in CASP 12.
Protein surface is where function of a protein realizes. Especially interaction with proteins and chemical compounds occur at a specific site of a protein surface. Hence, finding characteristics sites for protein function, e.g. active sites of enzymes, protein interaction interface, is a promising way to predict function of a protein. The aims of this project include development of methods for protein surface shape comparison for fast database search  and characterization of surface geometrical property of proteins [30, 28]. We use a versatile shape descriptor, 3D Zernike descriptors and 2D and 3D Krawchouk moments for this task. We have developed 3D-Surfer and EM-Surfer, web-based software for fast protein comparison and surface analysis.We have also revealed how the 3D shapes of proteins and protein complexes distirbute, i.e. 3D protein shape universe. Please check out this cool paper with video clips of the protein shape universe.
Computational pre-screening of drugs can drastically reduce the cost for developing new drugs target proteins. We developed PL-PatchSurfer , molecular surface-based drug screening tool. We are also actively participating various drug screening contests and achieved successful outcomes. The papers of contests include , , .
Function annotation of genes is a foundation of almost any molecular biology studies. Conventional methods for function annotation are homology search methods, such as BLAST and FASTA. These methods perform well when obvious homologs exist for a query protein, but don't provide any functional information otherwise. As a consequence, typically about only half of genes are annotated in a newly sequences genome. For a large scale omics analysis, it is helpful if function annotation coverage is larger even with less specific or low-resolution function [32, 26]. The goal of this project is to develop methods which can predict function to a larger number of genes than conventional homology search by providing low-resolution function when necessary witout losing accuracy. Our method, PFP , ESG , Phylo-PFP , Consensus method won the best prediction method in CASP7 and Automatic Function Prediction Meeting (AFP-SIG, ISMB 2005), and among top ranks in CAFA1, 2, 3. Please try our servers:
Intergenic regions contain important information for gene regulation. In recent years various families of small non-coding RNAs (sRNAs) have been discovered both in bacterial and eukaryotic genomes. We have developed an ensemble approach of DNA motif discovery, which outperforms standalone programs [24, 20]. We have also computationally identified sRNAs in 30 bacterial genomes and conducted comparative study .