Documentation
Getting Started
Learn how to use the ProteinFP tools for protein function prediction.
What is ProteinFP?
ProteinFP is a web-based platform for predicting protein functions using Gene Ontology (GO) terms. It provides three complementary prediction methods:
- PFP - Baseline prediction using PSI-BLAST sequence similarity
- Phylo-PFP - Enhanced prediction with phylogenetic distance weighting
- ESG - Extended similarity group for distant homolog discovery
- Domain-PFP - Self-supervised domain embedding representations
Input Format
All tools accept protein sequences in FASTA format:
>protein_name optional description MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLML SPDDIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAAPAPAPS WPLSSSVPSQKTYQGSYGFRLGFLHSGTAKSVTCTYSPALNKMFC
Requirements
- • Header line starting with >
- • Standard amino acid letters
- • Minimum 10 residues
Accepted Characters
- • A-Z amino acid codes
- • X for unknown residues
- • * for stop codons
Output Format
Results are provided for three GO categories:
Molecular Function (MF)
Biochemical activities of the protein
Biological Process (BP)
Larger biological goals the protein contributes to
Cellular Component (CC)
Cellular locations where the protein is active
Each GO term prediction includes a probability score (0-1) indicating confidence in the prediction.