Home Downloads Tutorials Contact Us Lab

Tutorials

Protocol

MAINMAST is a de novo modeling method for EM maps of near atomic resolution (less than 4.0 angstrom)

MAINMAST protocol consists of mainly four steps: (1) Identify local dense points in an EM map by Mean Shifting clustering algorithm; (2) Connect all LDPs by Minimum Spanning Tree; (3) Refine Tree structure by Tabu Search algorithm; (4) Thread sequence on the longest path.
Program MAINMAST will do the (1)-(3) steps.
Program ThreadCA threads the amino acid sequence on the longest path in the final step.
Flow Chart of MAINMAST

1. Identifying local dense points

Each Grid points in a EM map are clusterized by a non-parametric clustering algorithm (Mean shift). After the clustering, the representative points in the clusters are called local dense points (LDPs).
Map and LDPs

2. Connect all LDPs by Minimum Spanning Tree

Minimum Spanning Tree is a graph structure that connects all vertices with the minimum total length of edges.
Map and LDPs

3. Refine Tree structure by Tabu Search algorithm

The tree structure is further improved for finding the protein main-chain path. The initial tree structure (MST) is refined in an iterative procedure using a tabu search. A tabu search attempts to explore a large search space by keeping a list of moves that are forbidden.
Predicted path

4. Thread sequence on the longest path

The longest path of a tree is aligned with the amino acid sequence using the Smith-Waterman Dynamic Programming algorithm.
Precidted C-alpha model

Commands

Commands

MAINMAST protocol consists of two commands (MAINMAST and ThreadCA).

MAINMAST

This command identify local dense points (LDPs) in an EM map. Then LDPs are connected by Minimum Spanning Tree (MST). The MST is refined by a tabu search algorithm.

Usage: MAINMAST -m [MAP file (situs format)] (option) Option ver2.0: -Tree : Show MSTree mode -Graph : Graph mode ---Parameters in MeanShift---- -gw [f] : bandwidth of the gaussian filter def=2.0, sigma = 0.5*[float] -Dkeep [f] : Keep edge where distance < [f] def=0.5 -t [f] : Threshold of density values. def=0.0 -allow [f] : Max shift distance < [f] def=10.0 -filter [f]: Filter of representative points def=0.1 -merge [f]: After MeanShifting, merge d<[f] def=0.5 ---Parameters in Tabu-search---- -Nround [i]: Number of Iterations def=5000 -Nnb [i]: Number of Neighborss def=30 -Ntb [i]: Size of tabu-list def=100 -Rlocal [f]: Radius of Local MST def=10 -Const [f]: Constraint of total length of edge def=1.01,Total(Tree) <[f]*Total(MST)

ThreadCA

ThreadCA determine the direction of the protein by threading the amino acid sequence of the protein to the longest path in the refined tree graph.

Usage: ThreadCA -i [OUT file from MAINMAST] -a [20AA.param] -spd [*.spd3] (option) Option ver1.0: -i [file] : Result file of MAINMAST -a [file] : 20AA.param -spd [file] : Result of SPIDER2 -fw [f] : Filter width def=1.0 -Ab [f] : Average length of CA-CA Bond def=3.5 -Wb [f] : Weight of Bond score def=0.9 -r : Reverse mode, reverse mainchain order

Examples

Example1: Simulated map at 5.0 angstrom resolution

(Optional) Prepare a map file (1yfq.situs) as an input from a PDB file (1yfq.pdb)

Generate simulated map from 1yfq.pdb by e2pdb2mrc.py EMAN2 package.

e2pdb2mrc.py 1yfq.pdb 1yfq_2.mrc --res 5.0

Convert MRC format file to SITUS format by map2map SITUS package.

echo 2|map2map 1yfq.mrc 1yfq.situs

Density Map, 1yfq.mrc

Trace main-chain from MAP

Trace main-chain from MAP (1yfq.situs) by MAINMAST. Predicted main-chain paths were saved into path.pdb.

MAINMAST -m 1yfq.situs -t 9 -filter 0.3 -Dkeep 1.0 -Ntb 10 -Rlocal 5 -Nlocal 50 -Nround 50 > path.pdb

(Optional) Visualize main-chain path in pymol. bondmk.pl makes the pymol script.

bondmk.pl path.pdb > tmp
pymol -u tmp

Predicted Main-chain Path

Thread the amino acid sequence on the longest path

ThreadCA requires output file (*.spd3) from SPIDER2. Predict Secondary Structures by SPIDER2

run_local.sh 1yfq.seq

Thread the sequence on the main-chain path (required: path.pdb, 20AA.param and 1yfq.spd3).

../ThreadCA -i path.pdb -a ./20AA.param -spd 1yfq.spd3 -fw 1.3 -Ab 3.3 -Wb 0.9 >CA.pdb
../ThreadCA -i path.pdb -a ./20AA.param -spd 1yfq.spd3 -fw 1.3 -Ab 3.3 -Wb 0.9 -r >CA_r.pdb

Compare CA.pdb and CA_r.pdb by threading scores. In this case, CA.pdb shows better threading score than CA_r.pdb
Threading Score of CA.pdb is 108.092064
MODEL           1   108.092064      Wh=   1.00000000      We=  0.8
  
ATOM      1  CA  MET A   1      52.643  15.257  36.319  1.00  1.00
ATOM      2  CA  GLN A   2      54.566  17.208  34.977  1.00  1.00
ATOM      3  CA  ILE A   3      56.169  17.272  32.172  1.00  1.00
ATOM      4  CA  VAL A   4      57.479  18.569  30.500  1.00  1.00
	
Threading Score of CA_r.pdb is -13.4560547
 MODEL           1  -13.4560547      Wh=   1.29999995      We=  0.8
  
ATOM      1  CA  MET A   1      40.354  14.523  39.657  1.00  1.00
ATOM      2  CA  GLN A   2      41.278  16.075  39.879  1.00  1.00
ATOM      3  CA  ILE A   3      42.087  16.872  38.689  1.00  1.00
ATOM      4  CA  VAL A   4      44.132  18.503  37.508  1.00  1.00
	
Precidted CA model(CA.pdb)

Precidted CA model(CA_r.pdb)

Green: CA model, Red: 1yfq.pdb

(Optional )Visualize the Minimum Spanning Tree

Generate MST on the EM map. bondmk.pl makes the pymol script.

MAINMAST -m 1yfq.situs -t 9 -filter 0.3 -Dkeep 1.0 -Ntb 10 -Rlocal 5 -Nlocal 50 -Nround 50 -Tree > tree.pdb

bondtree.pl tree.pdb > tmp
pymol -u tmp

Minimum Spanning Tree

(Optional )Visualize All possible paths

Generate all possible connections(edges) on the EM map. bondmk.pl makes the pymol script.

MAINMAST -m 1yfq.situs -t 9 -filter 0.3 -Dkeep 1.0 -Ntb 10 -Rlocal 5 -Nlocal 50 -Nround 50 -Graph > graph.pdb

bondtree.pl graph.pdb > tmp
pymol -u tmp

All edges

Example2: A segmented map from EMD-6374

(Optional) Prepare a map file (6374.situs)

Prepare a segmented map (6374.mrc) by Chimera.
Convert MRC format file to SITUS format by map2map SITUS package.

echo 2|map2map 6374.mrc 6374.situs

Density Map, 6374.mrc

Trace main-chain from MAP

Trace main-chain from MAP (6374.situs) by MAINMAST. Predicted main-chain paths were saved into path.pdb.

MAINMAST -m 6374.situs -t 1.00 -filter 0.3 -Rlocal 10 > path.pdb

(Optional) Visualize main-chain path in pymol. bondmk.pl makes the pymol script.

bondmk.pl path.pdb > tmp
pymol -u tmp

Predicted Main-chain Path

Thread the amino acid sequence on the longest path

ThreadCA requires a output file (*.spd3) from SPIDER2. Predict Secondary Structures by SPIDER2

run_local.sh 6374.seq

Thread the sequence on the main-chain path (required: path.pdb, 20AA.param and 6374.spd3).

ThreadCA -i path.pdb -a ./20AA.param -spd 6374.spd3 -fw 1.4 -Ab 3.4 -Wb 0.9 > CA.pdb

ThreadCA -i path.pdb -a ./20AA.param -spd 6374.spd3 -fw 1.4 -Ab 3.4 -Wb 0.9 -r > CA_r.pdb

Compare CA.pdb and CA_r.pdb by threading scores. In this case, CA_r.pdb shows better threading score than CA_r.pdb
Threading Score of CA.pdb is -30.2037010
 MODEL           9  -30.2037010      Wh=   1.10000002      We=   1.0
ATOM      1  CA  MET A   1    -106.996 -28.676 264.784  1.00  1.00
ATOM      2  CA  LEU A   2    -105.789 -31.728 266.138  1.00  1.00
ATOM      3  CA  GLN A   3    -106.515 -31.803 269.158  1.00  1.00
ATOM      4  CA  GLN A   4    -104.319 -32.575 272.131  1.00  1.00
	
Threading Score of CA_r.pdb is 7.75097609E-02
 MODEL           1   7.75097609E-02  Wh=   1.20000005      We=   1.0  
  
ATOM      1  CA  MET A   1    -106.996 -28.676 264.784  1.00  1.00
ATOM      2  CA  LEU A   2    -105.789 -31.728 266.138  1.00  1.00
ATOM      3  CA  GLN A   3    -106.515 -31.803 269.158  1.00  1.00
ATOM      4  CA  GLN A   4    -104.319 -32.575 272.131  1.00  1.00
	
Precidted CA model(CA.pdb)

Precidted CA model(CA_r.pdb)

Green: CA model, Red: 1yfq.pdb

(Optional )Visualize the Minimum Spanning Tree

Generate MST on the EM map. bondmk.pl makes the pymol script.

MAINMAST -t 1.00 -filter 0.3 -Rlocal 10 -Tree -m 6374.situs > tree.pdb

bondtree.pl tree.pdb > tmp
pymol -u tmp

Minimum Spanning Tree

(Optional )Visualize All possible paths

Generate all possible connections(edges) on the EM map. bondmk.pl makes the pymol script.

MAINMAST -t 1.00 -filter 0.3 -Rlocal 10 -Graph -m 6374.situs > graph.pdb

bondtree.pl graph.pdb > tmp
pymol -u tmp

All edges