CryoREAD is a computational tool using deep learning to automatically build
full DNA/RNA atomic structure from cryo-EM map.
CryoREAD Algorithm (20 min video) is made available at our lab's YouTube channel.
(1) Structure Detection by Deep Learning: locations of phosphate, sugar, base, and four base types are detected by two-stage networks.
(2)Structure Node Clustering: representative nodes are identified through clustering from detected grid positions.
(3) Backbone Tracing: the backbone is traced by the graph constructed with representative sugar nodes.
(4) Sequence Assignment: Sequences are assigned to local fragments along the backbone path, which are then assembled in the subsequent step.
(5) Full Atom Model: full nucleotides are constructed according to triangles of phosphate, sugar, and base (S-P-B) node followed by atomic structure refinement.
python3 main.py --mode=0 -F=example/21051.mrc -M=best_model --contour=0.3 --gpu=0 --batch_size=4 --prediction_only
The predicted probability maps are saved in [Predict_Result/(map_name)/2nd_stage_detection] with mrc format. It will include 8 mrc files corresponding to 8 different classes. Here (map_name) is 21051.python3 main.py --mode=0 -F=example/21051.mrc -M=best_model --contour=0.3 --gpu=0 --batch_size=4 --resolution=3.7 --no_seqinfo --refine
The automatically build atomic structure is saved in [Predict_Result/(map-name)/Output/Refine_cycle[k].pdb] in pdb format, here default k is 3. However, it may fail if your dependencies are not properly installed, then you may only find Refine_cycle1.pdb or Refine_cycle2.pdb. Here (map_name) is 21051.python3 main.py --mode=0 -F=example/21051.mrc -M=best_model -P=example/21051.fasta --contour=0.3 --gpu=0 --batch_size=4 --rule_soft=0 --resolution=3.7 --refine
The automatically build atomic structure is saved in The automatically build atomic structure is saved in [Predict_Result/(map-name)/Output/Refine_cycle[k].pdb] in pdb format, here default k is 3. However, it may fail if your dependencies are not properly installed, then you may only find Refine_cycle1.pdb or Refine_cycle2.pdb. Modeled structures without considering sequence information are also saved as [Predict_Result/(map-name)/Output/CryoREAD_noseq.pdb] (without refinement). Meanwhile, structures only considering the sequence information without connecting gap regions are saved in [Predict_Result/(map-name)/Output/CryoREAD_seqonly.pdb] (without refinement) for reference. Here (map_name) is 21051. Compared to previous structures, only sequence assignments will be changed and overall structures are similar.
We have three publicly available platforms, which basically offer the same functionality.
Input: cryo-EM map+sequence file (optional). Output: modeled structure. The input and output is the same across all platforms.
Simply upload your files to run CryoREAD then the full structure is visualized online.
Step-by-step instructions are available. Limited by redistribution constraints of Coot and Phenix, the structure here is not refined and may include atom clashes. For free user, colab has 4-hour running time limit and may not work for large structure(>=1000 nucleotides).
Full code is available here and it is easier for user to modify to develop their own tools.
It provides two additional supports:
1. Detection Output: This option outputs probability values of detected phosphate, sugar, base, and base types, computed by deep learning, in the map, for users reference.
2. Refinement pipeline: structures from other source can be refined in the specified EM map.
CPU: >=8 cores
Memory: >=50Gb. For maps with more than 3,000 nucleotides, memory space should be higher than 300GB if sequence is provided.
GPU: any GPU supports CUDA with more than 12GB memory.
CryoREAD is a free software for academic and non-commercial users.
It is released under the terms of the GNU General Public License Ver.3 (https://www.gnu.org/licenses/gpl-3.0.en.html).
Commercial users please contact dkihara@purdue.edu for alternate licensing.
Citation of the following reference should be included in any publication that uses data or results generated by CryoREAD program.
Xiao Wang, Genki Terashi, & Daisuke Kihara. De novo structure modeling for nucleic acids in cryo-EM maps using deep learning. Nature Methods. 20(11): 1739-1747 (2023)
The output structures and detection maps benchmarked in this paper is available here
© 2024 KIHARA Bioinformatics LABORATORY, PURDUE University | Design by TEMPLATED.