Tutorial 2: Ensemble refinement with EMMIVox
These are the steps to do ensemble modelling using a cryo-EM map and EMMIVox.
Note 1: To do ensemble modelling you need to first complete the single-structure refinement tutorial.
Note 2: all the python scripts are contained in the scripts
folder. Please run each step of the tutorial in the dedicated subfolder of tutorials/2-ensemble
referred to as Working directory.
0. Production
-
Prepare the input files and directories for ensemble modelling with 16 replicas using:
bash prepare_PLUMED_Ensemble_input.sh 16
You can specify fewer or more replicas depending on the number of CPU cores and GPUs available.
The generated PLUMED input file, called
plumed_EMMI.dat
, should look like this:
# include topology info MOLINFOThis command is used to provide information on the molecules that are present in your system. More details STRUCTUREa file in pdb format containing a reference structure=../../../1-refinement/3-Map-Scaling/step3_input_xtc.pdb WHOLE The reference structure is whole, i # define map atoms system-map: GROUPDefine a group of atoms so that a particular list of atoms can be referenced with a single label in definitions of CVs or virtual atoms. More details NDX_FILEthe name of index file (gromacs syntax)=../../../1-refinement/0-Building/index.ndx NDX_GROUPthe name of the group to be imported (gromacs syntax) - first group found is used by default=System-MAP # make map atoms whole WHOLEMOLECULESThis action is used to rebuild molecules that can become split by the periodic boundary conditions. More details ... ADDREFERENCE Define the reference position of the first atom of each entity using a PDB file EMST only for backward compatibility, as of PLUMED 2 ENTITY0the atoms that make up a molecule that you wish to align=system-map STRIDE the frequency with which molecules are reassembled=4 ... WHOLEMOLECULES # create EMMI score EMMIVOXBayesian single-structure and ensemble refinement with cryo-EM maps. More details ... # name of this action LABELa label for the action so that its output can be referenced in the input to other actions=emmi # general parameters TEMPtemperature=300.0 NL_STRIDEneighbor list update frequency=50 NL_DIST_CUTOFFneighbor list distance cutoff=1.0 NL_GAUSS_CUTOFFneighbor list Gaussian sigma cutoff=3.0 # define atoms for cryo-EM restraint and read experimental data ATOMSatoms used in the calculation of the density map, typically all heavy atoms=system-map DATA_FILEfile with cryo-EM map=../../../1-refinement/1-Map-Preparation/emd_plumed_aligned.dat # info about the experimental map NORM_DENSITYintegral of experimental density=684.480896 RESOLUTIONcryo-EM map resolution=0.19 # data likelihood (or noise model): Marginal SIGMA_MINminimum density error=0.2 GPU calculate EMMIVOX on GPU with Libtorch # output: in production write with the frequency at which XTC/TRR are written STATUS_FILEwrite a file with all the data useful for restart=EMMIStatus WRITE_STRIDEstride for writing status file=5000 # comment this if you have a hetero-complex BFACT_NOCHAIN Do not use chain ID for Bfactor MC # in ensemble modelling, you should *NOT* sample Bfactors, but read from input # the minimum Bfactor found in single-structure refinement # and keep constant and the same for all residues BFACT_READ Read Bfactor on RESTART (automatic with DBFACT>0) # scale factor SCALEscale factor=1.3 # correlation CORRELATION calculate correlation coefficient ... # in production, apply bias to system emr: BIASVALUETakes the value of one variable and use it as a bias More details ARGthe labels of the scalar/vector arguments whose values will be used as a bias on the system=emmi.scoreb STRIDEthe frequency with which the forces due to the bias should be calculated=4 # print output to file PRINTPrint quantities to a file. More details ARGthe labels of the values that you would like to print to the file=emmi.* FILEthe name of the file on which to output these quantities=COLVAR STRIDE the frequency with which the quantities of interest should be output=5000
-
Run a 10-ns long simulation in parallel using 16 MPI processes, each one parallelized on multiple CPU cores (
$OMP_NUM_THREADS
). You might need to adapt this line depending on your command to submit parallel jobs, i.e. srun, mpiexec, or mpirun.srun -N 16 gmx_mpi mdrun -pin on -ntomp $OMP_NUM_THREADS -plumed plumed_EMMI.dat -s production.tpr -multidir rep-* -nsteps 5000000
Note: Usually a good setup for multiple-replica simulations is to allocate 1 GPU to each replica, and 6-10 CPU cores (
$OMP_NUM_THREADS
) per replica depending on the system size. You can safely allocate (using your job manager, i.e. slurm or pbs) 2 replicas on the same GPU without significant loss in performance.
Working directory: 0-Production
1. Postprocessing and validation
-
We first need to concatenate the trajectories of all replicas and fix discontinuities due to Periodic Boundary Conditions:
bash prepare_PLUMED_Ensemble_analysis.sh
-
Now we need to transform back the concatenated trajectory
traj-all-PBC.xtc
to fit the original cryo-EM map using thetransformation.dat
created during the map preparation stage of single-structure refinement:python align-XTC.py conf_map.pdb traj-all-PBC.xtc traj-all-PBC-align.xtc ../../1-refinement/1-Map-Preparation/transformation.dat
-
We add a Bfactor column to the reference PDB (
conf_map.pdb
). These Bfactors are the same for all residues, they were set before doing ensemble modelling equal to the minimum Bfactor found in the single-structure refinement:python add-BFACT.py conf_map.pdb ../0-Production/EMMIStatus conf_EMMIVOX.pdb
-
And finally we calculate model fit to the data (CC_mask) of the original PDB
7P6A.pdb
and of our ensemble:python cryo-EM_validate.py ../../1-refinement/1-Map-Preparation/emd_13223.map --pdbA=../../1-refinement/1-Map-Preparation/7P6A.pdb --pdbC=conf_EMMIVOX.pdb --trjC=traj-all-PBC-align.xtc --threshold=0.0
Note: if you performed energy minimization and validation after single-structure refinement, you can compare also with the EMMIVOX model:
python cryo-EM_validate.py ../../1-refinement/1-Map-Preparation/emd_13223.map --pdbA=../../1-refinement/5-Analysis/conf_EMMIVOX.pdb --pdbC=conf_EMMIVOX.pdb --trjC=traj-all-PBC-align.xtc --threshold=0.0
Working directory: 1-Analysis