GROMACS Tutorial

Step One: Prepare the Protein Topology

We must download the protein structure file we will be working with. For this tutorial, we will utilize T4 lysozyme L99A/M102Q (PDB code 3HTB). Go to the RCSB website and download the PDB text for the crystal structure.

Once you have downloaded the structure, you can visualize it using a viewing program such as VMD, Chimera, PyMOL, etc. Once you've had a look at the molecule, you are going to want to strip out the crystal waters, PO4, and BME. Note that such a procedure is not universally appropriate (i.e., the case of a bound active site water molecule). For our intentions here, we do not need crystal water or other species, which are just crystallization co-solvents. We will instead focus on the ligand called "JZ4," which is 2-propylphenol.

If you want a clean version of the .pdb file to check your work, you can download it here. The problem we now face is that the JZ4 ligand is not a recognized entity in any of the force fields provided with GROMACS, so pdb2gmx will give a fatal error if you were try to pass this file through it. Topologies can only be assembled automatically if an entry for a building block is present in the .rtp (residue topology) file for the force field. Since this is not the case, we will prepare our system topology in two steps:

  1. Prepare the protein topology with pdb2gmx
  2. Prepare the ligand topology using external tools

Since we will be preparing these two topologies separately, we must save the protein and JZ4 ligand into separate coordinate files. Save the JZ4 coordinates like so:

grep JZ4 3HTB_clean.pdb > jz4.pdb

Then simply delete the JZ4 lines from 3HTB_clean.pdb. At this point, preparing the protein topology is trivial. The force field we will be using in this tutorial is CHARMM36, obtained from the MacKerell lab website. While there, download the latest CHARMM36 force field tarball and the "cgenff_charmm2gmx.py" conversion script, which we will use later. Download the version of the conversion script that corresponds to your installed Python version (Python 2.x or 3.x).

Unarchive the force field tarball in your working directory:

tar -zxvf charmm36-jul2022.ff.tgz

There should now be a "charmm36-jul2022.ff" subdirectory in your working directory. Write the topology for the T4 lysozyme with pdb2gmx:

gmx pdb2gmx -f 3HTB_clean.pdb -o 3HTB_processed.gro -ter

You will be prompted to make 3 selections.

  1. Force field
  2. Water model
  3. Terminus type
Select the Force Field:
From current directory:
 1: CHARMM36 all-atom force field (July 2022)
From '/usr/local/gromacs/share/gromacs/top':
 2: AMBER03 protein, nucleic AMBER94 (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)
 3: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)
 4: AMBER96 protein, nucleic AMBER94 (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)
 5: AMBER99 protein, nucleic AMBER94 (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)
 6: AMBER99SB protein, nucleic AMBER94 (Hornak et al., Proteins 65, 712-725, 2006)
 7: AMBER99SB-ILDN protein, nucleic AMBER94 (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010)
 8: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)
 9: CHARMM27 all-atom force field (CHARM22 plus CMAP for proteins)
10: GROMOS96 43a1 force field
11: GROMOS96 43a2 force field (improved alkane dihedrals)
12: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)
13: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)
14: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)
15: GROMOS96 54a7 force field (Eur. Biophys. J. (2011), 40,, 843-856, DOI: 10.1007/s00249-011-0700-9)
16: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)

For this tutorial, choose the CHARMM36 force field (option 1), listed first under "From current directory" in the list.

Choose the default water model (CHARMM-modified TIP3P) and then choose "NH3+" and "COO-" for the termini. This interactive selection is necessary due to the N-terminal residue being methionine (MET), which causes pdb2gmx to choose an incompatible terminus type that is intended for carbohydrates. You must select the protein-specific termini, otherwise you will get a fatal error about non-matching atom names.

Back: Introduction Next: The Ligand Topology

Site design and content copyright Justin Lemkul
Problems with the site? Send them to the Webmaster