Step Two: Prepare the Ligand Topology
We must now deal with the ligand. But how does one come up with parameters for some species that the force field does not automatically recognize? Proper treatment of ligands is one of the most challenging tasks in molecular simulation. Force field authors spend years of their lives deriving self-consistent force fields, and it is no small task to introduce a new species into this framework. Force field parameters for any new species must be derived and validated in a manner that is consistent with the original force field. For the OPLS, AMBER, and CHARMM force fields, this derivation often takes the form of various quantum mechanical calculations. The primary literature for these force fields describes the required procedure. For GROMOS force fields, parameterization methodology is less clear, relying on empirical fitting of condensed-phase behavior. That is, some initial charges and Lennard-Jones parameters are calculated for each atom type, evaluated for their accuracy, and refined. While the end result is very satisfactory, i.e. fluids resemble their real-world counterparts, the derivation process can be laborious and frustrating. For this reason, automated tools are greatly preferred. For each force field, there are methodologies or software programs that purport to give parameters compatible with various force fields. Not all of them are equally accurate. For a few examples, refer to the following table:
Add Hydrogen Atoms to JZ4In this tutorial, we will be generating the JZ4 topology with the CGenFF server. Click the link in the table above to visit the site. Register for a (free) account and activate it. CGenFF requires a Sybyl .mol2 file as its input, to collect rudimentary atom type information and bonded connectivity. CHARMM is also an all-atom force field, meaning all H atoms are explicitly represented. Crystal structures do not typically assign H coordinates, so they have to be built in. To produce a .mol2 file and add H atoms, use the Avogadro program. Open jz4.pdb in Avogadro, and from the "Build" menu, choose "Add Hydrogens." Avogadro will build all of the H atoms onto the JZ4 ligand. Save a .mol2 file (File -> Save As... and choose Sybyl Mol2 from the drop-down menu) named "jz4.mol2." Several corrections must be made to jz4.mol2 before it can be used. Open jz4.mol2 in your favorite plain-text editor and you will find: @<TRIPOS>MOLECULE ***** 22 22 0 0 0 SMALL GASTEIGER @<TRIPOS>ATOM 1 C4 24.2940 -24.1240 -0.0710 C.3 167 JZ4167 -0.0650 2 C7 21.5530 -27.2140 -4.1120 C.ar 167 JZ4167 -0.0613 3 C8 22.0680 -26.7470 -5.3310 C.ar 167 JZ4167 -0.0583 4 C9 22.6710 -25.5120 -5.4480 C.ar 167 JZ4167 -0.0199 5 C10 22.7690 -24.7300 -4.2950 C.ar 167 JZ4167 0.1200 6 C11 21.6930 -26.4590 -2.9540 C.ar 167 JZ4167 -0.0551 7 C12 22.2940 -25.1870 -3.0750 C.ar 167 JZ4167 -0.0060 8 C13 22.4630 -24.4140 -1.8080 C.3 167 JZ4167 -0.0245 9 C14 23.9250 -24.7040 -1.3940 C.3 167 JZ4167 -0.0518 10 OAB 23.4120 -23.5360 -4.3420 O.3 167 JZ4167 -0.5065 11 H 25.3133 -24.3619 0.1509 H 1 UNL1 0.0230 12 H 23.6591 -24.5327 0.6872 H 1 UNL1 0.0230 13 H 24.1744 -23.0611 -0.1016 H 1 UNL1 0.0230 14 H 21.0673 -28.1238 -4.0754 H 1 UNL1 0.0618 15 H 21.9931 -27.3472 -6.1672 H 1 UNL1 0.0619 16 H 23.0361 -25.1783 -6.3537 H 1 UNL1 0.0654 17 H 21.3701 -26.8143 -2.0405 H 1 UNL1 0.0621 18 H 21.7794 -24.7551 -1.0588 H 1 UNL1 0.0314 19 H 22.2659 -23.3694 -1.9301 H 1 UNL1 0.0314 20 H 24.5755 -24.2929 -2.1375 H 1 UNL1 0.0266 21 H 24.0241 -25.7662 -1.3110 H 1 UNL1 0.0266 22 H 23.7394 -23.2120 -5.1580 H 1 UNL1 0.2921 @<TRIPOS>BOND 1 4 3 ar 2 4 5 ar 3 3 2 ar 4 10 5 1 5 5 7 ar 6 2 6 ar 7 7 6 ar 8 7 8 1 9 8 9 1 10 9 1 1 11 1 11 1 12 1 12 1 13 1 13 1 14 2 14 1 15 3 15 1 16 4 16 1 17 6 17 1 18 8 18 1 19 8 19 1 20 9 20 1 21 9 21 1 22 10 22 1 The first change that needs to be made is in the @<TRIPOS>MOLECULE JZ4 Next, fix the residue names and numbers such that they are all the same. This .mol2 file only contains one molecule, therefore there should only be one residue name and number specified. After editing, the corrected ATOM section of jz4.mol2 should read: @<TRIPOS>ATOM 1 C4 24.2940 -24.1240 -0.0710 C.3 1 JZ4 -0.0650 2 C7 21.5530 -27.2140 -4.1120 C.ar 1 JZ4 -0.0613 3 C8 22.0680 -26.7470 -5.3310 C.ar 1 JZ4 -0.0583 4 C9 22.6710 -25.5120 -5.4480 C.ar 1 JZ4 -0.0199 5 C10 22.7690 -24.7300 -4.2950 C.ar 1 JZ4 0.1200 6 C11 21.6930 -26.4590 -2.9540 C.ar 1 JZ4 -0.0551 7 C12 22.2940 -25.1870 -3.0750 C.ar 1 JZ4 -0.0060 8 C13 22.4630 -24.4140 -1.8080 C.3 1 JZ4 -0.0245 9 C14 23.9250 -24.7040 -1.3940 C.3 1 JZ4 -0.0518 10 OAB 23.4120 -23.5360 -4.3420 O.3 1 JZ4 -0.5065 11 H 25.3133 -24.3619 0.1509 H 1 JZ4 0.0230 12 H 23.6591 -24.5327 0.6872 H 1 JZ4 0.0230 13 H 24.1744 -23.0611 -0.1016 H 1 JZ4 0.0230 14 H 21.0673 -28.1238 -4.0754 H 1 JZ4 0.0618 15 H 21.9931 -27.3472 -6.1672 H 1 JZ4 0.0619 16 H 23.0361 -25.1783 -6.3537 H 1 JZ4 0.0654 17 H 21.3701 -26.8143 -2.0405 H 1 JZ4 0.0621 18 H 21.7794 -24.7551 -1.0588 H 1 JZ4 0.0314 19 H 22.2659 -23.3694 -1.9301 H 1 JZ4 0.0314 20 H 24.5755 -24.2929 -2.1375 H 1 JZ4 0.0266 21 H 24.0241 -25.7662 -1.3110 H 1 JZ4 0.0266 22 H 23.7394 -23.2120 -5.1580 H 1 JZ4 0.2921 Last, notice the strange bond order in the perl sort_mol2_bonds.pl jz4.mol2 jz4_fix.mol2 Use "jz4_fix.mol2" in the next step. Generate the JZ4 Topology with CGenFFThe jz4_fix.mol2 file is now ready for use to produce a topology. Visit the CGenFF server, log into your account, and and click "Upload molecule" at the top of the page. Upload jz4_fix.mol2 and the CGenFF server will quickly return a topology in the form of a CHARMM "stream" file (extension .str). Save its contents from your web browser into a file called "jz4.str." You can also download a copy of this file here. The CHARMM stream file contains all of the topology information - atom types, charges, and bonded connectivity. It also has sections for additional bonded parameters that were generated by analogy for any internal interactions not covered by the force field. CGenFF also provides penalty scores for each parameter, that is, an assessment of how reliable the assigned parameter is. Anything below 10 is considered acceptable for immediate use. Values from 10 - 50 imply that some validation of the topology is warranted, and any penalties larger than 50 generally require manual reparametrization. This penalty scoring is one of the most important features of the CGenFF server. Many other servers generate topologies and are "black boxes," which users are simply left to trust implicitly. Staking your entire research project on an automatic program without verification is very dangerous. A poor ligand topology can lead to significant wasted time and unreliable results. Always validate the topologies of newly parametrized species! At minimum, check the magnitudes of charges and the atom types assigned to the ligand against existing moieties in the force field. Examine the contents of jz4.str and look at the penalties for the charges and the new dihedral parameters. All of them are very low, suggesting that this topology is of very good quality and can be used directly for our simulation. The JZ4 topology in CHARMM format is all well and good, but it's not useful if we are trying to run our simulation in GROMACS. Download a suitable version of the cgenff_charmm2gmx.py script from this GitHub site. The script name will include the version, _py2 or _py3, but here for simplicity this information is omitted but you will need to use the actual name of the script that you downloaded for your Python version. Perform the conversion with: python cgenff_charmm2gmx.py JZ4 jz4_fix.mol2 jz4.str charmm36-jul2022.ff PLEASE NOTE: This script requires NetworkX package, and there are very specific versions that have been tested. Please see the above-linked GitHub site for version recommendations and the combinations of Python and NetworkX that have been tested. Deviations from these versions may lead to syntax issues that may cause the scripts to fail. NOTE the following screen output at the end of the successful conversion: ============ DONE ============ Conversion complete. The molecule topology has been written to jz4.itp Additional parameters needed by the molecule are written to jz4.prm, which needs to be included in the system .top ============ DONE ============ This ligand introduces new bonded parameters that are not part of the existing force field, and these parameters are written to a file called "jz4.prm," which is in the format of a GROMACS .itp file. We will deal with this file shortly, but it is important to note its existence. The ligand topology is now written to "jz4.itp," which contains the ligand Build the ComplexFrom pdb2gmx, we have a file called "3HTB_processed.gro" that contains the processed, force field-compliant structure of our protein. We also have "jz4_ini.pdb" from cgenff_charmm2gmx.py that has all of the necessary H atoms and matches the atom names in the JZ4 topology. Convert this .pdb file to .gro format with editconf: gmx editconf -f jz4_ini.pdb -o jz4.gro Copy 3HTB_processed.gro to a new file to be manipulated, for instance, call it "complex.gro," as the addition of JZ4 to the protein will generate our protein-ligand complex. Next, simply copy the coordinate section of jz4.gro and paste it into complex.gro, below the last line of the protein atoms, and before the box vectors, like so: 163ASN C 1691 0.621 -0.740 -0.126 163ASN O1 1692 0.624 -0.616 -0.140 163ASN O2 1693 0.683 -0.703 -0.011 5.99500 5.19182 9.66100 0.00000 0.00000 -2.99750 0.00000 0.00000 0.00000 becomes (added text in bold green)... 163ASN C 1691 0.621 -0.740 -0.126 163ASN O1 1692 0.624 -0.616 -0.140 163ASN O2 1693 0.683 -0.703 -0.011 1JZ4 C4 1 2.429 -2.412 -0.007 1JZ4 C7 2 2.155 -2.721 -0.411 1JZ4 C8 3 2.207 -2.675 -0.533 1JZ4 C9 4 2.267 -2.551 -0.545 1JZ4 C10 5 2.277 -2.473 -0.430 1JZ4 C11 6 2.169 -2.646 -0.295 1JZ4 C12 7 2.229 -2.519 -0.308 1JZ4 C13 8 2.246 -2.441 -0.181 1JZ4 C14 9 2.392 -2.470 -0.139 1JZ4 OAB 10 2.341 -2.354 -0.434 1JZ4 H1 11 2.531 -2.436 0.015 1JZ4 H2 12 2.366 -2.453 0.069 1JZ4 H3 13 2.417 -2.306 -0.010 1JZ4 H4 14 2.107 -2.812 -0.407 1JZ4 H5 15 2.199 -2.735 -0.617 1JZ4 H6 16 2.304 -2.518 -0.635 1JZ4 H7 17 2.137 -2.681 -0.204 1JZ4 H8 18 2.178 -2.476 -0.106 1JZ4 H9 19 2.227 -2.337 -0.193 1JZ4 H10 20 2.458 -2.429 -0.214 1JZ4 H11 21 2.402 -2.577 -0.131 1JZ4 H12 22 2.374 -2.321 -0.516 5.99500 5.19182 9.66100 0.00000 0.00000 -2.99750 0.00000 0.00000 0.00000 Since we have added 22 more atoms into the .gro file, increment the second line of complex.gro to reflect this change. There should be 2636 atoms in the coordinate file now. Build the TopologyIncluding the parameters for the JZ4 ligand in the system topology is very easy. Just insert a line that says ; Include Position restraint file #ifdef POSRES #include "posre.itp" #endif ; Include water topology #include "./charmm36-jul2022.ff/tip3p.itp" becomes... ; Include Position restraint file #ifdef POSRES #include "posre.itp" #endif ; Include ligand topology #include "jz4.itp" ; Include water topology #include "./charmm36-jul2022.ff/tip3p.itp" The ligand introduces new dihedral parameters, which were written to "jz4.prm" by the cgenff_charmm2gmx.py script. At the TOP of topol.top, insert an ; Include forcefield parameters #include "./charmm36-jul2022.ff/forcefield.itp" ; Include ligand parameters #include "jz4.prm" [ moleculetype ] ; Name nrexcl Protein_chain_A 3 The placement of this The last adjustment to be made is in the [ molecules ] ; Compound #mols Protein_chain_A 1 JZ4 1 The topology and coordinate file are now in agreement with respect to the contents of the system.
|
Site design and content copyright Justin Lemkul
Problems with the site? Send them to the Webmaster