Copyright © 2007-2008 Washington University in St. Louis
Table of Contents
List of Figures
List of Tables
List of Equations
This tutorial assumes you've completed the APBS tutorial and are comfortable with PDB2PQR and APBS calculations for those applications. In order to perform the provided examples, you will need APBS 0.5.1, access to PDB2PQR (web or command line), and a molecular graphics program such as VMD or PyMOL.
Why pKa calculations? Although pKa calculations may not seem like the most routine applications for demonstrating continuum electrostatics concepts, they have important scientific and pedagogic value. From a scientific standpoint, pKa values are important determinants of biomolecular (particularly enzymatic) function and can be used to assess functional activity and identify active sites. From a pedagogical perspective, pKa calculations require all of the important continuum electrostatics concepts and therefore serve as a "self-contained" introduction to solvation and binding energies.
Table of Contents
This is a very brief introduction to the concepts behind biomolecular pKas and titration states. More information can be obtained from most biochemistry or biophysics textbooks as well as some of the original articles on pKa evaluation [1].
Recall that the acid dissociation constant Ka describes the dissocation of an acid into its components:

by way of the activities

Under “conditions of ideality,” [2] activities can be replaced with concentrations to give

You should also recall that chemical equilibrium constants can be related to free energies by

However, chemists found it easier to use base-10 logarithms than natural logarithms for measuring pH, therefore the pKa is defined as

In many calculations, pKa values are assigned based on model values for amino acid side chains to mimic

for an isolated amino acid in solution. Some model pKa values are given in Table 2.1, “Model amino acid pKa values for common titratable groups; data from Nielsen et al (see footnote) ”.
Table 2.1. Model amino acid pKa values for common titratable groups; data from Nielsen et al (see footnote)
| Amino acid | Model pKa |
|---|---|
| Arginine | 13.0 |
| Aspartic acid | 4.0 |
| Cysteine | 8.7 |
| C-terminus | 3.8 |
| Glutamic acid | 4.4 |
| Histidine | 6.3 |
| Lysine | 10.4 |
| N-terminus | 8.0 |
| Tyrosine | 9.6 |
As we'll see in the next section, these model values provide the basis for calculating pKa values in proteins.
The role of the model pKa values introduced in the previous sections is to move all the chemical (bond making and breaking) complexity of protonation into the model values. In particular, pKa values in proteins are calculated as pertubations of the model compounds according to the free energy cycles shown below.
The pKa for the amino acid in the context of the protein is given by the free energy cycle:
We are interested in determining the unknown ΔaGHA from the known model ΔaGHA,model and the unknown ΔxferGHA and ΔxferGA- according to:
In general, the quantities ΔxferGHA and ΔxferGA- are obainted from a computational approach. Nearly any free energy calculation method could be used to obtain these energies according to a scheme where the (de)solvation energies of the charged and uncharged amino acids are calculated according to:
Note that these energies have all assumed the same background state of the protein for the calculations; in other words, the same states are used for other titratable groups in the protein for charged and uncharged amino acid states.
We'll discuss the implications of this assumption a bit later in this tutorial.
Although nearly any free energy method could be used to evaluate the energies of transferring the protonated and unprotonated amino acids from solution into the protein environment, continuum electrostatics offer a (usually) satisfying compromise between accuracy and computational efficiency.
The transfer free energies to be calculated, ΔxferGHA and ΔxferGA-, can be determined from Poisson-Boltzmann (PB) energies. In particular, these energies can be calculated as effective “binding energy calculations” similar to those covered in the APBS examples and tutorial:
where
Gprotein with charged X is the electrostatic energy of the protein with group X bound and all charges on X set to their normal values.
Gprotein with uncharged X is the electrostatic energy of the protein with group X bound and all charges on X set to zero.
Gcharged X in solution is the electrostatic energy of group X in solution with all charges set to their normal values.
Note that, as with binding energies, Equation 2.2, “Transfer free energy” can be evaluated two ways:
Ways to evaluate transfer free energies
Directly in a PB equation by computing the difference of total electrostatic energies (including self energies) from PB calculations for each of the states. In order for this to work, all conformations/grid positions/charge states must be the same in each PB calculation.
Indirectly through PB calculations of solvation free energies and Coulomb's law calculations (in a dielectric εp) of electrostatic interaction energies. For a sufficiently fine grid, this method is much more stable.
Both of these methods can be combined, via free energy cycles, to give the desired ΔxferGX. However, when care is taken to use the same grids and conformations for all calculations, the direct method using total electrostatic energies is usually the most efficient.
Note that none of the methods discussed above have explicitly allowed for changes in titration state of other groups in the protein during protonation/deprotonation of the acid group of interest. Additionally, none of these methods explicitly provide for conformational changes in the protein coupled to protonation/deprotonoation. As such, we cannot claim to be computing true pKa values with this method. Instead, we are calculating so-called intrinsic pKa values.
[1] The following references provide a good introduction to biomolecular pKa calculations:
Bashford D, Karplus M. pKa's of ionizable groups in proteins: Atomic detail from a continuum electrostatic model. Biochemistry. 29 (44), 10219-25, 1990. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=2271649
Antosiewicz J, McCammon JA, Gilson MK. The determinants of pKas in proteins. Biochemistry. 35 (24), 7819-33, 1996. http://pubs.acs.org/cgi-bin/pagelookup?bichaw/35/7819
Nielsen JE, Vriend G. Optimizing the hydrogen-bond network in Poisson-Boltzmann equation-based pKa calculations. Proteins-Structure Function and Genetics. 43 (4), 403-12, 2001. http://www3.interscience.wiley.com/cgi-bin/abstract/78505732/ABSTRACT
[2] Such conditions might happen at extremely low ion and coion concentrations in the absence of other interacting species, etc. In other words, for real biological systems, these conditions are rarely encountered.
Table of Contents
Hen egg white lysozyme (HEWL) is a very popular system for pKa calculations as it has a number of interesting values for its titratable residues. Early pKa work on this enzyme is presented in Tanford C, Roxby R. Interpretation of protein titration curves. Application to lysozyme. Biochemistry. 11 (11), 2192-8, 1972 which also contains the pKa values used in this lab. More recent pKa calculations and a review of some of the methodology can be found in Nielsen JE, Vriend G. Optimizing the hydrogen-bond network in Poisson-Boltzmann equation-based pk(a) calculations. Proteins-Structure Function and Genetics. 43 (4), 403-12, 2001. Finally, the biological relevance of lysozyme is briefly reviewed at Wikipedia.
HEWL has two active site residues GLU 35 and ASP 52 whose titration states determine the catalytic competency of the enzyme:
In particular, the enzyme is only active when ASP 52 is ionized (pKa ≈ 1.2) and GLU 35 is neutral (pKa = 6.3).
Additionally, lysozyme has ASP 66 with a pKa of 6.6 and HIS 15 with a pKa of 5.7.
In what follows, we will demonstrate the steps for determining the intrinsic pKa of GLU35. However, this actually doesn't work very well (you'll need to decide why!). Therefore, it is important that you also follow the same procedure for HIS15 and ASP66 to get better examples of working continuum solvent pKa calculations.
Download the PDB entry 2LZT from the PDB; save it as 2LZT-GLU35.pdb.
If you have time, you should also visit the PDBSum analysis page as well for additional information about the structure.
![]() | Problems with explicit water! |
|---|---|
It is very important that you remove all explicit water from the PDB file before proceeding. (Why?) |
We're going to need to generate a protonated form of GLU 35 to perform our pKa calculations. We will do this with the PDB2PQR web service. PDB2PQR “cleans up” PDB files for electrostatics calculations by performing a number of operations described on the PDB2PQR homepage.
Unless otherwise directed, PDB2PQR adds hydrogens to residues based on model pKa values.
Therefore, we will need to specify the titration state of GLU 35 for our pKa calculation by changing the residue name from GLU to GLH using your favorite text editor.
Save the result as 2LZT-GLH35.pdb
We're now ready to run PDB2PQR to generate protonated versions of our PDB files.
Use the command line version of PDB2PQR or one of the web servers listed on the PDB2PQR homepage to generate protonoated PQR files for for 2LZT-GLU35.pdb and 2LZT-GLH35.pdb.
Name your results 2LZT-GLU35.pqr and 2LZT-GLH35.pqr, respectively.
Although it is always important to test sensitivity to various force fields, I'd recommend starting with PARSE.
![]() | Note |
|---|---|
You can use PDB2PQR to assign titration states with PROPKA but don't do it for the above steps since we need to set the titration states explicit for our calculations. |
Recall that we're also going to need the isolated residue for our electrostatics calculations of intrinsic pKa.
Use your favorite text editor to extract the entire GLU 35 or GLH 35 residue from 2LZT-GLU35.pqr and
2LZT-GLH35.pqr.
Save the results as GLU35.pqr and GLH35.pqr, respectively.
Finally, we'll also need to perform electrostatics calculations on HEWL with the uncharged sidechains.
Use your favorite text editor to zero out the charges in 2LZT-GLU35.pqr and 2LZT-GLH35.pqr to create 2LZT-noGLU35.pqr and 2LZT-noGLH35.pqr.
This can be done by setting the second-to-last column in the PQR file to zero; e.g.
ATOM 534 N GLU 35 6.456 16.408 22.487 -0.5163 1.8240 ATOM 535 CA GLU 35 6.354 15.205 23.349 0.0397 1.9080 ATOM 536 C GLU 35 6.889 13.941 22.695 0.5366 1.9080 ATOM 537 O GLU 35 7.705 13.192 23.277 -0.5819 1.6612 ATOM 538 CB GLU 35 4.906 14.973 23.796 0.0560 1.9080 ATOM 539 CG GLU 35 4.476 15.932 24.948 0.0136 1.9080 ATOM 540 CD GLU 35 5.171 15.736 26.255 0.8054 1.9080 ATOM 541 OE1 GLU 35 5.844 14.786 26.570 -0.8188 1.6612 ATOM 542 OE2 GLU 35 5.038 16.682 27.056 -0.8188 1.6612 ATOM 543 H GLU 35 5.671 16.799 22.056 0.2936 0.6000 ATOM 544 HA GLU 35 6.856 15.382 24.215 0.1105 1.3870 ATOM 545 HB2 GLU 35 4.318 15.158 23.042 -0.0173 1.4870 ATOM 546 HB3 GLU 35 4.832 14.066 24.142 -0.0173 1.4870 ATOM 547 HG2 GLU 35 4.654 16.857 24.629 -0.0425 1.4870 ATOM 548 HG3 GLU 35 3.500 15.796 25.084 -0.0425 1.4870
might become
ATOM 534 N GLU 35 6.456 16.408 22.487 0.0000 1.8240 ATOM 535 CA GLU 35 6.354 15.205 23.349 0.0000 1.9080 ATOM 536 C GLU 35 6.889 13.941 22.695 0.0000 1.9080 ATOM 537 O GLU 35 7.705 13.192 23.277 0.0000 1.6612 ATOM 538 CB GLU 35 4.906 14.973 23.796 0.0000 1.9080 ATOM 539 CG GLU 35 4.476 15.932 24.948 0.0000 1.9080 ATOM 540 CD GLU 35 5.171 15.736 26.255 0.0000 1.9080 ATOM 541 OE1 GLU 35 5.844 14.786 26.570 0.0000 1.6612 ATOM 542 OE2 GLU 35 5.038 16.682 27.056 0.0000 1.6612 ATOM 543 H GLU 35 5.671 16.799 22.056 0.0000 0.6000 ATOM 544 HA GLU 35 6.856 15.382 24.215 0.0000 1.3870 ATOM 545 HB2 GLU 35 4.318 15.158 23.042 0.0000 1.4870 ATOM 546 HB3 GLU 35 4.832 14.066 24.142 0.0000 1.4870 ATOM 547 HG2 GLU 35 4.654 16.857 24.629 0.0000 1.4870 ATOM 548 HG3 GLU 35 3.500 15.796 25.084 0.0000 1.4870
We will be using focusing calculations to calculate the electrostatic potential and free energies for the systems of interest.
![]() | Warning |
|---|---|
In what follows, we are evaluating total electrostatic free energies -- e.g., energies which contain charge self-interaction terms. We will cancel these self-interaction terms in subsequent steps when we calculate solvation or transfer free energies. Therefore, it is very important that you use the same grid parameters (grid centers, dimensions, spacings, etc.) for every calculation. |
Here is a template input that we will use for each of the solvation energy calculations (download it here):
read mol pqr compound.pqr # This is the compound for which we will calculate solvation energies mol pqr ref.pqr # This is a compound used as a reference for grid centering end elec name inhom mg-auto # Focusing calculations dime 129 129 129 # This is a good grid spacing for this system cglen 52.0 66.0 79.0 # These are reasonable coarse grid settings for this system (PDB2PQR-recommended) fglen 51.0 59.0 67.0 # These are reasonable fine grid settings for this system (PDB2PQR-recommended) cgcent mol 2 # Center the grid on the reference molecule fgcent mol 2 # Center the grid on the reference molecule mol 1 lpbe bcfl sdh pdie 20.00 sdie 78.54 srfm smol sdens 40.0 chgm spl2 srad 1.40 swin 0.30 temp 298.15 calcenergy total calcforce no end # Print the final energy print energy inhom end quit
There are a number of aspects to this input file which are worth noting:
In general, compound.pqr will change for each calculation but ref.pqr will not.
Choose one molecule to be ref.pqr (2LZT-GLU35.pqr is a good choice) and use it in every calculation.
We are using a solute dielectric constant (εp) of 20 (see pdie).
This is a common choice for pKa since
[3]
it is thought to implicitly represent internal relaxation and rearrangement of the solute
[4].
We are using a molecular surface (srfm smol) with a reasonably high density of surface discretization points (sdens 40.0). pKa and other electrostatics results can be very sensitive to surface choice.
You now have enough information to calculate total electrostatic energies for all of the relevant molecules so far: 2LZT-GLU35.pqr, 2LZT-GLH35.pqr, 2LZT-noGLU35.pqr, 2LZT-noGLH35.pqr, GLU35.pqr, and GLH35.pqr.
You should be able to construct APBS input files for each of these systems by modifying the template above. Once these input files are constructed, you can run the PB calculation by
$apbsfoo.in| teefoo.out
where foo.in is the input file of interest and the output is saved in foo.out.
Recall from our earlier discussion that the transfer free energies can be evaluated in two ways. It is up to you to decide which method to use; both should give comparable results for a sufficiently fine grid. As mentioned above, the direct subtraction of total electrostatic energies is usually the most stable route, assuming identical grids and solute conformations are used for all calculations.
Any differences in results between these two transfer free energy methods are typically due to a lack of convergence in the calculations and can be resolved by decreasing the grid spacing (e.g., increasing the number of grid points).
If you choose to decompose the transfer free energies into solvation and Coulombic energy changes, then you will need to perform two additional electrostatics calculations beyond the APBS calculations outlined in the previous section:
First, you will need to re-run all of your APBS calculations with a homogeneous dielectric εs = εp to calculate solvation energies. This can be accomplished by appropriate modification of the input files from the previous section.
Second, you will need to calculate Coulombic interaction energies. The APBS program coulomb will evaluate the Coulomb's law electrostatic energy of a set of charges in a vacuum. The program's usage is
$coulomb foo.pqr
where foo.pqr is the particular PQR file you are interested in.
Note that this program can also be run with no arguments to obtain an expanded list of options.
Don't forget to scale by εp!
At this point, you should have everything you need to calculate the intrinsic pKa of interest. However, if you get stuck, I've included some example files here that might be helpful:
GLU35.pqr, GLH35.pqr, 2LZT-GLH35.pqr, 2LZT-GLU35.pqr, 2LZT-noGLH35.pqr, 2LZT-noGLU35.pqrPQR files described above
2LZT-noGLH35.in, 2LZT-GLH35.in, GLH35.inAPBS input files needed to calculate the transfer free energy for GLH directly using total electrostatic energies.
2LZT-noGLH35-vac.in, 2LZT-GLH35-vac.in, GLH35-vac.inAdditional APBS input files needed to calculate the transfer free energy for GLH using polar solvation energies.
2LZT-noGLU35.in, 2LZT-GLU35.in, GLU35.inAPBS input files needed to calculate the transfer free energy for GLU directly using total electrostatic energies.
2LZT-noGLU35-vac.in, 2LZT-GLU35-vac.in, GLU35-vac.inAdditional APBS input files needed to calculate the transfer free energy for GLU using polar solvation energies.
run-apbs.shBash script to run all the APBS input files listed above. Assumes the apbs binary is in your PATH.
run-coul.shBash script to run supporting Coulombic energy calculations for solvation energy-based evaluation of the transfer free energies. Assumes the APBS coulomb binary (usually found in apbs/tools/manip) is in your PATH.
process.shBash script to process APBS output and calculate pKa shifts directly using total electrostatic energies. Assumes results were obtained by running run-apbs.sh, linked above.
Based on this brief introduction, you should be in a good position to go back and try to evaluate intrinsic pKas for HIS 15 and ASP 66. How well do your results for those residues agree with experiment? What's different with those residues?
So far, we've only examined intrinsic pKas in this lab and have ignored coupling between titratable groups. Jens Nielsen has developed a very nice software package called pKaTool which allows you to explore coupling between titratable sites and its impact on titration events in a protein system. He has provided a tutorial (PDF) which you can use to explore coupled titration states and use to familiarize yourself with pKaTool.