Protein pKa calculations


In computational biology, protein pKa calculations are used to estimate the pKa values of amino acids as they exist within proteins. These calculations complement the pKa values reported for amino acids in their free state, and are used frequently within the fields of molecular modeling, structural bioinformatics, and computational biology.

Amino acid p''K''a values

of amino acid side chains play an important role in defining the pH-dependent characteristics of a protein. The pH-dependence of the activity displayed by enzymes and the pH-dependence of protein stability, for example, are properties that are determined by the pKa values of amino acid side chains.
The pKa values of an amino acid side chain in solution is typically inferred from the pKa values of model compounds. See Amino acid for the pKa values of all amino acid side chains inferred in such a way. There are also numerous experimental studies that have yielded such values, for example by use of NMR spectroscopy.
The table below lists the model pKa values that are often used in a protein pKa calculation, and contains a third column based on protein studies.
Amino AcidpKapKa
Asp 3.94.0
Glu 4.34.4
Arg 12.013.5
Lys 10.510.4
His 6.086.8
Cys 8.288.3
Tyr 10.19.6
N-term8.0
C-term3.6

The effect of the protein environment

When a protein folds, the titratable amino acids in the protein are transferred from a solution-like environment to an environment determined by the 3-dimensional structure of the protein. For example, in an unfolded protein an aspartic acid typically is in an environment which exposes the titratable side chain to water. When the protein folds the aspartic acid could find itself buried deep in the protein interior with no exposure to solvent.
Furthermore, in the folded protein the aspartic acid will be closer to other titratable groups in the protein and will also interact with permanent charges and dipoles in the protein.
All of these effects alter the pKa value of the amino acid side chain, and pKa calculation methods generally calculate the effect of the protein environment on the model pKa value of an amino acid side chain.
Typically the effects of the protein environment on the amino acid pKa value are divided into pH-independent effects and pH-dependent effects. The pH-independent effects are added to the model pKa value to give the intrinsic pKa value. The pH-dependent effects cannot be added in the same straightforward way and have to be accounted for using Boltzmann summation, Tanford–Roxby iterations or other methods.
The interplay of the intrinsic pKa values of a system with the electrostatic interaction energies between titratable groups can produce quite spectacular effects such as non-Henderson–Hasselbalch titration curves and even back-titration effects.
The image below shows a theoretical system consisting of three acidic residues. One group is displaying a back-titration event.

p''K''a calculation methods

Several software packages and webserver are available for the calculation of protein pKa values. See links below or

Using the Poisson–Boltzmann equation

Some methods are based on solutions to the Poisson–Boltzmann equation, often referred to as FDPB-based methods. The PBE is a modification of Poisson's equation that incorporates a description of the effect of solvent ions on the electrostatic field around a molecule.
The , the , , , and use the FDPB method to compute pKa values of amino acid side chains.
FDPB-based methods calculate the change in the pKa value of an amino acid side chain when that side chain is moved from a hypothetical fully solvated state to its position in the protein. To perform such a calculation, one needs theoretical methods that can calculate the effect of the protein interior on a pKa value, and knowledge of the pKa values of amino acid side chains in their fully solvated states.

Empirical methods

A set of empirical rules relating the protein structure to the pKa values of ionizable residues have been developed by . These rules form the basis for the program called PROPKA for rapid predictions of pKa values.
A recent empirical pKa prediction program was released by with the online server

Molecular dynamics (MD)-based methods

methods of calculating pKa values make it possible to include full flexibility of the titrated molecule.
Molecular dynamics based methods are typically much more computationally expensive, and not necessarily more accurate, ways to predict pKa values than approaches based on the Poisson–Boltzmann equation. Limited conformational flexibility can also be realized within a continuum electrostatics approach, e.g., for considering multiple amino acid sidechain rotamers. In addition, current commonly used molecular force fields do not take electronic polarizability into account, which could be an important property in determining protonation energies.

Determining p''K''a values from titration curves or free energy calculations

From the titration of protonatable group, one can read the so-called pKa which is equal to the pH value where the group is half-protonated. The pKa is equal to the Henderson–Hasselbalch pKa
if the titration curve follows the Henderson–Hasselbalch equation. Most pKa calculation methods silently assume that all titration curves are Henderson–Hasselbalch shaped, and pKa values in pKa calculation programs are therefore often determined in this way. In the general case of multiple interacting protonatable sites, the pKa value is not thermodynamically meaningful. In contrast, the Henderson–Hasselbalch pKa value can be computed from the protonation free energy via
and is thus in turn related to the protonation free energy of the site via
The protonation free energy can in principle be computed from the protonation probability of the group which can be read from its titration curve
Titration curves can be computed within a continuum electrostatics approach with formally exact but more elaborate analytical or Monte Carlo methods, or inexact but fast approximate methods. MC methods that have been used to compute titration curves are Metropolis MC or Wang–Landau MC. Approximate methods that use a mean-field approach for computing titration curves are the Tanford–Roxby method and hybrids of this method that combine an exact statistical mechanics treatment within clusters of strongly interacting sites with a mean-field treatment of intercluster interactions.
In practice, it can be difficult to obtain statistically converged and accurate protonation free energies from titration curves if is close to a value of 1 or 0. In this case, one can use various free energy calculation methods to obtain the protonation free energy such as biased Metropolis MC, free-energy perturbation, thermodynamic integration, the non-equilibrium work method or the Bennett acceptance ratio method.
Note that the pK value does in general depend on the pH value.
This dependence is small for weakly interacting groups like well solvated amino acid sidechains on the protein surface, but can be large for strongly interacting groups like those buried in enzyme active sites or integral membrane proteins.

Software for protein p''K''a calculations