Scoring functions are widely used in drug discovery and other molecular modelling applications. These include:
Virtual screening of small molecule databases of candidate ligands to identify novel small molecules that bind to a protein target of interest and therefore are useful starting points for drug discovery
De novo design of novel small molecules that bind to a protein target
Lead optimization of screening hits to optimize their affinity and selectivity
A potentially more reliable but much more computationally demanding alternative to scoring functions are free energy perturbation calculations.
Prerequisites
Scoring functions are normally parameterized against a data setconsisting of experimentally determined binding affinities between molecular species similar to the species that one wishes to predict. For currently used methods aiming to predict affinities of ligands for proteins the following must first be known or predicted:
Protein tertiary structure – arrangement of the protein atoms in three-dimensional space. Protein structures may be determined by experimental techniques such as X-ray crystallography or solution phase NMR methods or predicted by homology modelling.
Ligand active conformation – three-dimensional shape of the ligand when bound to the protein
Binding-mode – orientation of the two binding partners relative to each other in the complex
The above information yields the three-dimensional structure of the complex. Based on this structure, the scoring function can then estimate the strength of the association between the two molecules in the complex using one of the methods outlined below. Finally the scoring function itself may be used to help predict both the binding mode and the active conformation of the small molecule in the complex, or alternatively a simpler and computationally faster function may be utilised within the docking run.
Classes
There are four general classes of scoring functions:
Force field – affinities are estimated by summing the strength of intermolecularvan der Waals and electrostatic interactions between all atoms of the two molecules in the complex using a force field. The intramolecular energies of the two binding partners are also frequently included. Finally since the binding normally takes place in the presence of water, the desolvation energies of the ligand and of the protein are sometimes taken into account using implicit solvation methods such as GBSA or PBSA.
Empirical – based on counting the number of various types of interactions between the two binding partners. Counting may be based on the number of ligand and receptor atoms in contact with each other or by calculating the change in solvent accessible surface area in the complex compared to the uncomplexed ligand and protein. The coefficients of the scoring function are usually fit using multiple linear regression methods. These interactions terms of the function may include for example:
* number of rotatable bonds immobilized in complex formation.
Knowledge-based – based on statistical observations of intermolecular close contacts in large 3D databases which are used to derive "potentials of mean force". This method is founded on the assumption that close intermolecular interactions between certain types of atoms or functional groups that occur more frequently than one would expect by a random distribution are likely to be energetically favorable and therefore contribute favorably to binding affinity.
Machine-learning – Unlike these classical scoring functions, machine-learning scoring functions are characterized by not assuming a predetermined functional form for the relationship between binding affinity and the structural features describing the protein-ligand complex. In this way, the functional form is inferred directly from the data. Machine-learning scoring functions have consistently been found to outperform classical scoring functions at binding affinity prediction of diverse protein-ligand complexes. This has also been the case for target-specific complexes, although the advantage is target-dependent and mainly depends on the volume of relevant data available. When appropriate care is taken, machine-learning scoring functions perform at least as well as classical scoring functions at the related problem of structure-based virtual screening. These reviews provide a broader overview on machine-learning scoring functions for structure-based drug design
The first three types, force-field, empirical and knowledge-based, are commonly referred to as classical scoring functions and are characterized by assuming their contributions to binding are linearly combined. Due to this constraint, classical scoring functions are unable to take advantage of large amounts of training data.
Refinement
Since different scoring functions are relatively co-linear, consensus scoring functions may not improve accuracy significantly. This claim went somewhat against the prevailing view in the field, since previous studies had suggested that consensus scoring was beneficial. A perfect scoring function would be able to predict the binding free energy between the ligand and its target. But in reality both the computational methods and the computational resources put restraints to this goal. So most often methods are selected that minimize the number of false positive and false negative ligands. In cases where an experimental training set of data of binding constants and structures are available a simple method has been developed to refine the scoring function used in molecular docking.