The cumulative accuracy profile is used in data science to visualize the discriminative power of a model. The CAP of a model represents the cumulative number of positive outcomes along the y-axis versus the corresponding cumulative number of a classifying parameter along the x-axis. The CAP is distinct from the receiver operating characteristic, which plots the true-positive rate against the false-positive rate.
Example
An example is a model that predicts whether a product is brought by each individual from a group of people based on factors such as their gender, age, income etc. If group members would be contacted at random, the cumulative number of products sold would rise linearly toward a maximum value corresponding to the total number of buyers within the group. This distribution is called the "random" CAP. A perfect prediction, on the other hand, determines exactly which group members will buy the product, such that the maximum number of products sold will be reached with a minimum number of calls. This produces a steep line on the CAP curve that stays flat once the maximum is reached, which is the "perfect" CAP. A successful model predicts the likelihood of individuals purchasing the product and ranks these probabilities to produce a list of potential customers to be contacted first. The resulting cumulative number of sold products will increase rapidly and eventually flatten out to the given maximum as more group members are contacted. This results in a distribution that lies between the random and the perfect CAP curves.
Analyzing a CAP
The CAP can be used to evaluate a model by comparingthe curve to the perfect CAP in which the maximum number of positive outcomes is achieved directly and to the random CAP in which the positive outcomes are distributed equally. A good model will have a CAP between the perfect CAP and the random CAP with a better model tending to the perfect CAP. The accuracy ratio is defined as the ratio of the area between the model CAP and the random CAP and the area between the perfect CAP and the random CAP. For a successful model the AR has values between zero and one, with a higher value for a stronger model. Another indication of the model strength is given by the cumulative number of positive outcomes at 50% of the classifying parameter. For a successful model this value should lie between 50% and 100% of the maximum, with a higher percentage for stronger models. On very rarecases the accuracy ratio can be negative. In this case, the model is performing worse than the random CAP.
Applications
The CAP and the ROC are both commonly used by banks and regulators to analyze the discriminatory ability of rating systems that evaluate the credit risks