March 25, 2014 · machine learning

Simple guide to confusion matrix terminology

A confusion matrix is a table that is often used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be confusing.

I wanted to create a "quick reference guide" for confusion matrix terminology because I couldn't find an existing resource that suited my requirements: compact in presentation, using numbers instead of arbitrary variables, and explained both in terms of formulas and sentences.

Let's start with an example confusion matrix for a binary classifier (though it can easily be extended to the case of more than two classes):

Example confusion matrix for a binary classifier

What can we learn from this matrix?

Let's now define the most basic terms, which are whole numbers (not rates):

I've added these terms to the confusion matrix, and also added the row and column totals:

Example confusion matrix for a binary classifier

This is a list of rates that are often computed from a confusion matrix for a binary classifier:

A couple other terms are also worth mentioning:

And finally, for those of you from the world of Bayesian statistics, here's a quick summary of these terms from Applied Predictive Modeling:

In relation to Bayesian statistics, the sensitivity and specificity are the conditional probabilities, the prevalence is the prior, and the positive/negative predicted values are the posterior probabilities.

Want to learn more?

In my new 35-minute video, Making sense of the confusion matrix, I explain these concepts in more depth and cover more advanced topics:

Let me know if you have any questions!

Comments powered by Disqus