Precision/Recall vs FN/TN/FP/TP

machine learning

technical

Precision, recall, and the confusion matrix help evaluate machine learning models. Learn to Understand their tradeoffs, especially in imbalanced datasets, and optimize your classifier for better results.

Author

Dominik Lindner

Published

February 10, 2025

1 Measuring classification results

While writing this post, i noticed that there is far better article on GeeksforGeeks

Precision and recall are evaluation metrics used in machine learning to measure the performance of a binary classification model. The concepts of false negatives (FN), true negatives (TN), false positives (FP), and true positives (TP) are closely related to these metrics.

Classification	Abr.	Occur When a
True positives	TP	positive instance is correctly classified as positive.
True negatives	TN	negative instance is correctly classified as negative
False negatives	FN	positive instance is incorrectly classified as negative
False positives	FP	negative instance is incorrectly classified as positive

2 Precision and recall

These numbers are expressed in absolute terms. Sometimes it is more helpfull to focus on relative numbers. If we are interested in how many of the positive values should have been positive, we are interested in the precision. Precision is the ratio of true positives to the total number of instances that are classified as positive by the model. It is given by:

Precision = TP / (TP + FP)

If we are interested in model ability to identify all positive instances, we look for recall. Recall is the ratio of true positives to the total number of actual positive instances in the data.

Recall = TP / (TP + FN)

These metrics are important because in many cases, precision and recall have an inverse relationship. That is, improving one metric may come at the cost of the other. For example, a model that is overly conservative in making positive predictions may have high precision but low recall, as it is less likely to make false positive errors but may also miss many true positive instances. On the other hand, a model that is more aggressive in making positive predictions may have high recall but low precision, as it may capture more true positives but also generate more false positives.

By considering the confusion matrix with FN/TN/FP/TP, precision and recall can be calculated to evaluate the performance of a classification model. The confusion matrix shows the number of true and false predictions for each class, and it can be used to calculate metrics such as accuracy, precision, and recall.

3 Confusion matrix

A confusion matrix is a table that is often used to describe the performance of a classification model on a set of data for which the correct classifcations are known. Here is an example of a confusion matrix:

	Predicted A	Predicted B
Actual A	4	34
Actual B	23	35

The columns represent the predicted class labels and the rows represent the actual class labels. This can be generalized to n labels.

In the binary classification the confusion matrix simplifies itself to

	Predicted Negative	Predicted Positive
Actual Negative	TN (true negative)	FP (false positive)
Actual Positive	FN (false negative)	TP (true positive)

As can be seen the, the confusion matrix shows the number of true positive (TP), true negative (TN), false positive (FP), and false negative (FN) predictions made by a classification model.

4 What is better?

Precision measures accuracy of positive predictions while recall measures the ability to identify all positive instances. The confusion matrix provides a detailed breakdown of predictions including true positive, true negative, false positive, and false negative counts. The choice of metric depends on the context and purpose of the analysis. Precision/recall are useful when the cost of false positives and false negatives is different, while confusion matrix is useful when costs are similar and to identify specific areas of model performance, especially on imbalanced datasets.