What would you care more about – precision or recall for spam filtering problem?

To understand precision and recall, it is important to know about

  1. False positives(FP) : mail was NOT a SPAM but it WAS LABELLED as spam
  2. False negatives(FN): mail WAS a SPAM but was NOT LABELLED as spam
  3. True positives(TP): mail WAS a SPAM and also LABELLED as spam
  4. True negatives(TN): mail was NOT a SPAM and also LABELLED as NOT a SPAM
  • Precision is defined as (TP / TP + FP)  and Recall = (TP / (TP + FN)).
  • Increasing precision involves decreasing FP and increasing recall means decreasing FN. This often leads to precision-recall tradeoff
  • Ideally, users don’t want to miss the important mails, hence decreasing FP is priority and thus, care more for precision.

If you’re confused what metric to choose in general, read here about the best strategy to choose metric.

Leave a Reply

Your email address will not be published. Required fields are marked *