What is the difference between stemming and lemmatisation?

Difference between stemming and lemmatisation in short summary is given by:

  • Stemming is about replacing each word with its origin stem word in order to remove the suffixes like “es”, “ies”, “s”. For ex., “cats” => “cat”, “computers” => “computer” etc. This is more of a heuristic approach and not using any grammar or dictionary.
  • Lemmatisation has the same purpose as above but doing it properly and not with heuristics. For eg, replace the word with the dictionary form of the word. Lemmatisation would need a database of words like dictionary.
  • Pros of stemming is it doesn’t require a pre defined dictionary of valid words.

