What is a language model ? Why do you need a language model ?

A language model is a probability distribution over sequences of words given by

    \[p(w)\,=\,p(w_{1},w_{2},...,w_{k})\]

It enables us to measure the relative likelihood of different phrases. Measuring the likelihood of a sequence of words is useful in many NLP tasks such as speech recognition, machine translation, POS tagging, parsing, and so on.

Example :  In any generative model where a target sequence is generated from  a source sequence, for instance, in machine translation

        Target seq = \ argmax_{target\_seq}   p(target_seq, source seq)

                            =\ argmax_{target\_seq}   p(target_seq) p(source_seq | target_seq)

p(target_seq) is typically the language model while p(source_seq | target_seq) depends on the specific statistical model used for machine translation.

Another Example : For speech recognition, the task involves converting a sequence of sounds into word sequences. The language model enables distinguishing between target_sequence phrases that sound similar based on relative likelihood of the phrase occurring. Example: I am eating an ice cream is more likely than I am eating and I scream during the speech recognition task. Through language modeling, probability of both these sequences will be different and the one with highest probability will be chosen. 

Common language models involve n-gram model where each word depends on previous n words (unigram or bag of words, bi-gram, tri-gram), HMM based models and neural language models.

Leave a Reply

Your email address will not be published. Required fields are marked *