A smoothing n-gram language model addresses the limitations of the simple [[n-gram language model]] by assigning some of the density of probability distribution of frequent n-grams to rare or unseen n-grams.
Strategies to overcome this sparsity include
- [[discounting smoothing]]: redistribute probability density from high frequency to low frequency sequences.
- [[additive smoothing]]: like discounting.
- [[backoff smoothing]]: resort to shorter n-grams
- [[interpolation smoothing]]: combine information from n-gram models of different lengths.