Autoregressive generation is the process of generating a random sequence that reflects the distribution of some data, which is to say using the previous predictions in the next prediction. This is the basis of most [[generative AI]] applications today.
The concept is attributed to [[Claude Shannon]] in the 1950s from his work on [[information theory]] and especially the notion of [[entropy]].
An autoregressive model is a model for which the prediction for $x$ depends on all of the preceding points.
$x(t) = f(x_{(t-n)})$
for all $n$.
Contrast with a regressive model that predicts a point $y$ based on other inputs $x$.
A [[large language model]] is autoregressive because the next word it predicts is based on all of the preceding words predicted. GPT is an autoregressive language model.