Thomas Bayes is credited with discovering Bayes rule, but Laplace is more responsible for the development and current popularity of Bayesian statistics. Bayesian methods have been historically maligned by frequentists like Fisher and Neyman. However, Bayesian methods have been used successfully in many applications, for example by Alan Turing to break Enigma during WWII. Possibly the first successful commercial application was the use of Bayes filters for spam filtering[^1]. The field of Bayesian statistics is built upon [[Bayes rule]], specifically [[diachronic Bayes]]. ## contrast with frequentist statistics The difference between Bayesian and [[frequentist]] (classical) methods comes down to an interpretation of how [[probability]] can enter [[statistics]]. Frequentists never assign probabilities over hypotheses (the [[p-value]] is the probability of seeing such an extreme test statistic *given* the null hypothesis). The defining characteristic of Bayesian statistics is that it considers probability distributions over hypotheses as well as over data. A frequentist statistical model can be modeled as $(X, f(x; \theta))$, where the semi-colon indicates the set of parameters $\theta$ is fixed. A Bayesian statistical model can be modeled as $(X, f(x, \theta))$ where the comma indicates a joint probability distribution over the data and parameters. Each parameter is modeled as a random variable. Bayesians can show that in the limit, with enough data, the prior does not matter (provided the prior does not exclude any possible values). > [!NOTE] > The fact that any prior will be overcome given sufficient data, no matter how wrong it is, provided that it does not assign zero probability to any possible outcome, is another way of saying "never say never". Bayes rule suggest we always keep an open mind! ![bayesian](https://imgs.xkcd.com/comics/frequentists_vs_bayesians.png) [^1]: https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Spam_filtering