The exponential distribution is a continuous random variable widely used in engineering and science to model **time-to-event data**. Here we are interested in the amount of time, rather than the number of events, as in a Poisson distribution. If the number of events occurring in a unit of time is a Poisson random variable with parameter $\lambda$, then the time between events is exponential, also with parameter $\lambda$. For example, a lightbulb with an expected lifetime of 2,000 hours can be modeled as $X \sim exp(\frac{1}{2000})$. The exponential distribution is **memoryless**, which means that the expected amount of time to the next event at any time $t$ does not change with the amount of time that has already passed. Formally, we say $P(X > s + t | X > s) = P(X > t) \text{ for all s, t}>=0$ where $s$ is the amount of time already passed. Keep in mind that many systems are not memoryless. For example, the time until a machine breaks down will depend on the time it has already been in service. In this case, the exponential distribution is not an appropriate model. Importantly, when the situation is modeled as an exponential distribution, for example the time a customer is in service at a bank, the knowledge of how long a customer has already been in service ($s$) is not relevant to the probability that the customer will require $s + t$ time in service ($P(s+t)$). Instead, model this as $P(t | s+t)$ which, by the memoryless property we know is equivalent to $P(t)$. Some common examples include - Time until birth - Time until a light bulb fails - Waiting time in a queue - Length of service time - Time between customer arrivals ## Notation $X \sim exp(rate = \lambda)$ ## Probability Density Function $f(x) = \lambda e^{-\lambda x}$ ## Expected Value $E(X) = \frac{1}{\lambda}$ ## Variance $V(X) = \frac{1}{\lambda ^ 2}$ ## Cumulative Density Function $F(x) = 1 - e^{-\lambda x}$ # Alternative notation When the mean number of events per time period is supplied, rather than the rate, the pdf becomes $f(x) = \frac{1}{\lambda}e^{-\frac{x}{\lambda}}$For completeness, specify whether the rate or mean is supplied as $\lambda$ and use the corresponding pdf. ## R notation The exponential distribution is `exp` in R. ```R # Probability density function prob <- dexp(x, lambda) # Cumulative distribution function cum_prob <- pexp (q, lambda) # Quantile function quantile_val <- qexp(p, lambda) # Random number generation random_values <- rexp(n, lambda) ``` ## distribution of the minimum of a sample from the exponential distribution The distribution of the minimum value of a sample from the exponential distribution with rate $\lambda$ also has the exponential distribution $X \sim exp(n\lambda)$ Let $Y_n = min(X_1, X_2, \dots, X_n)$ where $X_1, X_2, \dots, X_n$ is a sample from the exponential distribution with rate $\lambda$: $X\sim exp(\lambda)$. The [[cumulative density function|cdf]] for each $X_i$ is $F(x) = P(X_i \le x) = 1 - e^{-\lambda x}$ The cdf for $Y_n$ is $F_{Y_n}(y) = P(Y_n \le y) = P(min(X_1, X_2, \dots, X_n) \le y)$ We can use our knowledge that each individual $X_i$ follows an exponential distribution with rate $\lambda$ and reframe this probability in terms of the individual $X_i$s. Consider that there are many ways for a sample to fall such that the minimum is less than some value $y$. One, two, three and up to $n$ values in the sample could be less than $y$ and the minimum would also be less than $y$. However, there is only one way for the [[complement]] of this event, the minimum is *greater than* $y$, to occur: all of the $X_i$s must be greater than $y$ for the minimum to also be greater than $y$. We can state this as $F_{Y_n}(y) = P(Y_n \le y) = 1 - P(min(X_1, X_2, \dots, X_n) > y))$ Because all $X_i$s are [[independent]], we can express this as the multiplication of individual probabilities $1 - P(min(X_1, X_2, \dots, X_n) > y)) \overset{indep}{=} 1 - P(X_1 > y, X_2 > y, \dots, X_n > y)$ Because all $X_i$s are identically distributed, we can further simplify as $1 - P(X_1 > y, X_2 > y, \dots, X_n > y) \overset{ident}{=} 1- P[(X_1 > y)]^n$ Next, we can bring the cdf back in to solve for $F_{Y{n}}$. $\begin{align} F_{Y_n}(y) = P(Y_n \le y) &= 1- P[(X_1 > y)]^n \\ &= 1 - [1 - F(y)]^n \\ &= 1 - [1 - (1 - e^{-\lambda y})]^n \\ &= 1 - [e^{-\lambda y}]^n \\ &= 1 - e^{n \lambda y} \end{align}$ Taking the derivative with respect to $y$ will convert the cdf to a [[probability density function|pdf]]. $f_{Y_n}(y) = \frac{d}{dy}F_{Y_n}(y) = n \lambda e^{-n \lambda y}$ (We're using the [[chain rule]] and [[derivative of an exponential]] to take this derivative.) We can see that the distribution of $Y$, the minimum value of a sample from the exponential distribution, is $X \sim exp(n \lambda)$. #diagram ## relationship to the gamma distribution The exponential distribution can be specified as a special case of the [[gamma distribution]]. $X \sim exp(\lambda) \sim \Gamma(1, \lambda)$ > [!Tip]+ Additional Resources > - [Expected value of the Exponential distibution (Wrath of Math)](https://www.youtube.com/watch?v=JdMHahIvo0E)