# Memorylessness

In probability theory, memorylessness is a property of certain probability distributions: the exponential distributions and the geometric distributions.

## Discrete memorylessness

Suppose X is a discrete random variable whose values lie in the set { 0, 1, 2, ... } or in the set { 1, 2, 3, ... }. The probability distribution of X is memoryless precisely if for any x, y in { 0, 1, 2, ... } or in { 1, 2, 3, ... }, (as the case may be), we have

$P(X>x+y \mid X>x)=P(X>y).$

It can readily be shown that the only probability distributions that enjoy this discrete memorylessness are geometric distributions. These are the distributions of the number of independent Bernoulli trials needed to get one "success", with a fixed probability p of "success" on each trial.

## Example and motivation for the name memorylessness

For example, suppose a die is thrown as many times as it takes to get a "1", so that the probability of "success" on each trial is 1/6, and the random variable X is the number of times the die must be thrown. Then X has a geometric distribution, and the conditional probability that the die must be thrown at least four more times to get a "1", given that it has already been thrown 10 times without a "1" appearing, is no different from the original probability that the die would be thrown at least four times. In effect, the random process does not "remember" how many failures have occurred so far.

### A frequent misunderstanding

Memorylessness is often misunderstood by students taking courses on probability: the fact that P(X > 16 | X > 12) = P(X > 4) does not mean that the events X > 16 and X > 12 are independent; i.e., it does not mean that P(X > 16 | X > 12) = P(X > 16). To summarize: "memorylessness" of the probability distribution of the number of trials X until the first success means

$\mathrm{(Right)}\ P(X>16 \mid X>12)=P(X>4).$

It does not mean

$\mathrm{(Wrong)}\ P(X>16 \mid X>12)=P(X>16).$

(That would be independence. These two events are not independent.)

## Continuous memorylessness

Suppose that rather than considering the discrete number of trials until the first "success", we consider continuous waiting time T until the arrival of the first phone call at a switchboard. To say that the probability distribtuion of T is memoryless means that for any positive real numbers s and t, we have

$P(T>t+s \mid T>t)=P(T>s).$

The only difference between this and the discrete version is that instead of requiring s and t to be positive (or, in some cases, nonnegative) integers, thus achieving discreteness, we allow them to be real numbers that are not necessarily integers.

It can be shown that the only probability distributions that enjoy this continuous memorylessness are the exponential distributions.