Bernoulli Distribution
Parameters ${\displaystyle 0 ${\displaystyle k\in \{0,1\}\,}$ ${\displaystyle {\begin{cases}q=(1-p)&{\text{for }}k=0\\p&{\text{for }}k=1\end{cases}}}$ ${\displaystyle {\begin{cases}0&{\text{for }}k<0\\1-p&{\text{for }}0\leq k<1\\1&{\text{for }}k\geq 1\end{cases}}}$ ${\displaystyle p\,}$ ${\displaystyle {\begin{cases}0&{\text{if }}q>p\\0.5&{\text{if }}q=p\\1&{\text{if }}q ${\displaystyle {\begin{cases}0&{\text{if }}q>p\\0,1&{\text{if }}q=p\\1&{\text{if }}q ${\displaystyle p(1-p)(=pq)\,}$ ${\displaystyle {\frac {1-2p}{\sqrt {pq}}}}$ ${\displaystyle {\frac {1-6pq}{pq}}}$ ${\displaystyle -q\ln(q)-p\ln(p)\,}$ ${\displaystyle q+pe^{t}\,}$ ${\displaystyle q+pe^{it}\,}$ ${\displaystyle q+pz\,}$ ${\displaystyle {\frac {1}{p(1-p)}}}$

In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli,[1] is the probability distribution of a random variable which takes the value 1 with probability ${\displaystyle p}$ and the value 0 with probability ${\displaystyle q=1-p}$ -- i.e., the probability distribution of any single experiment that asks a yes-no question; the question results in a boolean-valued outcome, a single bit of information whose value is success/yes/true/one with probability p and failure/no/false/zero with probability q. It can be used to represent a coin toss where 1 and 0 would represent "head" and "tail" (or vice versa), respectively. In particular, unfair coins would have ${\displaystyle p\neq 0.5}$.

The Bernoulli distribution is a special case of the Binomial distribution where a single experiment/trial is conducted (n=1). It is also a special case of the two-point distribution, for which the outcome need not be a bit, i.e., the two possible outcomes need not be 0 and 1.

## Properties of the Bernoulli Distribution

If ${\displaystyle X}$ is a random variable with this distribution, we have:

${\displaystyle \Pr(X=1)=p=1-\Pr(X=0)=1-q.}$

The probability mass function ${\displaystyle f}$ of this distribution, over possible outcomes k, is

${\displaystyle f(k;p)={\begin{cases}p&{\text{if }}k=1,\\[6pt]1-p&{\text{if }}k=0.\end{cases}}}$

This can also be expressed as

${\displaystyle f(k;p)=p^{k}(1-p)^{1-k}\!\quad {\text{for }}k\in \{0,1\}}$

or as

${\displaystyle f(k;p)=pk+(1-p)(1-k)\!\quad {\text{for }}k\in \{0,1\}.}$

The Bernoulli distribution is a special case of the binomial distribution with ${\displaystyle n=1}$.[2]

The kurtosis goes to infinity for high and low values of ${\displaystyle p}$, but for ${\displaystyle p=1/2}$ the two-point distributions including the Bernoulli distribution have a lower excess kurtosis than any other probability distribution, namely -2.

The Bernoulli distributions for ${\displaystyle 0\leq p\leq 1}$ form an exponential family.

The maximum likelihood estimator of ${\displaystyle p}$ based on a random sample is the sample mean.

## Mean

The expected value of a Bernoulli random variable ${\displaystyle X}$ is

${\displaystyle \operatorname {E} \left(X\right)=p}$

This is due to the fact that for a Bernoulli distributed random variable ${\displaystyle X}$ with ${\displaystyle \Pr(X=1)=p}$ and ${\displaystyle \Pr(X=0)=q}$ we find

${\displaystyle \operatorname {E} [X]=\Pr(X=1)\cdot 1+\Pr(X=0)\cdot 0=p\cdot 1+q\cdot 0=p}$

## Variance

The variance of a Bernoulli distributed ${\displaystyle X}$ is

${\displaystyle \operatorname {Var} [X]=pq=p(1-p)}$

We first find

${\displaystyle \operatorname {E} [X^{2}]=\Pr(X=1)\cdot 1^{2}+\Pr(X=0)\cdot 0^{2}=p\cdot 1^{2}+q\cdot 0^{2}=p}$

From this follows

${\displaystyle \operatorname {Var} [X]=\operatorname {E} [X^{2}]-\operatorname {E} [X]^{2}=p-p^{2}=p(1-p)=pq}$

## Skewness

The skewness is ${\displaystyle {\frac {q-p}{\sqrt {pq}}}={\frac {1-2p}{\sqrt {pq}}}}$. When we take the standardized Bernoulli distributed random variable ${\displaystyle {\frac {X-\operatorname {E} [X]}{\sqrt {\operatorname {Var} [X]}}}}$ we find that this random variable attains ${\displaystyle {\frac {q}{\sqrt {pq}}}}$ with probability ${\displaystyle p}$ and attains ${\displaystyle -{\frac {p}{\sqrt {pq}}}}$ with probability ${\displaystyle q}$. Thus we get

{\displaystyle {\begin{aligned}\gamma _{1}&=\operatorname {E} \left[\left({\frac {X-\operatorname {E} [X]}{\sqrt {\operatorname {Var} [X]}}}\right)^{3}\right]\\&=p\cdot \left({\frac {q}{\sqrt {pq}}}\right)^{3}+q\cdot \left(-{\frac {p}{\sqrt {pq}}}\right)^{3}\\&={\frac {1}{{\sqrt {pq}}^{3}}}\left(pq^{3}-qp^{3}\right)\\&={\frac {pq}{{\sqrt {pq}}^{3}}}(q-p)\\&={\frac {q-p}{\sqrt {pq}}}\end{aligned}}}

## Related distributions

• If ${\displaystyle X_{1},\dots ,X_{n}}$ are independent, identically distributed (i.i.d.) random variables, all Bernoulli distributed with success probability p, then
${\displaystyle Y=\sum _{k=1}^{n}X_{k}\sim \mathrm {B} (n,p)}$ (binomial distribution).

The Bernoulli distribution is simply ${\displaystyle \mathrm {B} (1,p)}$.

## Notes

1. ^ James Victor Uspensky: Introduction to Mathematical Probability, McGraw-Hill, New York 1937, page 45
2. ^ McCullagh and Nelder (1989), Section 4.2.2.

## References

Connect with defaultLogic
What We've Done
Led Digital Marketing Efforts of Top 500 e-Retailers.
Worked with Top Brands at Leading Agencies.
Successfully Managed Over \$50 million in Digital Ad Spend.
Developed Strategies and Processes that Enabled Brands to Grow During an Economic Downturn.