Proof for the calculation of mean in negative binomial distribution
I am trying to figure out the mean for negative binomial distribution but have run into mistakes. I know there are other posts on deriving the mean bu I am attempting to derive it in my own way. I wonder if any of you can point out where my mistake is:
In negative binomial distribution, the probability is: $$ p(X=x) = \frac{(x-1)!}{(r-1)!(x-r)!}p^r(1-p)^{x-r}, $$ where $X$ is a random variable for the number of trials required, $x$ is the number of trials, p is the probability of success, and r is the number of success until $x$th trial. Therefore, to calculate expectation:
$$ E(x) = \sum_{x=r}^{\infty}xp(x)=x\sum_{x=r}^{\infty}\frac{(x-1)!}{(r-1)!(x-r)!}p^r(1-p)^{x-r}=\sum_{x=r}^{\infty}\frac{x!}{(r-1)!(x-r)!}p^r(1-p)^{x-r} $$
Let $k=x-r$, then the formula becomes: $$ E(x)=\sum_{k=0}^{\infty}\frac{(k+r)!}{(r-1)!k!}p^r(1-p)^k= \sum_{k=0}^{\infty}\frac{(k+r)!}{(r-1)!k!}p^r(1-p)^k= r\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^r(1-p)^k $$
By binomial theorem, $\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^r(1-p)^k$ becomes $[p+(1-p)]^{k+r} = 1$, and thus $E(x) = r$, which is obviously wrong.
I cannot figure out what is wrong with my proof, and thus any help will be appreciated. For reference, someone else has done a similar proof here, but I still have trouble understanding the mistake(s) in my proof:
Deriving Mean for Negative Binomial Distribution.
$\endgroup$ 83 Answers
$\begingroup$Here is a purely algebraic approach. We begin by first showing that the PMF for a negative binomial distribution does in fact sum to $1$ over its support. Suppose $X \sim \operatorname{NegBinomial}(r,p)$, with PMF $$\Pr[X = x] = \binom{x-1}{r-1} p^r (1-p)^{x-r}, \quad x = r, r+1, r+2, \ldots.$$ This is the parametrization you chose. Consider the function $$f_m(z) = \sum_{k=0}^\infty \binom{k+m}{m} z^k.$$ We recall the identity $$\binom{k+m}{m} = \binom{k+m-1}{m-1} + \binom{k+m-1}{m},$$ from which it follows that $$\begin{align*} f_m(z) &= \sum_{k=0}^\infty \binom{k+m-1}{m-1}z^k + \binom{k-1+m}{m} z^k \\ &= f_{m-1}(z) + z \sum_{k=1}^\infty \binom{k-1+m}{m} z^{k-1} \\ &= f_{m-1}(z) + z f_m(z). \end{align*}$$ Consequently, $$f_m(z) = \frac{f_{m-1}(z)}{1-z}.$$ But because $$f_0(z) = \sum_{k=0}^\infty \binom{k}{0} z^k = \frac{1}{1-z},$$ it immediately follows that $$f_m(z) = (1-z)^{-(m+1)}.$$ Now letting $m = r-1$, $z = 1-p$, and $k = x-r$, we obtain $$\sum_{x=r}^\infty \Pr[X = x] = p^r (1 - (1-p))^{-(r-1+1)} = p^r p^{-r} = 1, \quad 0 < p < 1.$$ This proves that $\Pr[X = x]$ does define a valid PMF.
Next, we use this property to calculate $\operatorname{E}[X]$. By definition, $$\operatorname{E}[X] = \sum_{x=r}^\infty x \Pr[X = x].$$ But since $$x \binom{x-1}{r-1} = \frac{x!}{(r-1)!(x-r)!} = r \frac{x!}{r! (x-r)!} = r \binom{x}{r},$$ we find $$\operatorname{E}[X] = \sum_{x=r}^\infty r \binom{x}{r} p^r (1-p)^{x-r} = \frac{r}{p} \sum_{x=r+1}^\infty \binom{x-1}{(r+1)-1} p^{r+1} (1-p)^{x-(r+1)},$$ where we obtained this last expression by incrementing the lower index of summation by $1$, and decrementing the index in the summand by $1$. But you will notice that we have also rewritten the summand so that it is now apparent that it is the sum of the PMF of a negative binomial distribution with parameters $r+1$ and $p$. Thus this sum equals $1$, and we conclude $\operatorname{E}[X] = r/p$.
It is worth noting that for this purely algebraic approach, we have spent most of our effort to show that this parametrization is a valid PMF. The calculation of the expectation is quite straightforward by contrast. Also, if the variance is desired, it is best to consider $\operatorname{E}[X(X-1)],$ rather than $\operatorname{E}[X^2]$, since the former expression more readily yields to the same type of binomial coefficient manipulation that we used for $\operatorname{E}[X]$. I leave this computation as an exercise for the reader.
A final word: perhaps the most elegant computation is to exploit the fact that the negative binomial distribution is a generalization (i.e., a sum of IID) geometric random variables. But the purpose of this answer is to show how the computation can be done purely as an algebraic manipulation with very few prerequisites.
$\endgroup$ $\begingroup$Note that in your argumentation
- By binomial theorem, $\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^r(1-p)^k$ becomes $[p+(1-p)]^{k+r} = 1$
the equation $[p+(1-p)]^{k+r} = 1$ depends on the summation index $k$ which is not permissible. Furthermore the expression
\begin{align*} [p+(1-p)]^{k+r}=\sum_{j=0}^{k+r}\binom{k+r}{j}p^j(1-p)^{k+r-j}=1 \end{align*} is a sum with a finite number of summands and not helpful in your calculation. But, besides this argumentation you are on the right track.
Starting from your last expression we obtain \begin{align*} E(x)&=r\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^r(1-p)^k\\ &=rp^r\sum_{k=0}^{\infty}\binom{k+r}{k}(1-p)^k\tag{1}\\ &=rp^r\sum_{k=0}^{\infty}\binom{-(r+1)}{k}(-1)^k(1-p)^k\tag{2}\\ &=rp^r(1-(1-p))^{-(r+1)}\\ &=\frac{r}{p} \end{align*}
Comment:
- In (1) we use the identity $\binom{-n}{k}=\binom{n+k-1}{k}(-1)^k$
\begin{align*} \binom{-n}{k}&=\frac{(-n)(-n-1)\cdots(-n-k+1)}{k!}\\ &=(-1)^k\frac{n(n+1)\cdots(n+k-1)}{k!}\\ &=(-1)^k\frac{(n+k-1)!}{k!(n-1)!}\\ &=(-1)^k\binom{n+k-1}{k} \end{align*}
- In (2) we use the binomial series expansion
Your utilization of the Binomial theorem is wrong. In
$$\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^r(1-p)^k$$
the sum $r+k$ isn't constant.
For instance, with $r=2$,
$$\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^rq^k=\frac{2!}{2!}p^2+\frac{3!}{2!}p^2q+\frac{4!}{2!2!}p^2q^2+\frac{5!}{2!3!}p^2q^3\cdots\\ =\frac{p^2}2\left(2\cdot1+3\cdot2q+4\cdot3q^2+5\cdot4q^3+\cdots\right)=\frac{p^2}2\frac2{(1-q)^3}=\frac1p.$$
$\endgroup$