User Tools

Site Tools


b:head_first_statistics:binomial_distribution

This is an old revision of the document!


Binomial Distribution

  1. 1번의 시행에서 특정 사건 A가 발생할 확률을 p라고 하면
  2. n번의 (독립적인) 시행에서 사건 A가 발생할 때의 확률 분포를
  3. 이항확률분포라고 한다.

아래를 보면

  • 각 한문제를 맞힐 확률은 1/4, 틀릴 확률은 3/4
  • 3문제를 풀면서 (3번의 시행) 각 문제를 맞힐 확률 분포를 말한다.


x P(X=x) power of .75 power of .25
0 0.75 * 0.75 * 0.75 3 0
1 3 * (0.75 * 0.75 * 0.25) 2 1
2 3 * (0.75 * 0.25 * 0.25) 1 2
3 0.25 * 0.25 * 0.25 0 3

$$P(X = r) = {\huge\text{?} \cdot 0.25^{r} \cdot 0.75^{3-r}} $$
$$P(X = r) = {\huge_{3}C_{r}} \cdot 0.25^{r} \cdot 0.75^{3-r}$$

$_{n}C_{r}$은 n개의 사물에서 r개를 (순서없이) 고르는 방법의 수라고 할 때, 3개의 질문 중에서 한 개의 정답을 맞히는 방법은 $_{3}C_{1} = 3$ 세가지가 존재.

Probability for getting one question right
\begin{eqnarray*} P(X = r) & = & _{3}C_{1} \cdot 0.25^{1} \cdot 0.75^{3-1} \\ & = & \frac{3!}{1! \cdot (3-1)!} \cdot 0.25 \cdot 0.75^2 \\ & = & 3 \cdot 0.25 \cdot 0.5625 \\ & = & 3 \cdot 0.25 \cdot 0.5625 \\ & = & 0.421875 \end{eqnarray*}

$$P(X = r) = _{n}C_{r} \cdot 0.25^{r} \cdot 0.75^{n-r}$$
$$P(X = r) = _{n}C_{r} \cdot p^{r} \cdot q^{n-r}$$

  1. You’re running a series of independent trials. (n번의 시행을 하게 된다)
  2. There can be either a success or failure for each trial, and the probability of success is the same for each trial. (각 시행은 성공/실패로 구분되고 성공의 확률은 (반대로 실패의 확률도) 각 시행마다 동일하다)
  3. There are a finite number of trials. Note that this is different from that of geometric distribution. (n번의 시행으로 한정된다. 무한대 시행이 아님)

X가 n번의 시행에서 성공적인 결과를 얻는 수를 나타낸다고 할 때, r번의 성공이 있을 확률을 구하려면 아래 공식을 이용한다.

\begin{eqnarray*} P(X = r) & = & _{n}C_{r} \cdot p^{r} \cdot q^{n-r} \;\;\; \text{Where,} \\ _{n}C_{r} & = & \frac {n!}{r!(n-r)!} \end{eqnarray*}

p = 각 시행에서 성공할 확률
n = 시행 숫자
r = r 개의 정답을 구할 확률

$$X \sim B(n,p)$$

Expectation and Variance of Binomial Distribution

Toss a fair coin once. What is the distribution of the number of heads?

  • A single trial
  • The trial can be one of two possible outcomes – success and failure
  • P(success) = p
  • P(failure) = 1-p

X = 0, 1 (failure and success)
$P(X=x) = p^{x}(1-p)^{1-x}$ or
$P(x) = p^{x}(1-p)^{1-x}$

참고.

x 0 1
p(x) q = (1-p) p

When x = 0 (failure), $P(X = 0) = p^{0}(1-p)^{1-0} = (1-p)$ = Probability of failure
When x = 1 (success), $P(X = 1) = p^{1}(1-p)^{0} = p $ = Probability of success

This is called Bernoulli distribution.

  • Bernoulli distribution expands to binomial distribution, geometric distribution, etc.
  • Binomial distribution = The distribution of number of success in n independent Bernoulli trials.
  • Geometric distribution = The distribution of number of trials to get the first success in independent Bernoulli trials.

$$X \sim B(1,p)$$

\begin{eqnarray*} E(X) & = & \sum{x * p(x)} \\ & = & (0*q) + (1*p) \\ & = & p \end{eqnarray*}

\begin{eqnarray*} Var(X) & = & E((X - E(X))^{2}) \\ & = & \sum_{x}(x-E(X))^2p(x) \ldots \ldots \ldots E(X) = p \\ & = & (0 - p)^{2}*q + (1 - p)^{2}*p \\ & = & (0^2 - 2p0 + p^2)*q + (1-2p+p^2)*p \\ & = & p^2*(1-p) + (1-2p+p^2)*p \\ & = & p^2 - p^3 + p - 2p^2 + p^3 \\ & = & p - p^2 \\ & = & p(1-p) \\ & = & pq \end{eqnarray*}

For generalization,

$$X \sim B(n,p)$$

\begin{eqnarray*} E(X) & = & E(X_{1}) + E(X_{2}) + ... + E(X_{n}) \\ & = & n * E(X_{i}) \\ & = & n * p \end{eqnarray*}

\begin{eqnarray*} Var(X) & = & Var(X_{1}) + Var(X_{2}) + ... + Var(X_{n}) \\ & = & n * Var(X_{i}) \\ & = & n * p * q \end{eqnarray*}

e.g.,

In the latest round of Who Wants To Win A Swivel Chair, there are 5 questions. The probability of
getting a successful outcome in a single trial is 0.25

  1. What’s the probability of getting exactly two questions right?
  2. What’s the probability of getting exactly three questions right?
  3. What’s the probability of getting two or three questions right?
  4. What’s the probability of getting no questions right?
  5. What are the expectation and variance?

Ans 1.

p <- .25
q <- 1-p
r <- 2
n <-5
# combinations of 5,2
c <- choose(n,r) 
ans1 <- c*(p^r)*(q^(n-r))
ans1    # or

choose(n, r)*(p^r)*(q^(n-r))

dbinom(r, n, p)
> p <- .25
> q <- 1-p
> r <- 2
> n <-5
> # combinations of 5,2
> c <- choose(n,r)
> ans <- c*(p^r)*(q^(n-r))
> ans
[1] 0.2636719
>
> choose(n, r)*(p^r)*(q^(n-r))
[1] 0.2636719
>
> dbinom(r, n, p)
[1] 0.2636719
> 
> 

Ans 2.

p <- .25
q <- 1-p
r <- 3
n <-5
# combinations of 5,3
c <- choose(n,r)
ans2 <- c*(p^r)*(q^(n-r))
ans2

choose(n, r)*(p^r)*(q^(n-r))

dbinom(r, n, p)
> p <- .25
> q <- 1-p
> r <- 3
> n <-5
> # combinations of 5,3
> c <- choose(n,r)
> ans2 <- c*(p^r)*(q^(n-r))
> ans2
[1] 0.08789062
> 
> choose(n,r)*(p^r)*(q^(n-r))
[1] 0.08789062
> 
> dbinom(r, n, p)
[1] 0.08789063
> 
> 

Ans 3. 중요

ans1 + ans2
dbinom(2, 5, .25) + dbinom(3, 5, .25) 
dbinom(2:3, 5, .25)
sum(dbinom(2:3, 5, .25))
pbinom(3, 5, .25) - pbinom(1, 5, .25)
> ans1 + ans2
[1] 0.3515625
> dbinom(2, 5, .25) + dbinom(3, 5, .25) 
[1] 0.3515625
> dbinom(2:3, 5, .25)
[1] 0.26367187 0.08789063
> sum(dbinom(2:3, 5, .25))
[1] 0.3515625
> pbinom(3, 5, .25) - pbinom(1, 5, .25)
[1] 0.3515625
> 

Ans 4.

p <- .25
q <- 1-p
r <- 0
n <-5
# combinations of 5,3
c <- choose(n,r)
ans4 <- c*(p^r)*(q^(n-r))
ans4
> p <- .25
> q <- 1-p
> r <- 0
> n <-5
> # combinations of 5,3
> c <- choose(n,r)
> ans4 <- c*(p^r)*(q^(n-r))
> ans4
[1] 0.2373047
> 

Ans 5

p <- .25
q <- 1-p
n <- 5
exp.x <- n*p
exp.x
> p <- .25
> q <- 1-p
> n <- 5
> exp.x <- n*p
> exp.x
[1] 1.25
p <- .25
q <- 1-p
n <- 5
var.x <- n*p*q
var.x
> p <- .25
> q <- 1-p
> n <- 5
> var.x <- n*p*q
> var.x
[1] 0.9375
> 

Q. 한 문제를 맞힐 확률은 1/4 이다. 총 여섯 문제가 있다고 할 때, 0에서 5 문제를 맞힐 확률은? dbinom을 이용해서 구하시오.

p <- 1/4
q <- 1-p
n <- 6
pbinom(5, n, p)

1 - dbinom(6, n, p)
> p <- 1/4
> q <- 1-p
> n <- 6
> pbinom(5, n, p)
[1] 0.9997559
> 1 - dbinom(6, n, p)
[1] 0.9997559

중요 . . . .

# http://commres.net/wiki/mean_and_variance_of_binomial_distribution
# ##################################################################
#
p <- 1/4
q <- 1 - p
n <- 5
r <- 0
all.dens <- dbinom(0:n, n, p)
all.dens
sum(all.dens)

choose(5,0)*p^0*(q^(5-0))
choose(5,1)*p^1*(q^(5-1))
choose(5,2)*p^2*(q^(5-2))
choose(5,3)*p^3*(q^(5-3))
choose(5,4)*p^4*(q^(5-4))
choose(5,5)*p^5*(q^(5-5))
all.dens

choose(5,0)*p^0*(q^(5-0)) + 
  choose(5,1)*p^1*(q^(5-1)) + 
  choose(5,2)*p^2*(q^(5-2)) + 
  choose(5,3)*p^3*(q^(5-3)) + 
  choose(5,4)*p^4*(q^(5-4)) + 
  choose(5,5)*p^5*(q^(5-5))
sum(all.dens)
# 
(p+q)^n
# note that n = whatever, (p+q)^n = 1
> # http://commres.net/wiki/mean_and_variance_of_binomial_distribution
> # ##################################################################
> #
> p <- 1/4
> q <- 1 - p
> n <- 5
> r <- 0
> all.dens <- dbinom(0:n, n, p)
> all.dens
[1] 0.2373046875 0.3955078125 0.2636718750 0.0878906250
[5] 0.0146484375 0.0009765625
> sum(all.dens)
[1] 1
> 
> choose(5,0)*p^0*(q^(5-0))
[1] 0.2373047
> choose(5,1)*p^1*(q^(5-1))
[1] 0.3955078
> choose(5,2)*p^2*(q^(5-2))
[1] 0.2636719
> choose(5,3)*p^3*(q^(5-3))
[1] 0.08789062
> choose(5,4)*p^4*(q^(5-4))
[1] 0.01464844
> choose(5,5)*p^5*(q^(5-5))
[1] 0.0009765625
> all.dens
[1] 0.2373046875 0.3955078125 0.2636718750 0.0878906250
[5] 0.0146484375 0.0009765625
> 
> choose(5,0)*p^0*(q^(5-0)) + 
+   choose(5,1)*p^1*(q^(5-1)) + 
+   choose(5,2)*p^2*(q^(5-2)) + 
+   choose(5,3)*p^3*(q^(5-3)) + 
+   choose(5,4)*p^4*(q^(5-4)) + 
+   choose(5,5)*p^5*(q^(5-5))
[1] 1
> sum(all.dens)
[1] 1
> # 
> (p+q)^n
[1] 1
> # note that n = whatever, (p+q)^n = 1
> 
b/head_first_statistics/binomial_distribution.1759787023.txt.gz · Last modified: by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki