b:head_first_statistics:binomial_distribution

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
b:head_first_statistics:binomial_distribution [2025/10/07 06:40] – created hkimscilb:head_first_statistics:binomial_distribution [2025/10/07 06:45] (current) – [Proof of Binomial Expected Value and Variance] hkimscil
Line 1: Line 1:
-====== Binomial Distributions ======+======= Binomial Distribution =======
  
   - 1번의 시행에서 특정 사건 A가 발생할 확률을 p라고 하면    - 1번의 시행에서 특정 사건 A가 발생할 확률을 p라고 하면 
Line 54: Line 54:
 $$X \sim B(n,p)$$ $$X \sim B(n,p)$$
  
 +====== Expectation and Variance of Binomial Distribution ======
 +Toss a fair coin once. What is the distribution of the number of heads?
 +  * A single trial
 +  * The trial can be one of two possible outcomes -- success and failure
 +  * P(success) = p
 +  * P(failure) = 1-p
 +
 +X = 0, 1 (failure and success)
 +$P(X=x) = p^{x}(1-p)^{1-x}$ or 
 +$P(x) = p^{x}(1-p)^{1-x}$
 +
 +참고.
 +| x     | 0          | 1  |
 +| p(x)  | q = (1-p)  | p  | 
 +
 +When x = 0 (failure), $P(X = 0) = p^{0}(1-p)^{1-0} = (1-p)$ = Probability of failure
 +When x = 1 (success), $P(X = 1) = p^{1}(1-p)^{0} = p $ = Probability of success
 +
 +
 +This is called Bernoulli distribution.
 +  * Bernoulli distribution expands to binomial distribution, geometric distribution, etc.
 +  * Binomial distribution = The distribution of number of success in n independent Bernoulli trials.
 +  * Geometric distribution = The distribution of number of trials to get the first success in independent Bernoulli trials.
 +
 +$$X \sim B(1,p)$$
 +
 +\begin{eqnarray*}
 +E(X) & = & \sum{x * p(x)} \\
 +& = & (0*q) + (1*p) \\
 +& = & p 
 +\end{eqnarray*} 
 +
 +
 +\begin{eqnarray*}
 +Var(X) & = & E((X - E(X))^{2}) \\
 +& = & \sum_{x}(x-E(X))^2p(x)   \ldots \ldots \ldots E(X) = p \\
 +& = & (0 - p)^{2}*q + (1 - p)^{2}*p  \\
 +& = & (0^2 - 2p0 + p^2)*q + (1-2p+p^2)*p \\
 +& = & p^2*(1-p) + (1-2p+p^2)*p \\
 +& = & p^2 - p^3 + p - 2p^2 + p^3 \\
 +& = & p - p^2 \\
 +& = & p(1-p) \\
 +& = & pq
 +\end{eqnarray*}
 +
 +For generalization, 
 +
 +$$X \sim B(n,p)$$
 +
 +\begin{eqnarray*}
 +E(X) & = & E(X_{1}) + E(X_{2}) + ... + E(X_{n}) \\
 +& = & n * E(X_{i}) \\
 +& = & n * p 
 +\end{eqnarray*}
 +
 +\begin{eqnarray*}
 +Var(X) & = & Var(X_{1}) + Var(X_{2}) + ... + Var(X_{n}) \\
 +& = & n * Var(X_{i}) \\
 +& = & n * p * q 
 +\end{eqnarray*}
 +
 +====== e.g., ======
 +<WRAP box>
 +In the latest round of Who Wants To Win A Swivel Chair, there are 5 questions. The probability of
 +getting a successful outcome in a single trial is 0.25
 +  - What’s the probability of getting exactly two questions right?
 +  - What’s the probability of getting exactly three questions right? 
 +  - What’s the probability of getting two or three questions right? 
 +  - What’s the probability of getting no questions right?
 +  - What are the expectation and variance?
 +</WRAP>
 +
 +Ans 1. 
 +<code>
 +p <- .25
 +q <- 1-p
 +r <- 2
 +n <-5
 +# combinations of 5,2
 +c <- choose(n,r) 
 +ans1 <- c*(p^r)*(q^(n-r))
 +ans1    # or
 +
 +choose(n, r)*(p^r)*(q^(n-r))
 +
 +dbinom(r, n, p)
 +
 +</code>
 +
 +<code>
 +> p <- .25
 +> q <- 1-p
 +> r <- 2
 +> n <-5
 +> # combinations of 5,2
 +> c <- choose(n,r)
 +> ans <- c*(p^r)*(q^(n-r))
 +> ans
 +[1] 0.2636719
 +>
 +> choose(n, r)*(p^r)*(q^(n-r))
 +[1] 0.2636719
 +>
 +> dbinom(r, n, p)
 +[1] 0.2636719
 +
 +
 +</code>
 +
 +
 +
 +
 +
 +
 +Ans 2. 
 +<code>
 +p <- .25
 +q <- 1-p
 +r <- 3
 +n <-5
 +# combinations of 5,3
 +c <- choose(n,r)
 +ans2 <- c*(p^r)*(q^(n-r))
 +ans2
 +
 +choose(n, r)*(p^r)*(q^(n-r))
 +
 +dbinom(r, n, p)
 +
 +</code>
 +<code>
 +> p <- .25
 +> q <- 1-p
 +> r <- 3
 +> n <-5
 +> # combinations of 5,3
 +> c <- choose(n,r)
 +> ans2 <- c*(p^r)*(q^(n-r))
 +> ans2
 +[1] 0.08789062
 +
 +> choose(n,r)*(p^r)*(q^(n-r))
 +[1] 0.08789062
 +
 +> dbinom(r, n, p)
 +[1] 0.08789063
 +
 +
 +</code>
 +
 +Ans 3. 중요 
 +<code>
 +ans1 + ans2
 +dbinom(2, 5, .25) + dbinom(3, 5, .25) 
 +dbinom(2:3, 5, .25)
 +sum(dbinom(2:3, 5, .25))
 +pbinom(3, 5, .25) - pbinom(1, 5, .25)
 +</code>
 +
 +<code>
 +> ans1 + ans2
 +[1] 0.3515625
 +> dbinom(2, 5, .25) + dbinom(3, 5, .25) 
 +[1] 0.3515625
 +> dbinom(2:3, 5, .25)
 +[1] 0.26367187 0.08789063
 +> sum(dbinom(2:3, 5, .25))
 +[1] 0.3515625
 +> pbinom(3, 5, .25) - pbinom(1, 5, .25)
 +[1] 0.3515625
 +
 +</code>
 +
 +Ans 4. 
 +<code>
 +p <- .25
 +q <- 1-p
 +r <- 0
 +n <-5
 +# combinations of 5,3
 +c <- choose(n,r)
 +ans4 <- c*(p^r)*(q^(n-r))
 +ans4
 +</code>
 +
 +<code>> p <- .25
 +> q <- 1-p
 +> r <- 0
 +> n <-5
 +> # combinations of 5,3
 +> c <- choose(n,r)
 +> ans4 <- c*(p^r)*(q^(n-r))
 +> ans4
 +[1] 0.2373047
 +> </code>
 +
 +Ans 5
 +<code>
 +p <- .25
 +q <- 1-p
 +n <- 5
 +exp.x <- n*p
 +exp.x
 +</code>
 +<code>> p <- .25
 +> q <- 1-p
 +> n <- 5
 +> exp.x <- n*p
 +> exp.x
 +[1] 1.25</code>
 +
 +<code>
 +p <- .25
 +q <- 1-p
 +n <- 5
 +var.x <- n*p*q
 +var.x
 +</code>
 +<code>> p <- .25
 +> q <- 1-p
 +> n <- 5
 +> var.x <- n*p*q
 +> var.x
 +[1] 0.9375
 +> </code>
 +
 +Q. 한 문제를 맞힐 확률은 1/4 이다. 총 여섯 문제가 있다고 할 때, 0에서 5 문제를 맞힐 확률은? dbinom을 이용해서 구하시오.
 +<code>
 +p <- 1/4
 +q <- 1-p
 +n <- 6
 +pbinom(5, n, p)
 +
 +1 - dbinom(6, n, p)
 +</code> 
 +<code>
 +> p <- 1/4
 +> q <- 1-p
 +> n <- 6
 +> pbinom(5, n, p)
 +[1] 0.9997559
 +> 1 - dbinom(6, n, p)
 +[1] 0.9997559
 +
 +</code>
 +
 +중요 . . . . 
 +<code>
 +# http://commres.net/wiki/mean_and_variance_of_binomial_distribution
 +# ##################################################################
 +#
 +p <- 1/4
 +q <- 1 - p
 +n <- 5
 +r <- 0
 +all.dens <- dbinom(0:n, n, p)
 +all.dens
 +sum(all.dens)
 +
 +choose(5,0)*p^0*(q^(5-0))
 +choose(5,1)*p^1*(q^(5-1))
 +choose(5,2)*p^2*(q^(5-2))
 +choose(5,3)*p^3*(q^(5-3))
 +choose(5,4)*p^4*(q^(5-4))
 +choose(5,5)*p^5*(q^(5-5))
 +all.dens
 +
 +choose(5,0)*p^0*(q^(5-0)) + 
 +  choose(5,1)*p^1*(q^(5-1)) + 
 +  choose(5,2)*p^2*(q^(5-2)) + 
 +  choose(5,3)*p^3*(q^(5-3)) + 
 +  choose(5,4)*p^4*(q^(5-4)) + 
 +  choose(5,5)*p^5*(q^(5-5))
 +sum(all.dens)
 +
 +(p+q)^n
 +# note that n = whatever, (p+q)^n = 1
 +
 +</code>
 +
 +<code>
 +> # http://commres.net/wiki/mean_and_variance_of_binomial_distribution
 +> # ##################################################################
 +> #
 +> p <- 1/4
 +> q <- 1 - p
 +> n <- 5
 +> r <- 0
 +> all.dens <- dbinom(0:n, n, p)
 +> all.dens
 +[1] 0.2373046875 0.3955078125 0.2636718750 0.0878906250
 +[5] 0.0146484375 0.0009765625
 +> sum(all.dens)
 +[1] 1
 +
 +> choose(5,0)*p^0*(q^(5-0))
 +[1] 0.2373047
 +> choose(5,1)*p^1*(q^(5-1))
 +[1] 0.3955078
 +> choose(5,2)*p^2*(q^(5-2))
 +[1] 0.2636719
 +> choose(5,3)*p^3*(q^(5-3))
 +[1] 0.08789062
 +> choose(5,4)*p^4*(q^(5-4))
 +[1] 0.01464844
 +> choose(5,5)*p^5*(q^(5-5))
 +[1] 0.0009765625
 +> all.dens
 +[1] 0.2373046875 0.3955078125 0.2636718750 0.0878906250
 +[5] 0.0146484375 0.0009765625
 +
 +> choose(5,0)*p^0*(q^(5-0)) + 
 ++   choose(5,1)*p^1*(q^(5-1)) + 
 ++   choose(5,2)*p^2*(q^(5-2)) + 
 ++   choose(5,3)*p^3*(q^(5-3)) + 
 ++   choose(5,4)*p^4*(q^(5-4)) + 
 ++   choose(5,5)*p^5*(q^(5-5))
 +[1] 1
 +> sum(all.dens)
 +[1] 1
 +> # 
 +> (p+q)^n
 +[1] 1
 +> # note that n = whatever, (p+q)^n = 1
 +
 +</code>
 +====== Proof of Expected Value and Variance in Binomial Distribution ======
 +[[:Mean and Variance of Binomial Distribution|이항분포에서의 기댓값과 분산에 대한 수학적 증명]], Mathematical proof of Binomial Distribution Expected value and Variance
b/head_first_statistics/binomial_distribution.1759786841.txt.gz · Last modified: by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki