User Tools

Site Tools


b:head_first_statistics:using_discrete_probability_distributions

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
b:head_first_statistics:using_discrete_probability_distributions [2025/09/23 14:00] – [확률의 연산] hkimscilb:head_first_statistics:using_discrete_probability_distributions [2025/09/23 14:17] (current) hkimscil
Line 329: Line 329:
 Var(X1 + X2 + \ldots Xn) & = & nVar(X) \\ Var(X1 + X2 + \ldots Xn) & = & nVar(X) \\
 \\ \\
-\text{note that there be k Xs below  }  \\ 
-Var(X + X + \ldots X) & = & Var(k*X) \\ 
-& = & k^2Var(X) \\ 
 \end{eqnarray*} \end{eqnarray*}
  
Line 346: Line 343:
 \end{eqnarray*} \end{eqnarray*}
 ---- ----
-===== e.gs. ===== +===== e.g. =====
 Eg.  Eg. 
 {{discrete.prob.fortune.cookie.eg.jpg?500}} {{discrete.prob.fortune.cookie.eg.jpg?500}}
Line 355: Line 351:
 p.fc <- c(0.8, 0.1, 0.07, 0.03) p.fc <- c(0.8, 0.1, 0.07, 0.03)
  
-exp.fc <- sum(v.fc*p.fc) +exp.fc <- sum(v.fc * p.fc) 
-var.fc <- sum((v.fc-exp.fc)^2*p.fc)+var.fc <- sum((v.fc - exp.fc)^2 * p.fc)
 exp.fc exp.fc
 var.fc var.fc
Line 365: Line 361:
 p.fc <- c(0.8, 0.1, 0.07, 0.03) p.fc <- c(0.8, 0.1, 0.07, 0.03)
  
-exp.fc2 <- sum(v.fc2*p.fc) +exp.fc2 <- sum(v.fc2 * p.fc) 
-var.fc2 <- sum((v.fc2-exp.fc2)^2*p.fc)+var.fc2 <- sum((v.fc2 - exp.fc2)^2 * p.fc)
 exp.fc2 exp.fc2
 var.fc2 var.fc2
Line 407: Line 403:
 \end{eqnarray*} \end{eqnarray*}
  
-----+===== e.g.2 ===== 
 A restaurant offers two menus, one for weekdays and the other for weekends. Each menu offers four set prices, and the probability distributions for the amount someone pays is as follows: A restaurant offers two menus, one for weekdays and the other for weekends. Each menu offers four set prices, and the probability distributions for the amount someone pays is as follows:
  
Line 519: Line 516:
 </code> </code>
  
-====== Theorems ====== +SEE [[:Expected value and variance properties]]
-| $E(X)$ | $\sum{X}\cdot P(X=x)$ +
-| $E(X^2)$ | $\sum{X^{2}}\cdot P(X=x)$ +
-| $E(aX + b)$ | $aE(X) + b$  | +
-| $E(f(X))$ | $\sum{f(X)} \cdot P(X=x)$ +
-| $E(aX - bY)$ | $aE(X)-bE(Y)$ +
-| $E(X1 + X2 + X3)$ | $E(X) + E(X) + E(X) = 3E(X) \;\;\; $ ((X1,X2,X3는 동일한 statistics을 갖는 (X의 특성을 갖는, 즉, 집합 X의 동일한 mean, variance, sdev 값을 갖는) 집합))   | +
-| $Var(X)$ | $E(X-\mu)^{2} = E(X^{2})-E(X)^{2} \;\;\; $   see $\ref{var.theorem.1} $ | +
-| $Var(c)$  | $0 \;\;\; $ see $\ref{var.theorem.41}$   | +
-| $Var(aX + b)$ | $a^{2}Var(X) \;\;\; $  see $\ref{var.theorem.2}$ and $\ref{var.theorem.3}$ | +
-| $Var(aX - bY)$ | $a^{2}Var(X) + b^{2}Var(Y)$ see 1 | +
-| $Var(X1 + X2 + X3)$ | $Var(X) + Var(X) + Var(X) = 3 Var(X) \;\;\; $ ((X1, x2, x3는 동일한 특성을 (statistic, 가령 Xbar = 0, sd=1) 갖는 독립적인 세 집합이다. 따라서 세집합의 분산은 모두 1인 상태이고, 이들의 분삽값은 모두 동일하므로 Var(3X)의 성질을 갖는다.)) +
-| $Var(X1 + X1 + X1)$  | $Var(3X) = 3^2 Var(X) = 9 Var(X) $  | +
- +
- +
-\begin{eqnarray*} +
-Var(aX - bY) & = & Var(aX + -bY) \\ +
-& = & Var(aX) + Var(-bY) \\ +
-& = & a^{2}Var(X) + b^{2}Var(Y) +
-\end{eqnarray*} +
- +
-see also [[:why n-1]] +
-====== Variance Theorem 1 ====== +
-\begin{align} +
-Var[X] & = {E{(X-\mu)^2}}  \nonumber \\ +
-& = E[(X^2 - 2 X \mu + \mu^2)] \nonumber \\ +
-& = E[X^2] - 2 \mu E[X] + E[\mu^2] \nonumber \\ +
-& = E[X^2] - 2 \mu E[X] + E[\mu^2], \; \text{because E[X]=} \mu \text{, and E[} \mu^2 \text{] = } \mu^2, \nonumber \\ +
-& = E[X^2] - 2 \mu^2 + \mu^2   \nonumber \\ +
-& = E[X^2] - \mu^2 \nonumber \\ +
-& = E[X^2] - E[X]^2 \label{var.theorem.1} \tag{variance theorem 1} \\ +
-\end{align} +
- +
-====== Theorem 2: Why square ====== +
- +
-$ \ref{var.theorem.1} $ 에 따르면  +
-$$ Var[X= E[X^2− E[X]^2 $$  +
-이므로 +
- +
-\begin{align*} +
-Var[aX] & = & E[a^2X^2] − (E[aX])^2 \\ +
- & = & a^2 E[X^2] - (a E[X])^2 \\ +
- & = & a^2 E[X^2] - (a^2 E[X]^2) \\ +
- & = & a^2 (E[X^2] - (E[X])^2) \\ +
- & = & a^2 (Var[X]) \label{var.theorem.2} \tag{variance theorem 2} \\ +
-\end{align*} +
-====== Theorem 3: Why Var[X+c] = Var[X] ====== +
-\begin{align} +
-Var[X + c] = Var[X] \nonumber +
-\end{align} +
- +
-$ \ref{var.theorem.1} $ 에 따르면  +
-$$ Var[X] = E[X^2] − E[X]^2 $$  +
-이므로 +
- +
-\begin{align} +
-Var[X + c]  +
-= & E[(X+c)^2] - E[X+c]^2 \nonumber \\ +
-= & E[(X^2 + 2cX + c^2)] \label{tmp.1} \tag{temp 1} \\ +
-  & − E(X + c)E(X + c)  \label{tmp.2} \tag{temp 2} \\ +
-\end{align} +
- +
-$ \ref{tmp.1} $ 에서 +
-\begin{align} +
-E (X^2 + 2cX + c^2) = E (X^2) + 2cE(X) + c^2  \\ +
-\end{align}  +
- +
-그리고 $\ref{tmp.2}$ 에서 보면 +
-\begin{align} +
-E(X + c)E(X + c) = & E(X)(E(X + c)) + E(c)(E(X + c)) \nonumber \\ +
-= & E(X)^2 + cE(X) + cE(X) + c^2 \nonumber \\ +
-= & E(X)^2 + 2cE(X) + c^2 \\ +
-\end{align}  +
- +
-위의 둘을 모두 보면  +
-\begin{align} +
-Var(X + c) = & E(X^2) + 2cE(X) + c^2 − E(X)^2 − 2cE(X) − c^2 \nonumber \\ +
-= & E(X^2) − E(X)^2 \nonumber \\ +
-= & Var(X) \label{var.theorem.3} \tag{variance theorem 3} \\ +
-\end{align} +
- +
- +
-====== Theorem 4: Var(c) = 0 ====== +
-\begin{align} +
-Var(X) = & 0; \;\;\;\; \text{if   X = c, a constant }  \label{var.theorem.41} \tag{variance theorem 4.1} \\ +
-\text{otherwise    } \nonumber \\ +
-Var(X) \ge & 0 \label{var.theorem.42} \tag{variance theorem 4.2} \\ +
-\end{align} +
-Variance는 기본적으로 아래와 같다. 이 때 $X=c$ 라고 (c=상수) 하면 +
-\begin{align} +
-Var(X) & = E[(X − E(X))^2] \text{    because  X = c, and E(X) = c}    \nonumber \\ +
-& = E[(c-c)^2] \nonumber  \\  +
-& = 0   \nonumber  \\ +
-\text{if X  } \ne \text{c, then} \nonumber  \\ +
-&  \text{because    (X − E(X))^2 \ge 0 \nonumber \\ +
-& Var(X) \ge 0 \nonumber  +
-\end{align} +
- +
-====== Theorem 5: Var(X+Y) ====== +
-\begin{align} +
-Var(X + Y) = Var(X) + Var(Y) + 2Cov(X, Y) \label{var.theorem.51} \tag{variance theorem 5-1} \\ +
-Var(X − Y) = Var(X) + Var(Y) − 2Cov(X, Y) \label{var.theorem.52} \tag{variance theorem 5-2} \\ +
-\end{align} +
- +
-$ \ref{var.theorem.1} $ 에서 +
-$$ Var[X] = E[X^2] - E[(X)]^2  $$ +
-이므로 X <- X+Y 를 대입해보면 +
- +
-\begin{align} +
-Var[X+Y] = & E[(X + Y)^2]  \label{tmp.03} \tag{temp 3} \\  +
-- & E[(X + Y)]^2  \label{tmp.04} \tag{temp 4}  +
-\end{align} +
-$\ref{tmp.03}$과 $\ref{tmp.04}$ 는 아래처럼 정리된다 +
- +
-\begin{align*} +
-  &  E[(X + Y)^2] = E[X^2 + 2XY + Y^2] = E[X^2] + 2E[XY] + E[Y^2] \\ +
-- & [E(X + Y)]^2 = [E(X) + E(Y)]^2 = E(X)^2 + 2E(X)E(Y) + E(Y)^2 \\ +
-\end{align*} +
- +
-각 줄의 가장 오른쪽 정리식을 보면, +
- +
-\begin{align*} +
-Var[(X+Y)] =  +
-  & E[X^2] & + & 2E[XY] & + & E[Y^2] \\ +
-- & E(X)^2 & - & 2E(X)E(Y) & - & E(Y)^2 \\ +
-  & Var[X] & + & 2 E[XY]-2E(X)E(Y) & + & Var[Y] \\ +
-\end{align*} +
- +
-가운데 부분은  +
-\begin{align} +
-E(XY)- E(X)E(Y) = Cov[X,Y] \label{cov} \tag{covariance} \\ +
-\end{align} +
- +
-따라서  +
-\begin{align*} +
-Var[(X+Y)] = Var[X] + 2 Cov[X,Y] + Var[Y] \\ +
-\end{align*} +
- +
- +
-====== Questions ====== +
-Which one is correct? +
- +
-\begin{align} +
-Var(X+X) & = Var(X) + Var(X) & = 2 * Var(X)  \label{tmp.05} \tag{1} \\ +
-Var(X+X) & = Var(2X) & = 2^2 * Var(X) \label{tmp.06} \tag{2} +
-\end{align} +
- +
-$\ref{var.theorem.51}$ 을 다시 보면 +
-\begin{align*} +
-Var(X+Y) = Var(X) + 2 Cov(X,Y) + Var(Y) \\  +
-\end{align*} +
- +
-X와 Y가 independent 한 event라고 (group) 하면  +
-$ Cov(X,Y) = 0 $ 이므로  +
-\begin{align*} +
-Var[(X+Y)] = Var[X] + Var[Y] \\ +
-\end{align*} +
- +
-보통 X1, X2 집합은 같은 특성을 (statistic) 갖는 두 독립적인 집합을 의미하므로 +
-\begin{align*} +
-Var(X1 + X2)  = Var(X1) + Var(X2)  \\ +
-\end{align*} +
- +
-X1, X2는 같은 분포를 갖는 서로 독립적인 집합이고 (가령 각 집합은 n=10000이고 mean=0, var=4의 특성을 갖는) 이 때의 두 집합을 합한 집합의 Variance는 각 Variance를 더한 값과 같다는 뜻. 반면에 아래는 동일한 집합을 선형적인 관계로 옮긴 것 (X to 2X). +
- +
-\begin{align} +
-Var(X1 + X1) & = Var(2*X1) \nonumber \\  +
-& = 2^2 Var(X1) \nonumber \\ +
-& = 4 Var(X1)  \nonumber \\ +
-\end{align} +
- +
-따라서 수식 $(\ref{tmp.06})$ 가 참이다.  +
-이것은 아래처럼 생각해 볼 수도 있다.  +
- +
-$\ref{var.theorem.51}$ 에서 $Y$ 대신에 $X$를 대입하면  +
-\begin{align*} +
-Var(X + X) & = Var(X) + 2 Cov(X, X) + Var(X)  \\ +
-& \;\;\;\;\; \text{because } \\ +
-& \;\;\;\;\; \text{according to the below } \ref{cov.xx}, \\ +
-& \;\;\;\;\; Cov(X,X) = Var(X) \\  +
-& = Var(X) + 2 Var(X) + Var(X)   \\ +
-& = 4 Var(X)  +
-\end{align*} +
- +
-\begin{align} +
-Cov[X,Y] & = E(XY) - E(X)E(Y) \nonumber \\ +
-Cov[X,X] & = E(XX) - E(X)E(X) \nonumber \\ +
-& = E(X^2) - E(X)^2 \nonumber \\ +
-& = V(X) \label{cov.xx} \tag{3} +
-\end{align} +
-====== e.gs in R  ====== +
-R에서 이를 살펴보면 +
-<code> +
-# variance theorem 4-1, 4-2 +
-# http://commres.net/wiki/variance_theorem +
-# need a function, rnorm2 +
-rnorm2 <- function(n,mean,sd) { mean+sd*scale(rnorm(n)) } +
- +
-m <- 0 +
-v <- 1 +
-n <- 10000 +
-set.seed(1) +
-x1 <- rnorm2(n, m, sqrt(v)) +
-x2 <- rnorm2(n, m, sqrt(v)) +
-x3 <- rnorm2(n, m, sqrt(v)) +
- +
-m.x1 <- mean(x1) +
-m.x2 <- mean(x2) +
-m.x3 <- mean(x3) +
-m.x1 +
-m.x2 +
-m.x3 +
- +
-v.x1 <- var(x1) +
-v.x2 <- var(x2) +
-v.x3 <- var(x3) +
-v.x1 +
-v.x2 +
-v.x3 +
- +
-v.12 <- var(x1 + x2) +
-v.12 +
-###################################### +
-## v.12 should be near var(x1)+var(x2) +
-###################################### +
-## 정확히 2*v가 아닌 이유는 x1, x2가  +
-## 아주 약간은 (random하게) dependent하기 때문  +
-##(혹은 상관관계가 있기 때문) +
-## theorem 5-1 에서  +
-## var(x1+x2) = var(x1)+var(x2)+ (2*cov(x1,x2)) +
- +
-cov.x1x2 <- cov(x1,x2) +
- +
-var(x1 + x2) +
-var(x1) + var(x2) + (2*cov.x1x2) +
- +
-# theorem 5-2 도 확인 +
-var(x1 - x2) +
-var(x1) + var(x2) - (2 * cov.x1x2) +
- +
-# only when x1, x2 are independent (orthogonal) +
-# var(x1+x2) == var(x1) + var(x2) +
-######################################## +
- +
-## 그리고 동일한 (독립적이지 않은) 집합 X1에 대해서는 +
-v.11 <- var(x1 + x1)  +
-# var(2*x1) = 2^2 var(X1) +
-v.11 +
- +
-</code> +
- +
    
b/head_first_statistics/using_discrete_probability_distributions.1758603628.txt.gz · Last modified: 2025/09/23 14:00 by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki