Differences

This shows you the differences between two versions of the page.

--- b:head_first_statistics:constructing_confidence_intervals [2019/12/09 01:40] – [Four steps for finding confidence intervals] hkimscil
+++ b:head_first_statistics:constructing_confidence_intervals [2025/10/29 04:03] (current) – [Four steps for finding confidence intervals] hkimscil
@@ Line 10: / Line 10: @@
 Rather than specify an exact value, we can specify two values we expect flavor duration to lie between.
-[{{:b:head_first_statistics:pasted:20191203-121916.png  }}] : As an example, you may want to choose a and b so that there’s a 95% chance of the interval containing the population mean. Finding the exact spot of a and b is the problem we are trying to solve.
+<WRAP group>
+<WRAP 25% column>
+$\Large{P(a < \mu < b) = 0.95} $
+</WRAP>
+<WRAP 50% column>
+As an example, you may want to choose a and b so that there’s a 95% chance of the interval containing the population mean. Finding the exact spot of a and b is the problem we are trying to solve.
+</WRAP>
+</WRAP>
 The far side of each end, (a, b) is called a **//confidence interval//**.
@@ Line 21: / Line 28: @@
 <fs large>**Step 1:**</fs> Choose your population statistic
 If we go back to the work we did in the last chapter, then the sampling distribution of means has the following expectation and variance:
-{{:b:head_first_statistics:pasted:20191203-122301.png}}
+\begin{eqnarray*}
+E(\overline{X}) & = & \mu \\
+V(\overline{X}) & = & \dfrac{\sigma^{2}} {n} \\
+\end{eqnarray*}
 <fs large>**Step 2**</fs>: Find its __**sampling distribution**__
 샘플평균들의 분산은 ($Var(\overline{X})$) 모집단의 특성인데 (parameter), 이를 알 수는 없으므로 아래와 같이 샘플의 분산값을 ($s^{2}$) 사용하여 샘플평균들의 분포를 만든다.
-{{:b:head_first_statistics:pasted:20191203-122550.png}}
+\begin{eqnarray*}
+E(\overline{X}) & = & \mu \\
+V(\overline{X}) & = & \dfrac{s^{2}} {n} \\
+\end{eqnarray*}
 위대한 풍선껌은 (Mighty Gumball) 100개의 풍선검을 샘플로 이용하여 단맛의 지속시간을 측정하고, 이 샘플의
@@ Line 33: / Line 47: @@
 이를 이용하여 샘플평균들의 (n=100일 때) 분포의 (distribution) 분산값을 예측해보면 0.25를 얻는다.
-{{:b:head_first_statistics:pasted:20191203-122843.png}}
+\begin{eqnarray*}
+V(\overline{X}) & = & \dfrac{s^{2}} {n} \\
+& = & \dfrac{25}{100} \\
+& = & 1/4 \;\;\; (0.25) \\
+\end{eqnarray*}
 위를 일반화해서 생각해보면 $X \sim N(\mu, \sigma^{2})$이라고 할 때, 샘플의 숫자가 충분히 크다고 할 때 (n=100과 같이), $E(\overline{X})$ 값과 $Var(\overline{X})$ 값은 아래와 같다.
-{{:b:head_first_statistics:pasted:20191203-122946.png}}
+\begin{eqnarray*}
+\overline{X} & \sim & N \left( \mu, \dfrac{s^{2}}{n} \right) \\
+& & \text{for the above case   } \\
+\overline{X} & \sim & N \left( \mu, 0.25 \right) \\
+\end{eqnarray*}
 <fs large>**Step 3:**</fs> Decide on the level of confidence
@@ Line 49: / Line 75: @@
 {{:b:head_first_statistics:pasted:20191203-123432.png}}
-$$P(z_{a} < Z < z_{b}) = 0.95$$
+\begin{eqnarray*}
-$$P(Z < z_{a}) = 0.025$$
+P(z_{a} < Z < z_{b}) & = & 0.95 \\
-$$z_{a} = -1.96$$
+P(Z < z_{a}) & = & 0.025 \\
-$$P(Z > z_{b}) = 0.025$$
+z_{a} & = & -1.96 \\
-$$z_{b} = +1.96$$
+P(Z > z_{b}) & = & 0.025 \\
+z_{b} & = & +1.96 \\
+\end{eqnarray*}
 \begin{eqnarray*}
@@ Line 90: / Line 120: @@
 $(61.72, 63.68)$ 을 전체 population의 단맛의 지속시간으로 삼는다.
 <WRAP box>
@@ Line 98: / Line 129: @@
   * 따라서 위의 경우는 95%에 해당하는 probability는
     * $P(-2 < z < 2) = .95$
-    * $P(-2 < \dfrac {X - \overline{X}}{sd} < 2) = .95$
+    * $P(-2 < \dfrac {\overline{X} - \mu}{sd} < 2) = .95$
     * 이렇게 계산을 하면
     * $P(\overline{X} -1 < \mu < \overline{X} + 1) = .95 $
@@ Line 144: / Line 175: @@
 {{:b:head_first_statistics:pasted:20191203-133241.png}}
-v is called the **<fc #ff0000><fs large>number of degrees of freedom</fs></fc>**
+v is called the number of **<fc #ff0000><fs large>degrees of freedom</fs></fc>**
 {{:b:head_first_statistics:pasted:20191203-133508.png}}
@@ Line 153: / Line 185: @@
 ==== Step 4: Find the confidence limits ====
 {{:b:head_first_statistics:pasted:20191203-133742.png}}
+Use degrees of freedom with alpha (p-level)
 ===== The t-distribution vs. the normal distribution =====
 {{:b:head_first_statistics:pasted:20191203-133845.png}}
+===== Exercise =====
+<WRAP help>
+Mighty Gumball has noticed a problem with their gumball dispensers. They have taken a sample of 30 machines, and found that the mean number of malfunctions is 15. Construct a 99% confidence interval for the number of malfunctions per month.
+</WRAP>
+위는 Poisson distribution이므로 $X \sim Po(15)$ 이고 $E(X) = \lambda$이고 $Var(X) = \lambda$이다. 따라서
+$$\text {confidence interval} = (\overline{X} - c * se, \;\; \overline{X} + c * se)$$
+$$\text{se} = \sqrt{(15/30)}$$ 이고
+$$\text{c} = 2.58 (3) $$ 이므로
+\begin{eqnarray*}
+\text {confidence interval} & = & (\overline{X} - c * se, \;\; \overline{X} + c * se) \\
+& = & (15 - 3 * \sqrt{(15/30)}, \;\; 15 + 3 * \sqrt{(15/30)}) \\
+& = & (15 - 2.58 * \sqrt{(15/30)}, \;\; 15 + 2.58 * \sqrt{(15/30)}) \\
+& = & (15 - 2.58 * 0.707, \;\; 15 + 2.58 * 0.707) \\
+& = & (15 - 1.824, \;\; 15 + 1.824) \\
+& = & (13.176, \;\; 16.824)
+\end{eqnarray*}