Differences

This shows you the differences between two versions of the page.

--- b:head_first_statistics:constructing_confidence_intervals [2019/12/09 08:42] – [The problem with precision] hkimscil
+++ b:head_first_statistics:constructing_confidence_intervals [2023/11/15 08:24] (current) – [Four steps for finding confidence intervals] hkimscil
@@ Line 10: / Line 10: @@
 Rather than specify an exact value, we can specify two values we expect flavor duration to lie between.
-[{{:b:head_first_statistics:pasted:20191203-121916.png  }}] : As an example, you may want to choose a and b so that there’s a 95% chance of the interval containing the population mean. Finding the exact spot of a and b is the problem we are trying to solve.
+<WRAP group>
+<WRAP 25% column>
+$\Large{P(a < \mu < b) = 0.95} $
+</WRAP>
+<WRAP 50% column>
+As an example, you may want to choose a and b so that there’s a 95% chance of the interval containing the population mean. Finding the exact spot of a and b is the problem we are trying to solve.
+</WRAP>
+</WRAP>
 The far side of each end, (a, b) is called a **//confidence interval//**.
@@ Line 25: / Line 32: @@
 <fs large>**Step 2**</fs>: Find its __**sampling distribution**__
 샘플평균들의 분산은 ($Var(\overline{X})$) 모집단의 특성인데 (parameter), 이를 알 수는 없으므로 아래와 같이 샘플의 분산값을 ($s^{2}$) 사용하여 샘플평균들의 분포를 만든다.
 {{:b:head_first_statistics:pasted:20191203-122550.png}}
@@ Line 47: / Line 55: @@
 {{:b:head_first_statistics:pasted:20191203-123432.png}}
 $$P(z_{a} < Z < z_{b}) = 0.95$$
 $$P(Z < z_{a}) = 0.025$$
@@ Line 87: / Line 96: @@
 $\overline{X} =  62.7$ 이었으므로 $62.7 - 0.98$와 $62.7 + 0.98$이 구하는 공간 (interval). 즉,
-$(61.72, 63.68)$
+$(61.72, 63.68)$ 을 전체 population의 단맛의 지속시간으로 삼는다.
+<WRAP box>
+위의 1.96이 이해하고자 하는 것을 어렵게 하는 경향이 있음.
+  * 강사의 초기 강의 중에서 표준편차의 특성 중에서 68, 95, 99%에 대한 것으로 대체해서 생각하면
+  * 표준점수로 했을 때 +- SD 1, 2, 3 에 해당되는 probability이 (면적) 각각 68, 95, 99%
+  * 따라서 위의 경우는 95%에 해당하는 probability는
+    * $P(-2 < z < 2) = .95$
+    * $P(-2 < \dfrac {\overline{X} - \mu}{sd} < 2) = .95$
+    * 이렇게 계산을 하면
+    * $P(\overline{X} -1 < \mu < \overline{X} + 1) = .95 $
+</WRAP>
 ===== Handy shortcuts for confidence intervals =====
@@ Line 130: / Line 152: @@
 {{:b:head_first_statistics:pasted:20191203-133241.png}}
-v is called the **<fc #ff0000><fs large>number of degrees of freedom</fs></fc>**
+v is called the number of **<fc #ff0000><fs large>degrees of freedom</fs></fc>**
 {{:b:head_first_statistics:pasted:20191203-133508.png}}
@@ Line 139: / Line 162: @@
 ==== Step 4: Find the confidence limits ====
 {{:b:head_first_statistics:pasted:20191203-133742.png}}
+Use degrees of freedom with alpha (p-level)
 ===== The t-distribution vs. the normal distribution =====
 {{:b:head_first_statistics:pasted:20191203-133845.png}}
+===== Exercise =====
+<WRAP help>
+Mighty Gumball has noticed a problem with their gumball dispensers. They have taken a sample of 30 machines, and found that the mean number of malfunctions is 15. Construct a 99% confidence interval for the number of malfunctions per month.
+</WRAP>
+위는 Poisson distribution이므로 $X \sim Po(15)$ 이고 $E(X) = \lambda$이고 $Var(X) = \lambda$이다. 따라서
+$$\text {confidence interval} = (\overline{X} - c * se, \;\; \overline{X} + c * se)$$
+$$\text{se} = \sqrt{(15/30)}$$ 이고
+$$\text{c} = 2.58 (3) $$ 이므로
+\begin{eqnarray*}
+\text {confidence interval} & = & (\overline{X} - c * se, \;\; \overline{X} + c * se) \\
+& = & (15 - 3 * \sqrt{(15/30)}, \;\; 15 + 3 * \sqrt{(15/30)}) \\
+& = & (15 - 2.58 * \sqrt{(15/30)}, \;\; 15 + 2.58 * \sqrt{(15/30)}) \\
+& = & (15 - 2.58 * 0.707, \;\; 15 + 2.58 * 0.707) \\
+& = & (15 - 1.824, \;\; 15 + 1.824) \\
+& = & (13.176, \;\; 16.824)
+\end{eqnarray*}