User Tools

Site Tools


b:head_first_statistics:estimating_populations_and_samples

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
b:head_first_statistics:estimating_populations_and_samples [2019/12/10 13:27] – [Estimating population variance] hkimscilb:head_first_statistics:estimating_populations_and_samples [2022/11/17 12:47] (current) – [Exercise] hkimscil
Line 17: Line 17:
 </WRAP> </WRAP>
  
-\begin{eqnarray*} +\begin{align*} 
-\overline{X} & = \frac {\sum{X}}{n} \\ +\overline{X} & = \frac {\sum{X}}{n} \\ 
-& = \frac {\sum_{i=1}^{n} X_{i}}{n} \\ +& = \frac{ \sum_{i=1}^{n} X_{i} } {n} \\ 
-& = \hat{\mu} +& = \hat{\mu} 
-\end{eqnarray*}+\end{align*}
  
 ====== Estimating population variance ====== ====== Estimating population variance ======
Line 44: Line 44:
 {{:b:head_first_statistics:pasted:20191125-103603.png}} {{:b:head_first_statistics:pasted:20191125-103603.png}}
 {{:b:head_first_statistics:pasted:20191125-103510.png}} {{:b:head_first_statistics:pasted:20191125-103510.png}}
 +
 +[[:Why N-1]]
  
 <code> <code>
Line 129: Line 131:
 \end{eqnarray*} \end{eqnarray*}
    
-이 때 각각의 시도에서의 (trial) proportion 기대값은 ($\hat{P}$): +이 때 $n = 100$일때 각각의 시도에서의 (trial) proportion 기대값은 ($\hat{P}$): 
-\begin{eqnarray+ 
-\hat{P_{1}} & = {X_{1}}/{100} = 0.\\ +\begin{align*} 
-\hat{P_{2}} & = {X_{2}}/{100} = 0.\\ +n = 100, \\ 
-\hat{P_{3}} & = {X_{3}}/{100} = 0.\\ +\hat{P_{1}} & = \frac{X_{1}}{n} = 0.34, (X_{1} = 34) \\ 
-\hat{P_{4}} & = {X_{4}}/{100} = 0.4 \\ +\hat{P_{2}} & = \frac{X_{2}}{n} = 0.43, (X_{2} = 43) \\ 
-\cdots \cdots \cdots              \\ +\hat{P_{3}} & = \frac{X_{3}}{n} = 0.32, (X_{3} = 32) \\ 
-\hat{P_{k}} & = {X_{k}}/{100} = 0. +\hat{P_{4}} & = \frac{X_{4}}{n} = 0.42, (X_{4} = 42) \\ 
-\end{eqnarray}+\cdots \cdots \cdots \\ 
 +\hat{P_{k}} & = \frac{X_{k}}{n} = 0.24, (X_{1} = 24) \\  
 +\end{align*}
  
-즉, $X \sim B(n, p)$ 일 때, sample의 확률 $P_{s} = \displaytype \frac{X}{n}$를 따른다 (X = red gumball이 나온 갯수, n = sample 크기).+즉, $X \sim B(n, p)$ 일 때, sample의 확률 $P_{s} = \dfrac{X}{n}$를 따른다 ($X= red gumball이 나온 갯수, $n= sample 크기).
 {{:b:head_first_statistics:pasted:20191126-073028.png}} {{:b:head_first_statistics:pasted:20191126-073028.png}}
  
Line 224: Line 228:
 q <- 1-p q <- 1-p
 n <- 100 n <- 100
-var <- (p*q)/(n-1+var <- (p*q)/(n) 
-se  <- sqrt((p*q)/(n-1)) +se  <- sqrt((p*q)/(n)) 
-pnorm(.395, p, se, lower.tail = F)+o <.4 
 +o.c <- .4 - (1/(2*n)) 
 +o.c  
 +pnorm(o.c, p, se, lower.tail = F)
 </code> </code>
  
 <code> <code>
 +
 > p <- 0.25 > p <- 0.25
 > q <- 1-p > q <- 1-p
 > n <- 100 > n <- 100
-> var <- (p*q)/(n-1+> var <- (p*q)/(n) 
-> se  <- sqrt((p*q)/(n-1)) +> se  <- sqrt((p*q)/(n)) 
-> pnorm(.395, p, se, lower.tail = F) +> o <.4 
-[1] 0.0004313594+> o.c <- .4 - (1/(2*n)) 
 +> o.c  
 +[1] 0.395 
 +> pnorm(o.c, p, se, lower.tail = F) 
 +[1] 0.0004060586
 </code> </code>
  
 </WRAP> </WRAP>
  
-====== How many gumballs? -- Probability of sample means ======+====== Sampling distribution of sample mean ======
  
 <WRAP info 60%> <WRAP info 60%>
Line 258: Line 270:
 \overline{X} = \frac{X_{1} + X_{2} + . . . + X_{n}}{n}  \overline{X} = \frac{X_{1} + X_{2} + . . . + X_{n}}{n} 
 \end{eqnarray*} \end{eqnarray*}
 +위는 풍선검 봉지 30개로 이루어진 샘플의 평균을 이야기하고 
 +아래는 이 평균을 계속 모았을 때의 평균을 이야기한다. 
 \begin{eqnarray*} \begin{eqnarray*}
 E(\overline{X}) & = & E\left(\frac{X_{1} + X_{2} + . . . + X_{n}}{n}\right)  \\ E(\overline{X}) & = & E\left(\frac{X_{1} + X_{2} + . . . + X_{n}}{n}\right)  \\
Line 275: Line 288:
 \end{eqnarray*} \end{eqnarray*}
  
-\begin{eqnarray*} +\begin{align*} 
-Var(\overline{X}) & = Var \left(\frac{X_{1} + X_{2} + . . . + X_{n}}{n}\right) \\ +Var(\overline{X}) & = Var \left(\frac{X_{1} + X_{2} + . . . + X_{n}}{n}\right) \\ 
-& = \frac{1}{n^2} Var \left({X_{1} + X_{2} + . . . + X_{n}\right) \\ +& = \frac {1}{n^2} Var \left(X_{1} + X_{2} + . . . + X_{n} \right) \\ 
-& = \frac{1}{n^2} (\sigma^2 + \sigma^2 + . . . + \sigma^2) \\ +& = \frac{1}{n^2} (\sigma^2 + \sigma^2 + . . . + \sigma^2) \\ 
-& = \frac{1}{n^2} n * (\sigma^2) \\ +& = \frac{1}{n^2} n * (\sigma^2) \\ 
-& = \frac{\sigma^2}{n}  +& = \frac{\sigma^2}{n}  
-\end{eqnarray*}+ 
 + 
 +\end{align*}
  
  
Line 290: Line 305:
 \end{eqnarray} \end{eqnarray}
  
-$$\text{standard error of the sample means} = \frac{\sigma}{\sqrt{n}}$$+\begin{eqnarray*} 
 +\text{standard error} & = & \text{standard deviation of sample means} \\ 
 +\frac{\sigma}{\sqrt{n}} \\ 
 +& = & \sqrt{\frac{\sigma^{2}}{n}}   
 +\end{eqnarray*}
  
 {{:b:head_first_statistics:pasted:20191126-093924.png}} {{:b:head_first_statistics:pasted:20191126-093924.png}}
Line 307: Line 326:
  
 ===== Using CLT for the binomial distribution ===== ===== Using CLT for the binomial distribution =====
-$X \sim B(n, p)$, n이 30이 넘는 조건에서$\mu = np$, $\sigma^2 = npq$ 이므로 이를 $\overline{X} \sim N(\mu, \frac{\sigma^2}{n})$에 대입해 보면: +$X \sim B(n, p)$ 에서 $\mu = np$, $\sigma^2 = npq$ 이고, 
 +n이 30이 넘는 조건에서 이항분포가 정상분포를 이룬다고 하므로   
 +$\overline{X} \sim N(\mu, \frac{\sigma^2}{n})$에 대입해 보면: 
 $$\overline{X} \sim N(np, \; pq) $$ $$\overline{X} \sim N(np, \; pq) $$
  
Line 345: Line 366:
 </code> </code>
 discrepancy? discrepancy?
 +<code>
 +> a <- sqrt(1/30)
 +> b <- 8.5-10
 +> b/a
 +[1] -8.215838
 +> pnorm(b/a)
 +[1] 1.053435e-16
 +
 +</code>
  
b/head_first_statistics/estimating_populations_and_samples.1575952054.txt.gz · Last modified: 2019/12/10 13:27 by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki