ㄹ====== Week 3 내용 ======
===== SPSS =====
Chapter 3, Chapter 4
* SPSS
* [[http://www.uvm.edu/~dhowell/fundamentals7/DataFiles/MentalRotation.dat|Table 3.1 data file]]. for SPSS, excel format, see the below.
* Explanation: Read the textbook for yourself (Chapter 3)
* frequency distribution
* histogram
* stem and leaf display.
* watch [[https://www.youtube.com/watch?v=6JM80zb2fes|How to create a Stem and Leaf Plot in Microsoft Excel]]
* watch [[https://www.youtube.com/watch?v=atWwZmIEZ9Q|Spss]]
===== Central Tendency =====
* Central Tendency (집중경향)
* data: {{:data_rtsec.sav|SPSS data file, rtsec}} or {{:data_rtsec.xlsx|Excel file}}
Statistics
RTsec
N Valid 600
Missing 0
Mean 1.6245
Median 1.5300
Mode 1.33
Descriptives
StatisticStd. Error
RTsec Mean 1.6245 .02603
95% Confidence Lower 1.5734
Interval Upper 1.6756
for Mean
5% Trimmed Mean 1.5672
Median 1.5300
Variance .407
Std. Deviation .63772
Minimum .72
Maximum 4.44
Range 3.72
Interquartile Range .77
Skewness 1.465 .100
Kurtosis 2.849 .199
{{:hist.jpg}}
data file: {{:Ex3-1.sav}} 읽지 않은 지문에 대한 답을 한 학생들의 점수 (Katz, 1990).
NOPASSAG Stem-and-Leaf Plot
Frequency Stem & Leaf
1.00 3 . 4
5.00 3 . 66689
5.00 4 . 33444
7.00 4 . 6666799
5.00 5 . 01224
5.00 5 . 55577
Stem width: 10.00
Each leaf: 1 case(s)
{{:Fig.4.1.jpg}}
Chapter 5
* Dispersion (variability) -- 분산(변산성)
* Data file: [[http://www.uvm.edu/~dhowell/fundamentals7/DataFiles/Tab5-1.dat|Web site]] or {{:Tab5-1.sav}} p.86-7
* [[:range]]
* [[:outliers]]: It is beyond our scope. Please just refer to it. Won't be appearing in tests.
* 평균편차
* [[:Variance]] 변량
* 표본변량 $ s^2 $
* 모집단변량(전집) $ \sigma^2 $
Descriptives
SET Statistic Std. Error
ATTRACT 4 Mean 2.6445 .14651
95% Confidence Lower Bound 2.3379
Interval for Upper Bound 2.9511
Mean
5% Trimmed Mean 2.6483
Median 2.5950
Variance .429
Std. Deviation .65520
Minimum 1.20
Maximum 4.02
Range 2.82
Interquartile Range .82
Skewness -.001 .512
Kurtosis .438 .992
32 Mean 3.2615 .01541
95% Confidence Interval for Mean Lower Bound 3.2292
Upper Bound 3.2938
5% Trimmed Mean 3.2622
Median 3.2650
Variance .005
Std. Deviation .06892
Minimum 3.13
Maximum 3.38
Range .25
Interquartile Range .11
Skewness -.075 .512
Kurtosis -.863 .992
* [[:Standard Deviation]] 표준편차
* Variance calculation formula
* $ \displaystyle S_x^2 = \displaystyle \frac {\Sigma X^2 - \frac{(\Sigma X)^2}{N} } {N-1} $
* $ \displaystyle \sigma_x^2 = \displaystyle \frac {\Sigma X^2 - \frac{(\Sigma X)^2}{N} } {N} = \displaystyle \frac {\Sigma X^2}{N} - \frac {(\Sigma X)^2}{N^2} = \displaystyle \frac {\Sigma X^2}{N} - \bigg(\frac {\Sigma X}{N}\bigg)^2 = \displaystyle \frac {\Sigma X^2}{N} - \mu^2 $
* [[:Degrees of Freedom]] N-1
* [[:Why n-1]]
===== Sampling Distribution, Standard Error =====
* [[:Sampling]]
* [[:Sampling Distribution]]
* [[:Central Limit Theorem]]
* [[:Standard Error]]
===== CLT에 관한 정리 =====
우선, Expected value (기대값)와 Variance (분산)의 연산은 아래와 같이 계산될 수 있다.
X,Y 가 서로 독립적이라고 할 때:
\begin{eqnarray}
E[aX] = a E[X] \\
E[X+Y] = E[X] + E[Y] \\
Var[aX] = a^{\tiny{2}} Var[X] \\
Var[X+Y] = Var[X] + Var[Y]
\end{eqnarray}
이때, 한 샘플의 평균값을 $X$ 라고 하면, 평균들의 합인 $S_k$ 는
$$ S_{k} = X_1 + X_2 + . . . + X_k $$
와 같다.
이렇게 얻은 샘플들(k 개의)의 평균인 $ A_k $ 는,
$$ A_k = \displaystyle \frac{(X_1 + X_2 + . . . + X_k)}{k} = \frac{S_{k}}{k} $$
라고 할 수 있다.
이때,
$$
\begin{align*}
E[S_k] & = E[X_1 + X_2 + . . . +X_k] \\
& = E[X_1] + E[X_2] + . . . + E[X_k] \\
& = \mu + \mu + . . . + \mu = k * \mu \\
\end{align*}
$$
$$
\begin{align*}
Var[S_k] & = Var[X_1 + X_2 + . . . +X_k] \\
& = Var[X_1] + Var[X_2] + \dots + Var[X_k] \\
& = k * \sigma^2
\end{align*}
$$
이다.
그렇다면, $ A_k $ 에 관한 기대값과 분산값은:
$$
\begin{align*}
E[A_k] & = E[\frac{S_k}{k}] \\
& = \frac{1}{k}*E[S_k] \\
& = \frac{1}{k}*k*\mu = \mu
\end{align*}
$$
이고,
$$
\begin{align*}
Var[A_k] & = Var[\frac{S_k}{k}] \\
& = \frac{1}{k^2} Var[S_k] \\
& = \frac{1}{k^2}*k*\sigma^2 \\
& = \frac{\sigma^2}{k} \nonumber
\end{align*}
$$
라고 할 수 있다.