sampling_distribution
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
sampling_distribution [2021/04/01 08:38] – [n = 4 인 경우] hkimscil | sampling_distribution [2021/04/01 08:44] (current) – hkimscil | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Sampling Distribtution, | ====== Sampling Distribtution, | ||
이 글을 읽고 [[:mean and variance of the sample mean]] 문서를 읽을 것. | 이 글을 읽고 [[:mean and variance of the sample mean]] 문서를 읽을 것. | ||
- | 또한 | + | [[:sampling distribution in R]] |
- | + | < | |
- | {{ : | + | |
Sample distribution이 population의 parameter와 동일한 statistics을 가질 확률은 그리 많지 않다. 가령, 우리나라 대학생의 communication apprehension 지수가 (index) 70이고 [[:Standard Deviation|standard deviation]]이 15라고 가정하면, | Sample distribution이 population의 parameter와 동일한 statistics을 가질 확률은 그리 많지 않다. 가령, 우리나라 대학생의 communication apprehension 지수가 (index) 70이고 [[:Standard Deviation|standard deviation]]이 15라고 가정하면, | ||
Line 33: | Line 32: | ||
그렇다면 n = 4로 하여 샘플을 뽑는 경우는 어떨까? | 그렇다면 n = 4로 하여 샘플을 뽑는 경우는 어떨까? | ||
===== n = 4 인 경우 ===== | ===== n = 4 인 경우 ===== | ||
- | {{: | + | < |
- | 이 모집단에서: | + | |
- 샘플 구성원의 숫자가 4 인 샘플 (sample size, n = 4) 을 뽑아서 평균을 기록하고 | - 샘플 구성원의 숫자가 4 인 샘플 (sample size, n = 4) 을 뽑아서 평균을 기록하고 | ||
- 다시 그 샘플을 모집단에 넣은 다음 | - 다시 그 샘플을 모집단에 넣은 다음 | ||
Line 54: | Line 52: | ||
===== in R ===== | ===== in R ===== | ||
- | [[:sampling distribution in r]]을 보시오 | + | R에서 살펴보는 것이 더 이해가 쉬울 수 있다. |
+ | [[:sampling distribution in R]] | ||
===== CLT ===== | ===== CLT ===== | ||
Line 63: | Line 62: | ||
(sampling distribution은 [[Central Limit Theorem]] 을 이해하기 위해서 꼭 필요한 개념이다.) | (sampling distribution은 [[Central Limit Theorem]] 을 이해하기 위해서 꼭 필요한 개념이다.) | ||
- | {{: | + | < |
* $\mu_{\tiny\overline{X}} = \mu = 70$ | * $\mu_{\tiny\overline{X}} = \mu = 70$ | ||
* $\sigma_{\tiny\overline{X}} = \frac{\sigma}{\sqrt{n}} = \frac{15}{\sqrt{100}} = 1.5$ | * $\sigma_{\tiny\overline{X}} = \frac{\sigma}{\sqrt{n}} = \frac{15}{\sqrt{100}} = 1.5$ | ||
- | |||
- | ====== English ====== | ||
- | I mentioned in the earlier article that the standard error is actually standard deviation of sampling distribution. I would feel safe when I say standard deviation since I covered the concept already. However, I thought you might feel uneasy about " | ||
- | |||
- | Do you remember you heard something like "no matter how the population is distributed, | ||
- | |||
- | {{ pop-histogram.jpg | ||
- | {{ population-distribution.jpg | ||
- | |||
- | Certainly, you see that the distribution is not normal. | ||
- | |||
- | Now suppose that you took a sample from this population and recorded the mean of the sample. And suppose that you kept doing this about 1000 times. How do you think the curve of the graph look a like? Remember that you kept the means of the 10000 samples. The graph looks like the below -- normally distributed curve. | ||
- | |||
- | {{ sampling-distribution-2.jpg | ||
- | |||
- | This can be called normal curve of mean (x bar). And this distribution is called sampling distribution because the distribution graph is obtained by keeping sampling for a very very large number of times. | ||
- | |||
- | This sampling distribution has several interesting characteristics: | ||
- | |||
- | $ \mu_{\overline{x}}=\mu $ \\ | ||
- | $ \sigma_{\overline{x}}=\frac{\sigma}{\sqrt{n}} $ , | ||
- | |||
- | We all know what the sign means, the symbols are $\mu$ and $\sigma$ in Greek, representing " | ||
- | |||
- | The second is also called " | ||
- | |||
- | For the reference, the standard deviation of sampling distribution of probability was, | ||
- | |||
- | $\sigma_{\overline{p}}=\sqrt{\frac{p*q}{n}}$ \\ | ||
- | |||
- | As you see, they share the same Greek letter, $\sigma$ , " | ||
- | |||
- | What is this used for? At the bottom line, this is very important to do any kind of statistical (inferential) analysis. Illustration of this idea requires us to expand our thoughts a bit more, however. This is directly related to the t-test and z-test (Therefore, I strongly recommend to read " | ||
- | |||
- | Instead, I want to talk about an example which is related to the exact above concept. Suppose that you are a member of a consumer group. The director called you -- since you have taken statistics and media research course at Rutgers -- and asked you to test a brand of battery. She wanted to know whether the battery life, which the manufacturer has announced to the public, holds the truth. The manufacturer has claimed that the lengths of life of its best battery has a mean of 54 months and a standard deviation of 6 months. The director told you to send a sample of 50 of the batteries. | ||
- | |||
- | Immediately, | ||
- | |||
- | {{ stderr2.jpg | ||
- | |||
- | You are expecting that the picture represents the entire population of the batteries: their mean is about 54; about 68% of the batteries will last long between 48-60 months; 42-66 months for the 95%; 36-72 months for the 99%. And you are expecting this claim holds the truths. | ||
- | |||
- | You can also imagine how the sampling distribution -- again, the ones from the means obtained from imaginary sampling -- should look like based on the information. First, you know that the mean of means (the mean of the sampling distribution of means) is the same as that of the population. And the standard deviation of the sampling distribution is standard deviation of population divided by square root of sample size. That is, | ||
- | |||
- | $ \sigma_{\overline{x}}=\sigma $ , which is known as 54, and | ||
- | $ \sigma_{\overline{x}}=\frac{\sigma}{\sqrt{n}}=\frac{6}{\sqrt{50}} $ , which is about 0.85 month. | ||
- | |||
- | These two again gives you a picture of sampling distribution. which will look like the below graph. | ||
- | |||
- | {{ stderr6.jpg | ||
- | |||
- | The inner distribution line is " | ||
- | |||
- | ^ & | ||
- | | mean (+-) 1s (68%) | 53.15 | 54.85 | @yellow: | ||
- | | mean (+-) 2s (95%) | 52.3 | 55.7 | @yellow: | ||
- | | mean (+-) 3s (99%) | 51.45 | 56.55 | @yellow: | ||
- | | ||
- | |||
- | Reference | ||
- | |||
- | Weiss, A. J., & Leets, L. L. (1998). Introduction to Statistics for the Social Sciences (2nd ed.). New York, NY: McGraw Hill. | ||
sampling_distribution.1617233901.txt.gz · Last modified: 2021/04/01 08:38 by hkimscil