User Tools

Site Tools


b:head_first_statistics:variability_and_spread

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
b:head_first_statistics:variability_and_spread [2020/09/21 12:40] hkimscilb:head_first_statistics:variability_and_spread [2023/09/06 08:06] – [Range] hkimscil
Line 65: Line 65:
 > </code> > </code>
  
 +====== Range ======
  
 [[:range]] [[:range]]
Line 80: Line 81:
 30 - 3 = 27 30 - 3 = 27
 </code> </code>
 +그러나 range도 데이터의 분포를 정확하게 그려주지는 않는다. 아래의 첫번째, 두번째 데이터의 range는 모두 4 (8-12). 그러나, 개인 점수들의 분포는 다른 양상을 보인다. 
 +[{{range.no.difference.jpg}}]
 +즉, 
 +[{{range.problem.jpg}}]
  
 +아웃라이어의 (극단치의) 문제 
 +<code>
 +a <- c(1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,5,5,5}
 +b <- c(1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,5,5,5, 10}
 +</code>
  
 +range(a) vs. range(b) 
 +
 +이런 두 그룹간의 range 차이는 outlier에 기인한다.
 +
 +====== Quartile ======
 [[:quartile]] [[:quartile]]
 +[{{hf.measuring.variability.p95.ex.jpg}}]
 +
 +<code>
 +> basket <- c(3,3,6,7,7,10,10,10,11,13,30)
 +> basket <- sort(basket)
 +> basket
 + [1]  3  3  6  7  7 10 10 10 11 13 30
 +
 +</code>
 +
 +<code>> quantile(basket)
 +  0%  25%  50%  75% 100% 
 + 3.0  6.5 10.0 10.5 30.0 
 +
 +</code>
 +====== Percentile ======
 +<WRAP info>
 +How to find percentile 
 +  - First of all, line all your values up in ascending order. 
 +  - To find the position of the kth percentile out of n numbers, start off by calculating .$ k(\frac{n}{100})$
 +  - If this gives you an integer, then your percentile is halfway between the value at position $ k(\frac{n}{100})$ and the next number along. Take the average of the numbers at these two positions to give you your percentile.
 +  - If $ k(\frac{n}{100})$ is not an integer, then round it up. This then gives you the position of the percentile.
 +</WRAP>
 +
 +<code>
 +> k <- c(1:125)
 +> length(k)
 +[1] 125
 +> k
 +  [1]                    10  11  12  13  14  15  16  17  18  19  20
 + [21]  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40
 + [41]  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60
 + [61]  61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79  80
 + [81]  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99 100
 +[101] 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
 +[121] 121 122 123 124 125
 +
 +</code>
 +10th percentile 을 구하려면 
 +10 * ( 125 / 100) = 12.5 
 +이 숫자를 반올림하면 13이므로 13번째 숫자가 10번째 페센타일이 된다 (13).
 +
 +<code>
 +> k <- c(1:10)
 +> length(k)
 +[1] 10
 +> k
 + [1]  1  2  3  4  5  6  7  8  9 10
 +</code>
 +
 +20th percentile을 구하려면 
 +$ 20 * (10 /100) = 2 $ 이므로
 +2번째와 3번째 사이의 점수의 평균이므로, 2.5이다.
 +
 +====== Boxplot ======
 +<code>
 +# j <- c(6,7,7,8,9,10,10,11,11,13)
 +j <- c(7,9,9,10,10,10,10,11,11,13)
 +# m <- c(3,3,6,7,7,10,10,10,11,13,30)
 +m <- c(3,3,6,7,8,9,9,10,11,13,30)
 +
 +median(j)
 +median(m)
 +</code>
 +
 +[{{hf.boxplot.ex.jpg}}]
 +
 +
 +<code>
 +boxplot(j)
 +boxplot(m)
 +</code>
 +
 +<code>
 +boxplot(j, m)
 +boxplot(j, m, horizontal = T)
 +</code>
 +
 +
 +
 +
 +====== Variance ======
 +
 [[:variance]] [[:variance]]
   * $ \sum \text{deviation score}^2 = \sum \text{ds}^2 $   * $ \sum \text{deviation score}^2 = \sum \text{ds}^2 $
Line 98: Line 196:
   * calculation of variance (an easy way) see [[:variance#variance_cal|variance calculation]]   * calculation of variance (an easy way) see [[:variance#variance_cal|variance calculation]]
     * $ \displaystyle \frac{\sum(X_{i})}{N} - \mu^2$     * $ \displaystyle \frac{\sum(X_{i})}{N} - \mu^2$
 +    * [{{variance.cal.jpg?600}}]
 +
  
 [[:standard deviation]] [[:standard deviation]]
 +====== Standard score ======
 [[:standard score]] [[:standard score]]
 +$ z = \large\frac {x-\mu}{\sigma} $
b/head_first_statistics/variability_and_spread.txt · Last modified: 2023/09/13 08:59 by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki