User Tools

Site Tools


b:head_first_statistics:variability_and_spread

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
b:head_first_statistics:variability_and_spread [2020/09/21 12:42] hkimscilb:head_first_statistics:variability_and_spread [2023/09/13 08:59] (current) – [Variability and Spread] hkimscil
Line 59: Line 59:
 > >
 > >
 +
 > sapply(data,sd) > sapply(data,sd)
 [1] 1.825742 1.563472 7.362065 [1] 1.825742 1.563472 7.362065
Line 65: Line 66:
 > </code> > </code>
  
 +====== Range ======
  
 [[:range]] [[:range]]
Line 80: Line 82:
 30 - 3 = 27 30 - 3 = 27
 </code> </code>
-그러나 range도 데이터의 분포를 정확하게 그려주지는 않는다. +그러나 range도 데이터의 분포를 정확하게 그려주지는 않는다. 아래의 첫번째, 두번째 데이터의 range는 모두 4 (8-12). 그러나, 개인 점수들의 분포는 다른 양상을 보인다. 
 [{{range.no.difference.jpg}}] [{{range.no.difference.jpg}}]
 +즉, 
 +[{{range.problem.jpg}}]
  
 +아웃라이어의 (극단치의) 문제 
 +<code>
 +a <- c(1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,5,5,5}
 +b <- c(1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,5,5,5, 10}
 +</code>
 +
 +range(a) vs. range(b) 
 +
 +이런 두 그룹간의 range 차이는 outlier에 기인한다.
  
 +====== Quartile ======
 [[:quartile]] [[:quartile]]
 +[{{hf.measuring.variability.p95.ex.jpg}}]
 +
 +<code>
 +> basket <- c(3,3,6,7,7,10,10,10,11,13,30)
 +> basket <- sort(basket)
 +> basket
 + [1]  3  3  6  7  7 10 10 10 11 13 30
 +
 +</code>
 +
 +<code>> quantile(basket)
 +  0%  25%  50%  75% 100% 
 + 3.0  6.5 10.0 10.5 30.0 
 +
 +</code>
 +====== Percentile ======
 +<WRAP info>
 +How to find percentile 
 +  - First of all, line all your values up in ascending order. 
 +  - To find the position of the kth percentile out of n numbers, start off by calculating .$ k(\frac{n}{100})$
 +  - If this gives you an integer, then your percentile is halfway between the value at position $ k(\frac{n}{100})$ and the next number along. Take the average of the numbers at these two positions to give you your percentile.
 +  - If $ k(\frac{n}{100})$ is not an integer, then round it up. This then gives you the position of the percentile.
 +</WRAP>
 +
 +<code>
 +> k <- c(1:125)
 +> length(k)
 +[1] 125
 +> k
 +  [1]                    10  11  12  13  14  15  16  17  18  19  20
 + [21]  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40
 + [41]  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60
 + [61]  61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79  80
 + [81]  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99 100
 +[101] 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
 +[121] 121 122 123 124 125
 +
 +</code>
 +10th percentile 을 구하려면 
 +10 * ( 125 / 100) = 12.5 
 +이 숫자를 반올림하면 13이므로 13번째 숫자가 10번째 페센타일이 된다 (13).
 +
 +<code>
 +> k <- c(1:10)
 +> length(k)
 +[1] 10
 +> k
 + [1]  1  2  3  4  5  6  7  8  9 10
 +</code>
 +
 +20th percentile을 구하려면 
 +$ 20 * (10 /100) = 2 $ 이므로
 +2번째와 3번째 사이의 점수의 평균이므로, 2.5이다.
 +
 +====== Boxplot ======
 +<code>
 +# j <- c(6,7,7,8,9,10,10,11,11,13)
 +j <- c(7,9,9,10,10,10,10,11,11,13)
 +# m <- c(3,3,6,7,7,10,10,10,11,13,30)
 +m <- c(3,3,6,7,8,9,9,10,11,13,30)
 +
 +median(j)
 +median(m)
 +</code>
 +
 +[{{hf.boxplot.ex.jpg}}]
 +
 +
 +<code>
 +boxplot(j)
 +boxplot(m)
 +</code>
 +
 +<code>
 +boxplot(j, m)
 +boxplot(j, m, horizontal = T)
 +</code>
 +
 +
 +
 +
 +====== Variance ======
 +
 [[:variance]] [[:variance]]
   * $ \sum \text{deviation score}^2 = \sum \text{ds}^2 $   * $ \sum \text{deviation score}^2 = \sum \text{ds}^2 $
Line 100: Line 197:
   * calculation of variance (an easy way) see [[:variance#variance_cal|variance calculation]]   * calculation of variance (an easy way) see [[:variance#variance_cal|variance calculation]]
     * $ \displaystyle \frac{\sum(X_{i})}{N} - \mu^2$     * $ \displaystyle \frac{\sum(X_{i})}{N} - \mu^2$
 +    * [{{variance.cal.jpg?600}}]
 +
  
 [[:standard deviation]] [[:standard deviation]]
 +====== Standard score ======
 [[:standard score]] [[:standard score]]
 +$ z = \large\frac {x-\mu}{\sigma} $
b/head_first_statistics/variability_and_spread.1600659769.txt.gz · Last modified: 2020/09/21 12:42 by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki