기하분포
\begin{align*}
\text{Geometric Distribution: } \;\;\; \text{X} & \thicksim Geo(p) \\
p(X = k) & = q^{k-1} \cdot p \\
E\left[ X \right] & = \frac{1}{p} \\
V\left[ X \right] & = \frac{q}{p^2} \\
\\
\end{align*}
The probability of Chad making a clear run down the slope is 0.2, and he's going to keep on trying until he succeeds. After he’s made his first successful run down the slopes, he’s going to stop snowboarding, and head back to the lodge triumphantly
It’s time to exercise your probability skills. The probability of Chad making a successful run down the slopes is 0.2 for any given trial (assume trials are independent). What’s the probability he’ll need two trials? What’s the probability he’ll make a successful run down the slope in one or two trials? Remember, when he’s had his first successful run, he’s going to stop.
Hint: You may want to draw a probability tree to help visualize the problem.
P(X = 1) = P(success in the first trial) = 0.2
P(X = 2) = P(success in the second trial union failure in the first trial) = 0.8 * 0.2 = 0.16
1회 혹은 2회에서 성공할 확률
P(X <= 2) = P(X = 1) + P(X = 2) = 0.2 + 0.16 = 0.36
| X | P(X=x) |
| 1 | 0.2 |
| 2 | 0.8 * 0.2 = 0.16 |
| 3 | 0.8 * 0.8 * 0.2 = 0.128 |
| 4 | 0.8 * 0.8 * 0.8 * 0.2 = 0.1024 |
| . . . | . . . . . |
| X | P(X=x) | Power of 0.8 | Power of 0.2 |
| 1 | 0.80 * 0.2 | 0 | 1 |
| 2 | 0.81 * 0.2 | 1 | 1 |
| 3 | 0.82 * 0.2 | 2 | 1 |
| 4 | 0.83 * 0.2 | 3 | 1 |
| 5 | 0.84 * 0.2 | 4 | 1 |
| r | . . . . . | r - 1 | 1 |
$P(X = r) = 0.8^{r-1} × 0.2$
$P(X = r) = q^{r-1} × p $
This formula is called the geometric distribution.
$ P(X=r) = {p \cdot q^{r-1}} $
$ P(X=r) = {p \cdot (1-p)^{r-1}} $
p = 0.20 n = 29 ## geometric . . . . ## note that it starts with 0 rather than 1 ## since the function uses p * q^(r), ## rather than p * q^(r-1) dgeom(x = 0:n, prob = p) hist(dgeom(x = 0:n, prob = p))
> p = 0.20 > n = 29 > # exact > dgeom(0:n, prob = p) [1] 0.2000000000 0.1600000000 0.1280000000 0.1024000000 0.0819200000 0.0655360000 0.0524288000 [8] 0.0419430400 0.0335544320 0.0268435456 0.0214748365 0.0171798692 0.0137438953 0.0109951163 [15] 0.0087960930 0.0070368744 0.0056294995 0.0045035996 0.0036028797 0.0028823038 0.0023058430 [22] 0.0018446744 0.0014757395 0.0011805916 0.0009444733 0.0007555786 0.0006044629 0.0004835703 [29] 0.0003868563 0.0003094850 > > hist(dgeom(x = 0:n, prob = p))
r번 시도한 이후, 그 이후 어디서든지 간에 성공을 얻을 확률
$$ P(X > r) = q^{r} $$
예, 20번 시도 후에 어디선가 성공할 확률은?
Solution.
p <- .2 q <- 1-p n <- 19 s <- dgeom(x = 0:n, prob = p) # 20번째까지 성공할 확률을 모두 더한 확률 sum(s) # 따라서 아래는 20번 이후 어디서든지 간에서 성공할 확률 1-sum(s) ## 혹은 (교재가 이야기하는) 20번까지 실패하는 확률 q^20
> p <- .2 > q <- 1-p > n <- 19 > s <- dgeom(x = 0:n, prob = p) > # 20번째까지 성공할 확률 > sum(s) [1] 0.9884708 > # 따라서 아래는 20번 이후 어디서든지 간에서 성공할 확률 > 1-sum(s) [1] 0.01152922 > ## 혹은 (교재가 이야기하는) 20번까지 실패하는 확률 > q^20 [1] 0.01152922 >
그렇다면
r 번 이전에 성공이 있을 확률은? = r 번까지의 실패할 확률의 보수
$$ P(X \le r) = 1 - q^{r} $$
혹은 1번째 성공 + 2번째 성공 + . . . + r 번째 성공으로 구해도 된다
# r = 20 이라고 하면 p <- .2 q <- 1-p n <- 19 s <- dgeom(x = 0:n, prob = p) sum(s)
Note that
$$P(X > r) + P(X \le r) = 1 $$
X가 성공할 확률 p를 가진 Geometric distribution을 따른다 :: $X \sim \text{Geo}(p)$
Reminding . . . Expected value in discrete probability
$E(X) = \sum x*P(X=x)$
| textbook | x | P(X = x) | xP(X = x) | xP(X ≤ x): $E(X) = \sum (x*P(X=x))$ |
| r code | trial | px ← q^(trial-1)*p | npx ← trial*(q^(trial-1))*p | plex ← cumsum(trial*(q^(trial-1))*p) |
px | npx ← trial*px | plex ← cumsum(npx) |
||
| x번째 (trial번째) 성공할 확률 | x번째의 기대치 (주사위 경우처럼) | 그 x번째까지 성공할 확률에 대한 기대값 |
| x | p(x) px | npx.0 | npx = weighted probability at a given spot | plex.0 | plex | |
|---|---|---|---|---|---|---|
| 0 | 0.1 | 0 * 0.1 | 0.00 | 0.00 | 0.00 | |
| 1 | 0.15 | 1 * 0.15 | 0.15 | 0.00 + 0.15 | 0.15 | |
| 2 | 0.4 | 2 * 0.4 | 0.80 | 0.00 + 0.15 + 0.80 | 0.95 | |
| 3 | 0.25 | 3 * 0.25 | 0.75 | 0.00 + 0.15 + 0.80 + 0.75 | 1.7 | |
| 4 | 0.1 | 4 * 0.1 | 0.40 | 0.00 + 0.15 + 0.80 + 0.75 + 0.40 | 2.1 | = this is E(x) |
p <- .2 q <- 1-p trial <- c(1:8) px <- q^(trial-1)*p px ## npx <- trial*(q^(trial-1))*p ## 위는 아래와 같음 npx <- trial*px npx ## plex <- cumsum(trial*(q^(trial-1))*p) ## 위는 아래와 같음 plex <- cumsum(npx) plex sumgeod <- data.frame(trial,px,npx,plex) round(sumgeod,3)
> p <- .2 > q <- 1-p > trial <- c(1,2,3,4,5,6,7,8) > px <- q^(trial-1)*p > px [1] 0.20000000 0.16000000 0.12800000 0.10240000 0.08192000 0.06553600 0.05242880 0.04194304 > npx <- trial*(q^(trial-1))*p > npx [1] 0.2000000 0.3200000 0.3840000 0.4096000 0.4096000 0.3932160 0.3670016 0.3355443 > plex <- cumsum(trial*(q^(trial-1))*p) > plex [1] 0.200000 0.520000 0.904000 1.313600 1.723200 2.116416 2.483418 2.818962 > sumgeod <- data.frame(trial,px,npx,plex) > round(sumgeod,3) trial px npx plex 1 1 0.200 0.200 0.200 2 2 0.160 0.320 0.520 3 3 0.128 0.384 0.904 4 4 0.102 0.410 1.314 5 5 0.082 0.410 1.723 6 6 0.066 0.393 2.116 7 7 0.052 0.367 2.483 8 8 0.042 0.336 2.819 >
p <- .2 q <- 1-p trial <- c(1:100) px <- q^(trial-1)*p px npx <- trial*px npx ## plex <- cumsum(trial*(q^(trial-1))*p) ## 위는 아래와 같음 plex <- cumsum(npx) plex sumgeod <- data.frame(trial,px,npx,plex) sumgeod plot(npx, type="l") plot(plex, type="l")
>
> p <- .2
> q <- 1-p
> trial <- c(1:100)
> px <- q^(trial-1)*p
> px
[1] 2.000000e-01 1.600000e-01 1.280000e-01 1.024000e-01
[5] 8.192000e-02 6.553600e-02 5.242880e-02 4.194304e-02
[9] 3.355443e-02 2.684355e-02 2.147484e-02 1.717987e-02
[13] 1.374390e-02 1.099512e-02 8.796093e-03 7.036874e-03
[17] 5.629500e-03 4.503600e-03 3.602880e-03 2.882304e-03
[21] 2.305843e-03 1.844674e-03 1.475740e-03 1.180592e-03
[25] 9.444733e-04 7.555786e-04 6.044629e-04 4.835703e-04
[29] 3.868563e-04 3.094850e-04 2.475880e-04 1.980704e-04
[33] 1.584563e-04 1.267651e-04 1.014120e-04 8.112964e-05
[37] 6.490371e-05 5.192297e-05 4.153837e-05 3.323070e-05
[41] 2.658456e-05 2.126765e-05 1.701412e-05 1.361129e-05
[45] 1.088904e-05 8.711229e-06 6.968983e-06 5.575186e-06
[49] 4.460149e-06 3.568119e-06 2.854495e-06 2.283596e-06
[53] 1.826877e-06 1.461502e-06 1.169201e-06 9.353610e-07
[57] 7.482888e-07 5.986311e-07 4.789049e-07 3.831239e-07
[61] 3.064991e-07 2.451993e-07 1.961594e-07 1.569275e-07
[65] 1.255420e-07 1.004336e-07 8.034690e-08 6.427752e-08
[69] 5.142202e-08 4.113761e-08 3.291009e-08 2.632807e-08
[73] 2.106246e-08 1.684997e-08 1.347997e-08 1.078398e-08
[77] 8.627183e-09 6.901746e-09 5.521397e-09 4.417118e-09
[81] 3.533694e-09 2.826955e-09 2.261564e-09 1.809251e-09
[85] 1.447401e-09 1.157921e-09 9.263367e-10 7.410694e-10
[89] 5.928555e-10 4.742844e-10 3.794275e-10 3.035420e-10
[93] 2.428336e-10 1.942669e-10 1.554135e-10 1.243308e-10
[97] 9.946465e-11 7.957172e-11 6.365737e-11 5.092590e-11
> npx <- trial*px
> npx
[1] 2.000000e-01 3.200000e-01 3.840000e-01 4.096000e-01
[5] 4.096000e-01 3.932160e-01 3.670016e-01 3.355443e-01
[9] 3.019899e-01 2.684355e-01 2.362232e-01 2.061584e-01
[13] 1.786706e-01 1.539316e-01 1.319414e-01 1.125900e-01
[17] 9.570149e-02 8.106479e-02 6.845471e-02 5.764608e-02
[21] 4.842270e-02 4.058284e-02 3.394201e-02 2.833420e-02
[25] 2.361183e-02 1.964504e-02 1.632050e-02 1.353997e-02
[29] 1.121883e-02 9.284550e-03 7.675228e-03 6.338253e-03
[33] 5.229059e-03 4.310012e-03 3.549422e-03 2.920667e-03
[37] 2.401437e-03 1.973073e-03 1.619997e-03 1.329228e-03
[41] 1.089967e-03 8.932412e-04 7.316071e-04 5.988970e-04
[45] 4.900066e-04 4.007165e-04 3.275422e-04 2.676089e-04
[49] 2.185473e-04 1.784060e-04 1.455793e-04 1.187470e-04
[53] 9.682448e-05 7.892109e-05 6.430607e-05 5.238022e-05
[57] 4.265246e-05 3.472060e-05 2.825539e-05 2.298743e-05
[61] 1.869645e-05 1.520236e-05 1.235804e-05 1.004336e-05
[65] 8.160232e-06 6.628619e-06 5.383242e-06 4.370871e-06
[69] 3.548119e-06 2.879633e-06 2.336616e-06 1.895621e-06
[73] 1.537559e-06 1.246898e-06 1.010998e-06 8.195824e-07
[77] 6.642931e-07 5.383362e-07 4.361904e-07 3.533694e-07
[81] 2.862292e-07 2.318103e-07 1.877098e-07 1.519771e-07
[85] 1.230291e-07 9.958120e-08 8.059129e-08 6.521410e-08
[89] 5.276414e-08 4.268560e-08 3.452790e-08 2.792587e-08
[93] 2.258353e-08 1.826109e-08 1.476428e-08 1.193576e-08
[97] 9.648071e-09 7.798028e-09 6.302080e-09 5.092590e-09
> ## plex <- cumsum(trial*(q^(trial-1))*p)
> ## 위는 아래와 같음
> plex <- cumsum(npx)
> plex
[1] 0.200000 0.520000 0.904000 1.313600 1.723200 2.116416 2.483418
[8] 2.818962 3.120952 3.389387 3.625610 3.831769 4.010440 4.164371
[15] 4.296313 4.408903 4.504604 4.585669 4.654124 4.711770 4.760192
[22] 4.800775 4.834717 4.863051 4.886663 4.906308 4.922629 4.936169
[29] 4.947388 4.956672 4.964347 4.970686 4.975915 4.980225 4.983774
[36] 4.986695 4.989096 4.991069 4.992689 4.994018 4.995108 4.996002
[43] 4.996733 4.997332 4.997822 4.998223 4.998550 4.998818 4.999037
[50] 4.999215 4.999361 4.999479 4.999576 4.999655 4.999719 4.999772
[57] 4.999814 4.999849 4.999877 4.999900 4.999919 4.999934 4.999947
[64] 4.999957 4.999965 4.999971 4.999977 4.999981 4.999985 4.999988
[71] 4.999990 4.999992 4.999993 4.999995 4.999996 4.999997 4.999997
[78] 4.999998 4.999998 4.999998 4.999999 4.999999 4.999999 4.999999
[85] 4.999999 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000
[92] 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000
[99] 5.000000 5.000000
> sumgeod <- data.frame(trial,px,npx,plex)
> sumgeod
trial px npx plex
1 1 2.000000e-01 2.000000e-01 0.200000
2 2 1.600000e-01 3.200000e-01 0.520000
3 3 1.280000e-01 3.840000e-01 0.904000
4 4 1.024000e-01 4.096000e-01 1.313600
5 5 8.192000e-02 4.096000e-01 1.723200
6 6 6.553600e-02 3.932160e-01 2.116416
7 7 5.242880e-02 3.670016e-01 2.483418
8 8 4.194304e-02 3.355443e-01 2.818962
9 9 3.355443e-02 3.019899e-01 3.120952
10 10 2.684355e-02 2.684355e-01 3.389387
11 11 2.147484e-02 2.362232e-01 3.625610
12 12 1.717987e-02 2.061584e-01 3.831769
13 13 1.374390e-02 1.786706e-01 4.010440
14 14 1.099512e-02 1.539316e-01 4.164371
15 15 8.796093e-03 1.319414e-01 4.296313
16 16 7.036874e-03 1.125900e-01 4.408903
17 17 5.629500e-03 9.570149e-02 4.504604
18 18 4.503600e-03 8.106479e-02 4.585669
19 19 3.602880e-03 6.845471e-02 4.654124
20 20 2.882304e-03 5.764608e-02 4.711770
21 21 2.305843e-03 4.842270e-02 4.760192
22 22 1.844674e-03 4.058284e-02 4.800775
23 23 1.475740e-03 3.394201e-02 4.834717
24 24 1.180592e-03 2.833420e-02 4.863051
25 25 9.444733e-04 2.361183e-02 4.886663
26 26 7.555786e-04 1.964504e-02 4.906308
27 27 6.044629e-04 1.632050e-02 4.922629
28 28 4.835703e-04 1.353997e-02 4.936169
29 29 3.868563e-04 1.121883e-02 4.947388
30 30 3.094850e-04 9.284550e-03 4.956672
31 31 2.475880e-04 7.675228e-03 4.964347
32 32 1.980704e-04 6.338253e-03 4.970686
33 33 1.584563e-04 5.229059e-03 4.975915
34 34 1.267651e-04 4.310012e-03 4.980225
35 35 1.014120e-04 3.549422e-03 4.983774
36 36 8.112964e-05 2.920667e-03 4.986695
37 37 6.490371e-05 2.401437e-03 4.989096
38 38 5.192297e-05 1.973073e-03 4.991069
39 39 4.153837e-05 1.619997e-03 4.992689
40 40 3.323070e-05 1.329228e-03 4.994018
41 41 2.658456e-05 1.089967e-03 4.995108
42 42 2.126765e-05 8.932412e-04 4.996002
43 43 1.701412e-05 7.316071e-04 4.996733
44 44 1.361129e-05 5.988970e-04 4.997332
45 45 1.088904e-05 4.900066e-04 4.997822
46 46 8.711229e-06 4.007165e-04 4.998223
47 47 6.968983e-06 3.275422e-04 4.998550
48 48 5.575186e-06 2.676089e-04 4.998818
49 49 4.460149e-06 2.185473e-04 4.999037
50 50 3.568119e-06 1.784060e-04 4.999215
51 51 2.854495e-06 1.455793e-04 4.999361
52 52 2.283596e-06 1.187470e-04 4.999479
53 53 1.826877e-06 9.682448e-05 4.999576
54 54 1.461502e-06 7.892109e-05 4.999655
55 55 1.169201e-06 6.430607e-05 4.999719
56 56 9.353610e-07 5.238022e-05 4.999772
57 57 7.482888e-07 4.265246e-05 4.999814
58 58 5.986311e-07 3.472060e-05 4.999849
59 59 4.789049e-07 2.825539e-05 4.999877
60 60 3.831239e-07 2.298743e-05 4.999900
61 61 3.064991e-07 1.869645e-05 4.999919
62 62 2.451993e-07 1.520236e-05 4.999934
63 63 1.961594e-07 1.235804e-05 4.999947
64 64 1.569275e-07 1.004336e-05 4.999957
65 65 1.255420e-07 8.160232e-06 4.999965
66 66 1.004336e-07 6.628619e-06 4.999971
67 67 8.034690e-08 5.383242e-06 4.999977
68 68 6.427752e-08 4.370871e-06 4.999981
69 69 5.142202e-08 3.548119e-06 4.999985
70 70 4.113761e-08 2.879633e-06 4.999988
71 71 3.291009e-08 2.336616e-06 4.999990
72 72 2.632807e-08 1.895621e-06 4.999992
73 73 2.106246e-08 1.537559e-06 4.999993
74 74 1.684997e-08 1.246898e-06 4.999995
75 75 1.347997e-08 1.010998e-06 4.999996
76 76 1.078398e-08 8.195824e-07 4.999997
77 77 8.627183e-09 6.642931e-07 4.999997
78 78 6.901746e-09 5.383362e-07 4.999998
79 79 5.521397e-09 4.361904e-07 4.999998
80 80 4.417118e-09 3.533694e-07 4.999998
81 81 3.533694e-09 2.862292e-07 4.999999
82 82 2.826955e-09 2.318103e-07 4.999999
83 83 2.261564e-09 1.877098e-07 4.999999
84 84 1.809251e-09 1.519771e-07 4.999999
85 85 1.447401e-09 1.230291e-07 4.999999
86 86 1.157921e-09 9.958120e-08 5.000000 ###########
87 87 9.263367e-10 8.059129e-08 5.000000
88 88 7.410694e-10 6.521410e-08 5.000000
89 89 5.928555e-10 5.276414e-08 5.000000
90 90 4.742844e-10 4.268560e-08 5.000000
91 91 3.794275e-10 3.452790e-08 5.000000
92 92 3.035420e-10 2.792587e-08 5.000000
93 93 2.428336e-10 2.258353e-08 5.000000
94 94 1.942669e-10 1.826109e-08 5.000000
95 95 1.554135e-10 1.476428e-08 5.000000
96 96 1.243308e-10 1.193576e-08 5.000000
97 97 9.946465e-11 9.648071e-09 5.000000
98 98 7.957172e-11 7.798028e-09 5.000000
99 99 6.365737e-11 6.302080e-09 5.000000
100 100 5.092590e-11 5.092590e-09 5.000000
> plot(npx, type="l")
> plot(plex, type="l")
$(4)$, $(5)$에 대한 증명은 Mean and Variance of Geometric Distribution
The probability that another snowboarder will make it down the slope without falling over is 0.4. Your job is to play like you’re the snowboarder and work out the following probabilities for your slope success.