====== Normal distribution functions ======

^ Function  ^ Purpose  ^
| dnorm	 | Normal density  | 
| pnorm	 | Normal distribution function  | 
| qnorm	 | Normal quantile function  | 
| rnorm	 | Normal random variates  | 

Table 8-1. Discrete distributions

| Discrete distribution  | R name  | Parameters  |
| Binomial  | binom  | n = number of trials;  \\ p = probability of success for one trial  |
| Geometric  | geom  | p = probability of success for one trial  |
| Hypergeometric  | hyper  | m = number of white balls in urn; \\ n = number of black balls in urn;  \\ k = number of balls drawn from urn  |
| Negative binomial (NegBinomial)  | nbinom  | size = number of successful trials; \\ either prob = probability of successful trial or \\ mu = mean  |
| Poisson  | pois  | lambda = mean  |

Table 8-2. Continuous distributions
| Continuous distribution  | R name  | Parameters  |
| Beta  | beta  | shape1; shape2  |
| Cauchy  | cauchy  | location; scale  |
| Chi-squared (Chisquare)  | chisq  | df = degrees of freedom  |
| Exponential  | exp  | rate  |
| F  | f  | df1 and df2 = degrees of freedom  |
| Gamma  | gamma  | rate; either rate or scale  |
| Log-normal (Lognormal)  | lnorm  | meanlog = mean on logarithmic scale; \\ sdlog = standard deviation on logarithmic scale   |
| Logistic  | logis  | location; scale  |
| Normal  | norm  | mean; \\ sd = standard deviation  |
| Student’s t (TDist)  | t  | df = degrees of freedom  |
| Uniform  | unif  | min = lower limit; \\ max = upper limit  |
| Weibull  | weibull  | shape; scale  |
| Wilcoxon  | wilcox  | m = number of observations in first sample; \\ n = number of observations in second sample   |
===== pnorm, qnorm =====

<WRAP info>
Normal distribution
$ f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{\frac{-(x-\mu)^2}{2\sigma^2}} $

Assume that the test scores of a college entrance exam fits a normal distribution. Furthermore, the mean test score is 72, and the standard deviation is 15.2. What is the percentage of students scoring 84 or more in the exam?

<code>> pnorm(72, mean=72, sd=15.2, lower.tail=FALSE)
[1] 0.5

> pnorm(1.96)
[1] 0.9750021

> pnorm(1.96)-pnorm(-1.96)
[1] 0.9500042

> pnorm(c(1.96, -1.96))
[1] 0.9750021 0.0249979

> pnorm(84, mean=72, sd=15.2, lower.tail=FALSE)
[1] .2149176

> qnorm(.2149176, mean=72, sd=15.2, lower.tail=FALSE)
[1] 84
</code></WRAP>
===== rnorm =====
Random samples from a normal distribution
<code>> set.seed(1024)
> rnorm(50)
 [1] -0.778662882 -0.389476396 -2.033798329 -0.982373104  0.247890054
 [6] -2.103864629 -0.381418049  2.074919838  1.027138407  0.473014228
[11] -1.879263193 -1.239189026  1.160418602  0.003671291 -0.095452066
[16]  1.795551228 -1.322138481 -0.276086413 -0.743976510 -1.070050125
[21] -0.349525474  0.805559661  1.605301660  1.447595754 -0.128302224
[26] -0.538926447  0.391586050  0.879217023 -0.824732092  0.732876423
[31] -0.664914510  0.360885549  1.011930957 -0.235916848  1.353589893
[36] -0.268632965  1.019877368 -0.279706500 -0.618146278 -0.499273059
[41] -0.153716777  1.220869694 -0.669570510 -1.209660342  1.024096655
[46]  0.603955311 -0.568653469 -0.891303117 -2.525145692  0.589357049</code>


===== qt, pt =====

<WRAP info>
$t = \frac{Z}{\sqrt{\frac{V}{m}}}$
<code>> qt(c(0.025, 0.975), df=5)
[1] -2.5706  2.5706

> qt(c(0.025, 0.975), df=10)
[1] -2.228139  2.228139

> qt(c(0.025, 0.975), df=20)
[1] -2.085963  2.085963

> qt(c(0.025, 0.975), df=30)
[1] -2.042272  2.042272

> qt(c(0.025, 0.975), df=40)
[1] -2.021075  2.021075

> qt(c(0.025, 0.975), df=50)
[1] -2.008559  2.008559

. . . . . .

> qt(c(0.025, 0.975), df=50000)
[1] -1.960011  1.960011

</code>
</WRAP>

====== Counting the Number of Combinations ======

A common problem in computing probabilities of discrete variables is counting combinations: the number of distinct subsets of size k that can be created from n items. The number is given by 
 $$n!/r!(n − r)!$$
But it’s much more convenient to use the choose function—especially as n and k grow larger:
<code>> choose(5,3)       # How many ways can we select 3 items from 5 items?
[1] 10
> choose(50,3)      # How many ways can we select 3 items from 50 items?
[1] 19600
> choose(50,30)     # How many ways can we select 30 items from 50 items?
[1] 4.712921e+13
</code>
These numbers are also known as **binomial coefficients**.

====== Generating Combinations ======
<code>> combn(1:5,3)
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    1    1    1    1    1    1    2    2    2     3
[2,]    2    2    2    3    3    4    3    3    4     4
[3,]    3    4    5    4    5    5    4    5    5     5
</code>

The function is not restricted to numbers. We can generate combinations of strings, too. Here are all combinations of five treatments taken three at a time:
<code>> combn(c("T1","T2","T3","T4","T5"), 3)
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] "T1" "T1" "T1" "T1" "T1" "T1" "T2" "T2" "T2" "T3" 
[2,] "T2" "T2" "T2" "T3" "T3" "T4" "T3" "T3" "T4" "T4" 
[3,] "T3" "T4" "T5" "T4" "T5" "T5" "T4" "T5" "T5" "T5"
</code>

====== Generating Random Numbers ======

The simple case of generating **uniform random numbers** **between 0 and 1** is handled by the runif function:
<code>> runif(1)
[1] 0.5119812
</code>

Generating a vector of 10 such values:
<code>> runif(10)
 [1] 0.03475948 0.88950680 0.90005434 0.95689496 0.55829493 0.18407604
 [7] 0.87814788 0.71057726 0.11140864 0.66392239
</code>

<code>> runif(1, min=-3, max=3)          # One uniform variate between -3 and +3
[1] 2.954591
> rnorm(1)                         # One standard Normal variate
[1] 1.048491
> rnorm(1, mean=100, sd=15)        # One Normal variate, mean 100 and SD 15
[1] 108.7300
> rbinom(1, size=10, prob=0.5)     # One binomial variate
[1] 3
</code>

<code>> rpois(1, lambda=10)              # One Poisson variate
[1] 13
> rexp(1, rate=0.1)                # One exponential variate
[1] 8.430267
> rgamma(1, shape=2, rate=0.1)     # One gamma variate
[1] 20.47334
</code>


<code>> rnorm(3, mean=c(-10,0,+10), sd=1) # mean이 각 -10,0,10이고 각 mean의 sd가 1인 경우에, random score를 구할것
[1] -11.195667   2.615493  10.294831
</code>

Recycling the vector . . . 
<code>> rnorm(6, mean=c(-10,0,+10), sd=1)
[1] -11.74168122   0.56572232  11.88595452 -11.13726844
[5]   0.03274875   9.02216868
</code> 

====== Generating Reproducible Random Numbers ======
After generating random numbers, you may often want to **reproduce the same sequence of “random” numbers** every time your program executes. 

<code>> set.seed(165)          # Initialize the random number generator to a known state
> runif(10)              # Generate ten random numbers
 [1] 0.1159132 0.4498443 0.9955451 0.6106368 0.6159386 0.4261986 0.6664884
 [8] 0.1680676 0.7878783 0.4421021
> set.seed(165)          # Reinitialize to the same known state
> runif(10)              # Generate the same ten "random" numbers
 [1] 0.1159132 0.4498443 0.9955451 0.6106368 0.6159386 0.4261986 0.6664884
 [8] 0.1680676 0.7878783 0.4421021
</code> 

====== Generating a Random Sample ======
Suppose your World Series data contains a vector of years when the Series was played. You can select 10 years at random using sample:
<code>year <- c(1900:2016)     # years in vector year
world.series <- data.frame(year)
</code>

<code>> sample(world.series$year, 10)
 [1] 1906 1963 1966 1928 1905 1924 1961 1959 1927 1934
</code>

The items are randomly selected, so running sample again (usually) produces a different result:
<code>> sample(world.series$year, 10)
 [1] 1968 1947 1966 1916 1970 1961 1936 1913 1914 1958
</code>

Replacement in random sampling: Specify replace=TRUE to sample with replacement.

<code>> set.seed(121)
sample(world.series$year, 10)
 [1] 1906 1963 1966 1928 1905 1924 1961 1959 1927 1934
set.seed(121)
sample(world.series$year, 10)
 [1] 1906 1963 1966 1928 1905 1924 1961 1959 1927 1934
</code>


====== Generating Random Sequences ======
<code>> sample(set, n, replace=TRUE)

> sample(c("H","T"), 10, replace=TRUE)
 [1] "H" "H" "H" "T" "T" "H" "T" "H" "H" "T"

> sample(c(FALSE,TRUE), 20, replace=TRUE)
 [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
[13]  TRUE  TRUE FALSE  TRUE  TRUE FALSE FALSE  TRUE
</code>

<code>> sample(c(FALSE,TRUE), 20, replace=TRUE, prob=c(0.2,0.8))
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE
[13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE
</code>

====== Randomly Permuting a Vector ======
<code>> sample(1:10)
 [1]  5  8  7  4  3  9  2  6  1 10
</code>

====== Calculating Probabilities for Continuous Distributions ======
| Distribution  | Distribution function: P(X ≤ x)  |
| Normal  | pnorm(x, mean, sd)  |
| Student’s t  | pt(x, df)  |
| Exponential  | pexp(x, rate)  |
| Gamma  | pgamma(x, shape, rate)  |
| Chi-squared (χ2)  | pchisq(x, df)  |

<code>> pnorm(66, mean=70, sd=3)
[1] 0.09121122
</code>

<code>> pnorm(73, mean=70, sd=3)
[1] ??
</code>

<code>> b <- pnorm(-1)
> a <- pnorm(1)
> a-b
[1] 0.6826895

> b <- pnorm(-2)
> a <- pnorm(2)
> a-b
[1] 0.9544997

> a <- pnorm(3)
> b <- pnorm(-3)
> a-b
[1] 0.9973002
</code>

<code>
> b <- pnorm(-1.959964)
> a <- pnorm(1.959964)
> a-b
[1] 0.95
</code>

====== Converting Probabilities to Quantiles ======
<code>> qnorm(0.8413447, mean=70, sd=3)
[1] 73
</code>

<code>> pnorm(73, mean=70, sd=3)
[1] 0.8413447
</code>

<code>> qnorm(c(0.025, 0.975)) # 5% 바깥쪽의 점수는 약 +-2sd 점수인 -2, 2
[1] -1.959964  1.959964
</code>

====== Plotting a Density Function ======
<code>> x <- seq(from=-3, to=+3, length.out=100)
> plot(x, dnorm(x))
</code>

{{dnorm_x.png|Normal distribution plot}}

<code>> x <- seq(from=-3, to=+3, length.out=100)
> y <- dnorm(x)
> plot(x, y, main="Standard Normal Distribution", type='l',
+      ylab="Density", xlab="Quantile")
> abline(h=0)
</code>

{{standard_normal_distribution.png|standard_normal_distribution}}

<code>> # The body of the polygon follows the density curve where 1 <= z <= 2
> region.x <- x[1 <= x & x <= 2]
> region.y <- y[1 <= x & x <= 2]
> 
> # We add initial and final segments, which drop down to the Y axis
> region.x <- c(region.x[1], region.x, tail(region.x,1))
> region.y <- c(          0, region.y,                0)
> polygon(region.x, region.y, density=-1, col="red")
</code>

{{standard_normal_distribution_1-2.png|standard_normal_distribution}}