estimated_standard_deviation
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
estimated_standard_deviation [2023/09/13 10:58] – hkimscil | estimated_standard_deviation [2025/08/12 20:00] (current) – [R 에서 SS값이 최소인 v값을 찾기] hkimscil | ||
---|---|---|---|
Line 22: | Line 22: | ||
====== 직관적 이해 ====== | ====== 직관적 이해 ====== | ||
- | 위에서 n-1 을 사용하기 | + | 분산은 $SS/df$ 라고 배웠는데, |
- | \begin{eqnarray*} | + | |
- | \sum_{i=1}^{n} {(X_{i}-\mu)} > \sum_{i=1}^{n} {(X_{i}-\overline{X})} | + | 이를 그림으로도 설명할 수 있다. 아래에서 녹색의 세로선은 모집단의 평균값이고, |
- | \end{eqnarray*} | + | |
- | + | ||
- | 라는 점이다. 따라서 n 대신 n-1로 나눠주어서 " | + | |
- | + | ||
- | 아래는 20개의 원소를 갖는 k 집합을 예이다. | + | |
- | '' | + | |
- | + | ||
- | 우리는 이 집합의 평균과 분산값이 각각 8.95 와 27.2475 임을 알고 있다. 이 때 분산값은 24.2475는 SS값을 구한 후, 이를 N으로 나눈 값이다. | + | |
- | + | ||
- | 위의 모집단에서 3개의 샘플을 취하여 S1 = {4, 11, 18}을 얻었고, 그 평균값은 11이다. 위의 샘플에서 모집단의 분산값을 예측한다고 할 때, 모집단의 (N=20인) 평균값을 안다고 하면 우리는 | + | |
- | | s1 | mu | deviation score | ds< | + | |
- | | 4 | 8.95 | -4.95 | 24.5025 | + | |
- | | 11 | 8.95 | 2.05 | 4.2025 | + | |
- | | 18 | 8.95 | 9.05 | 81.9025 | + | |
- | | | | SS< | + | |
- | + | ||
- | SS< | + | |
- | + | ||
- | | s1 | $\overline{X}$ | deviation score | ds< | + | |
- | | 4 | 11 | -7 | 49 | | + | |
- | | 11 | 11 | 0 | 0 | | + | |
- | | 18 | 11 | 7 | 49 | | + | |
- | | | | SS< | + | |
- | + | ||
- | 이렇게 얻은 SS< | + | |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | < | + | |
- | ############ | + | |
- | set.seed(1010) | + | |
- | n.pop <- 20 | + | |
- | k <- sample(1: | + | |
- | k | + | |
- | k.mean <- mean(k) | + | |
- | k.pvar <- var(k)*((n.pop-1)/ | + | |
- | k.mean | + | |
- | k.pvar | + | |
- | + | ||
- | ############ | + | |
- | n.samp <- 3 | + | |
- | ks <- sample(k, n.samp) | + | |
- | ks | + | |
- | ks.mean <- mean(ks) | + | |
- | ks.var <- var(ks) | + | |
- | ks.pvar <- var(ks)*((n.samp-1)/ | + | |
- | ############ | + | |
- | ks-k.mean | + | |
- | ks-ks.mean | + | |
- | sum((ks-k.mean)^2) | + | |
- | sum((ks-ks.mean)^2) | + | |
- | </ | + | |
- | + | ||
- | < | + | |
- | ############ | + | |
- | set.seed(3) # another sample | + | |
- | n.samp <- 3 | + | |
- | ks <- sample(k, n.samp) | + | |
- | ks | + | |
- | ks.mean <- mean(ks) | + | |
- | ks.var <- var(ks) | + | |
- | ks.pvar <- var(ks)*((n.samp-1)/ | + | |
- | ############ | + | |
- | ks-k.mean | + | |
- | ks-ks.mean | + | |
- | sum((ks-k.mean)^2) | + | |
- | sum((ks-ks.mean)^2) | + | |
- | + | ||
- | ############ | + | |
- | set.seed(5) # another sample | + | |
- | n.samp <- 3 | + | |
- | ks <- sample(k, n.samp) | + | |
- | ks | + | |
- | ks.mean <- mean(ks) | + | |
- | ks.var <- var(ks) | + | |
- | ks.pvar <- var(ks)*((n.samp-1)/ | + | |
- | ############ | + | |
- | ks-k.mean | + | |
- | ks-ks.mean | + | |
- | sum((ks-k.mean)^2) | + | |
- | sum((ks-ks.mean)^2) | + | |
- | + | ||
- | ############ | + | |
- | set.seed(7) # another sample | + | |
- | n.samp <- 3 | + | |
- | ks <- sample(k, n.samp) | + | |
- | ks | + | |
- | ks.mean <- mean(ks) | + | |
- | ks.var <- var(ks) | + | |
- | ks.pvar <- var(ks)*((n.samp-1)/ | + | |
- | ############ | + | |
- | ks-k.mean | + | |
- | ks-ks.mean | + | |
- | sum((ks-k.mean)^2) | + | |
- | sum((ks-ks.mean)^2) | + | |
- | </ | + | |
- | + | ||
- | < | + | |
- | > ############ | + | |
- | > set.seed(1010) | + | |
- | > n.pop <- 20 | + | |
- | > k <- sample(1: | + | |
- | > k | + | |
- | | + | |
- | > k.mean <- mean(k) | + | |
- | > k.pvar <- var(k)*((n.pop-1)/ | + | |
- | > k.mean | + | |
- | [1] 8.95 | + | |
- | > k.pvar | + | |
- | [1] 27.2475 | + | |
- | > ############ | + | |
- | > n.samp <- 3 | + | |
- | > ks <- sample(k, n.samp) | + | |
- | > ks | + | |
- | [1] 11 13 18 | + | |
- | > ks.mean <- mean(ks) | + | |
- | > ks.var <- var(ks) | + | |
- | > ks.pvar <- var(ks)*((n.samp-1)/ | + | |
- | > ############ | + | |
- | > ks-k.mean | + | |
- | [1] 2.05 4.05 9.05 | + | |
- | > ks-ks.mean | + | |
- | [1] -3 -1 4 | + | |
- | > sum((ks-k.mean)^2) | + | |
- | [1] 102.5075 | + | |
- | > sum((ks-ks.mean)^2) | + | |
- | [1] 26 | + | |
- | </ | + | |
- | + | ||
- | < | + | |
- | > ############ | + | |
- | > set.seed(3) # another sample | + | |
- | > n.samp <- 3 | + | |
- | > ks <- sample(k, n.samp) | + | |
- | > ks | + | |
- | [1] 4 11 18 | + | |
- | > ks.mean <- mean(ks) | + | |
- | > ks.var <- var(ks) | + | |
- | > ks.pvar <- var(ks)*((n.samp-1)/ | + | |
- | > ############ | + | |
- | > ks-k.mean | + | |
- | [1] -4.95 2.05 9.05 | + | |
- | > ks-ks.mean | + | |
- | [1] -7 0 7 | + | |
- | > sum((ks-k.mean)^2) | + | |
- | [1] 110.6075 | + | |
- | > sum((ks-ks.mean)^2) | + | |
- | [1] 98 | + | |
- | > | + | |
- | + | ||
- | > ############ | + | |
- | > set.seed(5) # another sample | + | |
- | > n.samp <- 3 | + | |
- | > ks <- sample(k, n.samp) | + | |
- | > ks | + | |
- | [1] 4 5 18 | + | |
- | > ks.mean <- mean(ks) | + | |
- | > ks.var <- var(ks) | + | |
- | > ks.pvar <- var(ks)*((n.samp-1)/ | + | |
- | > ############ | + | |
- | > ks-k.mean | + | |
- | [1] -4.95 -3.95 9.05 | + | |
- | > ks-ks.mean | + | |
- | [1] -5 -4 9 | + | |
- | > sum((ks-k.mean)^2) | + | |
- | [1] 122.0075 | + | |
- | > sum((ks-ks.mean)^2) | + | |
- | [1] 122 | + | |
- | + | ||
- | + | ||
- | > ############ | + | |
- | > set.seed(7) # another sample | + | |
- | > n.samp <- 3 | + | |
- | > ks <- sample(k, n.samp) | + | |
- | > ks | + | |
- | [1] 11 5 18 | + | |
- | > ks.mean <- mean(ks) | + | |
- | > ks.var <- var(ks) | + | |
- | > ks.pvar <- var(ks)*((n.samp-1)/ | + | |
- | > ############ | + | |
- | > ks-k.mean | + | |
- | [1] 2.05 -3.95 9.05 | + | |
- | > ks-ks.mean | + | |
- | [1] -0.3333333 -6.3333333 | + | |
- | > sum((ks-k.mean)^2) | + | |
- | [1] 101.7075 | + | |
- | > sum((ks-ks.mean)^2) | + | |
- | [1] 84.66667 | + | |
- | > | + | |
- | </ | + | |
- | 위의 코드에서 | + | |
- | '' | + | |
- | '' | + | |
- | 인데, 위의 케이스를 보면 | + | |
- | + | ||
- | '' | + | |
- | $\sum({X_{i}-\mu})^{2} > \sum({X_{i}-\overline{X}})^{2}$ 의 경향이 있다. | + | |
- | + | ||
- | 이를 그림으로 설명하면 다음과 같다. 아래에서 녹색의 세로선은 모집단의 평균값이고, | + | |
{{: | {{: | ||
Line 493: | Line 293: | ||
+ | ====== R 에서 SS값이 최소인 v값을 찾기 ====== | ||
+ | < | ||
+ | # | ||
+ | # | ||
- | {{tag>" | + | rm(list=ls()) |
+ | rnorm2 <- function(n, | ||
+ | mean+sd*scale(rnorm(n)) | ||
+ | } | ||
+ | # set.seed(191) | ||
+ | nx <- 20 | ||
+ | mx <- 50 | ||
+ | sdx <- mx * 0.15 | ||
+ | x <- rnorm2(nx, mx, sdx) | ||
+ | |||
+ | mean(x) | ||
+ | sd(x) | ||
+ | length(x) | ||
+ | hist(x) | ||
+ | |||
+ | x.span <- seq(from = mean(x)-6*sd(x), | ||
+ | to = mean(x)+6*sd(x), | ||
+ | by = 0.1) | ||
+ | |||
+ | residuals <- function(x, v) { | ||
+ | return(x - v) | ||
+ | } | ||
+ | |||
+ | ssr <- function(x, v) { | ||
+ | residuals <- (x - v) | ||
+ | return(sum(residuals^2)) | ||
+ | } | ||
+ | |||
+ | msr <- function(x, v) { | ||
+ | residuals <- (x - v) | ||
+ | # return((sum(residuals^2))/ | ||
+ | return((mean(residuals^2))) | ||
+ | } | ||
+ | |||
+ | srs <- c() # sum of residuals | ||
+ | ssrs <- c() # sum of square residuals | ||
+ | msrs <- c() # mean square residuals = variance | ||
+ | vs <- c() # the value of v in (x - v) | ||
+ | |||
+ | for (i in x.span) { | ||
+ | res.x <- residuals(x, | ||
+ | srs.x <- sum(res.x) | ||
+ | ssr.x <- ssr(x,i) | ||
+ | msr.x <- msr(x,i) | ||
+ | srs <- append(srs, srs.x) | ||
+ | ssrs <- append(ssrs, | ||
+ | msrs <- append(msrs, | ||
+ | vs <- append(vs, i) | ||
+ | } | ||
+ | plot(srs) | ||
+ | plot(msrs) | ||
+ | plot(srs) | ||
+ | |||
+ | min(msrs) | ||
+ | min.pos.msrs <- which(msrs == min(msrs)) | ||
+ | min.pos.msrs | ||
+ | print(vs[min.pos.msrs]) | ||
+ | |||
+ | plot(vs, msrs) | ||
+ | plot(vs, srs) | ||
+ | |||
+ | |||
+ | |||
+ | # the above no gradient | ||
+ | # mse 값으로 계산 rather than sse | ||
+ | # 후자는 값이 너무 커짐 | ||
+ | |||
+ | gradient <- function(x, v){ | ||
+ | residuals = x - v | ||
+ | dx = -2 * mean(residuals) | ||
+ | return(list(" | ||
+ | } # function returns ds value | ||
+ | |||
+ | residuals <- function(x, v) { | ||
+ | return(x - v) | ||
+ | } | ||
+ | |||
+ | ssr <- function(x, v) { | ||
+ | residuals <- (x - v) | ||
+ | return(sum(residuals^2)) | ||
+ | } | ||
+ | |||
+ | msr <- function(x, v) { | ||
+ | residuals <- (x - v) | ||
+ | return((sum(residuals^2))/ | ||
+ | # return(mean(residuals^2)) | ||
+ | } | ||
+ | |||
+ | # pick one random v in (x-v) | ||
+ | v <- rnorm(1) | ||
+ | # Train the model with scaled features | ||
+ | learning.rate = 1e-1 | ||
+ | |||
+ | ssrs <- c() | ||
+ | msrs <- c() | ||
+ | mres <- c() | ||
+ | vs <- c() | ||
+ | # Record Loss for each epoch: | ||
+ | zx <- (x-mean(x))/ | ||
+ | |||
+ | nlen <- 100 | ||
+ | for (epoch in 1:nlen) { | ||
+ | residual <- residuals(zx, | ||
+ | ssr.x <- ssr(zx, v) | ||
+ | msr.x <- msr(zx, v) | ||
+ | ssrs <- append(ssrs, | ||
+ | msrs <- append(msrs, | ||
+ | | ||
+ | grad <- gradient(zx, | ||
+ | | ||
+ | step.v <- grad$ds * learning.rate | ||
+ | v <- v - step.v | ||
+ | vs <- append(vs, v) | ||
+ | } | ||
+ | |||
+ | tail(srs) | ||
+ | tail(msrs) | ||
+ | tail(ssrs) | ||
+ | tail(vs) | ||
+ | |||
+ | plot(srs) | ||
+ | plot(msrs) | ||
+ | plot(ssrs) | ||
+ | plot(vs) | ||
+ | # scaled | ||
+ | v | ||
+ | v.orig <- (v*sd(x))+mean(x) | ||
+ | v.orig | ||
+ | |||
+ | </ | ||
+ | ====== output ====== | ||
+ | < | ||
+ | > # | ||
+ | > # | ||
+ | > | ||
+ | > rm(list=ls()) | ||
+ | > rnorm2 <- function(n, | ||
+ | + | ||
+ | + } | ||
+ | > | ||
+ | > # set.seed(191) | ||
+ | > nx <- 20 | ||
+ | > mx <- 50 | ||
+ | > sdx <- mx * 0.15 | ||
+ | > x <- rnorm2(nx, mx, sdx) | ||
+ | > | ||
+ | > mean(x) | ||
+ | [1] 50 | ||
+ | > sd(x) | ||
+ | [1] 7.5 | ||
+ | > length(x) | ||
+ | [1] 20 | ||
+ | > hist(x) | ||
+ | > | ||
+ | > x.span <- seq(from = mean(x)-6*sd(x), | ||
+ | + to = mean(x)+6*sd(x), | ||
+ | + by = 0.1) | ||
+ | > | ||
+ | > residuals <- function(x, v) { | ||
+ | + | ||
+ | + } | ||
+ | > | ||
+ | > ssr <- function(x, v) { | ||
+ | + | ||
+ | + | ||
+ | + } | ||
+ | > | ||
+ | > msr <- function(x, v) { | ||
+ | + | ||
+ | + # return((sum(residuals^2))/ | ||
+ | + | ||
+ | + } | ||
+ | > | ||
+ | > srs <- c() # sum of residuals | ||
+ | > ssrs <- c() # sum of square residuals | ||
+ | > msrs <- c() # mean square residuals = variance | ||
+ | > vs <- c() # the value of v in (x - v) | ||
+ | > | ||
+ | > for (i in x.span) { | ||
+ | + res.x <- residuals(x, | ||
+ | + srs.x <- sum(res.x) | ||
+ | + ssr.x <- ssr(x,i) | ||
+ | + msr.x <- msr(x,i) | ||
+ | + srs <- append(srs, srs.x) | ||
+ | + ssrs <- append(ssrs, | ||
+ | + msrs <- append(msrs, | ||
+ | + vs <- append(vs, i) | ||
+ | + } | ||
+ | > plot(srs) | ||
+ | > plot(msrs) | ||
+ | > plot(srs) | ||
+ | > | ||
+ | > min(msrs) | ||
+ | [1] 53.4375 | ||
+ | > min.pos.msrs <- which(msrs == min(msrs)) | ||
+ | > min.pos.msrs | ||
+ | [1] 451 | ||
+ | > print(vs[min.pos.msrs]) | ||
+ | [1] 50 | ||
+ | > | ||
+ | > plot(vs, msrs) | ||
+ | > plot(vs, srs) | ||
+ | > | ||
+ | > | ||
+ | > | ||
+ | > # the above no gradient | ||
+ | > # mse 값으로 계산 rather than sse | ||
+ | > # 후자는 값이 너무 커짐 | ||
+ | > | ||
+ | > gradient <- function(x, v){ | ||
+ | + | ||
+ | + dx = -2 * mean(residuals) | ||
+ | + | ||
+ | + } # function returns ds value | ||
+ | > | ||
+ | > residuals <- function(x, v) { | ||
+ | + | ||
+ | + } | ||
+ | > | ||
+ | > ssr <- function(x, v) { | ||
+ | + | ||
+ | + | ||
+ | + } | ||
+ | > | ||
+ | > msr <- function(x, v) { | ||
+ | + | ||
+ | + | ||
+ | + # return(mean(residuals^2)) | ||
+ | + } | ||
+ | > | ||
+ | > # pick one random v in (x-v) | ||
+ | > v <- rnorm(1) | ||
+ | > # Train the model with scaled features | ||
+ | > learning.rate = 1e-1 | ||
+ | > | ||
+ | > ssrs <- c() | ||
+ | > msrs <- c() | ||
+ | > mres <- c() | ||
+ | > vs <- c() | ||
+ | > # Record Loss for each epoch: | ||
+ | > zx <- (x-mean(x))/ | ||
+ | > | ||
+ | > nlen <- 100 | ||
+ | > for (epoch in 1:nlen) { | ||
+ | + | ||
+ | + ssr.x <- ssr(zx, v) | ||
+ | + msr.x <- msr(zx, v) | ||
+ | + ssrs <- append(ssrs, | ||
+ | + msrs <- append(msrs, | ||
+ | + | ||
+ | + grad <- gradient(zx, | ||
+ | + | ||
+ | + | ||
+ | + v <- v - step.v | ||
+ | + vs <- append(vs, v) | ||
+ | + } | ||
+ | > | ||
+ | > tail(srs) | ||
+ | [1] -890 -892 -894 -896 -898 -900 | ||
+ | > tail(msrs) | ||
+ | [1] 1 1 1 1 1 1 | ||
+ | > tail(ssrs) | ||
+ | [1] 19 19 19 19 19 19 | ||
+ | > tail(vs) | ||
+ | [1] 2.936258e-11 2.349006e-11 1.879204e-11 1.503363e-11 1.202690e-11 9.621523e-12 | ||
+ | > | ||
+ | > plot(srs) | ||
+ | > plot(msrs) | ||
+ | > plot(ssrs) | ||
+ | > plot(vs) | ||
+ | > # scaled | ||
+ | > v | ||
+ | [1] 9.621523e-12 | ||
+ | > v.orig <- (v*sd(x))+mean(x) | ||
+ | > v.orig | ||
+ | [1] 50 | ||
+ | > | ||
+ | > | ||
+ | </ | ||
+ | ====== tags ====== | ||
+ | {{tag>" | ||
estimated_standard_deviation.1694570323.txt.gz · Last modified: 2023/09/13 10:58 by hkimscil