====== Wald Test ====== Regression model의 coefficient값이 significant 한지 테스트하는 방법. 즉, regression coefficient의 t-test와 비슷한 일을 한다. H0: Some set of predictor variables are all equal to zero. HA: Not all predictor variables in the set are equal to zero. #fit regression model model <- lm(mpg ~ disp + carb + hp + cyl, data = mtcars) #view model summary summary(model) Call: lm(formula = mpg ~ disp + carb + hp + cyl, data = mtcars) Residuals: Min 1Q Median 3Q Max -5.0761 -1.5752 -0.2051 1.0745 6.3047 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 34.021595 2.523397 13.482 1.65e-13 *** disp -0.026906 0.011309 -2.379 0.0247 * carb -0.926863 0.578882 -1.601 0.1210 hp 0.009349 0.020701 0.452 0.6551 cyl -1.048523 0.783910 -1.338 0.1922 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.973 on 27 degrees of freedom Multiple R-squared: 0.788, Adjusted R-squared: 0.7566 F-statistic: 25.09 on 4 and 27 DF, p-value: 9.354e-09 # coefficient coef(model) (Intercept) disp carb hp cyl 34.021594525 -0.026906182 -0.926863291 0.009349208 -1.048522632 # term1, 2, 3, 4, 5 install.packages("aod") library(aod) # wald.test(Sigma, b, Terms) #perform Wald Test to determine if 3rd and 4th predictor variables are both zero wald.test(Sigma = vcov(model), b = coef(model), Terms = 3:4) Wald test: ---------- Chi-squared test: X2 = 3.6, df = 2, P(> X2) = 0.16 wald.test(Sigma, b, Terms) * Sigma: The variance-covariance matrix of the regression model * b: A vector of regression coefficients from the model * Terms: A vector that specifies which coefficients to test ====== Wald test in logistic regression ====== odds <- function(p) p/(1-p) odds.ratio <- function(p1, p2) odds(p1)/odds(p2) logit <- function(p) log(p/(1-p)) ilogit <- function(x) exp(x)/(1+exp(x)) iter <- 10000 n <- 350 p.cancer <- 0.08 p.mutant <- 0.39 logor <- rep (NA, iter) pp0 <- rep (NA, iter) pp1 <- rep (NA, iter) op0 <- rep (NA, iter) op1 <- rep (NA, iter) or <- rep (NA, iter) for(i in 1:iter){ c <- runif(n, 0, 1) canc <- ifelse(c>=p.cancer, "nocancer", "cancer") c <- runif(n, 0, 1) gene <- ifelse(c>=p.mutant, "norm", "mutated") da <- data.frame(gene, canc) tab <- table(da) pp0[i] <- tab[1,1] / (tab[1,1] + tab[1,2]) pp1[i] <- tab[2,1] / (tab[2,1] + tab[2,2]) op0[i] <- odds(pp0[i]) op1[i] <- odds(pp1[i]) or[i] <- odds.ratio(pp0[i], pp1[i]) # stats <- c(pp0, pp1, op0, op1, ortemp) logor[i] <- log(or[i]) } hist(logor,breaks = 50)