Differences

This shows you the differences between two versions of the page.

--- r:multiple_regression [2020/12/01 14:55] – [Partial, Semi-partial Correlation and R squared value] hkimscil
+++ r:multiple_regression [2023/10/19 08:23] (current) – hkimscil
@@ Line 104: / Line 104: @@
   * unemployment rate (UNEM) = 9%, 12%, 3%
   * spring high school graduating class (HGRAD) = 100000, 98000, 78000
-  * a per capita income (INC) of $30,000, $2800, $36000
+  * a per capita income (INC) of \$30000, \$28000, \$36000
   * 일 때, enrollment는 어떻게 predict할 수 있을까?
@@ Line 111: / Line 111: @@
 여기에 위의 정보를 대입해 보면 된다.
+<code>
 new.data <- data.frame(UNEM=c(9, 12, 3), HGRAD=c(100000, 98000, 78000), INC=c(30000, 28000, 36000))
 predict(three.predictor.model, newdata=new.data)
+</code>
 <code>
@@ Line 129: / Line 131: @@
 \end{align*}
-[[:sequential_regression#eg_3_college_enrollment_in_new_mexico_university|Sequential method]]
+beta coefficient 살펴보기
+see [[:beta coefficients]]
+<code>
+# install.packages('lm.beta')
+# library(lm.beta)
+lm.beta(three.predictor.model)
+</code>
+<code>
+> # install.packages('lm.beta')
+> # library(lm.beta)
+> lm.beta(three.predictor.model)
+Call:
+lm(formula = ROLL ~ UNEM + HGRAD + INC, data = datavar)
+Standardized Coefficients::
+(Intercept)        UNEM       HGRAD         INC
+.0000000   0.1553619   0.3656177   0.6061762
+>
+</code>
+by hand
+<code>
+# coefficient * (sd(x)/sd(y)) 이므로
+#
+attach(datavar)
+sd.roll <- sd(ROLL)
+sd.unem <- sd(UNEM)
+sd.hgrad <- sd(HGRAD)
+sd.inc <- sd(INC)
+b.unem <- three.predictor.model$coefficients[2]
+b.hgrad <- three.predictor.model$coefficients[3]
+b.inc <- three.predictor.model$coefficients[4]
+## or
+b.unem <- 4.501e+02
+b.hgrad <- 4.065e-01
+b.inc <- 4.275e+00
+b.unem * (sd.unem / sd.roll)
+b.hgrad * (sd.hgrad / sd.roll)
+b.inc * (sd.inc / sd.roll)
+lm.beta(three.predictor.model)
+</code>
+output of the above
+<code>
+> sd.roll <- sd(ROLL)
+> sd.unem <- sd(UNEM)
+> sd.hgrad <- sd(HGRAD)
+> sd.inc <- sd(INC)
+>
+> b.unem <- three.predictor.model$coefficients[2]
+> b.hgrad <- three.predictor.model$coefficients[3]
+> b.inc <- three.predictor.model$coefficients[4]
+>
+> ## or
+> b.unem <- 4.501e+02
+> b.hgrad <- 4.065e-01
+> b.inc <- 4.275e+00
+>
+>
+> b.unem * (sd.unem / sd.roll)
+[1] 0.1554
+> b.hgrad * (sd.hgrad / sd.roll)
+[1] 0.3656
+> b.inc * (sd.inc / sd.roll)
+[1] 0.6062
+>
+> lm.beta(three.predictor.model)
+Call:
+lm(formula = ROLL ~ UNEM + HGRAD + INC, data = datavar)
+Standardized Coefficients::
+(Intercept)        UNEM       HGRAD         INC
+.0000      0.1554      0.3656      0.6062
+>
+</code>
+see also [[:sequential_regression#eg_3_college_enrollment_in_new_mexico_university|Sequential method]] regression modeling by hand
+see also [[:statistical regression methods]] regression modeling by computing
+<code>
+> fit <- three.predictor.model
+> step <- stepAIC(fit, direction="both")
+Start:  AIC=381.2
+ROLL ~ UNEM + HGRAD + INC
+        Df Sum of Sq      RSS AIC
+<none>               11237313 381
+- UNEM   1   6522098 17759411 392
+- HGRAD  1  12852039 24089352 401
+- INC    1  33568255 44805568 419
+>
+</code>
 ====== Housing ======
 {{housing.txt}}
@@ Line 137: / Line 239: @@
 ====== etc ======
+{{:marketing_from_datarium.csv}}
 <code>
+marketing <- read.csv("http://commres.net/wiki/_media/marketing_from_datarium.csv")
+</code>
+<code>
+# install.packages("tidyverse", dep=TRUE)
 library(tidyverse)
 data("marketing", package = "datarium")
@@ Line 145: / Line 253: @@
   * Note that to list all the independent (explanatory) variables, you could use ''lm (sales ~ ., data="marketing")''.
   * You could also use ''-'' sign to subtract ivs. ''lm(sales ~ . - newspapers, data = "marketing")''
 <code>