multiple_regression
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
multiple_regression [2019/05/23 10:20] – [Why overall model is significant while IVs are not?] hkimscil | multiple_regression [2023/10/19 08:39] (current) – [Determining IVs' role] hkimscil | ||
---|---|---|---|
Line 44: | Line 44: | ||
====== e.g.====== | ====== e.g.====== | ||
Data set again. | Data set again. | ||
+ | < | ||
+ | datavar <- read.csv(" | ||
^ DATA for regression analysis | ^ DATA for regression analysis | ||
Line 166: | Line 168: | ||
====== e.g., ====== | ====== e.g., ====== | ||
DATA: \\ | DATA: \\ | ||
- | <wrap indent> | + | <wrap indent> |
+ | {{: | ||
+ | </ | ||
The Academic Performance Index (**API**) is a measurement of //academic performance and progress of individual schools in California, United States//. It is one of the main components of the Public Schools Accountability Act passed by the California legislature in 1999. API scores ranges from a low of 200 to a high of 1000. [[https:// | The Academic Performance Index (**API**) is a measurement of //academic performance and progress of individual schools in California, United States//. It is one of the main components of the Public Schools Accountability Act passed by the California legislature in 1999. API scores ranges from a low of 200 to a high of 1000. [[https:// | ||
Line 242: | Line 246: | ||
| | number of students | | | number of students | ||
| a. Dependent Variable: api 2000 ||||||| | | a. Dependent Variable: api 2000 ||||||| | ||
+ | |||
====== e.g., ====== | ====== e.g., ====== | ||
Line 328: | Line 333: | ||
</ | </ | ||
- | ====== Why overall model is significant while IVs are not? ====== | + | ===== in R ===== |
- | see https://www.researchgate.net/post/Why_is_the_Multiple_regression_model_not_significant_while_simple_regression_for_the_same_variables_is_significant | + | < |
+ | mod <- lm(api00 ~ ell + acs_k3 + avg_ed + meals, data=dvar) | ||
+ | summary(mod) | ||
+ | anova(mod) | ||
+ | </ | ||
< | < | ||
- | RSS = 3:10 #Right shoe size | + | dvar <- read.csv(" |
- | LSS = rnorm(RSS, RSS, 0.1) #Left shoe size - similar to RSS | + | > mod <- lm(api00 ~ ell + acs_k3 + avg_ed |
- | cor(LSS, RSS) # | + | > summary(mod) |
- | + | ||
- | weights = 120 + rnorm(RSS, 10*RSS, 10) | + | |
- | + | ||
- | ##Fit a joint model | + | |
- | m = lm(weights ~ LSS + RSS) | + | |
- | + | ||
- | ##F-value is very small, but neither LSS or RSS are significant | + | |
- | summary(m) | + | |
- | </code> | + | |
- | + | ||
- | + | ||
- | <code>> | + | |
- | > LSS = rnorm(RSS, RSS, 0.1) #Left shoe size - similar to RSS | + | |
- | > cor(LSS, RSS) # | + | |
- | [1] 0.9994836 | + | |
- | > | + | |
- | > weights = 120 + rnorm(RSS, 10*RSS, 10) | + | |
- | > | + | |
- | > ##Fit a joint model | + | |
- | > m = lm(weights ~ LSS + RSS) | + | |
- | > | + | |
- | > ##F-value is very small, but neither LSS or RSS are significant | + | |
- | > summary(m) | + | |
Call: | Call: | ||
- | lm(formula = weights | + | lm(formula = api00 ~ ell + acs_k3 + avg_ed + meals, data = dvar) |
Residuals: | Residuals: | ||
- | | + | |
- | 4.8544 4.5254 | + | -187.020 -40.358 -0.313 36.155 173.697 |
Coefficients: | Coefficients: | ||
Estimate Std. Error t value Pr(> | Estimate Std. Error t value Pr(> | ||
- | (Intercept) | + | (Intercept) |
- | LSS -14.162 | + | ell -0.8434 |
- | RSS 26.305 | + | acs_k3 |
+ | avg_ed | ||
+ | meals -2.9374 | ||
--- | --- | ||
Signif. codes: | Signif. codes: | ||
- | Residual standard error: | + | Residual standard error: |
- | Multiple R-squared: | + | (21 observations deleted due to missingness) |
- | F-statistic: | + | Multiple R-squared: |
+ | F-statistic: | ||
+ | > anova(mod) | ||
+ | Analysis of Variance Table | ||
+ | |||
+ | Response: api00 | ||
+ | | ||
+ | ell 1 4502711 4502711 1309.762 < 2.2e-16 *** | ||
+ | acs_k3 | ||
+ | avg_ed | ||
+ | meals | ||
+ | Residuals 374 1285740 | ||
+ | --- | ||
+ | Signif. codes: | ||
> | > | ||
- | > ##Fitting RSS or LSS separately gives a significant result. | + | </code> |
- | > summary(lm(weights ~ LSS)) | + | |
+ | < | ||
Call: | Call: | ||
- | lm(formula = weights | + | lm(formula = api00 ~ ell + acs_k3 + avg_ed + meals, data = dvar) |
- | + | ||
- | Residuals: | + | |
- | | + | |
- | -6.055 -4.930 -2.925 | + | |
Coefficients: | Coefficients: | ||
- | Estimate Std. Error t value Pr(> | + | (Intercept) |
- | (Intercept) | + | 709.6388 -0.8434 3.3884 29.0724 |
- | LSS | + | |
- | --- | + | |
- | Signif. codes: | + | |
- | Residual standard error: 7.026 on 6 degrees of freedom | + | ></ |
- | Multiple R-squared: | + | $$ \hat{Y} = 709.6388 + -0.8434 \text{ell} + 3.3884 \text{acs_k3} + 29.0724 \text{avg_ed} + -2.9374 \text{meals} \\$$ |
- | F-statistic: | + | |
- | > | + | 그렇다면 각각의 독립변인 고유의 설명력은 얼마인가? |
- | </ | + | |
Line 427: | Line 420: | ||
| | Standard Multiple | | | Standard Multiple | ||
- | | r< | + | | r< |
| ::: | IV< | | ::: | IV< | ||
- | | sr< | + | | sr< |
| ::: | IV< | | ::: | IV< | ||
- | | pr< | + | | pr< |
| ::: | IV< | | ::: | IV< | ||
| IV< | | IV< | ||
Line 454: | Line 447: | ||
Multicolliearity problem = when torelance < .01 or when VIF > 10 | Multicolliearity problem = when torelance < .01 or when VIF > 10 | ||
+ | ====== elem e.g. again ====== | ||
+ | < | ||
+ | dvar <- read.csv(" | ||
+ | mod <- lm(api00 ~ ell + acs_k3 + avg_ed + meals, data=dvar) | ||
+ | summary(mod) | ||
+ | anova(mod) | ||
+ | </ | ||
+ | < | ||
+ | dvar <- read.csv(" | ||
+ | > mod <- lm(api00 ~ ell + acs_k3 + avg_ed + meals, data=dvar) | ||
+ | > summary(mod) | ||
+ | Call: | ||
+ | lm(formula = api00 ~ ell + acs_k3 + avg_ed + meals, data = dvar) | ||
+ | Residuals: | ||
+ | | ||
+ | -187.020 | ||
+ | |||
+ | Coefficients: | ||
+ | Estimate Std. Error t value Pr(> | ||
+ | (Intercept) 709.6388 | ||
+ | ell -0.8434 | ||
+ | acs_k3 | ||
+ | avg_ed | ||
+ | meals -2.9374 | ||
+ | --- | ||
+ | Signif. codes: | ||
+ | |||
+ | Residual standard error: 58.63 on 374 degrees of freedom | ||
+ | (21 observations deleted due to missingness) | ||
+ | Multiple R-squared: | ||
+ | F-statistic: | ||
+ | |||
+ | > anova(mod) | ||
+ | Analysis of Variance Table | ||
+ | |||
+ | Response: api00 | ||
+ | | ||
+ | ell 1 4502711 4502711 1309.762 < 2.2e-16 *** | ||
+ | acs_k3 | ||
+ | avg_ed | ||
+ | meals | ||
+ | Residuals 374 1285740 | ||
+ | --- | ||
+ | Signif. codes: | ||
+ | > | ||
+ | </ | ||
+ | < | ||
+ | # install.packages(" | ||
+ | library(ppcor) | ||
+ | myvar <- data.frame(api00, | ||
+ | myvar <- na.omit(myvar) | ||
+ | spcor(myvar) | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | > library(ppcor) | ||
+ | > myvar <- data.frame(api00, | ||
+ | > myvar <- na.omit(myvar) | ||
+ | > spcor(myvar) | ||
+ | $estimate | ||
+ | | ||
+ | api00 | ||
+ | ell -0.13469956 | ||
+ | acs_k3 | ||
+ | avg_ed | ||
+ | meals -0.29972194 | ||
+ | |||
+ | $p.value | ||
+ | api00 ell acs_k3 | ||
+ | api00 0.000000e+00 0.07761805 0.5525340 0.085390280 2.403284e-10 | ||
+ | ell 8.918743e-03 0.00000000 0.2390272 0.232377348 1.558141e-03 | ||
+ | acs_k3 1.608778e-01 0.05998819 0.0000000 0.009891503 7.907183e-03 | ||
+ | avg_ed 1.912418e-02 0.27203887 0.1380449 0.000000000 7.424903e-05 | ||
+ | meals 3.041658e-09 0.04526574 0.2919775 0.006489783 0.000000e+00 | ||
+ | |||
+ | $statistic | ||
+ | | ||
+ | api00 | ||
+ | ell -2.628924 | ||
+ | acs_k3 | ||
+ | avg_ed | ||
+ | meals -6.075665 | ||
+ | |||
+ | $n | ||
+ | [1] 379 | ||
+ | |||
+ | $gp | ||
+ | [1] 3 | ||
+ | |||
+ | $method | ||
+ | [1] " | ||
+ | > | ||
+ | > | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | > spcor.test(myvar$api00, | ||
+ | estimate | ||
+ | 1 -0.3190889 2.403284e-10 -6.511264 379 3 pearson | ||
+ | > | ||
+ | </ | ||
====== e.g., ====== | ====== e.g., ====== | ||
[[:multiple regression examples]] | [[:multiple regression examples]] | ||
Line 481: | Line 575: | ||
* Income Income seven years after College (in thousands) | * Income Income seven years after College (in thousands) | ||
+ | ====== exercise ====== | ||
+ | {{: | ||
+ | < | ||
+ | dvar <- read.csv(" | ||
+ | </ | ||
+ | |||
+ | [[:Multiple Regression Exercise]] | ||
====== Resources ====== | ====== Resources ====== | ||
Line 502: | Line 603: | ||
* https:// | * https:// | ||
* http:// | * http:// | ||
+ | |||
+ | https:// | ||
+ | |||
+ | |||
{{tag> " | {{tag> " | ||
+ |
multiple_regression.1558574411.txt.gz · Last modified: 2019/05/23 10:20 by hkimscil