c:ma:2019:multiple_regression_exercise
This is an old revision of the document!
Table of Contents
Class Activities
Ex. 1
- Install packages ISLR
- use a dataset, Carseats
- Build regression models with a DV, sales and IVs, your choices
- Use
?Carseatscommand for the explanation of the dataset - Use
strfunction to see the characteristic of each variable. Make it sure thatSelvesLocvariable should be factor, not int or anything. - 변인설명을 토대로 가설만들기
- 종속변인 = Sales
- 독립변인 = 숫자변인 1 + 종류변인 1 (조별 선택)
- Multiple regression without interactin
- Multiple regression with interaction
- 가설 만들기
- 종속변인 Sales
- 독립변인 여러개 (interaction 없이)
- Modeling 해 볼 것
see hierarchical regression
see also statistical regression methods ← 많이 쓰이지 않음
Make a full model (with all variables) then reduce down the model until you find it fitted.Make a null model (with no variables) then, build up the model with additional IVs until you find a fitted model.Can we usesteporstepAIC(MASS package needed) function?Interpret the result
> step(lm.full, direction=“back”)
Ex. 2
- Install packages tidyverse
- load the tidyverse
install.packages("car")data("Salaries", package = "car")- Use a dataset Salaries
- describe the data set
—-
- Regress sex variable on salary variable
- Write the regression model
- Discuss the difference
- Use rank variable for the same purpose
- –
- Use yrs.service + rank + discipline + sex
- on salary
- How do you interpret the result?
—–
위의 Salaries 데이터사용이 안 될 때
- download to R from here salaries.csv
- use to import the data set.
Salaries <- read.csv("http://commres.net/wiki/_media/salaries.csv")
- for information about Salaries (it may not be loaded),
- use
??Salariesto describe the data set.
—–
Please copy and paste the proper r command and output to a txt file (use notepad or some other text editing program). You could use MS Word, but, please make it sure that you use type-setting fonts such as “Courier New.” The below output, as an example, includes the r command head(Salaries) and the output.
> head(Salaries)
rank discipline yrs.since.phd yrs.service sex salary
1 Prof B 19 18 Male 139750
2 Prof B 20 16 Male 173200
3 AsstProf B 4 3 Male 79750
4 Prof B 45 39 Male 115000
5 Prof B 40 41 Male 141500
6 AssocProf B 6 6 Male 97000
> lm.sal.sex <- lm(salary ~ sex, data=Salaries)
> summary(lm.sal.sex)
Call:
lm(formula = salary ~ sex, data = Salaries)
Residuals:
Min 1Q Median 3Q Max
-57290 -23502 -6828 19710 116455
Coefficients:
Estimate Std. Error
(Intercept) 101002 4809
sexMale 14088 5065
t value Pr(>|t|)
(Intercept) 21.001 < 2e-16 ***
sexMale 2.782 0.00567 **
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’
0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 30030 on 395 degrees of freedom
Multiple R-squared: 0.01921, Adjusted R-squared: 0.01673
F-statistic: 7.738 on 1 and 395 DF, p-value: 0.005667
> lm.sal.rank <- lm(salary ~ rank, data=Salaries)
> summary(lm.sal.rank)
Call:
lm(formula = salary ~ rank, data = Salaries)
Residuals:
Min 1Q Median 3Q Max
-68972 -16376 -1580 11755 104773
Coefficients:
Estimate Std. Error
(Intercept) 80776 2887
rankAssocProf 13100 4131
rankProf 45996 3230
t value Pr(>|t|)
(Intercept) 27.976 < 2e-16 ***
rankAssocProf 3.171 0.00164 **
rankProf 14.238 < 2e-16 ***
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’
0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 23630 on 394 degrees of freedom
Multiple R-squared: 0.3943, Adjusted R-squared: 0.3912
F-statistic: 128.2 on 2 and 394 DF, p-value: < 2.2e-16
>
> summary(lm.sal.many)
Call:
lm(formula = salary ~ yrs.service + rank + discipline + sex,
data = Salaries)
Residuals:
Min 1Q Median 3Q Max
-64202 -14255 -1533 10571 99163
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 68351.67 4482.20 15.250 < 2e-16 ***
yrs.service -88.78 111.64 -0.795 0.426958
rankAssocProf 14560.40 4098.32 3.553 0.000428 ***
rankProf 49159.64 3834.49 12.820 < 2e-16 ***
disciplineB 13473.38 2315.50 5.819 1.24e-08 ***
sexMale 4771.25 3878.00 1.230 0.219311
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 22650 on 391 degrees of freedom
Multiple R-squared: 0.4478, Adjusted R-squared: 0.4407
F-statistic: 63.41 on 5 and 391 DF, p-value: < 2.2e-16
>
Discussion
Common topics
- What affects students GPA? Or what determines students' GPA?
Group topics
Making Questionnaire
Questions you submit at the ajoubb.
Then we will list questions in Google docs Google survey
c/ma/2019/multiple_regression_exercise.1636430276.txt.gz · Last modified: by hkimscil
