c:ma:2019:multiple_regression_exercise
This is an old revision of the document!
Table of Contents
Class Activities
Ex. 1
- Install packages ISLR
- use a dataset, Carseats
- Build regression models with a DV, sales and IVs, your choices
- Use
?Carseats
command for the explanation of the dataset - Use
str
function to see the characteristic of each variable. Make it sure thatSelvesLoc
variable should be factor, not int or anything.
see hierarchical regression
see also statistical regression methods
Make a full model (with all variables) then reduce down the model until you find it fitted.Make a null model (with no variables) then, build up the model with additional IVs until you find a fitted model.Can we usestep
orstepAIC
(MASS package needed) function?Interpret the result
step(lm.full, direction=“back”)
Ex. 2
- Install packages tidyverse
- load the tidyverse
install.packages("car")
data("Salaries", package = "car")
- Use a dataset Salaries
- describe the data set
—-
- Regress sex variable on salary variable
- Write the regression model
- Discuss the difference
- Use rank variable for the same purpose
- –
- Use yrs.service + rank + discipline + sex
- on salary
- How do you interpret the result?
—–
위의 Salaries 데이터사용이 안 될 때
- download to R from here salaries.csv
- use to import the data set.
Salaries <- read.csv("http://commres.net/wiki/_media/salaries.csv")
- for information about Salaries (it may not be loaded),
- use
??Salaries
to describe the data set.
—–
Please copy and paste the proper r command and output to a txt file (use notepad or some other text editing program). You could use MS Word, but, please make it sure that you use type-setting fonts such as “Courier New.” The below output, as an example, includes the r command head(Salaries)
and the output.
> head(Salaries) rank discipline yrs.since.phd yrs.service sex salary 1 Prof B 19 18 Male 139750 2 Prof B 20 16 Male 173200 3 AsstProf B 4 3 Male 79750 4 Prof B 45 39 Male 115000 5 Prof B 40 41 Male 141500 6 AssocProf B 6 6 Male 97000
> lm.sal.sex <- lm(salary ~ sex, data=Salaries) > summary(lm.sal.sex) Call: lm(formula = salary ~ sex, data = Salaries) Residuals: Min 1Q Median 3Q Max -57290 -23502 -6828 19710 116455 Coefficients: Estimate Std. Error (Intercept) 101002 4809 sexMale 14088 5065 t value Pr(>|t|) (Intercept) 21.001 < 2e-16 *** sexMale 2.782 0.00567 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 30030 on 395 degrees of freedom Multiple R-squared: 0.01921, Adjusted R-squared: 0.01673 F-statistic: 7.738 on 1 and 395 DF, p-value: 0.005667
> lm.sal.rank <- lm(salary ~ rank, data=Salaries) > summary(lm.sal.rank) Call: lm(formula = salary ~ rank, data = Salaries) Residuals: Min 1Q Median 3Q Max -68972 -16376 -1580 11755 104773 Coefficients: Estimate Std. Error (Intercept) 80776 2887 rankAssocProf 13100 4131 rankProf 45996 3230 t value Pr(>|t|) (Intercept) 27.976 < 2e-16 *** rankAssocProf 3.171 0.00164 ** rankProf 14.238 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 23630 on 394 degrees of freedom Multiple R-squared: 0.3943, Adjusted R-squared: 0.3912 F-statistic: 128.2 on 2 and 394 DF, p-value: < 2.2e-16 >
> summary(lm.sal.many) Call: lm(formula = salary ~ yrs.service + rank + discipline + sex, data = Salaries) Residuals: Min 1Q Median 3Q Max -64202 -14255 -1533 10571 99163 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 68351.67 4482.20 15.250 < 2e-16 *** yrs.service -88.78 111.64 -0.795 0.426958 rankAssocProf 14560.40 4098.32 3.553 0.000428 *** rankProf 49159.64 3834.49 12.820 < 2e-16 *** disciplineB 13473.38 2315.50 5.819 1.24e-08 *** sexMale 4771.25 3878.00 1.230 0.219311 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 22650 on 391 degrees of freedom Multiple R-squared: 0.4478, Adjusted R-squared: 0.4407 F-statistic: 63.41 on 5 and 391 DF, p-value: < 2.2e-16 >
Discussion
Common topics
- What affects students GPA? Or what determines students' GPA?
Group topics
Making Questionnaire
Questions you submit at the ajoubb.
Then we will list questions in Google docs Google survey
c/ma/2019/multiple_regression_exercise.1635988299.txt.gz · Last modified: 2021/11/04 10:11 by hkimscil