c:ma:2019:multiple_regression_exercise
                This is an old revision of the document!
Table of Contents
Class Activities
Ex. 1
- Install packages ISLR
 - use a dataset, Carseats
 - Build regression models with a DV, sales and IVs, your choices
 - Use
?Carseatscommand for the explanation of the dataset - Use
strfunction to see the characteristic of each variable. Make it sure thatSelvesLocvariable should be factor, not int or anything. - 변인설명을 토대로 가설만들기
- 종속변인 = Sales
 - 독립변인 = 숫자변인 1 + 종류변인 1 (조별 선택)
 - Multiple regression without interactin
 - Multiple regression with interaction
 
 - 가설 만들기
- 종속변인 Sales
 - 독립변인 여러개 (interaction 없이)
 - Modeling 해 볼 것
 
 
see hierarchical regression
see also statistical regression methods ← 많이 쓰이지 않음
Make a full model (with all variables) then reduce down the model until you find it fitted.Make a null model (with no variables) then, build up the model with additional IVs until you find a fitted model.Can we usesteporstepAIC(MASS package needed) function?Interpret the result
> step(lm.full, direction=“back”)
Ex. 2
- Install packages tidyverse
 - load the tidyverse
 install.packages("car")data("Salaries", package = "car")- Use a dataset Salaries
 - describe the data set
 
—-
- Regress sex variable on salary variable
 - Write the regression model
 - Discuss the difference
 
- Use rank variable for the same purpose
 - –
 
- Use yrs.service + rank + discipline + sex
 - on salary
 - How do you interpret the result?
 
—–
위의 Salaries 데이터사용이 안 될 때
- download to R from here salaries.csv
- use to import the data set.
Salaries <- read.csv("http://commres.net/wiki/_media/salaries.csv") 
 - for information about Salaries (it may not be loaded),
- use
??Salariesto describe the data set. 
 
—–
Please copy and paste the proper r command and output to a txt file (use notepad or some other text editing program). You could use MS Word, but, please make it sure that you use type-setting fonts such as “Courier New.” The below output, as an example, includes the r command head(Salaries) and the output. 
> head(Salaries)
       rank discipline yrs.since.phd yrs.service  sex salary
1      Prof          B            19          18 Male 139750
2      Prof          B            20          16 Male 173200
3  AsstProf          B             4           3 Male  79750
4      Prof          B            45          39 Male 115000
5      Prof          B            40          41 Male 141500
6 AssocProf          B             6           6 Male  97000
> lm.sal.sex <- lm(salary ~ sex, data=Salaries)
> summary(lm.sal.sex)
Call:
lm(formula = salary ~ sex, data = Salaries)
Residuals:
   Min     1Q Median     3Q    Max 
-57290 -23502  -6828  19710 116455 
Coefficients:
            Estimate Std. Error
(Intercept)   101002       4809
sexMale        14088       5065
            t value Pr(>|t|)    
(Intercept)  21.001  < 2e-16 ***
sexMale       2.782  0.00567 ** 
---
Signif. codes:  
  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’
  0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 30030 on 395 degrees of freedom
Multiple R-squared:  0.01921,	Adjusted R-squared:  0.01673 
F-statistic: 7.738 on 1 and 395 DF,  p-value: 0.005667
> lm.sal.rank <- lm(salary ~ rank, data=Salaries)
> summary(lm.sal.rank)
Call:
lm(formula = salary ~ rank, data = Salaries)
Residuals:
   Min     1Q Median     3Q    Max 
-68972 -16376  -1580  11755 104773 
Coefficients:
              Estimate Std. Error
(Intercept)      80776       2887
rankAssocProf    13100       4131
rankProf         45996       3230
              t value Pr(>|t|)    
(Intercept)    27.976  < 2e-16 ***
rankAssocProf   3.171  0.00164 ** 
rankProf       14.238  < 2e-16 ***
---
Signif. codes:  
  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’
  0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 23630 on 394 degrees of freedom
Multiple R-squared:  0.3943,	Adjusted R-squared:  0.3912 
F-statistic: 128.2 on 2 and 394 DF,  p-value: < 2.2e-16
> 
> summary(lm.sal.many)
Call:
lm(formula = salary ~ yrs.service + rank + discipline + sex, 
    data = Salaries)
Residuals:
   Min     1Q Median     3Q    Max 
-64202 -14255  -1533  10571  99163 
Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)   68351.67    4482.20  15.250  < 2e-16 ***
yrs.service     -88.78     111.64  -0.795 0.426958    
rankAssocProf 14560.40    4098.32   3.553 0.000428 ***
rankProf      49159.64    3834.49  12.820  < 2e-16 ***
disciplineB   13473.38    2315.50   5.819 1.24e-08 ***
sexMale        4771.25    3878.00   1.230 0.219311    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 22650 on 391 degrees of freedom
Multiple R-squared:  0.4478,	Adjusted R-squared:  0.4407 
F-statistic: 63.41 on 5 and 391 DF,  p-value: < 2.2e-16
> 
Discussion
Common topics
- What affects students GPA? Or what determines students' GPA?
 
Group topics
Making Questionnaire
Questions you submit at the ajoubb.
Then we will list questions in Google docs Google survey
c/ma/2019/multiple_regression_exercise.1636430276.txt.gz · Last modified:  by hkimscil
                
                