partial_and_semipartial_correlation
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
partial_and_semipartial_correlation [2019/10/13 00:03] – [regression gpa against sat] hkimscil | partial_and_semipartial_correlation [2020/11/25 18:19] – [e.g. 4 Using ppcor.test] hkimscil | ||
---|---|---|---|
Line 2: | Line 2: | ||
references | references | ||
{{https:// | {{https:// | ||
+ | [{{ : | ||
+ | or [[https:// | ||
Simple explanation of the below procedures is like this: | Simple explanation of the below procedures is like this: | ||
Line 10: | Line 11: | ||
* Regress the Y residuals against the X1 residuals. | * Regress the Y residuals against the X1 residuals. | ||
In the below example, | In the below example, | ||
- | * regress gpa against sat | + | * regress gpa against sat (and get residuals of gpa = a + b) |
- | * regress clep against sat | + | * regress clep against sat (and get residuals of clep = b + c) |
- | * regress the gpa residuals against clep residuals. | + | * regress the gpa residuals against clep residuals. |
- | Take a close look at the graphs, especially, the grey areas. | + | * In this case, $r^{2} = \displaystyle \frac{b}{(a+b)}$ and $b$ is very small. |
+ | |||
+ | Take a close look at the right graph, especially, the '' | ||
+ | |||
For more, see https:// | For more, see https:// | ||
Line 85: | Line 90: | ||
linear model | linear model | ||
'' | '' | ||
+ | '' | ||
Line 115: | Line 121: | ||
10 550 2.9 3.13544 -0.23544 | 10 550 2.9 3.13544 -0.23544 | ||
> | > | ||
- | > round(cor(cor.gpa.sat), | + | round(cor(cor.gpa.sat), |
sat | sat | ||
sat 1.000 0.718 1.000 0.000 | sat 1.000 0.718 1.000 0.000 | ||
Line 122: | Line 128: | ||
resid 0.000 0.696 0.000 1.000 | resid 0.000 0.696 0.000 1.000 | ||
> | > | ||
- | > </ | + | </ |
Note that | Note that | ||
- | * r (sat and gpa) = .718 | + | * r (sat and gpa) = .718 (sqrt(r< |
* r (sat and pred) = 1. In other words, predicted values (y hats) are the linear function of x (sat) values ('' | * r (sat and pred) = 1. In other words, predicted values (y hats) are the linear function of x (sat) values ('' | ||
* r (sat and resid) = 0. residuals are orthogonal to the independent (sat) values. | * r (sat and resid) = 0. residuals are orthogonal to the independent (sat) values. | ||
Line 155: | Line 161: | ||
Residual standard error: 0.1637 on 8 degrees of freedom | Residual standard error: 0.1637 on 8 degrees of freedom | ||
Multiple R-squared: | Multiple R-squared: | ||
- | F-statistic: | + | F-statistic: |
+ | </ | ||
+ | |||
+ | '' | ||
+ | |||
< | < | ||
Line 161: | Line 172: | ||
res.lm.gpa.clep <- lm.gpa.clep$residuals | res.lm.gpa.clep <- lm.gpa.clep$residuals | ||
</ | </ | ||
+ | |||
{{lm.gpa.clep.png? | {{lm.gpa.clep.png? | ||
+ | |||
< | < | ||
# get cor between gpa, sat, pred, and resid from. lm.gpa.clep | # get cor between gpa, sat, pred, and resid from. lm.gpa.clep | ||
- | cor.gpa.clep <- as.data.frame(cbind(gpa, clep, lm.gpa.clep$fitted.values, | + | cor.gpa.clep <- as.data.frame(cbind(clep, gpa, lm.gpa.clep$fitted.values, |
- | colnames(cor.gpa.clep) <- c("gpa", "clep", " | + | colnames(cor.gpa.clep) <- c("clep", "gpa", " |
cor(cor.gpa.clep) | cor(cor.gpa.clep) | ||
</ | </ | ||
- | < | + | < |
- | gpa 1.0000 0.8763 0.8763 0.4818 | + | > round(cor(cor.gpa.clep),4) |
- | clep | + | clep gpa pred resid |
- | pred 0.8763 1.0000 1.0000 0.0000 | + | clep |
- | resid 0.4818 0.0000 0.0000 1.0000 | + | gpa |
- | > </ | + | pred 1.0000 |
+ | resid 0.0000 0.4818 0.0000 1.0000 | ||
+ | > | ||
+ | |||
+ | sat | ||
+ | sat | ||
+ | gpa | ||
+ | pred 1.0000 | ||
+ | resid 0.0000 0.6960 | ||
+ | > | ||
+ | </ | ||
Line 205: | Line 228: | ||
> | > | ||
</ | </ | ||
+ | |||
+ | '' | ||
+ | '' | ||
+ | |||
+ | '' | ||
+ | '' | ||
+ | '' | ||
+ | |||
+ | One other thing that we could do help determine a pragmatic argument is to regress GPA on both SAT and CLEP at the same time to see what happens. If we do that, we find that R-square for the model is .78, F = 12.25, p < .01. The intercept and b weight for CLEP are both significant, | ||
+ | |||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | |||
+ | In this case, we would conclude that the significant unique predictor is CLEP. Although SAT is highly correlated with GPA, it adds nothing to the prediction equation once the CLEP score is entered. (These data are fictional and the sample size is much too small to run this analysis. It's there for illustration only.) | ||
+ | |||
+ | Now suppose we wanted to argue something a little different. Suppose we had a theory that said that all measures of math achievement share a common explanation, | ||
+ | |||
+ | |||
===== checking partial cor 1 ===== | ===== checking partial cor 1 ===== | ||
< | < | ||
Line 620: | Line 662: | ||
OV = U + R # the Other Variable is U plus R | OV = U + R # the Other Variable is U plus R | ||
Y = R + rnorm(60, mean=0, sd=2) # Y is R plus error | Y = R + rnorm(60, mean=0, sd=2) # Y is R plus error | ||
+ | </ | ||
+ | ====== e.g. 4 Using ppcor.test ====== | ||
+ | {{: | ||
+ | |||
+ | < | ||
+ | |||
+ | ## install.packages(" | ||
+ | ## install.packages(" | ||
+ | |||
+ | library(pysch) | ||
+ | library(ppcor) | ||
+ | |||
+ | ## ppcor(v.dv, v.iv, v.ctrl) | ||
+ | |||
+ | options(digits = 4) | ||
+ | SATV <- c(500, 550, 450, 400, 600, 650, 700, 550, 650, 550) | ||
+ | HSGPA <- c(3.0, 3.2, 2.8, 2.5, 3.2, 3.8, 3.9, 3.8, 3.5, 3.1) | ||
+ | FGPA <- c(2.8, 3.0, 2.8, 2.2, 3.3, 3.3, 3.5, 3.7, 3.4, 2.9) | ||
+ | scholar <- data.frame(SATV, | ||
+ | describe(scholar) # provides descrptive information about each variable | ||
+ | |||
+ | corrs <- cor(scholar) # find the correlations and set them into an object called ' | ||
+ | corrs # print corrs | ||
+ | |||
+ | pairs(scholar) | ||
+ | |||
+ | pcor.test(HSGPA, | ||
+ | |||
+ | reg1 <- lm(HSGPA ~ SATV) # run linear regression | ||
+ | resid1 <- resid(reg1) | ||
+ | |||
+ | reg2 <- lm(FGPA ~ SATV) # second regression | ||
+ | resid2 <- resid(reg2) | ||
+ | |||
+ | cor(resid1, resid2) | ||
+ | |||
+ | |||
+ | </ | ||
+ | |||
+ | < | ||
+ | options(digits = 4) | ||
+ | |||
+ | HSGPA <- c(3.0, 3.2, 2.8, 2.5, 3.2, 3.8, 3.9, 3.8, 3.5, 3.1) | ||
+ | SATV <- c(500, 550, 450, 400, 600, 650, 700, 550, 650, 550) | ||
+ | FGPA <- c(2.8, 3.0, 2.8, 2.2, 3.3, 3.3, 3.5, 3.7, 3.4, 2.9) | ||
+ | GREV <- c(600, 670, 540, 800, 750, 820, 830, 670, 690, 600) | ||
+ | ##GREV <- c(510, 670, 440, 800, 750, 420, 830, 470, 690, 600) | ||
+ | |||
+ | scholar <- data.frame(HSGPA, | ||
+ | describe(scholar) # provides descrptive information about each variable | ||
+ | |||
+ | corrs <- cor(scholar) # find the correlations and set them into an object called ' | ||
+ | corrs # print corrs | ||
+ | |||
+ | pairs(scholar) | ||
+ | |||
+ | pcor.test(HSGPA, | ||
+ | |||
+ | reg1 <- lm(HSGPA ~ SATV) # run linear regression | ||
+ | resid1 <- resid(reg1) | ||
+ | |||
+ | reg2 <- lm(FGPA ~ SATV) # second regression | ||
+ | resid2 <- resid(reg2) | ||
+ | |||
+ | cor(resid1, resid2) | ||
+ | |||
+ | |||
+ | |||
+ | reg12 <- lm(HSGPA ~ GREV) | ||
+ | resid12 <- resid(reg12) | ||
</ | </ | ||
partial_and_semipartial_correlation.txt · Last modified: 2024/06/12 08:01 by hkimscil