pre-assumptions_of_regression_analysis
Differences
This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| pre-assumptions_of_regression_analysis [2016/04/27 08:00] – created hkimscil | pre-assumptions_of_regression_analysis [2016/05/11 08:37] (current) – [Outliers] hkimscil | ||
|---|---|---|---|
| Line 2: | Line 2: | ||
| ====== pre-asumptions in regression test ====== | ====== pre-asumptions in regression test ====== | ||
| * [[Linearity]] - the relationships between the predictors and the outcome variable should be linear | * [[Linearity]] - the relationships between the predictors and the outcome variable should be linear | ||
| - | * [Normality]] - the errors should be normally distributed - technically normality is necessary only for the t-tests to be valid, estimation of the coefficients only requires that the errors be identically and independently distributed | + | * [[:Normality]] - the errors should be normally distributed - technically normality is necessary only for the t-tests to be valid, estimation of the coefficients only requires that the errors be identically and independently distributed |
| * [[: | * [[: | ||
| * Independence - the errors associated with one observation are not correlated with the errors of any other observation | * Independence - the errors associated with one observation are not correlated with the errors of any other observation | ||
| Line 8: | Line 8: | ||
| * [[Influence]] - individual observations that exert undue influence on the coefficients | * [[Influence]] - individual observations that exert undue influence on the coefficients | ||
| - | * [[Collinearity]] or [Singularity] - predictors that are highly collinear, i.e. linearly related, can cause problems in estimating the regression coefficients. | + | * [[Collinearity]] or [[Singularity]] - predictors that are highly collinear, i.e. linearly related, can cause problems in estimating the regression coefficients. |
| ===== Outliers ===== | ===== Outliers ===== | ||
| Line 15: | Line 15: | ||
| | **Model Summary(b) ** | | **Model Summary(b) ** | ||
| | Model | | Model | ||
| - | | 1 | + | | 1 |
| | a Predictors: (Constant), income | | a Predictors: (Constant), income | ||
| | b Dependent Variable: sales | | b Dependent Variable: sales | ||
| Line 28: | Line 28: | ||
| | Coefficients(a) | | Coefficients(a) | ||
| - | | Model | + | | Model |
| | | | | ||
| | 1 | | 1 | ||
| Line 34: | Line 34: | ||
| | a Dependent Variable: sales | | a Dependent Variable: sales | ||
| <WRAP clear /> | <WRAP clear /> | ||
| - | Note, | + | Note, R<sup>2</ |
| - | R^2^ = .141 | + | Further, Anova test shows that the model is not significant, |
| - | Further, | + | Since F test failed, t-test for B also failed. |
| - | Anova test shows that the model is not significant, | + | |
| - | Since | + | |
| - | F test failed, t-test for B also failed. | + | |
| But, the result might be due to some outliers. So, check outliers by examining: | But, the result might be due to some outliers. So, check outliers by examining: | ||
| * scatter plot: (z-predicted(x), | * scatter plot: (z-predicted(x), | ||
| - | * Mahalanovis | + | * [[Mahalanobis distance]] |
| * Cook distance | * Cook distance | ||
| * Leverage | * Leverage | ||
| Line 49: | Line 46: | ||
| {{regression04-outlier.jpg? | {{regression04-outlier.jpg? | ||
| - | + | | Casewise Diagnostics(a) | |
| - | | | + | |
| | Case Number | | Case Number | ||
| | 10 | | 10 | ||
| Line 57: | Line 53: | ||
| 두 개의 케이스를 제거한 후의 분석: | 두 개의 케이스를 제거한 후의 분석: | ||
| - | r^2^ 값이 14%에서 70% 로 증가하였다. | + | r<sup>2</ |
| 독립변인 income의 b 값이 0.527406291에서 1.618765817로 증가 (따라서, t value도 증가) 하였다. | 독립변인 income의 b 값이 0.527406291에서 1.618765817로 증가 (따라서, t value도 증가) 하였다. | ||
| - | | | + | | Model Summary(b) |
| | Model | | Model | ||
| | 1 | | 1 | ||
pre-assumptions_of_regression_analysis.1461713426.txt.gz · Last modified: by hkimscil
