simple_regression_example
Data examination
Here we are looking at several variables, instead of each of IV and DV. This is called multiple regression. We will discuss it later.
Download example file:
elemapi.sav
elemapi.sps
display labels.
| Data Label description | ||
|---|---|---|
| Variable Labels | ||
| Variable | Position | Label |
| snum | 1 | school number |
| dnum | 2 | district number |
| api00 | 3 | api 2000 |
| api99 | 4 | api 1999 |
| growth | 5 | growth 1999 to 2000 |
| meals | 6 | pct free meals |
| ell | 7 | english language learners |
| yr_rnd | 8 | year round school |
| mobility | 9 | pct 1st year in school |
| acs_k3 | 10 | avg class size k-3 |
| acs_46 | 11 | avg class size 4-6 |
| not_hsg | 12 | parent not hsg |
| hsg | 13 | parent hsg |
| some_col | 14 | parent some college |
| col_grad | 15 | parent college grad |
| grad_sch | 16 | parent grad school |
| avg_ed | 17 | avg parent ed |
| full | 18 | pct full credential |
| emer | 19 | pct emer credential |
| enroll | 20 | number of students |
| mealcat | 21 | Percentage free meals in 3 categories |
우리가 관심이 있는 데이터는
2000년의 수학능력 (api00)
% 무료급식
% 풀타임 교원
k-3까지의 평균 클래스사이즈
이에 대한 부분적 자료 먼저 살펴보기 위해서는
list /variables api00 acs_k3 meals full /cases from 1 to 10.
api00 acs_k3 meals full 693 16 67 76.00 570 15 92 79.00 546 17 97 68.00 571 20 90 87.00 478 18 89 87.00 858 20 . 100.00 918 19 . 100.00 831 20 . 96.00 860 20 . 100.00 737 21 29 96.00 Number of cases read: 10 Number of cases listed: 10
descriptive /var = all .
| Descriptive Statistics | |||||
|---|---|---|---|---|---|
| N | Minimum | Maximum | Mean | Std. Deviation | |
| school number | 400 | 58 | 6072 | 2866.81 | 1543.811 |
| district number | 400 | 41 | 796 | 457.74 | 184.823 |
| api 2000 | 400 | 369 | 940 | 647.62 | 142.249 |
| api 1999 | 400 | 333 | 917 | 610.21 | 147.136 |
| growth 1999 to 2000 | 400 | -69 | 134 | 37.41 | 25.247 |
| pct free meals | 315 | 6 | 100 | 71.99 | 24.386 |
| english language learners | 400 | 0 | 91 | 31.45 | 24.839 |
| year round school | 400 | 0 | 1 | .23 | .421 |
| pct 1st year in school | 399 | 2 | 47 | 18.25 | 7.485 |
| avg class size k-3 | 398 | -21 | 25 | 18.55 | 5.005 |
| avg class size 4-6 | 397 | 20 | 50 | 29.69 | 3.841 |
| parent not hsg | 400 | 0 | 100 | 21.25 | 20.676 |
| parent hsg | 400 | 0 | 100 | 26.02 | 16.333 |
| parent some college | 400 | 0 | 67 | 19.71 | 11.337 |
| parent college grad | 400 | 0 | 100 | 19.70 | 16.471 |
| parent grad school | 400 | 0 | 67 | 8.64 | 12.131 |
| avg parent ed | 381 | 1.00 | 4.62 | 2.6685 | .76379 |
| pct full credential | 400 | .42 | 100.00 | 66.0568 | 40.29793 |
| pct emer credential | 400 | 0 | 59 | 12.66 | 11.746 |
| number of students | 400 | 130 | 1570 | 483.47 | 226.448 |
| Percentage free meals in 3 categories | 400 | 1 | 3 | 2.02 | .819 |
examine /variables=acs_k3 /plot histogram stem boxplot .
| Descriptives | ||||
|---|---|---|---|---|
| Statistic | Std. Error | |||
| avg class size k-3 | Mean | 18.55 | .251 | |
| 95% Confidence Interval for Mean | Lower Bound | 18.05 | ||
| Upper Bound | 19.04 | |||
| 5% Trimmed Mean | 19.13 | |||
| Median | 19.00 | |||
| Variance | 25.049 | |||
| Std. Deviation | 5.005 | |||
| Minimum | -21 | |||
| Maximum | 25 | |||
| Range | 46 | |||
| Interquartile Range | 2 | |||
| Skewness | -7.106 | .122 | ||
| Kurtosis | 53.014 | .244 | ||
avg class size k-3 Stem-and-Leaf Plot
Frequency Stem & Leaf
8.00 Extremes (=<14.0)
1.00 15 . &
.00 15 .
14.00 16 . 0000000
.00 16 .
20.00 17 . 0000000000
.00 17 .
64.00 18 . 00000000000000000000000000000000
.00 18 .
143.00 19 . 00000000000000000000000000000000000000000000000000000000000000000000000
.00 19 .
97.00 20 . 000000000000000000000000000000000000000000000000
.00 20 .
40.00 21 . 00000000000000000000
.00 21 .
7.00 22 . 000
.00 22 .
3.00 23 . 0
1.00 Extremes (>=25.0)
Stem width: 1
Each leaf: 2 case(s)
& denotes fractional leaves.
frequencies /var acs_k3.
| avg class size k-3 | |||||
|---|---|---|---|---|---|
| Frequency | Percent | Valid Percent | Cumulative Percent | ||
| Valid | -21 | 3 | .8 | .8 | .8 |
| -20 | 2 | .5 | .5 | 1.3 | |
| -19 | 1 | .3 | .3 | 1.5 | |
| 14 | 2 | .5 | .5 | 2.0 | |
| 15 | 1 | .3 | .3 | 2.3 | |
| 16 | 14 | 3.5 | 3.5 | 5.8 | |
| 17 | 20 | 5.0 | 5.0 | 10.8 | |
| 18 | 64 | 16.0 | 16.1 | 26.9 | |
| 19 | 143 | 35.8 | 35.9 | 62.8 | |
| 20 | 97 | 24.3 | 24.4 | 87.2 | |
| 21 | 40 | 10.0 | 10.1 | 97.2 | |
| 22 | 7 | 1.8 | 1.8 | 99.0 | |
| 23 | 3 | .8 | .8 | 99.7 | |
| 25 | 1 | .3 | .3 | 100.0 | |
| Total | 398 | 99.5 | 100.0 | ||
| Missing | System | 2 | .5 | ||
| Total | 400 | 100.0 | |||
compute filtvar = (acs_k3 < 0). filter by filtvar. list cases /var snum dnum acs_k3.
snum dnum acs_k3
600 140 -20
596 140 -19
611 140 -20
595 140 -21
592 140 -21
602 140 -21
Number of cases read: 6 Number of cases listed: 6
filter off. IF (acs_k3<0) racs_k3=ABS(acs_k3). IF (acs_k3>=0) racs_k3=acs_k3. EXECUTE.
frequencies variables=full /format=notable /histogram .
| pct full credential | |||||
|---|---|---|---|---|---|
| Frequency | Percent | Valid Percent | Cumulative Percent | ||
| Valid | 0.42 | 1 | .3 | .3 | .3 |
| 0.45 | 1 | .3 | .3 | .5 | |
| 0.46 | 1 | .3 | .3 | .8 | |
| 0.47 | 1 | .3 | .3 | 1.0 | |
| 0.48 | 1 | .3 | .3 | 1.3 | |
| 0.5 | 3 | .8 | .8 | 2.0 | |
| 0.51 | 1 | .3 | .3 | 2.3 | |
| 0.52 | 1 | .3 | .3 | 2.5 | |
| 0.53 | 1 | .3 | .3 | 2.8 | |
| 0.54 | 1 | .3 | .3 | 3.0 | |
| 0.56 | 2 | .5 | .5 | 3.5 | |
| 0.57 | 2 | .5 | .5 | 4.0 | |
| 0.58 | 1 | .3 | .3 | 4.3 | |
| 0.59 | 3 | .8 | .8 | 5.0 | |
| 0.6 | 1 | .3 | .3 | 5.3 | |
| 0.61 | 4 | 1.0 | 1.0 | 6.3 | |
| 0.62 | 2 | .5 | .5 | 6.8 | |
| 0.63 | 1 | .3 | .3 | 7.0 | |
| 0.64 | 3 | .8 | .8 | 7.8 | |
| 0.65 | 3 | .8 | .8 | 8.5 | |
| 0.66 | 2 | .5 | .5 | 9.0 | |
| 0.67 | 6 | 1.5 | 1.5 | 10.5 | |
| 0.68 | 2 | .5 | .5 | 11.0 | |
| 0.69 | 3 | .8 | .8 | 11.8 | |
| 0.7 | 1 | .3 | .3 | 12.0 | |
| 0.71 | 1 | .3 | .3 | 12.3 | |
| 0.72 | 2 | .5 | .5 | 12.8 | |
| 0.73 | 6 | 1.5 | 1.5 | 14.3 | |
| 0.75 | 4 | 1.0 | 1.0 | 15.3 | |
| 0.76 | 2 | .5 | .5 | 15.8 | |
| 0.77 | 2 | .5 | .5 | 16.3 | |
| 0.79 | 3 | .8 | .8 | 17.0 | |
| 0.8 | 5 | 1.3 | 1.3 | 18.3 | |
| 0.81 | 8 | 2.0 | 2.0 | 20.3 | |
| 0.82 | 2 | .5 | .5 | 20.8 | |
| 0.83 | 2 | .5 | .5 | 21.3 | |
| 0.84 | 2 | .5 | .5 | 21.8 | |
| 0.85 | 3 | .8 | .8 | 22.5 | |
| 0.86 | 2 | .5 | .5 | 23.0 | |
| 0.9 | 3 | .8 | .8 | 23.8 | |
| 0.92 | 1 | .3 | .3 | 24.0 | |
| 0.93 | 1 | .3 | .3 | 24.3 | |
| 0.94 | 2 | .5 | .5 | 24.8 | |
| 0.95 | 2 | .5 | .5 | 25.3 | |
| 0.96 | 1 | .3 | .3 | 25.5 | |
| 1 | 2 | .5 | .5 | 26.0 | |
| 37 | 1 | .3 | .3 | 26.3 | |
| 41 | 1 | .3 | .3 | 26.5 | |
| 44 | 2 | .5 | .5 | 27.0 | |
| 45 | 2 | .5 | .5 | 27.5 | |
| 46 | 1 | .3 | .3 | 27.8 | |
| 48 | 1 | .3 | .3 | 28.0 | |
| 53 | 1 | .3 | .3 | 28.3 | |
| 57 | 1 | .3 | .3 | 28.5 | |
| 58 | 3 | .8 | .8 | 29.3 | |
| 59 | 1 | .3 | .3 | 29.5 | |
| 61 | 1 | .3 | .3 | 29.8 | |
| 63 | 2 | .5 | .5 | 30.3 | |
| 64 | 1 | .3 | .3 | 30.5 | |
| 65 | 1 | .3 | .3 | 30.8 | |
| 68 | 2 | .5 | .5 | 31.3 | |
| 69 | 3 | .8 | .8 | 32.0 | |
| 70 | 1 | .3 | .3 | 32.3 | |
| 71 | 3 | .8 | .8 | 33.0 | |
| 72 | 1 | .3 | .3 | 33.3 | |
| 73 | 2 | .5 | .5 | 33.8 | |
| 74 | 1 | .3 | .3 | 34.0 | |
| 75 | 4 | 1.0 | 1.0 | 35.0 | |
| 76 | 4 | 1.0 | 1.0 | 36.0 | |
| 77 | 2 | .5 | .5 | 36.5 | |
| 78 | 4 | 1.0 | 1.0 | 37.5 | |
| 79 | 3 | .8 | .8 | 38.3 | |
| 80 | 10 | 2.5 | 2.5 | 40.8 | |
| 81 | 4 | 1.0 | 1.0 | 41.8 | |
| 82 | 3 | .8 | .8 | 42.5 | |
| 83 | 9 | 2.3 | 2.3 | 44.8 | |
| 84 | 4 | 1.0 | 1.0 | 45.8 | |
| 85 | 8 | 2.0 | 2.0 | 47.8 | |
| 86 | 5 | 1.3 | 1.3 | 49.0 | |
| 87 | 12 | 3.0 | 3.0 | 52.0 | |
| 88 | 6 | 1.5 | 1.5 | 53.5 | |
| 89 | 5 | 1.3 | 1.3 | 54.8 | |
| 90 | 9 | 2.3 | 2.3 | 57.0 | |
| 91 | 8 | 2.0 | 2.0 | 59.0 | |
| 92 | 7 | 1.8 | 1.8 | 60.8 | |
| 93 | 12 | 3.0 | 3.0 | 63.8 | |
| 94 | 10 | 2.5 | 2.5 | 66.3 | |
| 95 | 17 | 4.3 | 4.3 | 70.5 | |
| 96 | 17 | 4.3 | 4.3 | 74.8 | |
| 97 | 11 | 2.8 | 2.8 | 77.5 | |
| 98 | 9 | 2.3 | 2.3 | 79.8 | |
| 100 | 81 | 20.3 | 20.3 | 100.0 | |
| Total | 400 | 100.0 | 100.0 | ||
frequencies variables=full .
IF (full <= 1) rfull=full * 100. IF (full > 1) rfull=full. EXECUTE.
simple_regression_example.txt · Last modified: by hkimscil



