User Tools

Site Tools


simple_regression_example

Data examination

Here we are looking at several variables, instead of each of IV and DV. This is called multiple regression. We will discuss it later.

Download example file:

elemapi.sav

elemapi.sps

 display labels.
Data Label description
Variable Labels
Variable Position Label
snum 1 school number
dnum 2 district number
api00 3 api 2000
api99 4 api 1999
growth 5 growth 1999 to 2000
meals 6 pct free meals
ell 7 english language learners
yr_rnd 8 year round school
mobility 9 pct 1st year in school
acs_k3 10 avg class size k-3
acs_46 11 avg class size 4-6
not_hsg 12 parent not hsg
hsg 13 parent hsg
some_col 14 parent some college
col_grad 15 parent college grad
grad_sch 16 parent grad school
avg_ed 17 avg parent ed
full 18 pct full credential
emer 19 pct emer credential
enroll 20 number of students
mealcat 21 Percentage free meals in 3 categories

우리가 관심이 있는 데이터는
2000년의 수학능력 (api00)
% 무료급식
% 풀타임 교원
k-3까지의 평균 클래스사이즈

이에 대한 부분적 자료 먼저 살펴보기 위해서는

list 
  /variables api00 acs_k3 meals full
  /cases from 1 to 10.
 api00 acs_k3 meals     full

   693    16     67    76.00
   570    15     92    79.00
   546    17     97    68.00
   571    20     90    87.00
   478    18     89    87.00
   858    20      .   100.00
   918    19      .   100.00
   831    20      .    96.00
   860    20      .   100.00
   737    21     29    96.00


Number of cases read:  10    Number of cases listed:  10
descriptive /var = all .
Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
school number 400 58 6072 2866.81 1543.811
district number 400 41 796 457.74 184.823
api 2000 400 369 940 647.62 142.249
api 1999 400 333 917 610.21 147.136
growth 1999 to 2000 400 -69 134 37.41 25.247
pct free meals 315 6 100 71.99 24.386
english language learners 400 0 91 31.45 24.839
year round school 400 0 1 .23 .421
pct 1st year in school 399 2 47 18.25 7.485
avg class size k-3 398 -21 25 18.55 5.005
avg class size 4-6 397 20 50 29.69 3.841
parent not hsg 400 0 100 21.25 20.676
parent hsg 400 0 100 26.02 16.333
parent some college 400 0 67 19.71 11.337
parent college grad 400 0 100 19.70 16.471
parent grad school 400 0 67 8.64 12.131
avg parent ed 381 1.00 4.62 2.6685 .76379
pct full credential 400 .42 100.00 66.0568 40.29793
pct emer credential 400 0 59 12.66 11.746
number of students 400 130 1570 483.47 226.448
Percentage free meals in 3 categories 400 1 3 2.02 .819
examine
  /variables=acs_k3
  /plot histogram stem boxplot .
Descriptives
Statistic Std. Error
avg class size k-3 Mean 18.55 .251
95% Confidence Interval for Mean Lower Bound 18.05
Upper Bound 19.04
5% Trimmed Mean 19.13
Median 19.00
Variance 25.049
Std. Deviation 5.005
Minimum -21
Maximum 25
Range 46
Interquartile Range 2
Skewness -7.106 .122
Kurtosis 53.014 .244
Histogram
avg class size k-3 Stem-and-Leaf Plot

 Frequency    Stem &  Leaf

     8.00 Extremes    (=<14.0)
     1.00       15 .  &
      .00       15 .
    14.00       16 .  0000000
      .00       16 .
    20.00       17 .  0000000000
      .00       17 .
    64.00       18 .  00000000000000000000000000000000
      .00       18 .
   143.00       19 .  00000000000000000000000000000000000000000000000000000000000000000000000
      .00       19 .
    97.00       20 .  000000000000000000000000000000000000000000000000
      .00       20 .
    40.00       21 .  00000000000000000000
      .00       21 .
     7.00       22 .  000
      .00       22 .
     3.00       23 .  0
     1.00 Extremes    (>=25.0)

 Stem width:     1
 Each leaf:       2 case(s)

 & denotes fractional leaves.
Boxplot
frequencies
  /var acs_k3.
avg class size k-3
Frequency Percent Valid Percent Cumulative Percent
Valid -21 3 .8 .8 .8
-20 2 .5 .5 1.3
-19 1 .3 .3 1.5
14 2 .5 .5 2.0
15 1 .3 .3 2.3
16 14 3.5 3.5 5.8
17 20 5.0 5.0 10.8
18 64 16.0 16.1 26.9
19 143 35.8 35.9 62.8
20 97 24.3 24.4 87.2
21 40 10.0 10.1 97.2
22 7 1.8 1.8 99.0
23 3 .8 .8 99.7
25 1 .3 .3 100.0
Total 398 99.5 100.0
Missing System 2 .5
Total 400 100.0
compute filtvar = (acs_k3 < 0).
filter by filtvar.
list cases
  /var snum dnum acs_k3.
     snum    dnum acs_k3

      600     140   -20
      596     140   -19
      611     140   -20
      595     140   -21
      592     140   -21
      602     140   -21


Number of cases read:  6    Number of cases listed:  6
filter off.
IF (acs_k3<0) racs_k3=ABS(acs_k3).
IF (acs_k3>=0) racs_k3=acs_k3.
EXECUTE.
frequencies
  variables=full
  /format=notable
  /histogram .
Histogram for variable full
pct full credential
Frequency Percent Valid Percent Cumulative Percent
Valid 0.42 1 .3 .3 .3
0.45 1 .3 .3 .5
0.46 1 .3 .3 .8
0.47 1 .3 .3 1.0
0.48 1 .3 .3 1.3
0.5 3 .8 .8 2.0
0.51 1 .3 .3 2.3
0.52 1 .3 .3 2.5
0.53 1 .3 .3 2.8
0.54 1 .3 .3 3.0
0.56 2 .5 .5 3.5
0.57 2 .5 .5 4.0
0.58 1 .3 .3 4.3
0.59 3 .8 .8 5.0
0.6 1 .3 .3 5.3
0.61 4 1.0 1.0 6.3
0.62 2 .5 .5 6.8
0.63 1 .3 .3 7.0
0.64 3 .8 .8 7.8
0.65 3 .8 .8 8.5
0.66 2 .5 .5 9.0
0.67 6 1.5 1.5 10.5
0.68 2 .5 .5 11.0
0.69 3 .8 .8 11.8
0.7 1 .3 .3 12.0
0.71 1 .3 .3 12.3
0.72 2 .5 .5 12.8
0.73 6 1.5 1.5 14.3
0.75 4 1.0 1.0 15.3
0.76 2 .5 .5 15.8
0.77 2 .5 .5 16.3
0.79 3 .8 .8 17.0
0.8 5 1.3 1.3 18.3
0.81 8 2.0 2.0 20.3
0.82 2 .5 .5 20.8
0.83 2 .5 .5 21.3
0.84 2 .5 .5 21.8
0.85 3 .8 .8 22.5
0.86 2 .5 .5 23.0
0.9 3 .8 .8 23.8
0.92 1 .3 .3 24.0
0.93 1 .3 .3 24.3
0.94 2 .5 .5 24.8
0.95 2 .5 .5 25.3
0.96 1 .3 .3 25.5
1 2 .5 .5 26.0
37 1 .3 .3 26.3
41 1 .3 .3 26.5
44 2 .5 .5 27.0
45 2 .5 .5 27.5
46 1 .3 .3 27.8
48 1 .3 .3 28.0
53 1 .3 .3 28.3
57 1 .3 .3 28.5
58 3 .8 .8 29.3
59 1 .3 .3 29.5
61 1 .3 .3 29.8
63 2 .5 .5 30.3
64 1 .3 .3 30.5
65 1 .3 .3 30.8
68 2 .5 .5 31.3
69 3 .8 .8 32.0
70 1 .3 .3 32.3
71 3 .8 .8 33.0
72 1 .3 .3 33.3
73 2 .5 .5 33.8
74 1 .3 .3 34.0
75 4 1.0 1.0 35.0
76 4 1.0 1.0 36.0
77 2 .5 .5 36.5
78 4 1.0 1.0 37.5
79 3 .8 .8 38.3
80 10 2.5 2.5 40.8
81 4 1.0 1.0 41.8
82 3 .8 .8 42.5
83 9 2.3 2.3 44.8
84 4 1.0 1.0 45.8
85 8 2.0 2.0 47.8
86 5 1.3 1.3 49.0
87 12 3.0 3.0 52.0
88 6 1.5 1.5 53.5
89 5 1.3 1.3 54.8
90 9 2.3 2.3 57.0
91 8 2.0 2.0 59.0
92 7 1.8 1.8 60.8
93 12 3.0 3.0 63.8
94 10 2.5 2.5 66.3
95 17 4.3 4.3 70.5
96 17 4.3 4.3 74.8
97 11 2.8 2.8 77.5
98 9 2.3 2.3 79.8
100 81 20.3 20.3 100.0
Total 400 100.0 100.0
frequencies
  variables=full  .
IF (full  <= 1) rfull=full  * 100.
IF (full > 1) rfull=full.
EXECUTE.
simple_regression_example.txt · Last modified: 2017/05/24 08:56 by hkimscil