Table of Contents
See also, ANOVA, Factorial ANOVA, paired sample t-test repeated_measures_anova
Repeated Measure ANOVA
Introduction
- one-way ANOVA for related, not-independent groups
- extension of the dependent t-test (one group t-test, repeated measure t-test)
- also, it is called “within-subjects ANOVA” or “ANOVA for correlated samples”
- the simplest one is one-way repeated measures ANOVA
- which requires one independent and one dependent variable
- the independent variable is categorical (either nominal or ordinal)
- the dependent variable is continuous (interval or ratio)
Test Circumstances
- one subject with repeated measures across a time period (differences of mean scores across three or more time periods)
- participants being tested with headache drugs such as
- group A, B, C, placebo
- across the time periods j, k, l, m
- testing the effect of a three-month exercise training program on blood sugar level
- measure blood sugar level at 3 different points (pre-exercise, midway, post-exercise)
- one subject with repeated measures in different situation (treatments; differences of mean scores under three or more different conditions)
- e.g., participant (n=30) using and evaluating three web site UI (naver, daum, and google)
- and rate its usefulness, usability and ease of use
- data should look as follows:
pre-excerise “sugar level” | mid-term “sugar level” | post-exercise “sugar level” |
|
---|---|---|---|
a | 250 | 220 | 150 |
b | 300 | 170 | 120 |
c | 150 | 120 | 120 |
d | 230 | 170 | 160 |
e | 260 | 250 | 250 |
level 1 | level 2 | level 3 |
Levels = related groups of the independent variable “time”
treatment condition “naver” | treatment condition “daum” | treatment condition “google” |
|
---|---|---|---|
a | 70 | 60 | 80 |
b | 50 | 70 | 50 |
c | 40 | 50 | 60 |
d | 30 | 40 | 60 |
e | 60 | 50 | 40 |
level 1 | level 2 | level 3 |
in general, the data should look
time/condition | |||
---|---|---|---|
T1 | T2 | T3 | |
s1 | s1 | s1 | s1 |
s2 | s2 | s2 | s2 |
s3 | s3 | s3 | s3 |
s4 | s4 | s4 | s4 |
s5 | s5 | s5 | s5 |
.. | .. | .. | .. |
sn | sn | sn | sn |
You should discern the above from normal ANOVA situation.
group | treatment | |
---|---|---|
a | 1 | 70 |
b | 1 | 50 |
c | 1 | 40 |
d | 1 | 30 |
e | 1 | 60 |
f | 2 | 60 |
g | 2 | 70 |
h | 2 | 50 |
i | 2 | 40 |
j | 2 | 50 |
k | 3 | 80 |
l | 3 | 50 |
m | 3 | 60 |
n | 3 | 60 |
o | 3 | 40 |
LOGICS
- $\text{independent ANOVA: } F = \displaystyle \frac{MS_{between}}{MS_{within}} = \frac{MS_{between}}{MS_{error}}$
- $\text{rep measures ANOVA: } F = \displaystyle \frac{MS_{between}}{MS_{within}} = \displaystyle \frac{MS_{conditions}}{MS_{error}}$
주>
- “between” 이란 단어는 독립적인 그룹 간의 비교를 의미하므로, 반복측정(repeated measure)의 경우에는 conditions라는 용어를 사용.
- but, $\text{SS}_\text{{within}}$ can be partitioned as
- $\text{SS}_{\text{ subjects}}$ and $\text{SS}_{\text{ error}}$
- that is, some of the “within variation” are carried along in each individual.
- Among the two, we can exclude the first from SSwithin
- and solely use the latter as SSerror
- This is to say:
- in $\text{independent ANOVA: } \text{SS}_\text{{within}} = \text{SS}_{\text{error}} $
- in $\text{rep measures ANOVA: } \text{SS}_\text{{within}} = \text{SS}_{\text{subjects}} + \text{SS}_{\text{error}}$
- This means that the term SSerror will be smaller
- But, with this SSerror, the df is going to be (n-1)(k-1)
subjects | Pre | 1 Month | 3 Month | Subject Means |
---|---|---|---|---|
1 | 45 | 50 | 55 | 50 |
2 | 42 | 42 | 45 | 43 |
3 | 36 | 41 | 43 | 40 |
4 | 39 | 35 | 40 | 38 |
5 | 51 | 55 | 59 | 55 |
6 | 44 | 49 | 56 | 49.7 |
Monthly mean | 42.8 | 45.3 | 49.97 | |
Grand mean: 45.9 |
We do this (and the below example) with an excel spreadsheet.
We also require fdistribution table to determine the null hypothesis test.
Headache Analysis | ||||||
---|---|---|---|---|---|---|
base treatment | average per case (subject, participant) |
|||||
ser | w1 | w2 | w3 | w4 | w5 | $\overline{X}_{part}$ = average per case (subject, participant) |
1 | 21 | 22 | 8 | 6 | 6 | 12.6 |
2 | 20 | 19 | 10 | 4 | 9 | 12.4 |
3 | 7 | 5 | 5 | 4 | 5 | 5.2 |
4 | 25 | 30 | 13 | 12 | 4 | 16.8 |
5 | 30 | 33 | 10 | 8 | 6 | 17.4 |
6 | 19 | 27 | 8 | 7 | 4 | 13 |
7 | 26 | 16 | 5 | 2 | 5 | 10.8 |
8 | 13 | 4 | 8 | 1 | 5 | 6.2 |
9 | 26 | 24 | 14 | 8 | 17 | 17.8 |
average per week | 20.78 | 20.00 | 9.00 | 5.78 | 6.78 | $\overline{X}$ = 12.47 |
Stats | |
---|---|
Mean Total | 12.47 |
$\Sigma{X_i}$ | 561 |
$\Sigma{{X_i}^2}$ | 10483 |
# of week | 5 |
# of case (n) | 9 |
SStotal = $\Sigma{(X-\overline{X})^2} $ = 3489.2
SSbetween
= SSconditions
= SSweeks
= $n\Sigma{(\overline{X}_{week} - \overline{X})^2}$ = 1934.5
SSwithin
= $ \Sigma \Sigma{(X_{s_i.t_j} - \overline{X_{t_j}})^2}$
= $ \Sigma (411.6, 836.0, 78.0, 93.6, 135.6) $
= 1554.7
SSparticipants = $w\Sigma{(\overline{X}_{participants}-\overline{X})^2}$ = 833.6
SSresidual
= SSerror
= SSwithin - SSparticipants
= 1554.7 - 833.6
= 721.1
OR
SSresidual =
= SSerror
= (SStotal - SSweeks(between)) - SSparticipants
= 721.1
dftotal = N - 1 = 45 - 1 = 44
dfweek = 5 - 1 = 4 = dfbetween
dfparticipants = 9 - 1 = 8 = dfsubjects
dferror= (n - 1)(k - 1) = 8 * 4 = 32 = 40 - 8 = 32
dfwithin = N - k = 45 - 5 = 40
ie
시각적 인지점수 | |||
---|---|---|---|
참가자 | No visual distraction | Visual distraction | Sound Distraction |
A | 47 | 22 | 41 |
B | 57 | 31 | 52 |
C | 38 | 18 | 40 |
D | 45 | 32 | 43 |
in r
demo1
https://rcompanion.org/handbook/I_09.html
demo1 <- read.csv("https://stats.idre.ucla.edu/stat/data/demo1.csv") demo1 str(demo1) ## 모든 변인이 int이므로 (숫자) factor로 바꿔야 한다 ## Convert variables to factor demo1 <- within(demo1, { group <- factor(group) time <- factor(time) id <- factor(id) }) ## 이제 pulse만 제외하고 모두 factor로 변환된 데이터 str(demo1)
demo1 data는 아래와 같다.
id group pulse time 1 1 10 1 1 1 10 2 1 1 10 3 2 1 10 1 2 1 10 2 2 1 10 3 3 1 10 1 3 1 10 2 3 1 10 3 4 1 10 1 4 1 10 2 4 1 10 3 5 2 15 1 5 2 15 2 5 2 15 3 6 2 15 1 6 2 15 2 6 2 15 3 7 2 16 1 7 2 15 2 7 2 15 3 8 2 15 1 8 2 15 2 8 2 15 3
이를 정리해보면
time | |||||||||
t1 | t2 | t3 | mean of the same person's measures |
||||||
1 | 10 | 10 | 10 | 10 | |||||
2 | 10 | 10 | 10 | 10 | |||||
3 | 10 | 10 | 10 | 10 | |||||
4 | 10 | 10 | 10 | 10 | |||||
5 | 15 | 15 | 15 | 15 | |||||
6 | 15 | 15 | 15 | 15 | |||||
7 | 16 | 15 | 15 | 15.333 | |||||
8 | 15 | 15 | 15 | 15 | |||||
mean across the time | 12.625 | 12.5 | 12.5 | 12.542 |
demo1.within.only.aov <- aov(pulse ~ time + Error(id), data = demo1) summary(demo1.within.only.aov)
> demo1.within.only.aov <- aov(pulse ~ time + Error(id), data = demo1) > summary(demo1.within.only.aov) Error: id Df Sum Sq Mean Sq F value Pr(>F) Residuals 7 155.3 22.18 Error: Within Df Sum Sq Mean Sq F value Pr(>F) time 2 0.0833 0.04167 1 0.393 Residuals 14 0.5833 0.04167 >
demo 2
Twoway repeated measure anova
reference
- Repeated measures one-way ANOVA by Akkelin
- http://rcompanion.org/handbook/I_09.html : This is an excellent example, but, difficult to swallow.