See also, [[ANOVA]], [[:Factorial Anova|Factorial ANOVA]], [[:t-test#동일집단_간의_차이에_대해서_알아볼_때|paired sample t-test]] [[:r:repeated_measures_anova]] 
====== Repeated Measure ANOVA ======
Introduction
  * one-way ANOVA for //**related, not-independent groups**//
  * extension of the dependent t-test (one group t-test, repeated measure t-test)
  * also, it is called "within-subjects ANOVA" or "ANOVA for correlated samples"
  * the simplest one is __one-way repeated measures ANOVA__
  * which requires one independent and one dependent variable
  * the independent variable is categorical (either nominal or ordinal)
  * the dependent variable is continuous (interval or ratio)

Test Circumstances 
  * one subject with repeated measures across a time period (differences of mean scores across three or more time periods)
    * participants being tested with headache drugs such as 
      * group A, B, C, placebo 
      * across the time periods j, k, l, m
    * testing the effect of a three-month exercise training program on blood sugar level
      * measure blood sugar level at 3 different points (pre-exercise, midway, post-exercise)
  * one subject with repeated measures in different situation (treatments; differences of mean scores under three or more different conditions)
    * e.g., participant (n=30) using and evaluating three web site UI (naver, daum, and google)
    * and rate its usefulness, usability and ease of use
  * data should look as follows:

^ ^ pre-excerise \\ "sugar level"   ^ mid-term \\ "sugar level"   ^ post-exercise  \\ "sugar level"  ^
|  a  | 250  | 220  | 150  |
|  b  | 300  | 170  | 120  |
|  c  | 150  | 120  | 120  |
|  d  | 230  | 170  | 160  |
|  e  | 260  | 250  | 250  |
|     | level 1  | level 2  | level 3  |

Levels = related groups of the independent variable "time"

^ ^ treatment \\ condition \\ "naver"   ^ treatment \\ condition \\ "daum"   ^ treatment \\ condition \\ "google"   ^
|  a  | 70  | 60  | 80  |
|  b  | 50  | 70  | 50  |
|  c  | 40  | 50  | 60  |
|  d  | 30  | 40  | 60  |
|  e  | 60  | 50  | 40  |
|     | level 1  | level 2  | level 3  |

in general, the data should look 
^ ^  time/condition  ^^^
| |  T1  |  T2  |  T3  |
|  s1  |  s1  |  s1  |  s1  |
|  s2  |  s2  |  s2  |  s2  |
|  s3  |  s3  |  s3  |  s3  |
|  s4  |  s4  |  s4  |  s4  |
|  s5  |  s5  |  s5  |  s5  |
|  ..  |  ..  |  ..  |  ..  |
|  sn  |  sn  |  sn  |  sn  |

You should discern the above from normal ANOVA situation.

^  ^  group  ^  treatment  ^
| a |  1  |  70  |
| b |  1  |  50  |
| c |  1  |  40  |
| d |  1  |  30  |
| e |  1  |  60  |
| f |  2  |  60  |
| g |  2  |  70  |
| h |  2  |  50  |
| i |  2  |  40  |
| j |  2  |  50  |
| k |  3  |  80  |
| l |  3  |  50  |
| m |  3  |  60  |
| n |  3  |  60  |
| o |  3  |  40  |

LOGICS
  * $\text{independent ANOVA: } F = \displaystyle \frac{MS_{between}}{MS_{within}} = \frac{MS_{between}}{MS_{error}}$

  * $\text{rep measures ANOVA: } F = \displaystyle \frac{MS_{between}}{MS_{within}} = \displaystyle \frac{MS_{conditions}}{MS_{error}}$

주>
  * "between" 이란 단어는 독립적인 그룹 **간**의 비교를 의미하므로, 반복측정(repeated measure)의 경우에는 conditions라는 용어를 사용.

-- Picture about here --
{{:pasted:20240501-082722.png}}
----
{{:pasted:20240513-083858.png}}
----
  * but, $\text{SS}_\text{{within}}$ can be partitioned as 
    * $\text{SS}_{\text{ subjects}}$ and $\text{SS}_{\text{ error}}$
    * that is, some of the "within variation" are carried along in each individual.  
    * Among the two, we can exclude the first from SS<sub>within</sub>
    * and solely use the latter as SS<sub>error</sub>
    * This is to say:
      * in $\text{independent ANOVA: } \text{SS}_\text{{within}} = \text{SS}_{\text{error}} $  
      * in $\text{rep measures ANOVA: } \text{SS}_\text{{within}} = \text{SS}_{\text{subjects}} + \text{SS}_{\text{error}}$ 
    * This means that the term SS<sub>error</sub> will be **__smaller__**
    * But, with this SS<sub>error</sub>, the df is going to be (n-1)(k-1)

^  subjects  ^  Pre  ^  1 Month  ^  3 Month  ^  Subject \\ Means  ^
|  1  |  45  |  50  |  55  |  **50**  |
|  2  |  42  |  42  |  45  |  **43**  |
|  3  |  36  |  41  |  43  |  **40**  |
|  4  |  39  |  35  |  40  |  **38**  |
|  5  |  51  |  55  |  59  |  **55**  |
|  6  |  44  |  49  |  56  |  **49.7**  |
|  **Monthly mean**  |  **42.8**  |  **45.3**  |  **49.97**  |   |
|  **Grand mean: 45.9**      |||||

We do this (and the below example) with an excel {{:r:repeated_measures_anova_eg.xlsx|spreadsheet, repeated_measures_anova_eg.xlsx}}. 
We also require {{:ftable.pdf|fdistribution table}} to determine the null hypothesis test.

^  Headache Analysis  ^^^^^^^
| | base   treatment  ||||| average \\ per case \\ (subject, \\ participant)  |
|  ser  | w1  |  w2  |  w3  |  w4  |  w5  | $\overline{X}_{part}$ \\ = average \\ per case \\ (subject, \\ participant)  |
|  1  |  21  |  22  |  8  |  6  |  6  |  12.6  |
|  2  |  20  |  19  |  10  |  4  |  9  |  12.4  |
|  3  |  7  |  5  |  5  |  4  |  5  |  5.2  |
|  4  |  25  |  30  |  13  |  12  |  4  |  16.8  |
|  5  |  30  |  33  |  10  |  8  |  6  |  17.4  |
|  6  |  19  |  27  |  8  |  7  |  4  |  13  |
|  7  |  26  |  16  |  5  |  2  |  5  |  10.8  |
|  8  |  13  |  4  |  8  |  1  |  5  |  6.2  |
|  9  |  26  |  24  |  14  |  8  |  17  |  17.8  |
|  average \\ per week  |  20.78  |  20.00  |  9.00  |  5.78  |  6.78  |  $\overline{X}$ = 12.47  |

^  Stats  ^^
|  Mean Total | 12.47  |
|  $\Sigma{X_i}$ | 561  |
|  $\Sigma{{X_i}^2}$ | 10483  |
|  # of week | 5  |
|  # of case (n) | 9  |

SS<sub>total</sub> = $\Sigma{(X-\overline{X})^2} $ = 3489.2 \\

SS<sub>between</sub>
= SS<sub>conditions</sub> 
= SS<sub>weeks</sub> 
= $n\Sigma{(\overline{X}_{week} - \overline{X})^2}$ = 1934.5 \\

SS<sub>within</sub> 
= $ \Sigma \Sigma{(X_{s_i.t_j} - \overline{X_{t_j}})^2}$ 
= $ \Sigma (411.6, 836.0, 78.0, 93.6, 135.6) $ 
= 1554.7 
\\

SS<sub>participants</sub> = $w\Sigma{(\overline{X}_{participants}-\overline{X})^2}$ = 833.6 \\

SS<sub>residual</sub>
= SS<sub>error</sub> 
= SS<sub>within</sub> - SS<sub>participants</sub>
= 1554.7 - 833.6
= 721.1

OR
SS<sub>residual</sub> =
= SS<sub>error</sub> 
= (SS<sub>total</sub> - SS<sub>weeks(between)</sub>) - SS<sub>participants</sub>  
= 721.1 \\
\\
df<sub>total</sub> = N - 1 = 45 - 1 = 44 \\
df<sub>week</sub> = 5 - 1 = 4 = df<sub>between</sub> \\
df<sub>participants</sub> = 9 - 1 = 8 = df<sub>subjects</sub> \\
df<sub>error</sub>= (n - 1)(k - 1) = 8 * 4 = 32 = 40 - 8 = 32 \\
df<sub>within</sub> = N - k = 45 - 5 = 40

====== ie ======
^  시각적 인지점수  ^^^^
|참가자 | No visual distraction | Visual distraction | Sound Distraction |
|  A  |  47  |  22  |  41  |
|  B  |  57  |  31  |  52  |
|  C  |  38  |  18  |  40  |
|  D  |  45  |  32  |  43  |
====== in r ======
===== demo1 =====

[[https://rcompanion.org/handbook/I_09.html]] 
<WRAP box info>
data files in e.gs:
{{:demo1.csv}}
{{:demo2.csv}}
{{:demo3.csv}}
{{:demo4.csv}}
{{:exer.csv}}
</WRAP>

<code>
demo1  <- read.csv("https://stats.idre.ucla.edu/stat/data/demo1.csv")
demo1 
str(demo1) ## 모든 변인이 int이므로 (숫자) factor로 바꿔야 한다

## Convert variables to factor
demo1 <- within(demo1, {
  group <- factor(group)
  time <- factor(time)
  id <- factor(id)
}) ## 이제 pulse만 제외하고 모두 factor로 변환된 데이터

str(demo1)
</code>

demo1 data는 아래와 같다.
<code>
id	group	pulse	time
1	1	10	1
1	1	10	2
1	1	10	3
2	1	10	1
2	1	10	2
2	1	10	3
3	1	10	1
3	1	10	2
3	1	10	3
4	1	10	1
4	1	10	2
4	1	10	3
5	2	15	1
5	2	15	2
5	2	15	3
6	2	15	1
6	2	15	2
6	2	15	3
7	2	16	1
7	2	15	2
7	2	15	3
8	2	15	1
8	2	15	2
8	2	15	3
</code>
이를 정리해보면 

||   || time  ||||||||
||   || t1  || t2  || t3  || mean \\ of the \\ same person's \\ measures  ||
|| 1  || 10  || 10  || 10  || 10  ||
|| 2  || 10  || 10  || 10  || 10  ||
|| 3  || 10  || 10  || 10  || 10  ||
|| 4  || 10  || 10  || 10  || 10  ||
|| 5  || 15  || 15  || 15  || 15  ||
|| 6  || 15  || 15  || 15  || 15  ||
|| 7  || 16  || 15  || 15  || 15.333  ||
|| 8  || 15  || 15  || 15  || 15  ||
|| mean \\ across \\ the time  || 12.625  || 12.5  || 12.5  || 12.542  ||


<code>
demo1.within.only.aov <- aov(pulse ~ time + Error(id), data = demo1)
summary(demo1.within.only.aov)
</code>

<code>
> demo1.within.only.aov <- aov(pulse ~ time + Error(id), data = demo1)
> summary(demo1.within.only.aov)

Error: id
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals  7  155.3   22.18               

Error: Within
          Df Sum Sq Mean Sq F value Pr(>F)
time       2 0.0833 0.04167       1  0.393
Residuals 14 0.5833 0.04167               
> 
</code>

see {{:r:repeated_measures_anova_eg.xlsx}}
===== demo 2 =====
see [[:r:repeated measure anova]]
===== Twoway repeated measure anova=====
see [[:r:twoway repeated measure anova]]

====== reference ======
  * [[http://wwwstage.valpo.edu/other/dabook/ch12/c12-1.htm|Repeated measures one-way ANOVA]] by Akkelin
    * {{:ezdata.sav|ezdata: SPSS Data file}}
  * http://www.psych.utoronto.ca/courses/c1/chap14/chap14.html
  * https://statistics.laerd.com/statistical-guides/repeated-measures-anova-statistical-guide.php

  * http://rcompanion.org/handbook/I_09.html : This is an excellent example, but, difficult to swallow.