Differences

This shows you the differences between two versions of the page.

--- repeated_measure_anova [2016/06/23 19:04] – hkimscil
+++ repeated_measure_anova [2024/05/08 08:31] (current) – hkimscil
@@ Line 1: / Line 1: @@
-See also, [[ANOVA]], [[:Factorial Anova|Factorial ANOVA]]
+See also, [[ANOVA]], [[:Factorial Anova|Factorial ANOVA]], [[:t-test#동일집단_간의_차이에_대해서_알아볼_때|paired sample t-test]] [[:r:repeated_measures_anova]]
-설명이 불충분하므로, [[:Repeated Measure Anova#s-3|아래 참고 사이트]](reference)를 숙지할 것.
+====== Repeated Measure ANOVA ======
+Introduction
+  * one-way ANOVA for //**related, not-independent groups**//
+  * extension of the dependent t-test (one group t-test, repeated measure t-test)
+  * also, it is called "within-subjects ANOVA" or "ANOVA for correlated samples"
+  * the simplest one is __one-way repeated measures ANOVA__
+  * which requires one independent and one dependent variable
+  * the independent variable is categorical (either nominal or ordinal)
+  * the dependent variable is continuous (interval or ratio)
+Test Circumstances
+  * one subject with repeated measures across a time period (differences of mean scores across three or more time periods)
+    * participants being tested with headache drugs such as
+      * group A, B, C, placebo
+      * across the time periods j, k, l, m
+    * testing the effect of a three-month exercise training program on blood sugar level
+      * measure blood sugar level at 3 different points (pre-exercise, midway, post-exercise)
+  * one subject with repeated measures in different situation (treatments; differences of mean scores under three or more different conditions)
+    * e.g., participant (n=30) using and evaluating three web site UI (naver, daum, and google)
+    * and rate its usefulness, usability and ease of use
+  * data should look as follows:
+^ ^ pre-excerise \\ "sugar level"   ^ mid-term \\ "sugar level"   ^ post-exercise  \\ "sugar level"  ^
+|  a  | 250  | 220  | 150  |
+|  b  | 300  | 170  | 120  |
+|  c  | 150  | 120  | 120  |
+|  d  | 230  | 170  | 160  |
+|  e  | 260  | 250  | 250  |
+|     | level 1  | level 2  | level 3  |
+Levels = related groups of the independent variable "time"
+^ ^ treatment \\ condition \\ "naver"   ^ treatment \\ condition \\ "daum"   ^ treatment \\ condition \\ "google"   ^
+|  a  | 70  | 60  | 80  |
+|  b  | 50  | 70  | 50  |
+|  c  | 40  | 50  | 60  |
+|  d  | 30  | 40  | 60  |
+|  e  | 60  | 50  | 40  |
+|     | level 1  | level 2  | level 3  |
+in general, the data should look
+^ ^  time/condition  ^^^
+| |  T1  |  T2  |  T3  |
+|  s1  |  s1  |  s1  |  s1  |
+|  s2  |  s2  |  s2  |  s2  |
+|  s3  |  s3  |  s3  |  s3  |
+|  s4  |  s4  |  s4  |  s4  |
+|  s5  |  s5  |  s5  |  s5  |
+|  ..  |  ..  |  ..  |  ..  |
+|  sn  |  sn  |  sn  |  sn  |
+You should discern the above from normal ANOVA situation.
+^  ^  group  ^  treatment  ^
+| a |  1  |  70  |
+| b |  1  |  50  |
+| c |  1  |  40  |
+| d |  1  |  30  |
+| e |  1  |  60  |
+| f |  2  |  60  |
+| g |  2  |  70  |
+| h |  2  |  50  |
+| i |  2  |  40  |
+| j |  2  |  50  |
+| k |  3  |  80  |
+| l |  3  |  50  |
+| m |  3  |  60  |
+| n |  3  |  60  |
+| o |  3  |  40  |
+LOGICS
+  * $\text{independent ANOVA: } F = \displaystyle \frac{MS_{between}}{MS_{within}} = \frac{MS_{between}}{MS_{error}}$
+  * $\text{rep measures ANOVA: } F = \displaystyle \frac{MS_{between}}{MS_{within}} = \displaystyle \frac{MS_{conditions}}{MS_{error}}$
+주>
+  * "between" 이란 단어는 독립적인 그룹 **간**의 비교를 의미하므로, 반복측정(repeated measure)의 경우에는 conditions라는 용어를 사용.
+-- Picture about here --
+{{:pasted:20240501-082722.png}}
+----
+{{:pasted:20240501-082738.png}}
+----
+  * but, $\text{SS}_\text{{within}}$ can be partitioned as
+    * $\text{SS}_{\text{ subjects}}$ and $\text{SS}_{\text{ error}}$
+    * that is, some of the "within variation" are carried along in each individual.
+    * Among the two, we can exclude the first from SS<sub>within</sub>
+    * and solely use the latter as SS<sub>error</sub>
+    * This is to say:
+      * in $\text{independent ANOVA: } \text{SS}_\text{{within}} = \text{SS}_{\text{error}} $
+      * in $\text{rep measures ANOVA: } \text{SS}_\text{{within}} = \text{SS}_{\text{subjects}} + \text{SS}_{\text{error}}$
+    * This means that the term SS<sub>error</sub> will be **__smaller__**
+    * But, with this SS<sub>error</sub>, the df is going to be (n-1)(k-1)
+^  subjects  ^  Pre  ^  1 Month  ^  3 Month  ^  Subject \\ Means  ^
+|  1  |  45  |  50  |  55  |  **50**  |
+|  2  |  42  |  42  |  45  |  **43**  |
+|  3  |  36  |  41  |  43  |  **40**  |
+|  4  |  39  |  35  |  40  |  **38**  |
+|  5  |  51  |  55  |  59  |  **55**  |
+|  6  |  44  |  49  |  56  |  **49.7**  |
+|  **Monthly mean**  |  **42.8**  |  **45.3**  |  **49.97**  |   |
+|  **Grand mean: 45.9**      |||||
+We do this (and the below example) with an excel {{:r:repeated_measures_anova_eg.xlsx|spreadsheet}}.
+We also require {{:ftable.pdf|fdistribution table}} to determine the null hypothesis test.
-====== Repeated Measures ANOVA ======
 ^  Headache Analysis  ^^^^^^^
-| ||  base  |||  treatment | average per case |
+| | base   treatment  ||||| average \\ per case \\ (subject, \\ participant)  |
-|  ser  | w1  |  w2  |  w3  |  w4  |  w5  |  $\overline{X}_{participants}$ = average per case  |
+|  ser  | w1  |  w2  |  w3  |  w4  |  w5  | $\overline{X}_{part}$ \\ = average \\ per case \\ (subject, \\ participant)  |
 |  1  |  21  |  22  |  8  |  6  |  6  |  12.6  |
 |  2  |  20  |  19  |  10  |  4  |  9  |  12.4  |
@@ Line 15: / Line 119: @@
 |  8  |  13  |  4  |  8  |  1  |  5  |  6.2  |
 |  9  |  26  |  24  |  14  |  8  |  17  |  17.8  |
-|  average per week  |  20.78  |  20.00  |  9.00  |  5.78  |  6.78  |  $\overline{X}$ = 12.47  |
+|  average \\ per week  |  20.78  |  20.00  |  9.00  |  5.78  |  6.78  |  $\overline{X}$ = 12.47  |
 ^  Stats  ^^
@@ Line 24: / Line 128: @@
 |  # of case (n) | 9  |
-SS<sub>total</sub> = $\Sigma{(X-\overline{X})^2} $ = 3489.2
+SS<sub>total</sub> = $\Sigma{(X-\overline{X})^2} $ = 3489.2 \\
-SS<sub>participants</sub> = $w\Sigma{(\overline{X}_{participants}-\overline{X})}$ = 833.6
-SS<sub>weeks</sub> = $n\Sigma{(\overline{X}_{week} - \overline{X})}$ = 1934.5
+SS<sub>between</sub>
-SS<sub>residual</sub>
+= SS<sub>conditions</sub>
+= SS<sub>weeks</sub>
+= $n\Sigma{(\overline{X}_{week} - \overline{X})^2}$ = 1934.5 \\
+SS<sub>within</sub>
+= $ \Sigma \Sigma{(X_{s_i.t_j} - \overline{X_{t_j}})^2}$
+= $ \Sigma (411.6, 836.0, 78.0, 93.6, 135.6) $
+= 1554.7
+\\
+SS<sub>participants</sub> = $w\Sigma{(\overline{X}_{participants}-\overline{X})^2}$ = 833.6 \\
+SS<sub>residual</sub>
 = SS<sub>error</sub>
-= SS<sub>total</sub> - SS<sub>participants</sub> - SS<sub>weeks</sub>
+= SS<sub>within</sub> - SS<sub>participants</sub>
+= 1554.7 - 833.6
 = 721.1
-df<sub>total</sub> = N - 1 = 45 - 1 = 44
+OR
+SS<sub>residual</sub> =
+= SS<sub>error</sub>
+= (SS<sub>total</sub> - SS<sub>weeks(between)</sub>) - SS<sub>participants</sub>
+= 721.1 \\
+\\
+df<sub>total</sub> = N - 1 = 45 - 1 = 44 \\
 df<sub>week</sub> = 5 - 1 = 4 = df<sub>between</sub> \\
 df<sub>participants</sub> = 9 - 1 = 8 = df<sub>subjects</sub> \\
@@ Line 45: / Line 168: @@
 |  C  |  38  |  18  |  40  |
 |  D  |  45  |  32  |  43  |
+====== in r ======
+===== demo1 =====
+[[https://rcompanion.org/handbook/I_09.html]]
+<WRAP box info>
+data files in e.gs:
+{{:demo1.csv}}
+{{:demo2.csv}}
+{{:demo3.csv}}
+{{:demo4.csv}}
+{{:exer.csv}}
+</WRAP>
+<code>
+demo1  <- read.csv("https://stats.idre.ucla.edu/stat/data/demo1.csv")
+demo1
+str(demo1) ## 모든 변인이 int이므로 (숫자) factor로 바꿔야 한다
+## Convert variables to factor
+demo1 <- within(demo1, {
+  group <- factor(group)
+  time <- factor(time)
+  id <- factor(id)
+}) ## 이제 pulse만 제외하고 모두 factor로 변환된 데이터
+str(demo1)
+</code>
+demo1 data는 아래와 같다.
+<code>
+id	group	pulse	time
+	1	10	1
+	1	10	2
+	1	10	3
+	1	10	1
+	1	10	2
+	1	10	3
+	1	10	1
+	1	10	2
+	1	10	3
+	1	10	1
+	1	10	2
+	1	10	3
+	2	15	1
+	2	15	2
+	2	15	3
+	2	15	1
+	2	15	2
+	2	15	3
+	2	16	1
+	2	15	2
+	2	15	3
+	2	15	1
+	2	15	2
+	2	15	3
+</code>
+이를 정리해보면
+||   || time  ||||||||
+||   || t1  || t2  || t3  || mean \\ of the \\ same person's \\ measures  ||
+|| 1  || 10  || 10  || 10  || 10  ||
+|| 2  || 10  || 10  || 10  || 10  ||
+|| 3  || 10  || 10  || 10  || 10  ||
+|| 4  || 10  || 10  || 10  || 10  ||
+|| 5  || 15  || 15  || 15  || 15  ||
+|| 6  || 15  || 15  || 15  || 15  ||
+|| 7  || 16  || 15  || 15  || 15.333  ||
+|| 8  || 15  || 15  || 15  || 15  ||
+|| mean \\ across \\ the time  || 12.625  || 12.5  || 12.5  || 12.542  ||
+<code>
+demo1.within.only.aov <- aov(pulse ~ time + Error(id), data = demo1)
+summary(demo1.within.only.aov)
+</code>
+<code>
+> demo1.within.only.aov <- aov(pulse ~ time + Error(id), data = demo1)
+> summary(demo1.within.only.aov)
+Error: id
+          Df Sum Sq Mean Sq F value Pr(>F)
+Residuals  7  155.3   22.18
+Error: Within
+          Df Sum Sq Mean Sq F value Pr(>F)
+time       2 0.0833 0.04167       1  0.393
+Residuals 14 0.5833 0.04167
+>
+</code>
+see {{:r:repeated_measures_anova_eg.xlsx}}
+===== demo 2 =====
+see [[:r:repeated measure anova]]
+===== Twoway repeated measure anova=====
+see [[:r:twoway repeated measure anova]]
 ====== reference ======
@@ Line 51: / Line 270: @@
   * http://www.psych.utoronto.ca/courses/c1/chap14/chap14.html
   * https://statistics.laerd.com/statistical-guides/repeated-measures-anova-statistical-guide.php
+  * http://rcompanion.org/handbook/I_09.html : This is an excellent example, but, difficult to swallow.