Differences

This shows you the differences between two versions of the page.

--- repeated_measures_anova [2020/06/11 15:18] – [demo 3] hkimscil
+++ repeated_measures_anova [2022/05/10 10:29] (current) – removed hkimscil
@@ Line 1: / Line 1: @@
-See also, [[ANOVA]], [[:Factorial Anova|Factorial ANOVA]], [[:t-test#동일집단_간의_차이에_대해서_알아볼_때|paired sample t-test]] [[:r:repeated_measure_anova]]
-====== Repeated Measure ANOVA ======
-Introduction
-  * one-way ANOVA for //**related, not-independent groups**//
-  * extension of the dependent t-test (one group t-test, repeated measure t-test)
-  * also, it is called "within-subjects ANOVA" or "ANOVA for correlated samples"
-  * the simplest one is __one-way repeated measures ANOVA__
-  * which requires one independent and one dependent variable
-  * the independent variable is categorical (either nominal or ordinal)
-  * the dependent variable is continuous (interval or ratio)
-Test Circumstances
-  * one subject with repeated measures across a time period (differences of mean scores across three or more time periods)
-    * participants being tested with headache drugs such as
-      * group A, B, C, placebo
-      * across the time periods j, k, l, m
-    * testing the effect of a three-month exercise training program on blood sugar level
-      * measure blood sugar level at 3 different points (pre-exercise, midway, post-exercise)
-  * one subject with repeated measures in different situation (treatments; differences of mean scores under three or more different conditions)
-    * e.g., participant (n=30) using and evaluating three web site UI (naver, daum, and google)
-    * and rate its usefulness, usability and ease of use
-  * data should look as follows:
-^ ^ pre-excerise \\ "sugar level"   ^ mid-term \\ "sugar level"   ^ post-exercise  \\ "sugar level"  ^
-|  a  | 250  | 220  | 150  |
-|  b  | 300  | 170  | 120  |
-|  c  | 150  | 120  | 120  |
-|  d  | 230  | 170  | 160  |
-|  e  | 260  | 250  | 250  |
-|     | level 1  | level 2  | level 3  |
-Levels = related groups of the independent variable "time"
-^ ^ treatment \\ condition \\ "naver"   ^ treatment \\ condition \\ "daum"   ^ treatment \\ condition \\ "google"   ^
-|  a  | 70  | 60  | 80  |
-|  b  | 50  | 70  | 50  |
-|  c  | 40  | 50  | 60  |
-|  d  | 30  | 40  | 60  |
-|  e  | 60  | 50  | 40  |
-|     | level 1  | level 2  | level 3  |
-in general, the data should look
-^ ^  time/condition  ^^^
-| |  T1  |  T2  |  T3  |
-|  s1  |  s1  |  s1  |  s1  |
-|  s2  |  s2  |  s2  |  s2  |
-|  s3  |  s3  |  s3  |  s3  |
-|  s4  |  s4  |  s4  |  s4  |
-|  s5  |  s5  |  s5  |  s5  |
-|  ..  |  ..  |  ..  |  ..  |
-|  sn  |  sn  |  sn  |  sn  |
-You should discern the above from normal ANOVA situation.
-^  ^  group  ^  treatment  ^
-| a |  1  |  70  |
-| b |  1  |  50  |
-| c |  1  |  40  |
-| d |  1  |  30  |
-| e |  1  |  60  |
-| f |  2  |  60  |
-| g |  2  |  70  |
-| h |  2  |  50  |
-| i |  2  |  40  |
-| j |  2  |  50  |
-| k |  3  |  80  |
-| l |  3  |  50  |
-| m |  3  |  60  |
-| n |  3  |  60  |
-| o |  3  |  40  |
-LOGICS
-  * $\text{independent ANOVA: } F = \displaystyle \frac{MS_{between}}{MS_{within}} = \frac{MS_{between}}{MS_{error}}$
-  * $\text{rep measures ANOVA: } F = \displaystyle \frac{MS_{between}}{MS_{within}} = \displaystyle \frac{MS_{conditions}}{MS_{error}}$
-주>
-  * "between" 이란 단어는 독립적인 그룹 **간**의 비교를 의미하므로, 반복측정(repeated measure)의 경우에는 conditions라는 용어를 사용.
--- Picture about here --
-  * but, $\text{SS}_\text{{within}}$ can be partitioned as
-    * $\text{SS}_{\text{ subjects}}$ and $\text{SS}_{\text{ error}}$
-    * Among the two, we can exclude the first from SS<sub>within</sub>
-    * and solely use the latter as SS<sub>error</sub>
-    * This is to say:
-      * in $\text{independent ANOVA: } \text{SS}_\text{{within}} = \text{SS}_{\text{error}} $
-      * in $\text{rep measures ANOVA: } \text{SS}_\text{{within}} = \text{SS}_{\text{subjects}} + \text{SS}_{\text{error}}$
-    * This means that the term SS<sub>error</sub> will be **__smaller__**
-    * But, with this SS<sub>error</sub>, the df is going to be (n-1)(k-1)
-^  subjects  ^  Pre  ^  1 Month  ^  3 Month  ^  Subject \\ Means  ^
-|  1  |  45  |  50  |  55  |  **50**  |
-|  2  |  42  |  42  |  45  |  **43**  |
-|  3  |  36  |  41  |  43  |  **40**  |
-|  4  |  39  |  35  |  40  |  **38**  |
-|  5  |  51  |  55  |  59  |  **55**  |
-|  6  |  44  |  49  |  56  |  **49.7**  |
-|  **Monthly mean**  |  **42.8**  |  **45.3**  |  **49.97**  |   |
-|  **Grand mean: 45.9**      |||||
-We do this (and the below example) with an excel {{:repeated_measures_anova_eg.xlsx|spreadsheet}}.
-We also require {{:ftable.pdf|fdistribution table}} to determine the null hypothesis test.
-^  Headache Analysis  ^^^^^^^
-| | base   treatment  ||||| average \\ per case \\ (subject, \\ participant)  |
-|  ser  | w1  |  w2  |  w3  |  w4  |  w5  | $\overline{X}_{part}$ \\ = average \\ per case \\ (subject, \\ participant)  |
-|  1  |  21  |  22  |  8  |  6  |  6  |  12.6  |
-|  2  |  20  |  19  |  10  |  4  |  9  |  12.4  |
-|  3  |  7  |  5  |  5  |  4  |  5  |  5.2  |
-|  4  |  25  |  30  |  13  |  12  |  4  |  16.8  |
-|  5  |  30  |  33  |  10  |  8  |  6  |  17.4  |
-|  6  |  19  |  27  |  8  |  7  |  4  |  13  |
-|  7  |  26  |  16  |  5  |  2  |  5  |  10.8  |
-|  8  |  13  |  4  |  8  |  1  |  5  |  6.2  |
-|  9  |  26  |  24  |  14  |  8  |  17  |  17.8  |
-|  average \\ per week  |  20.78  |  20.00  |  9.00  |  5.78  |  6.78  |  $\overline{X}$ = 12.47  |
-^  Stats  ^^
-|  Mean Total | 12.47  |
-|  $\Sigma{X_i}$ | 561  |
-|  $\Sigma{{X_i}^2}$ | 10483  |
-|  # of week | 5  |
-|  # of case (n) | 9  |
-SS<sub>total</sub> = $\Sigma{(X-\overline{X})^2} $ = 3489.2 \\
-SS<sub>participants</sub> = $w\Sigma{(\overline{X}_{participants}-\overline{X})}$ = 833.6 \\
-SS<sub>weeks</sub> = $n\Sigma{(\overline{X}_{week} - \overline{X})}$ = 1934.5 \\
-SS<sub>residual</sub>  \\
-= SS<sub>error</sub> \\
-= SS<sub>total</sub> - SS<sub>participants</sub> - SS<sub>weeks</sub>  \\
-= 721.1 \\
-\\
-df<sub>total</sub> = N - 1 = 45 - 1 = 44 \\
-df<sub>week</sub> = 5 - 1 = 4 = df<sub>between</sub> \\
-df<sub>participants</sub> = 9 - 1 = 8 = df<sub>subjects</sub> \\
-df<sub>error</sub>= (n - 1)(k - 1) = 8 * 4 = 32 = 40 - 8 = 32 \\
-df<sub>within</sub> = N - k = 45 - 5 = 40
-====== ie ======
-^  시각적 인지점수  ^^^^
-|참가자 | No visual distraction | Visual distraction | Sound Distraction |
-|  A  |  47  |  22  |  41  |
-|  B  |  57  |  31  |  52  |
-|  C  |  38  |  18  |  40  |
-|  D  |  45  |  32  |  43  |
-====== in r ======
-===== demo1 =====
-[[https://rcompanion.org/handbook/I_09.html]]
-<WRAP box info>
-data files in e.gs:
-{{:demo1.csv}}
-{{:demo2.csv}}
-{{:demo3.csv}}
-{{:demo4.csv}}
-{{:exer.csv}}
-</WRAP>
-<code>
-demo1  <- read.csv("https://stats.idre.ucla.edu/stat/data/demo1.csv")
-demo1
-str(demo1) ## 모든 변인이 int이므로 (숫자) factor로 바꿔야 한다
-## Convert variables to factor
-demo1 <- within(demo1, {
-  group <- factor(group)
-  time <- factor(time)
-  id <- factor(id)
-}) ## 이제 pulse만 제외하고 모두 factor로 변환된 데이터
-str(demo1)
-</code>
-demo1 data는 아래와 같다.
-<code>
-id	group	pulse	time
-	1	10	1
-	1	10	2
-	1	10	3
-	1	10	1
-	1	10	2
-	1	10	3
-	1	10	1
-	1	10	2
-	1	10	3
-	1	10	1
-	1	10	2
-	1	10	3
-	2	15	1
-	2	15	2
-	2	15	3
-	2	15	1
-	2	15	2
-	2	15	3
-	2	16	1
-	2	15	2
-	2	15	3
-	2	15	1
-	2	15	2
-	2	15	3
-</code>
-이를 정리해보면
-||   || time  ||||||||
-||   || t1  || t2  || t3  || mean \\ of the \\ same person's \\ measures  ||
-|| 1  || 10  || 10  || 10  || 10  ||
-|| 2  || 10  || 10  || 10  || 10  ||
-|| 3  || 10  || 10  || 10  || 10  ||
-|| 4  || 10  || 10  || 10  || 10  ||
-|| 5  || 15  || 15  || 15  || 15  ||
-|| 6  || 15  || 15  || 15  || 15  ||
-|| 7  || 16  || 15  || 15  || 15.333  ||
-|| 8  || 15  || 15  || 15  || 15  ||
-|| mean \\ across \\ the time  || 12.625  || 12.5  || 12.5  || 12.542  ||
-<code>
-demo1.within.only.aov <- aov(pulse ~ time + Error(id), data = demo1)
-summary(demo1.within.only.aov)
-</code>
-<code>
-> demo1.within.only.aov <- aov(pulse ~ time + Error(id), data = demo1)
-> summary(demo1.within.only.aov)
-Error: id
-          Df Sum Sq Mean Sq F value Pr(>F)
-Residuals  7  155.3   22.18
-Error: Within
-          Df Sum Sq Mean Sq F value Pr(>F)
-time       2 0.0833 0.04167       1  0.393
-Residuals 14 0.5833 0.04167
->
-</code>
-<code>
-demo1  <- read.csv("https://stats.idre.ucla.edu/stat/data/demo1.csv")
-demo1
-str(demo1) ## 모든 변인이 int이므로 (숫자) factor로 바꿔야 한다
-## Convert variables to factor
-demo1 <- within(demo1, {
-  group <- factor(group)
-  time <- factor(time)
-  id <- factor(id)
-}) ## 이제 pulse만 제외하고 모두 factor로 변환된 데이터
-str(demo1)
-par(cex = .6)
-with(demo1, interaction.plot(time, group, pulse,
-  ylim = c(5, 20), lty= c(1, 12), lwd = 3,
-  ylab = "mean of pulse", xlab = "time", trace.label = "group"))
-demo1.aov <- aov(pulse ~ group * time + Error(id), data = demo1)
-summary(demo1.aov)
-</code>
-<code>
-> summary(demo1.aov)
-Error: id
-          Df Sum Sq Mean Sq F value  Pr(>F)
-group      1 155.04  155.04    3721 1.3e-09 ***
-Residuals  6   0.25    0.04
----
-Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
-Error: Within
-           Df Sum Sq Mean Sq F value Pr(>F)
-time        2 0.0833 0.04167       1  0.397
-group:time  2 0.0833 0.04167       1  0.397
-Residuals  12 0.5000 0.04167
-</code>
-{{:pasted:20200611-142331.png?350}}
-===== demo2 =====
-<code>
-demo2 <- read.csv("https://stats.idre.ucla.edu/stat/data/demo2.csv")
-## Convert variables to factor
-demo2 <- within(demo2, {
-    group <- factor(group)
-    time <- factor(time)
-    id <- factor(id)
-})
-demo2
-with(demo2, interaction.plot(time, group, pulse,
- ylim = c(10, 40), lty = c(1, 12), lwd = 3,
- ylab = "mean of pulse", xlab = "time", trace.label = "group"))
-demo2.aov <- aov(pulse ~ group * time + Error(id), data = demo2)
-summary(demo2.aov)
-</code>
-{{:pasted:20200611-151520.png?350}}
-<code>
-> demo2 <- read.csv("https://stats.idre.ucla.edu/stat/data/demo2.csv")
-> ## Convert variables to factor
-> demo2 <- within(demo2, {
-+     group <- factor(group)
-+     time <- factor(time)
-+     id <- factor(id)
-+ })
-> demo2
-   id group pulse time
-   1     1    14    1
-   1     1    19    2
-   1     1    29    3
-   2     1    15    1
-   2     1    25    2
-   2     1    26    3
-   3     1    16    1
-   3     1    16    2
-   3     1    31    3
-  4     1    12    1
-  4     1    24    2
-  4     1    32    3
-  5     2    10    1
-  5     2    21    2
-  5     2    24    3
-  6     2    17    1
-  6     2    26    2
-  6     2    35    3
-  7     2    19    1
-  7     2    22    2
-  7     2    32    3
-  8     2    15    1
-  8     2    23    2
-  8     2    34    3
->
-> with(demo2, interaction.plot(time, group, pulse,
-+  ylim = c(10, 40), lty = c(1, 12), lwd = 3,
-+  ylab = "mean of pulse", xlab = "time", trace.label = "group"))
->
-> demo2.aov <- aov(pulse ~ group * time + Error(id), data = demo2)
-> summary(demo2.aov)
-Error: id
-          Df Sum Sq Mean Sq F value Pr(>F)
-group      1  15.04   15.04   0.836  0.396
-Residuals  6 107.92   17.99
-Error: Within
-           Df Sum Sq Mean Sq F value   Pr(>F)
-time        2  978.2   489.1  53.684 1.03e-06 ***
-group:time  2    1.1     0.5   0.059    0.943
-Residuals  12  109.3     9.1
----
-Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
->
-</code>
-===== demo 3 =====
-<code>
-demo3 <- read.csv("https://stats.idre.ucla.edu/stat/data/demo3.csv")
-## Convert variables to factor
-demo3 <- within(demo3, {
-  group <- factor(group)
-  time <- factor(time)
-  id <- factor(id)
-})
-with(demo3, interaction.plot(time, group, pulse,
-  ylim = c(10, 60), lty = c(1, 12), lwd = 3,
-  ylab = "mean of pulse", xlab = "time", trace.label = "group"))
-demo3.aov <- aov(pulse ~ group * time + Error(id), data = demo3)
-summary(demo3.aov)
-</code>
-{{:pasted:20200611-151755.png?350}}
-<code>
-> demo3 <- read.csv("https://stats.idre.ucla.edu/stat/data/demo3.csv")
-> ## Convert variables to factor
-> demo3 <- within(demo3, {
-+     group <- factor(group)
-+     time <- factor(time)
-+     id <- factor(id)
-+ })
->
-> with(demo3, interaction.plot(time, group, pulse,
-+  ylim = c(10, 60), lty = c(1, 12), lwd = 3,
-+  ylab = "mean of pulse", xlab = "time", trace.label = "group"))
->
-> demo3.aov <- aov(pulse ~ group * time + Error(id), data = demo3)
-> summary(demo3.aov)
-Error: id
-          Df Sum Sq Mean Sq F value  Pr(>F)
-group      1 2035.0  2035.0   343.1 1.6e-06 ***
-Residuals  6   35.6     5.9
----
-Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
-Error: Within
-           Df Sum Sq Mean Sq F value   Pr(>F)
-time        2 2830.3  1415.2   553.8 1.52e-12 ***
-group:time  2  200.3   100.2    39.2 5.47e-06 ***
-Residuals  12   30.7     2.6
----
-Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
->
->
-</code>
-====== reference ======
-  * [[http://wwwstage.valpo.edu/other/dabook/ch12/c12-1.htm|Repeated measures one-way ANOVA]] by Akkelin
-    * {{:ezdata.sav|ezdata: SPSS Data file}}
-  * http://www.psych.utoronto.ca/courses/c1/chap14/chap14.html
-  * https://statistics.laerd.com/statistical-guides/repeated-measures-anova-statistical-guide.php
-  * http://rcompanion.org/handbook/I_09.html : This is an excellent example, but, difficult to swallow.