Differences

This shows you the differences between two versions of the page.

--- c:ma:2019:schedule [2019/09/20 10:04] – [Week02 (Sep 11, 13)] hkimscil
+++ c:ma:2019:schedule [2019/12/13 13:21] – [Week16 (June 18, 20)] hkimscil
@@ Line 131: / Line 131: @@
   * Hypothesis 문서의 [[:hypothesis#예_1]]의 "제3자 효과이론과 침묵의 나선이론 연계성" 논문을 읽고 가설을 기술하시오.
   * 각 가설의 독립변인(Independent variables), 종속변인 (dependent variabless) 등을 나열하시오.
-  * 이 이론에 사용된 이론은 무엇인지 기술하고 설명하시오.
+  * 이 논문에 사용된 이론은 무엇인지 기술하고 설명하시오.
 </WRAP>
@@ Line 182: / Line 182: @@
   - Converting One Structured Data Type into Another
+[[:r:data transformations]]
 </WRAP>
 <WRAP half column>
 ===== Assignment =====
-  * 가설 연습 in ajoubb
+ga04.making.hypothesis 가설 연습 ajoubb
-  *
+  * 첫번째, R(rstudio사용)에서 default로 구할 수 있는 mtcars 데이터를 이용하여 t-test와 anova test를 할 수 있는 가설을 만들고, R에서 분석해 보세요.
+    * 가설에 대해서는 [[:hypothesis testing]] 문서를 참조하시기 바랍니다.
+    * t-test는 [[:t-test]]를 참조하시기 바랍니다.
+      * 4가지 종류의 t-test 중에서 mtcars 데이터의 경우는 몇 번째 것을 사용해야 하는가에 대해서 확인하세요.
+    * anova에 대해서는 [[:anova]] 문서를 참조하세요.
+    * R에서의 분석은 각각 t.test와 aov 펑션을 이용해야 합니다.
+  * 두번째, 신문에서의 여론조사 결과에 나오는 error of margin에 대해서 확인해보시기 바랍니다.
+    * 여론조사 결과가 내용인 신문기사 2개를 고릅니다.
+      * [[http://www.realmeter.net/%ea%b3%a0%ec%9c%84%ec%a7%81-%ec%9e%90%eb%85%80-%ec%9e%85%ec%8b%9c%eb%b9%84%eb%a6%ac-%ec%a0%84%ec%88%98%ec%a1%b0%ec%82%ac-%ec%b0%ac%ec%84%b175vs%eb%b0%98%eb%8c%8018/|예]]
+      * 일반적인 se값은 아래와 같이 구합니다.
+      * $ \displaystyle \sigma_{\hat{p}} = \sqrt{\frac{p*q}{n}} , \;\;\; q = (1 - p) $
+      * $ p = .752 $ = 75.2%
+  * 파일을 upload한다면 파일이름은
+    * ga04.making.hypothesis.그룹이름.ext 과 같이 저장한 후에 올리시기 바랍니다.
+    * 위에서 "그룹이름"과 "ext"은 그룹에 따라서 바꾸야 합니다.
+      * 3조의 경우는 "그룹이름"대신 03을 사용합니다.
+      * ms word파일로 저장을 했다면 파일extension으로 "docx"가 생길겁니다. text파일로 저장을 했다면 "txt"가 생길 것입니다.
+    * 따라서 위의 예에 따르면 과제 이름은
+      * ga04.making.hypothesis.03.txt와 같을 겁니다.
@@ Line 195: / Line 216: @@
 <WRAP half column>
 ===== ideas and concepts  =====
+[[:r:probability]]
+[[:r:General Statistics]]
+==== t.test: mtcars ====
+<code>
+> mdata <- split(mtcars$mpg, mtcars$am)
+> mdata
+$`0`
+ [1] 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
+[13] 10.4 14.7 21.5 15.5 15.2 13.3 19.2
+$`1`
+ [1] 21.0 21.0 22.8 32.4 30.4 33.9 27.3 26.0 30.4 15.8 19.7 15.0
+[13] 21.4
+> stack(mdata)
+   values ind
+    21.4   0
+    18.7   0
+    18.1   0
+    14.3   0
+    24.4   0
+    22.8   0
+    19.2   0
+    17.8   0
+    16.4   0
+   17.3   0
+   15.2   0
+   10.4   0
+   10.4   0
+   14.7   0
+   21.5   0
+   15.5   0
+   15.2   0
+   13.3   0
+   19.2   0
+   21.0   1
+   21.0   1
+   22.8   1
+   32.4   1
+   30.4   1
+   33.9   1
+   27.3   1
+   26.0   1
+   30.4   1
+   15.8   1
+   19.7   1
+   15.0   1
+   21.4   1
+> mdata
+$`0`
+ [1] 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
+[13] 10.4 14.7 21.5 15.5 15.2 13.3 19.2
+$`1`
+ [1] 21.0 21.0 22.8 32.4 30.4 33.9 27.3 26.0 30.4 15.8 19.7 15.0
+[13] 21.4
+> t.test(mpg~am, data=mtcars)
+	Welch Two Sample t-test
+data:  mpg by am
+t = -3.7671, df = 18.332, p-value = 0.001374
+alternative hypothesis: true difference in means is not equal to 0
+percent confidence interval:
+ -11.280194  -3.209684
+sample estimates:
+mean in group 0 mean in group 1
+.14737        24.39231
+> t.test(mpg~am, data=mtcars, var.equal=T)
+	Two Sample t-test
+data:  mpg by am
+t = -4.1061, df = 30, p-value = 0.000285
+alternative hypothesis: true difference in means is not equal to 0
+percent confidence interval:
+ -10.84837  -3.64151
+sample estimates:
+mean in group 0 mean in group 1
+.14737        24.39231
+> m1 <- mdata[[1]]
+> m2 <- mdata[[2]]
+> m1
+ [1] 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
+[13] 10.4 14.7 21.5 15.5 15.2 13.3 19.2
+> m2
+ [1] 21.0 21.0 22.8 32.4 30.4 33.9 27.3 26.0 30.4 15.8 19.7 15.0
+[13] 21.4
+> m1.var <- var(m1)
+> m2.var <- var(m2)
+> m1.n <- length(m1)
+> m2.n <- length(m2)
+> m1.df <- length(m1)-1
+> m2.df <- length(m2)-1
+> m1.ss <- m1.var*m1.df
+> m2.ss <- m2.var*m2.df
+> m1.ss
+[1] 264.5874
+> m2.ss
+[1] 456.3092
+> m12.ss <- m1.ss+m2.ss
+> m12.ss
+[1] 720.8966
+> m12.df <- m1.df+m2.df
+> pv <- m12.ss/m12.df
+> pv
+[1] 24.02989
+> pv/m1.n
+[1] 1.264731
+> pv/m2.n
+[1] 1.848453
+> m.se <- sqrt((pv/m1.n)+(pv/m2.n))
+> m.se
+[1] 1.764422
+> m1.m <- mean(m1)
+> m2.m <- mean(m2)
+> m.tvalue <- (m1.m-m2.m)/m.se
+> m.tvalue
+[1] -4.106127
+</code>
+<code>
+> t.test(mpg~am, data=mtcars, var.equal=T)
+	Two Sample t-test
+data:  mpg by am
+t = -4.1061, df = 30, p-value = 0.000285
+alternative hypothesis: true difference in means is not equal to 0
+percent confidence interval:
+ -10.84837  -3.64151
+sample estimates:
+mean in group 0 mean in group 1
+.14737        24.39231
+</code>
+==== anova: mtcars ====
+<code>
+stats4each = function(x,y) {
+   meani <- tapply(x,y,mean)
+   vari <- tapply(x,y,var)
+   ni <- tapply(x,y,length)
+   dfi <- tapply(x,y,length)-1
+   ssi <- tapply(x,y,var)*(tapply(x,y,length)-1)
+   out <- rbind(meani,vari,ni,dfi,ssi)
+   return(out)
+}
+library(MASS)
+tempd <- iris
+x <- tempd$Species
+y <- tempd$Sepal.Width
+tempd <- mtcars
+x <- tempd$gear
+y <- tempd$mpg
+tempd <- mtcars
+x <- tempd$am
+y <- tempd$mpg
+x <- factor(x)
+dfbetween <- nlevels(x)-1
+stats <- stats4each(y, x)
+stats
+sswithin <- sum(stats[5,])
+sstotal <- var(y)*(length(y)-1)
+ssbetween <- sstotal-sswithin
+round(sswithin,2)
+round(ssbetween,2)
+round(sstotal,2)
+dfwithin <- sum(stats[4,])
+dftotal <- length(y)-1
+dfwithin
+dfbetween
+dftotal
+mswithin <- sswithin / dfwithin
+msbetween <- ssbetween / dfbetween
+mstotal <- sstotal / dftotal
+round(mswithin,2)
+round(msbetween,2)
+round(mstotal,2)
+fval <- round(msbetween/mswithin,2)
+fval
+siglevel <- pf(q=fval, df1=dfbetween, df2=dfwithin, lower.tail=FALSE)
+siglevel
+mod <- aov(y~x, data=tempd)
+summary(mod)
+</code>
+==== cor ====
+<code>
+attach(mtcars)
+cor(mpg, hp)
+mycor <- cov(mpg,hp)/(sd(mpg)*sd(hp))
+mycor
+sp <- cov(mpg,hp)*(length(mtcars$hp)-1)
+ssx <- var(mpg)*(length(mtcars$mpg)-1)
+ssy <- var(hp)*(length(mtcars$hp)-1)
+mycor2 <- sp/sqrt(ssx*ssy)
+mycor2
+mycor2 == mycor
+mycor == cor(mpg,hp)
+mycor2 == cor(mpg,hp)
+</code>
 </WRAP>
 <WRAP half column>
@@ Line 203: / Line 454: @@
 <WRAP half column>
 ===== ideas and concepts  =====
+[[:correlation]]
+[[:regression]]
+[[:multiple regression]]
+  * [[:r:correlation|correlation in r]]
+  * [[:r:multiple regression|multiple regression in r]]
+[[:Partial and semipartial correlation]]
+[[:using dummy variables]]
+[[:Statistical Regression Methods]]
+[[:Sequential Regression]]
 </WRAP>
 <WRAP half column>
 ===== Assignment =====
+  - Public opinion in online environments ((refer to {{:public.opinion.theories.introduction.pdf}} ))
+    * [[:Spiral of Silence]]
+    * [[:Pluralistic Ignorance]]
+    * [[:The Third Person Effect]]
+    * etc. 여론형성과 관련된 사회학적 혹은 사회심리학적 이론을 찾아보고 소개하기, 예로 위의 세가지. 얼마전 사회현상을 어떻게 설명하면 좋을까에 대해서 논의정리하기? 정확한 온라인 환경에서의 여론파악을 위해서 어떤 것이 필요할까?
+    * 혹은 다른 문제에 대해서 (. . . 조에 따른 . . .)
+  - Hypotheses
+    * Multiple regression hypotheses.
+    * Google Survey Questions
 </WRAP>
@@ Line 219: / Line 495: @@
 <WRAP half column>
 __**Mid-term period**__
+===== Quiz the first one =====
+  * Lecture materials + textbook
+  * Textbook: r cookbook: textbook과 관련해서는 예상되는 아웃풋, 아웃풋을 얻기위한 명령어, 명령어(function)에 사용되는 옵션이 의미하는 것 등에 대한 사지선다 혹은 단답식 질문이 나옵니다. 펑션의 옵션사용 등과 같은 정확한 것에 대해서는 질문이 나오지 않습니다.
+    * 예
+      * one sample t-test를 하기 위한 명령어를 쓰시오 (x)
+      * t.test(sample, mu=100)에서 mu는 무엇을 의미하는가? (o)
+      * 다음 중 sapply의 아웃풋 모양으로 적당한 것은? 등등
+    * [[:The r project for statistical computing]]
+    * [[:r:Getting started]]
+    * [[:r:Basics]]
+    * [[:r:Navigating]]
+    * [[:r:Input output]]
+    * [[:r:Data structures]]
+    * [[:r:Data transformations]]
+  * Lecture content
+    * [[:Hypothesis]],
+    * [[:Research question]],
+    * [[:Research methods lecture note#커뮤니케이션_연구문제_제기와_가설|커뮤니케이션 연구문제 제기와 가설]] 부분만
+    * [[:Operationalization]],
+    * [[:Variables]],
+    * [[:Types of variables]]
+    * [[:Hypothesis testing]]
+    * [[:T-test]]
+      * 정확한 t test 공식등은 외울 필요가 없습니다. (제공됩니다).
+      * 간단한 t test 계산을 요구할 수 있습니다.
+      * ANOVA도 마찬가지입니다.
+    * [[:ANOVA]]
 </WRAP>
@@ Line 224: / Line 529: @@
 <WRAP half column>
 ===== ideas and concepts  =====
+[[:correlation]]
+[[:regression]]
+[[:multiple regression]]
+  * [[:r:correlation|correlation in r]]
+  * [[:r:multiple regression|multiple regression in r]]
+[[:Partial and semipartial correlation]]
+[[:using dummy variables]]
+[[:Statistical Regression Methods]]
+[[:Sequential Regression]]
+===== Activity =====
+[[c/ma/2019/Multiple Regression Exercise]]
 </WRAP>
 <WRAP half column>
@@ Line 232: / Line 552: @@
 <WRAP half column>
 ===== ideas and concepts  =====
+[[:factor analysis]]
 </WRAP>
 <WRAP half column>
@@ Line 251: / Line 572: @@
 <WRAP half column>
 ===== Assignment =====
+[[factor analysis assignment]]
 </WRAP>
@@ Line 256: / Line 579: @@
 <WRAP half column>
 ===== ideas and concepts  =====
+[[:social network analysis]]
+[[:r:social network analysis tutorial]]
+[[:r:social network analysis|sna in r]]
+[[:sna_eg_stanford|Stanford University egs.]]
 </WRAP>
 <WRAP half column>
+===== announcement  =====
+Quiz 2 (on Friday Dec. the 6th) covers:
+  * [[:correlation]]
+  * [[:regression]]
+  * [[:multiple regression]]
+    * [[:partial and semipartial correlation]]
+    * [[:using dummy variables]]
+  * [[:factor analysis]]
+Some R outputs will be used to ask the related concepts and ideas (the above).
 ===== Assignment =====
 </WRAP>
@@ Line 268: / Line 605: @@
 ====== Week15 (Dec 11, 13) ======
 <WRAP half column>
-Group Presentation
+[[./assignment week15]]
 </WRAP>
 ====== Week16 (June 18, 20) ======
 <WRAP half column>
-__**Final-term**__
+__**Final-term**__ covers:
+correlation
+regression
+multiple regression
+partial and semipartial correlation
+using dummy variables
+factor analysis
+social network analysis
+sna tutorial
+[[:r:social network analysis|sna in r]]
+[[:sna_eg_stanford:lab06|SNA e.g. lab 06]]
+Some R outputs will be used to ask the related concepts and ideas (the above).
 </WRAP>