[[./|Class page]] multivariate statistics in R network analysis in R * A User’s Guide to Network Analysis in R (Use R!) * Statistical Analysis of Network Data with R (Use R!) 2014th Edition [[https://lagunita.stanford.edu]] [[https://campus.datacamp.com/courses/network-analysis-in-r|Network Analysis in R]] using igraph package -- from Datacamp [[https://campus.datacamp.com/courses/marketing-analytics-in-r-statistical-modeling/|Marketing analysis in r statistics]] from Datacamp ====== Week01 (Sep 4, 6) ====== ===== ideas and concepts ===== Introduction to R and others - Downloading and Installing R - [[:the_r_project_for_statistical_computing]] - [[:r]], [[:r:getting started]] - Starting R - Entering Commands - Exiting from R - Interrupting R - Viewing the Supplied Documentation - Getting Help on a Function - Searching the Supplied Documentation - Getting Help on a Package - Searching the Web for Help - Finding Relevant Functions and Packages - Searching the Mailing Lists - Submitting Questions to the Mailing Lists using ~~[[:theories]]~~ [[http://commres.net/wiki/research_methods_lecture_note#%EC%BB%A4%EB%AE%A4%EB%8B%88%EC%BC%80%EC%9D%B4%EC%85%98_%EC%97%B0%EA%B5%AC%EB%AC%B8%EC%A0%9C_%EC%A0%9C%EA%B8%B0%EC%99%80_%EA%B0%80%EC%84%A4|연구문제와 가설]] and making [[:hypothesis|hypotheses]] Installing R ===== Assignment ===== ====== Week02 (Sep 11, 13) ====== ===== Concepts and ideas ===== Some [[:R:basics|basics]] - Introduction - Printing Something - Setting Variables - Listing Variables - Deleting Variables - Creating a Vector - Computing Basic Statistics - Creating Sequences - Comparing Vectors - Selecting Vector Elements - Performing Vector Arithmetic - Getting Operator Precedence Right - Defining a Function - Typing Less and Accomplishing More - Avoiding Some Common Mistakes * [[:Research Question]] * [[:Hypothesis]] * Educated guess (via theories) * Difference * Association * Variables (vs. ideas, concepts, and constructs) * [[:Operationalization]] * [[:Variables]], [[:Types of Variables]] * see [[http://chohongjoong.com/gnu4/bbs/board.php?bo_table=board02&wr_id=311&sfl=&stx=&sst=wr_datetime&sod=desc&sop=and&page=1|this blog]] written in Korean * [[:Independent Variable|IV]] 독립변인 * [[:Dependent Variable|DV]] 종속변인 * Control variable 제어변인 * Mediating (Intervening) variable 매개변인 ===== Assignment ===== ====== Week03 (Sep 18, 20) ====== ===== Activities ===== * Grouping. See [[./Group]] page * Group discussion on group works ===== Concepts and ideas ===== You __should be knoweldgeable__ about [[:research question]] and [[:hypothesis]] building. However, we will be deal with the issue in the class. Please read the two and [[:research_methods_lecture_note#커뮤니케이션_연구문제_제기와_가설]] individually. The materials will be on quizzes. [[:r:navigating|Navigating]] software - Introduction - Getting and Setting the Working Directory - Saving Your Workspace - Viewing Your Command History - Saving the Result of the Previous Command - Displaying the Search Path - Accessing the Functions in a Package - Accessing Built-in Datasets - Viewing the List of Installed Packages - Installing Packages from CRAN - Setting a Default CRAN Mirror - Suppressing the Startup Message - Running a Script - Running a Batch Script - Getting and Setting Environment Variables - Locating the R Home Directory - Customizing R [[:r:input_output|Input and output]] - Introduction - Entering Data from the Keyboard - Printing Fewer Digits (or More Digits) - Redirecting Output to a File - Listing Files - Dealing with “Cannot Open File” in Windows - Reading Fixed-Width Records - Reading Tabular Data Files - Reading from CSV Files - Writing to CSV Files - Reading Tabular or CSV Data from the Web - Reading Data from HTML Tables - Reading Files with a Complex Structure - Reading from MySQL Databases - Saving and Transporting Objects ===== Assignment ===== Assignment for all * Read [[:research_methods_lecture_note#커뮤니케이션_연구문제_제기와_가설]] * Read [[:research question]] * Read [[:hypothesis]] Group assignment * Hypothesis 문서의 [[:hypothesis#예_1]]의 "제3자 효과이론과 침묵의 나선이론 연계성" 논문을 읽고 가설을 기술하시오. * 각 가설의 독립변인(Independent variables), 종속변인 (dependent variabless) 등을 나열하시오. * 이 논문에 사용된 이론은 무엇인지 기술하고 설명하시오. ====== Week04 (Sep 25, 27) ====== ===== Class Activity ===== * 가설 만들어 보기 * No need to read [[:theories]] * the third person effect * [[:Spiral of Silence]] * [[:cognitive dissonance]] * Read [[:hypothesis]] * [[http://behavioralsciencewriting.blogspot.kr/2011/09/how-to-write-hypothesis.html|how to write hypothesis]] at behavioral science writing. * One sample hypothesis [[http://www.socialresearchmethods.net/kb/hypothes.php|Hypothesis]] at www.socialresearchmethods.net ===== Concepts and ideas ===== [[:r:Data Structures]] - Introduction - Appending Data to a Vector - Inserting Data into a Vector - Understanding the Recycling Rule - Creating a Factor (Categorical Variable) - Combining Multiple Vectors into One Vector and a Factor - Creating a List - Selecting List Elements by Position - Selecting List Elements by Name - Building a Name/Value Association List - Removing an Element from a List - Flatten a List into a Vector - Removing NULL Elements from a List - Removing List Elements Using a Condition - Initializing a Matrix - Performing Matrix Operations - Giving Descriptive Names to the Rows and Columns of a Matrix - Selecting One Row or Column from a Matrix - Initializing a Data Frame from Column Data - Initializing a Data Frame from Row Data - Appending Rows to a Data Frame - Preallocating a Data Frame - Selecting Data Frame Columns by Position - Selecting Data Frame Columns by Name - Selecting Rows and Columns More Easily - Changing the Names of Data Frame Columns - Editing a Data Frame - Removing NAs from a Data Frame - Excluding Columns by Name - Combining Two Data Frames - Merging Data Frames by Common Column - Accessing Data Frame Contents More Easily - Converting One Atomic Value into Another - Converting One Structured Data Type into Another [[:r:data transformations]] ===== Assignment ===== ga04.making.hypothesis 가설 연습 ajoubb * 첫번째, R(rstudio사용)에서 default로 구할 수 있는 mtcars 데이터를 이용하여 t-test와 anova test를 할 수 있는 가설을 만들고, R에서 분석해 보세요. * 가설에 대해서는 [[:hypothesis testing]] 문서를 참조하시기 바랍니다. * t-test는 [[:t-test]]를 참조하시기 바랍니다. * 4가지 종류의 t-test 중에서 mtcars 데이터의 경우는 몇 번째 것을 사용해야 하는가에 대해서 확인하세요. * anova에 대해서는 [[:anova]] 문서를 참조하세요. * R에서의 분석은 각각 t.test와 aov 펑션을 이용해야 합니다. * 두번째, 신문에서의 여론조사 결과에 나오는 error of margin에 대해서 확인해보시기 바랍니다. * 여론조사 결과가 내용인 신문기사 2개를 고릅니다. * [[http://www.realmeter.net/%ea%b3%a0%ec%9c%84%ec%a7%81-%ec%9e%90%eb%85%80-%ec%9e%85%ec%8b%9c%eb%b9%84%eb%a6%ac-%ec%a0%84%ec%88%98%ec%a1%b0%ec%82%ac-%ec%b0%ac%ec%84%b175vs%eb%b0%98%eb%8c%8018/|예]] * 일반적인 se값은 아래와 같이 구합니다. * $ \displaystyle \sigma_{\hat{p}} = \sqrt{\frac{p*q}{n}} , \;\;\; q = (1 - p) $ * $ p = .752 $ = 75.2% * 파일을 upload한다면 파일이름은 * ga04.making.hypothesis.그룹이름.ext 과 같이 저장한 후에 올리시기 바랍니다. * 위에서 "그룹이름"과 "ext"은 그룹에 따라서 바꾸야 합니다. * 3조의 경우는 "그룹이름"대신 03을 사용합니다. * ms word파일로 저장을 했다면 파일extension으로 "docx"가 생길겁니다. text파일로 저장을 했다면 "txt"가 생길 것입니다. * 따라서 위의 예에 따르면 과제 이름은 * ga04.making.hypothesis.03.txt와 같을 겁니다. ====== Week05 (Oct 2, 4) ====== ===== ideas and concepts ===== [[:r:probability]] [[:r:General Statistics]] ==== t.test: mtcars ====


> mdata <- split(mtcars$mpg, mtcars$am)
> mdata
$`0`
 [1] 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[13] 10.4 14.7 21.5 15.5 15.2 13.3 19.2

$`1`
 [1] 21.0 21.0 22.8 32.4 30.4 33.9 27.3 26.0 30.4 15.8 19.7 15.0
[13] 21.4

> stack(mdata)
   values ind
1    21.4   0
2    18.7   0
3    18.1   0
4    14.3   0
5    24.4   0
6    22.8   0
7    19.2   0
8    17.8   0
9    16.4   0
10   17.3   0
11   15.2   0
12   10.4   0
13   10.4   0
14   14.7   0
15   21.5   0
16   15.5   0
17   15.2   0
18   13.3   0
19   19.2   0
20   21.0   1
21   21.0   1
22   22.8   1
23   32.4   1
24   30.4   1
25   33.9   1
26   27.3   1
27   26.0   1
28   30.4   1
29   15.8   1
30   19.7   1
31   15.0   1
32   21.4   1
> mdata
$`0`
 [1] 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[13] 10.4 14.7 21.5 15.5 15.2 13.3 19.2

$`1`
 [1] 21.0 21.0 22.8 32.4 30.4 33.9 27.3 26.0 30.4 15.8 19.7 15.0
[13] 21.4

> t.test(mpg~am, data=mtcars)

	Welch Two Sample t-test

data:  mpg by am
t = -3.7671, df = 18.332, p-value = 0.001374
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -11.280194  -3.209684
sample estimates:
mean in group 0 mean in group 1 
       17.14737        24.39231 

> t.test(mpg~am, data=mtcars, var.equal=T)

	Two Sample t-test

data:  mpg by am
t = -4.1061, df = 30, p-value = 0.000285
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -10.84837  -3.64151
sample estimates:
mean in group 0 mean in group 1 
       17.14737        24.39231 

> m1 <- mdata[[1]]
> m2 <- mdata[[2]]
> m1
 [1] 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[13] 10.4 14.7 21.5 15.5 15.2 13.3 19.2
> m2
 [1] 21.0 21.0 22.8 32.4 30.4 33.9 27.3 26.0 30.4 15.8 19.7 15.0
[13] 21.4
> m1.var <- var(m1)
> m2.var <- var(m2)
> m1.n <- length(m1)
> m2.n <- length(m2)
> m1.df <- length(m1)-1
> m2.df <- length(m2)-1
> m1.ss <- m1.var*m1.df
> m2.ss <- m2.var*m2.df
> m1.ss
[1] 264.5874
> m2.ss
[1] 456.3092
> m12.ss <- m1.ss+m2.ss
> m12.ss
[1] 720.8966
> m12.df <- m1.df+m2.df
> pv <- m12.ss/m12.df
> pv
[1] 24.02989
> pv/m1.n
[1] 1.264731
> pv/m2.n
[1] 1.848453
> m.se <- sqrt((pv/m1.n)+(pv/m2.n))
> m.se
[1] 1.764422
> m1.m <- mean(m1)
> m2.m <- mean(m2)
> m.tvalue <- (m1.m-m2.m)/m.se
> m.tvalue
[1] -4.106127


> t.test(mpg~am, data=mtcars, var.equal=T)

	Two Sample t-test

data:  mpg by am
t = -4.1061, df = 30, p-value = 0.000285
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -10.84837  -3.64151
sample estimates:
mean in group 0 mean in group 1 
       17.14737        24.39231

==== anova: mtcars ====


stats4each = function(x,y) {
   meani <- tapply(x,y,mean)
   vari <- tapply(x,y,var)
   ni <- tapply(x,y,length)
   dfi <- tapply(x,y,length)-1
   ssi <- tapply(x,y,var)*(tapply(x,y,length)-1)
   out <- rbind(meani,vari,ni,dfi,ssi)
   
   return(out)  
}

library(MASS)

tempd <- iris
x <- tempd$Species
y <- tempd$Sepal.Width

tempd <- mtcars
x <- tempd$gear
y <- tempd$mpg

tempd <- mtcars
x <- tempd$am
y <- tempd$mpg


x <- factor(x)
dfbetween <- nlevels(x)-1

stats <- stats4each(y, x)
stats 

sswithin <- sum(stats[5,])
sstotal <- var(y)*(length(y)-1)
ssbetween <- sstotal-sswithin

round(sswithin,2)
round(ssbetween,2)
round(sstotal,2)

dfwithin <- sum(stats[4,])
dftotal <- length(y)-1

dfwithin
dfbetween
dftotal

mswithin <- sswithin / dfwithin
msbetween <- ssbetween / dfbetween
mstotal <- sstotal / dftotal

round(mswithin,2)
round(msbetween,2)
round(mstotal,2)

fval <- round(msbetween/mswithin,2)
fval
siglevel <- pf(q=fval, df1=dfbetween, df2=dfwithin, lower.tail=FALSE)
siglevel

mod <- aov(y~x, data=tempd)
summary(mod)

==== cor ====


attach(mtcars)
cor(mpg, hp)

mycor <- cov(mpg,hp)/(sd(mpg)*sd(hp))
mycor

sp <- cov(mpg,hp)*(length(mtcars$hp)-1)
ssx <- var(mpg)*(length(mtcars$mpg)-1)
ssy <- var(hp)*(length(mtcars$hp)-1)

mycor2 <- sp/sqrt(ssx*ssy)
mycor2

mycor2 == mycor
mycor == cor(mpg,hp)
mycor2 == cor(mpg,hp)

===== Assignment ===== ====== Week06 (Oct 9, 11) ====== ===== ideas and concepts ===== [[:correlation]] [[:regression]] [[:multiple regression]] * [[:r:correlation|correlation in r]] * [[:r:multiple regression|multiple regression in r]] [[:Partial and semipartial correlation]] [[:using dummy variables]] [[:Statistical Regression Methods]] [[:Sequential Regression]] ===== Assignment ===== - Public opinion in online environments ((refer to {{:public.opinion.theories.introduction.pdf}} )) * [[:Spiral of Silence]] * [[:Pluralistic Ignorance]] * [[:The Third Person Effect]] * etc. 여론형성과 관련된 사회학적 혹은 사회심리학적 이론을 찾아보고 소개하기, 예로 위의 세가지. 얼마전 사회현상을 어떻게 설명하면 좋을까에 대해서 논의정리하기? 정확한 온라인 환경에서의 여론파악을 위해서 어떤 것이 필요할까? * 혹은 다른 문제에 대해서 (. . . 조에 따른 . . .) - Hypotheses * Multiple regression hypotheses. * Google Survey Questions ====== Week07 (Oct 16, 18) ====== ===== ideas and concepts ===== ===== Assignment ===== ====== Week08 (Oct 23, 25) ====== __**Mid-term period**__ ===== Quiz the first one ===== * Lecture materials + textbook * Textbook: r cookbook: textbook과 관련해서는 예상되는 아웃풋, 아웃풋을 얻기위한 명령어, 명령어(function)에 사용되는 옵션이 의미하는 것 등에 대한 사지선다 혹은 단답식 질문이 나옵니다. 펑션의 옵션사용 등과 같은 정확한 것에 대해서는 질문이 나오지 않습니다. * 예 * one sample t-test를 하기 위한 명령어를 쓰시오 (x) * t.test(sample, mu=100)에서 mu는 무엇을 의미하는가? (o) * 다음 중 sapply의 아웃풋 모양으로 적당한 것은? 등등 * [[:The r project for statistical computing]] * [[:r:Getting started]] * [[:r:Basics]] * [[:r:Navigating]] * [[:r:Input output]] * [[:r:Data structures]] * [[:r:Data transformations]] * Lecture content * [[:Hypothesis]], * [[:Research question]], * [[:Research methods lecture note#커뮤니케이션_연구문제_제기와_가설|커뮤니케이션 연구문제 제기와 가설]] 부분만 * [[:Operationalization]], * [[:Variables]], * [[:Types of variables]] * [[:Hypothesis testing]] * [[:T-test]] * 정확한 t test 공식등은 외울 필요가 없습니다. (제공됩니다). * 간단한 t test 계산을 요구할 수 있습니다. * ANOVA도 마찬가지입니다. * [[:ANOVA]] ====== Week09 (Oct 30, Nov 1) ====== ===== ideas and concepts ===== [[:correlation]] [[:regression]] [[:multiple regression]] * [[:r:correlation|correlation in r]] * [[:r:multiple regression|multiple regression in r]] [[:Partial and semipartial correlation]] [[:using dummy variables]] [[:Statistical Regression Methods]] [[:Sequential Regression]] ===== Activity ===== [[c/ma/2019/Multiple Regression Exercise]] ===== Assignment ===== ====== Week10 (Nov 6, 8) ====== ===== ideas and concepts ===== [[:factor analysis]] ===== Assignment ===== ====== Week11 (Nov 13, 15) ====== ===== ideas and concepts ===== ===== Assignment ===== ====== Week12 (Nov 20, 22) ====== ===== ideas and concepts ===== ===== Assignment ===== [[factor analysis assignment]] ====== Week13 (Nov 27, 29) ====== ===== ideas and concepts ===== [[:social network analysis]] [[:r:social network analysis tutorial]] [[:r:social network analysis|sna in r]] [[:sna_eg_stanford|Stanford University egs.]] ===== announcement ===== Quiz 2 (on Friday Dec. the 6th) covers: * [[:correlation]] * [[:regression]] * [[:multiple regression]] * [[:partial and semipartial correlation]] * [[:using dummy variables]] * [[:factor analysis]] Some R outputs will be used to ask the related concepts and ideas (the above). ===== Assignment ===== ====== Week14 (Dec 4, 6) ====== Group Presentation ====== Week15 (Dec 11, 13) ====== [[./assignment week15]] ====== Week16 (June 18, 20) ====== __**Final-term**__ covers: correlation regression multiple regression partial and semipartial correlation using dummy variables factor analysis [[:social network analysis]] [[:r:social network analysis tutorial|sna tutorial]] [[:r:social network analysis|sna in r]] [[:sna_eg_stanford:lab06|SNA e.g. lab 06]] Some R outputs will be used to ask the related concepts and ideas (the above).