[[./|Class page]] multivariate statistics in R network analysis in R * A User’s Guide to Network Analysis in R (Use R!) * Statistical Analysis of Network Data with R (Use R!) 2014th Edition [[https://lagunita.stanford.edu]] [[https://campus.datacamp.com/courses/network-analysis-in-r|Network Analysis in R]] using igraph package -- from Datacamp [[https://campus.datacamp.com/courses/marketing-analytics-in-r-statistical-modeling/|Marketing analysis in r statistics]] from Datacamp ====== Week01 (Sep 4, 6) ====== ===== ideas and concepts ===== Introduction to R and others - Downloading and Installing R - [[:the_r_project_for_statistical_computing]] - [[:r]], [[:r:getting started]] - Starting R - Entering Commands - Exiting from R - Interrupting R - Viewing the Supplied Documentation - Getting Help on a Function - Searching the Supplied Documentation - Getting Help on a Package - Searching the Web for Help - Finding Relevant Functions and Packages - Searching the Mailing Lists - Submitting Questions to the Mailing Lists using [[:theories]] [[http://commres.net/wiki/research_methods_lecture_note#%EC%BB%A4%EB%AE%A4%EB%8B%88%EC%BC%80%EC%9D%B4%EC%85%98_%EC%97%B0%EA%B5%AC%EB%AC%B8%EC%A0%9C_%EC%A0%9C%EA%B8%B0%EC%99%80_%EA%B0%80%EC%84%A4|연구문제와 가설]] and making [[:hypothesis|hypotheses]] Installing R ===== Assignment ===== ====== Week02 (Sep 11, 13) ====== ===== Concepts and ideas ===== Some [[:R:basics|basics]] - Introduction - Printing Something - Setting Variables - Listing Variables - Deleting Variables - Creating a Vector - Computing Basic Statistics - Creating Sequences - Comparing Vectors - Selecting Vector Elements - Performing Vector Arithmetic - Getting Operator Precedence Right - Defining a Function - Typing Less and Accomplishing More - Avoiding Some Common Mistakes * [[:Research Question]] * [[:Hypothesis]] * Educated guess (via theories) * Difference * Association * Variables (vs. ideas, concepts, and constructs) * [[:Operationalization]] * [[:Variables]], [[:Types of Variables]] * see [[http://chohongjoong.com/gnu4/bbs/board.php?bo_table=board02&wr_id=311&sfl=&stx=&sst=wr_datetime&sod=desc&sop=and&page=1|this blog]] written in Korean * [[:Independent Variable|IV]] 독립변인 * [[:Dependent Variable|DV]] 종속변인 * Control variable 제어변인 * Mediating (Intervening) variable 매개변인 ===== Assignment ===== ====== Week03 (Sep 18, 20) ====== ===== Activities ===== * Grouping. See [[./Group]] page * Group discussion on group works ===== Concepts and ideas ===== You __should be knoweldgeable__ about [[:research question]] and [[:hypothesis]] building. However, we will be deal with the issue in the class. Please read the two and [[:research_methods_lecture_note#커뮤니케이션_연구문제_제기와_가설]] individually. The materials will be on quizzes. [[:r:navigating|Navigating]] software - Introduction - Getting and Setting the Working Directory - Saving Your Workspace - Viewing Your Command History - Saving the Result of the Previous Command - Displaying the Search Path - Accessing the Functions in a Package - Accessing Built-in Datasets - Viewing the List of Installed Packages - Installing Packages from CRAN - Setting a Default CRAN Mirror - Suppressing the Startup Message - Running a Script - Running a Batch Script - Getting and Setting Environment Variables - Locating the R Home Directory - Customizing R [[:r:input_output|Input and output]] - Introduction - Entering Data from the Keyboard - Printing Fewer Digits (or More Digits) - Redirecting Output to a File - Listing Files - Dealing with “Cannot Open File” in Windows - Reading Fixed-Width Records - Reading Tabular Data Files - Reading from CSV Files - Writing to CSV Files - Reading Tabular or CSV Data from the Web - Reading Data from HTML Tables - Reading Files with a Complex Structure - Reading from MySQL Databases - Saving and Transporting Objects ===== Assignment ===== Assignment for all * Read [[:research_methods_lecture_note#커뮤니케이션_연구문제_제기와_가설]] * Read [[:research question]] * Read [[:hypothesis]] Group assignment * Hypothesis 문서의 [[:hypothesis#예_1]]의 "제3자 효과이론과 침묵의 나선이론 연계성" 논문을 읽고 가설을 기술하시오. * 각 가설의 독립변인(Independent variables), 종속변인 (dependent variabless) 등을 나열하시오. * 이 논문에 사용된 이론은 무엇인지 기술하고 설명하시오. ====== Week04 (Sep 25, 27) ====== ===== Class Activity ===== * 가설 만들어 보기 * No need to read [[:theories]] * the third person effect * [[:Spiral of Silence]] * [[:cognitive dissonance]] * Read [[:hypothesis]] * [[http://behavioralsciencewriting.blogspot.kr/2011/09/how-to-write-hypothesis.html|how to write hypothesis]] at behavioral science writing. * One sample hypothesis [[http://www.socialresearchmethods.net/kb/hypothes.php|Hypothesis]] at www.socialresearchmethods.net ===== Concepts and ideas ===== [[:r:Data Structures]] - Introduction - Appending Data to a Vector - Inserting Data into a Vector - Understanding the Recycling Rule - Creating a Factor (Categorical Variable) - Combining Multiple Vectors into One Vector and a Factor - Creating a List - Selecting List Elements by Position - Selecting List Elements by Name - Building a Name/Value Association List - Removing an Element from a List - Flatten a List into a Vector - Removing NULL Elements from a List - Removing List Elements Using a Condition - Initializing a Matrix - Performing Matrix Operations - Giving Descriptive Names to the Rows and Columns of a Matrix - Selecting One Row or Column from a Matrix - Initializing a Data Frame from Column Data - Initializing a Data Frame from Row Data - Appending Rows to a Data Frame - Preallocating a Data Frame - Selecting Data Frame Columns by Position - Selecting Data Frame Columns by Name - Selecting Rows and Columns More Easily - Changing the Names of Data Frame Columns - Editing a Data Frame - Removing NAs from a Data Frame - Excluding Columns by Name - Combining Two Data Frames - Merging Data Frames by Common Column - Accessing Data Frame Contents More Easily - Converting One Atomic Value into Another - Converting One Structured Data Type into Another [[:r:data transformations]] ===== Assignment ===== ga04.making.hypothesis 가설 연습 ajoubb * 첫번째, R(rstudio사용)에서 default로 구할 수 있는 mtcars 데이터를 이용하여 t-test와 anova test를 할 수 있는 가설을 만들고, R에서 분석해 보세요. * 가설에 대해서는 [[:hypothesis testing]] 문서를 참조하시기 바랍니다. * t-test는 [[:t-test]]를 참조하시기 바랍니다. * 4가지 종류의 t-test 중에서 mtcars 데이터의 경우는 몇 번째 것을 사용해야 하는가에 대해서 확인하세요. * anova에 대해서는 [[:anova]] 문서를 참조하세요. * R에서의 분석은 각각 t.test와 aov 펑션을 이용해야 합니다. * 두번째, 신문에서의 여론조사 결과에 나오는 error of margin에 대해서 확인해보시기 바랍니다. * 여론조사 결과가 내용인 신문기사 2개를 고릅니다. * [[http://www.realmeter.net/%ea%b3%a0%ec%9c%84%ec%a7%81-%ec%9e%90%eb%85%80-%ec%9e%85%ec%8b%9c%eb%b9%84%eb%a6%ac-%ec%a0%84%ec%88%98%ec%a1%b0%ec%82%ac-%ec%b0%ac%ec%84%b175vs%eb%b0%98%eb%8c%8018/|예]] * 일반적인 se값은 아래와 같이 구합니다. * $ \displaystyle \sigma_{\hat{p}} = \sqrt{\frac{p*q}{n}} , \;\;\; q = (1 - p) $ * $ p = .752 $ = 75.2% * 파일을 upload한다면 파일이름은 * ga04.making.hypothesis.그룹이름.ext 과 같이 저장한 후에 올리시기 바랍니다. * 위에서 "그룹이름"과 "ext"은 그룹에 따라서 바꾸야 합니다. * 3조의 경우는 "그룹이름"대신 03을 사용합니다. * ms word파일로 저장을 했다면 파일extension으로 "docx"가 생길겁니다. text파일로 저장을 했다면 "txt"가 생길 것입니다. * 따라서 위의 예에 따르면 과제 이름은 * ga04.making.hypothesis.03.txt와 같을 겁니다. ====== Week05 (Oct 2, 4) ====== ===== ideas and concepts ===== [[:r:probability]] [[:r:General Statistics]] ==== t.test: mtcars ==== > mdata <- split(mtcars$mpg, mtcars$am) > mdata $`0` [1] 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 [13] 10.4 14.7 21.5 15.5 15.2 13.3 19.2 $`1` [1] 21.0 21.0 22.8 32.4 30.4 33.9 27.3 26.0 30.4 15.8 19.7 15.0 [13] 21.4 > stack(mdata) values ind 1 21.4 0 2 18.7 0 3 18.1 0 4 14.3 0 5 24.4 0 6 22.8 0 7 19.2 0 8 17.8 0 9 16.4 0 10 17.3 0 11 15.2 0 12 10.4 0 13 10.4 0 14 14.7 0 15 21.5 0 16 15.5 0 17 15.2 0 18 13.3 0 19 19.2 0 20 21.0 1 21 21.0 1 22 22.8 1 23 32.4 1 24 30.4 1 25 33.9 1 26 27.3 1 27 26.0 1 28 30.4 1 29 15.8 1 30 19.7 1 31 15.0 1 32 21.4 1 > mdata $`0` [1] 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 [13] 10.4 14.7 21.5 15.5 15.2 13.3 19.2 $`1` [1] 21.0 21.0 22.8 32.4 30.4 33.9 27.3 26.0 30.4 15.8 19.7 15.0 [13] 21.4 > t.test(mpg~am, data=mtcars) Welch Two Sample t-test data: mpg by am t = -3.7671, df = 18.332, p-value = 0.001374 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -11.280194 -3.209684 sample estimates: mean in group 0 mean in group 1 17.14737 24.39231 > t.test(mpg~am, data=mtcars, var.equal=T) Two Sample t-test data: mpg by am t = -4.1061, df = 30, p-value = 0.000285 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -10.84837 -3.64151 sample estimates: mean in group 0 mean in group 1 17.14737 24.39231 > m1 <- mdata[[1]] > m2 <- mdata[[2]] > m1 [1] 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 [13] 10.4 14.7 21.5 15.5 15.2 13.3 19.2 > m2 [1] 21.0 21.0 22.8 32.4 30.4 33.9 27.3 26.0 30.4 15.8 19.7 15.0 [13] 21.4 > m1.var <- var(m1) > m2.var <- var(m2) > m1.n <- length(m1) > m2.n <- length(m2) > m1.df <- length(m1)-1 > m2.df <- length(m2)-1 > m1.ss <- m1.var*m1.df > m2.ss <- m2.var*m2.df > m1.ss [1] 264.5874 > m2.ss [1] 456.3092 > m12.ss <- m1.ss+m2.ss > m12.ss [1] 720.8966 > m12.df <- m1.df+m2.df > pv <- m12.ss/m12.df > pv [1] 24.02989 > pv/m1.n [1] 1.264731 > pv/m2.n [1] 1.848453 > m.se <- sqrt((pv/m1.n)+(pv/m2.n)) > m.se [1] 1.764422 > m1.m <- mean(m1) > m2.m <- mean(m2) > m.tvalue <- (m1.m-m2.m)/m.se > m.tvalue [1] -4.106127 > t.test(mpg~am, data=mtcars, var.equal=T) Two Sample t-test data: mpg by am t = -4.1061, df = 30, p-value = 0.000285 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -10.84837 -3.64151 sample estimates: mean in group 0 mean in group 1 17.14737 24.39231 ==== anova: mtcars ==== stats4each = function(x,y) { meani <- tapply(x,y,mean) vari <- tapply(x,y,var) ni <- tapply(x,y,length) dfi <- tapply(x,y,length)-1 ssi <- tapply(x,y,var)*(tapply(x,y,length)-1) out <- rbind(meani,vari,ni,dfi,ssi) return(out) } library(MASS) tempd <- iris x <- tempd$Species y <- tempd$Sepal.Width tempd <- mtcars x <- tempd$gear y <- tempd$mpg tempd <- mtcars x <- tempd$am y <- tempd$mpg x <- factor(x) dfbetween <- nlevels(x)-1 stats <- stats4each(y, x) stats sswithin <- sum(stats[5,]) sstotal <- var(y)*(length(y)-1) ssbetween <- sstotal-sswithin round(sswithin,2) round(ssbetween,2) round(sstotal,2) dfwithin <- sum(stats[4,]) dftotal <- length(y)-1 dfwithin dfbetween dftotal mswithin <- sswithin / dfwithin msbetween <- ssbetween / dfbetween mstotal <- sstotal / dftotal round(mswithin,2) round(msbetween,2) round(mstotal,2) fval <- round(msbetween/mswithin,2) fval siglevel <- pf(q=fval, df1=dfbetween, df2=dfwithin, lower.tail=FALSE) siglevel mod <- aov(y~x, data=tempd) summary(mod) ==== cor ==== attach(mtcars) cor(mpg, hp) mycor <- cov(mpg,hp)/(sd(mpg)*sd(hp)) mycor sp <- cov(mpg,hp)*(length(mtcars$hp)-1) ssx <- var(mpg)*(length(mtcars$mpg)-1) ssy <- var(hp)*(length(mtcars$hp)-1) mycor2 <- sp/sqrt(ssx*ssy) mycor2 mycor2 == mycor mycor == cor(mpg,hp) mycor2 == cor(mpg,hp) ===== Assignment ===== ====== Week06 (Oct 9, 11) ====== ===== ideas and concepts ===== [[:correlation]] [[:regression]] [[:multiple regression]] * [[:r:correlation|correlation in r]] * [[:r:multiple regression|multiple regression in r]] [[:Partial and semipartial correlation]] [[:using dummy variables]] [[:Statistical Regression Methods]] [[:Sequential Regression]] ===== Assignment ===== - Public opinion in online environments ((refer to {{:public.opinion.theories.introduction.pdf}} )) * [[:Spiral of Silence]] * [[:Pluralistic Ignorance]] * [[:The Third Person Effect]] * etc. 여론형성과 관련된 사회학적 혹은 사회심리학적 이론을 찾아보고 소개하기, 예로 위의 세가지. 얼마전 사회현상을 어떻게 설명하면 좋을까에 대해서 논의정리하기? 정확한 온라인 환경에서의 여론파악을 위해서 어떤 것이 필요할까? * 혹은 다른 문제에 대해서 (. . . 조에 따른 . . .) - Hypotheses * Multiple regression hypotheses. * Google Survey Questions ====== Week07 (Oct 16, 18) ====== ===== ideas and concepts ===== ===== Assignment ===== ====== Week08 (Oct 23, 25) ====== __**Mid-term period**__ ===== Quiz the first one ===== * Lecture materials + textbook * Textbook: r cookbook: textbook과 관련해서는 예상되는 아웃풋, 아웃풋을 얻기위한 명령어, 명령어(function)에 사용되는 옵션이 의미하는 것 등에 대한 사지선다 혹은 단답식 질문이 나옵니다. 펑션의 옵션사용 등과 같은 정확한 것에 대해서는 질문이 나오지 않습니다. * 예 * one sample t-test를 하기 위한 명령어를 쓰시오 (x) * t.test(sample, mu=100)에서 mu는 무엇을 의미하는가? (o) * 다음 중 sapply의 아웃풋 모양으로 적당한 것은? 등등 * [[:The r project for statistical computing]] * [[:r:Getting started]] * [[:r:Basics]] * [[:r:Navigating]] * [[:r:Input output]] * [[:r:Data structures]] * [[:r:Data transformations]] * Lecture content * [[:Hypothesis]], * [[:Research question]], * [[:Research methods lecture note#커뮤니케이션_연구문제_제기와_가설|커뮤니케이션 연구문제 제기와 가설]] 부분만 * [[:Operationalization]], * [[:Variables]], * [[:Types of variables]] * [[:Hypothesis testing]] * [[:T-test]] * 정확한 t test 공식등은 외울 필요가 없습니다. (제공됩니다). * 간단한 t test 계산을 요구할 수 있습니다. * ANOVA도 마찬가지입니다. * [[:ANOVA]] ====== Week09 (Oct 30, Nov 1) ====== ===== ideas and concepts ===== [[:correlation]] [[:regression]] [[:multiple regression]] * [[:r:correlation|correlation in r]] * [[:r:multiple regression|multiple regression in r]] [[:Partial and semipartial correlation]] [[:using dummy variables]] [[:Statistical Regression Methods]] [[:Sequential Regression]] ===== Activity ===== [[c/ma/2019/Multiple Regression Exercise]] ===== Assignment ===== ====== Week10 (Nov 6, 8) ====== ===== ideas and concepts ===== [[:factor analysis]] ===== Assignment ===== ====== Week11 (Nov 13, 15) ====== ===== ideas and concepts ===== ===== Assignment ===== ====== Week12 (Nov 20, 22) ====== ===== ideas and concepts ===== ===== Assignment ===== [[factor analysis assignment]] ====== Week13 (Nov 27, 29) ====== ===== ideas and concepts ===== [[:social network analysis]] [[:r:social network analysis tutorial]] [[:r:social network analysis|sna in r]] [[:sna_eg_stanford|Stanford University egs.]] ===== announcement ===== Quiz 2 (on Friday Dec. the 6th) covers: * [[:correlation]] * [[:regression]] * [[:multiple regression]] * [[:partial and semipartial correlation]] * [[:using dummy variables]] * [[:factor analysis]] Some R outputs will be used to ask the related concepts and ideas (the above). ===== Assignment ===== ====== Week14 (Dec 4, 6) ====== Group Presentation ====== Week15 (Dec 11, 13) ====== [[./assignment week15]] ====== Week16 (June 18, 20) ====== __**Final-term**__ covers: correlation regression multiple regression partial and semipartial correlation using dummy variables factor analysis [[:social network analysis]] [[:r:social network analysis tutorial|sna tutorial]] [[:r:social network analysis|sna in r]] [[:sna_eg_stanford:lab06|SNA e.g. lab 06]] Some R outputs will be used to ask the related concepts and ideas (the above).