User Tools

Site Tools


c:ma:2018:schedule

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
c:ma:2018:schedule [2018/11/27 01:44] hkimscilc:ma:2018:schedule [2018/12/17 10:01] (current) – [Week15 (Dec. 11, 14)] hkimscil
Line 435: Line 435:
 <WRAP half column> <WRAP half column>
 ===== Concepts and ideas ===== ===== Concepts and ideas =====
-[[:Regression]] [[:Multiple Regression]] +Do the following 
-[[:r:Linear Regression]] and [[:r:ANOVA]] +<code>S1 <- c(89, 85, 85, 86, 88, 89, 86, 82, 96, 85, 93, 91,  
-  - Introduction +        98, 87, 94, 77, 87, 98, 85, 89, 95, 85, 93, 93,  
-  Performing Simple Linear Regression +        97, 71, 97, 93, 75, 68, 98, 95, 79, 94, 98, 95) 
-  - Performing Multiple Linear Regression +S2 <- c(60, 98, 94, 95, 99, 97, 100, 73, 93, 91, 98,  
-  - Getting Regression Statistics +        86, 66, 83, 77, 97, 91, 93, 71, 91, 95, 100,  
-  - Understanding the Regression Summary +        72, 96, 91, 76, 100, 97, 99, 95, 97, 77, 94,  
-  - Performing Linear Regression Without an Intercept +        99, 88, 100, 94, 93, 86) 
-  - Performing Linear Regression with Interaction Terms +S3 <- c(95, 86, 90, 90, 75, 83, 96, 85, 83, 84, 81, 98,  
-  - Selecting the Best Regression Variables +        77, 94, 84, 89, 93, 99, 91, 77, 95, 90, 91, 87,  
-  - Regressing on a Subset of Your Data +        85, 76, 99, 99, 97, 97, 97, 77, 93, 96, 90, 87,  
-  - Using an Expression Inside Regression Formula +        97, 88) 
-  - Regressing on Polynomial +S4 <- c(67, 93, 63, 83, 87, 97, 96, 92, 93, 96, 87, 90,  
-  Regressing on Transformed Data +        94, 90, 82, 91, 85, 93, 83, 90, 87, 99, 94, 88,  
-  - Finding the Best Power Transformation (Box?Cox Procedure+        90, 72, 81, 93, 93, 94, 97, 89, 96, 95, 82, 97) 
-  Forming Confidence Intervals for Regression Coefficients + 
-  Plotting Regression Residuals +scores <- list(S1=S1,S2=S2,S3=S3,S4=S4)</code> 
-  Diagnosing a Linear Regression + 
-  Identifying Influential Observations +  * find means for each element in "scores" in a list format 
-  Testing Residuals for Autocorrelation (Durbin?Watson Test+  * find standard deviation for each element in "scores" in a data frame format 
-  Predicting New Values +  * find variance  for each element in "scores" in a data frame format without using "var" function 
-  - Forming Prediction Intervals + 
-  - Performing One-Way ANOVA +<code>longdata<- c(-1.850152, -1.406571, -1.0104817, -3.7170704,  
-  - Creating an Interaction Plot +           -0.2804896, 0.9496313, 1.346517, -0.1580926, 1.6272786,  
-  - Finding Differences Between Means of Groups +           -2.4483321, -0.5407272, -1.708678, -0.3480616, -0.2757667,  
-  - Performing Robust ANOVA (Kruskal?Wallis Test+           -1.2177024)</code> 
-  - Comparing Models by Using ANOVA+  * make "longdata" to a matrix whose size is  3 by 5 
 +  * name columns "trial1, trial2, . . . . trial5" 
 +  * name rows "subject1, subject2, subject3" 
 +  * get means for each subject  
 +  * attach the above data to the matrix data and name it "longtemp." 
 +  * get standard deviation for each trial 
 +  * attach the above data to the matrix data, "longtemp." 
 + 
 + 
 +<code>suburbs <- read.csv("http://commres.net/wiki/_export/code/r/data_transformations?codeblock=15", head=T, sep=" ")</code> 
 +  * get subrubs data  as the above 
 +  * get population means by each state (listed in the data, suburbs) 
 +    * use aggregate and refer to the below e.g. 
 +<code>attach(Cars93) 
 +aggregate(MPG.city ~ Origin, Cars93, mean)</code> 
 +  * get population sum by each county with  tapply function.  
 +  * tapply(number, byfactor, function) 
 +  * how many counties are there? 
 +  * Use Cars93 data, get MPG.city mean by Origin. 
 + 
 +__Using pnorm, qnorm__ 
 +pnorm get proportion out of normal distribution whose characteristics are mean and sd 
 +<code>pnorm(84, mean=72, sd=15.2, lower.tail=FALSE)</code> 
 +  * What is the value of  the below? 
 +<code>pnorm(1)</code> 
 +  * How would you get 68, 95, 99% from pnorm 
 +      * use ?pnorm and see the default option 
 + 
 +  * generate 10 random numbers with runif function 
 + 
 +<code>year <- c(1900:2016)     # years in vector year 
 +world.series <- data.frame(year)</code> 
 +  * get 10 year samples out of world.series data with "sample" command 
 +  * how would you get the sample sample again latter? 
 + 
 +<code>pnorm(110, mean=100, sd=10)</code> 
 +  * What would be the result from the above? 
 + 
 +<code>library(MASS)       # load the MASS package  
 +tbl = table(survey$Smoke, survey$Exer)  
 +tbl                 # the contingency table</code>  
 + 
 +<code>summary(tbl) 
 +</code> 
 +  * read the above output and interpret 
 +  * what about the below one? 
 +<code>chisq.test(tbl)  
 +</code> 
 + 
 +see first [[:chi-square test]] 
 +see [[:r:chi-square test]] in r document space for more 
 + 
 +<code> library(MASS) 
 + cardata <data.frame(Cars93$Origin, Cars93$Type) 
 + cardata 
 +</code> 
 +  * Can you say the types of cars are different by the Origins? 
 + 
 +<code>dur <faithful$eruptions 
 +dur</code> 
 +  * make the above data into z-score (zdur). 
 +  * get mean of the zdur 
 +  * get sd of the zdur 
 + 
 +<code> 
 +set.seed(1123) 
 +x <rnorm(50, mean=100, sd=15) 
 +</code> 
 +  * test x against population  mean 95. 
 +  * test x against population  mean 99. 
 +  * are they different from each other? 
 +  * what would you do if you want to see the different result from the  second one? 
 + 
 +<code>= c(65, 78, 88, 55, 48, 95, 66, 57, 79, 81) 
 + 
 +> t.test(a, mu=60) 
 + 
 + One Sample t-test 
 + 
 +data:  
 +t = 2.3079, df = 9, p-value = 0.0464 
 +alternative hypothesis: true mean is not equal to 60 
 +95 percent confidence interval: 
 + 60.22187 82.17813 
 +sample estimates: 
 +mean of x  
 +     71.2  
 +</code> 
 +  * find the t critical value with function qt.  
 +  * explain what happens in the next code 
 +  * read (or remindwhat pnorm and qnorm do. 
 +<code>> s <sd(x) 
 +> m <mean(x) 
 +> n <length(x) 
 +> n 
 +[1] 50 
 +> m 
 +[1] 96.00386 
 +> s 
 +[1] 17.38321 
 +> SE <s / sqrt(n) 
 +> SE 
 +[1] 2.458358 
 +> E <qt(.975, df=n-1)*SE 
 +> E 
 +[1] 4.940254 
 +> m + c(-E, E) 
 +[1]  91.0636 100.9441 
 +> </code> 
 + 
 + 
 +  * what's wrong with the below? 
 +<code>t.test(x)</code> 
 + 
 +<code>> mtcars</code> 
 +  * using aggregate, get mean for each trnas. type. 
 +  * compare the difference of mileage between auto and manual cars. 
 +    * use t.test (two sample) 
 +    * "use var.equal=T" option 
 + 
 +<code>a = c(175, 168, 168, 190, 156, 181, 182, 175, 174, 179) 
 +b = c(185, 169, 173, 173, 188, 186, 175, 174, 179, 180) 
 +</code> 
 +  * stack them into data c 
 +  * convert colnames into score and trans 
 +  * t.test score by trans with var.equal option true.  
 +  * aov test 
 +  * see  t.test t value, t = -0.9474 and F value,  F = ? 
 </WRAP> </WRAP>
 <WRAP half column> <WRAP half column>
Line 472: Line 600:
 <WRAP half column> <WRAP half column>
 ===== Concepts and ideas ===== ===== Concepts and ideas =====
 +ANOVA
 +[[:r:oneway anova]]
 +[[:r:twoway anova]]
 +[[:r:linear regression]]
 +[[:r:multiple regression]]
 +[[:partial and semipartial correlation]]
 +
 +[[:statistical regression methods]]
 +[[:sequential_regression]]
 +
 + 
 +[[:factor analysis]]
 +
 Linear Regression and ANOVA Linear Regression and ANOVA
 http://commres.net/wiki/text_mining_example_with_korean_songs http://commres.net/wiki/text_mining_example_with_korean_songs
  
-[[:temp|quiz 3 answer]]+ 
  
 </WRAP> </WRAP>
 <WRAP half column> <WRAP half column>
 ===== Assignment ===== ===== Assignment =====
-  - 자신의 전공과 관심사에 맞는 아래의 테스트를 수행하기 위한 가설을 작성하시오.  
-    - T-test 
-    - F-test 
-    - factorial f-test 
-    - Simple regression 
-    - Multiple regression  
-  - 각 가설의 독립변인과 종속변인을 밝히고 이를 측정하는 방법에 대해서 논하시오. 
-  - 가설과 관련이 있는 논문을 찾아서 (적어도 하나 이상씩) 관련 논문이 밝힌 것을 설명하고 자신의 가설과의 연관성을 논하시오.  
-  - 각 가설에 필요한 데이터를 구한 후, 적절한 테스를 하시오 (r의 인풋과 아웃풋 필요). 
-  - 테스트 결과를 논하시오.  
 </WRAP> </WRAP>
  
 ====== Week15 (Dec. 11, 14) ====== ====== Week15 (Dec. 11, 14) ======
 <WRAP half column> <WRAP half column>
-Group Presentation+Final quiz 
 +Part I  (필기시험): NO open book.  
 +  * [[:correlation]]   
 +  * [[:regression]] 
 +  * [[:multiple regression]] 
 +  * [[:chi-square test]] 
 +  * [[:factor analysis]] - 이론적인 이해와 관련된 부분 
 +  * r 과 관련된 내용 중 통계에 대한 이해와 관련된 부분, 예를 들면 
 +    * t-test, ANOVA, Factorial  ANOVA output에 대한 이해 
 +    * regression, multiple regression output에 대한 이해 등 
 +Part II (r 실기시험): 교재와  R help만 허용 
 +  * [[:r:getting started]] 
 +  * [[:r:basics]] 
 +  * [[:r:navigating]] 
 +  * [[:r:input output]] 
 +  * [[:r:data structures]] 
 +  * [[:r:data transformations]] 
 +  * [[:r:probability]] 
 +  * [[:r:general statistics]] 
 +  * [[:r:t-test]] 
 +  * [[:r:anova]] 
 +  * [[:r:linear regression]] 
 +  * [[:r:multiple regression]] 
 +    * [[:partial and semipartial correlation]] 
 +    * [[:statistical regression methods]]
 </WRAP> </WRAP>
 <WRAP half column> <WRAP half column>
 </WRAP> </WRAP>
-<WRAP half column> 
 ====== Week16 (Dec. 18, 21) ====== ====== Week16 (Dec. 18, 21) ======
-Group Presentation+<WRAP half column>
 __**Final-term**__ __**Final-term**__
 </WRAP> </WRAP>
c/ma/2018/schedule.1543250660.txt.gz · Last modified: 2018/11/27 01:44 by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki