R Cookbook
Chapter 1 Getting Started and Getting Help
Chapter 2 Some Basics
Chapter 3 Navigating the Software
Chapter 4 Input and Output
Chapter 5 Data Structures
Chapter 6 Data Transformations
Chapter 7 Strings and Dates
Chapter 8 Probability
Chapter 9 General Statistics
Chapter 10 Graphics
Chapter 11 Linear Regression and ANOVA
Chapter 12 Useful Tricks
Chapter 13 Beyond Basic Numerics and Statistics
Chapter 14 Time Series Analysis

Week01 (Mar 16, 19)

ideas and concepts

https://youtu.be/6ExajWI_r2w
https://youtu.be/J8e5dEH8K_Q
https://youtu.be/W3DhUXI5cyQ
https://youtu.be/qCeTcvWBDNY
https://youtu.be/1hJm0O-RY4Q

Course Introduction –> syllabus

Introduction to R and others

Downloading and Installing R
1. the_r_project_for_statistical_computing
2. r, getting started
Starting R
Entering Commands
Exiting from R
Interrupting R
Viewing the Supplied Documentation
Getting Help on a Function
Searching the Supplied Documentation
Getting Help on a Package
Searching the Web for Help
Finding Relevant Functions and Packages
Searching the Mailing Lists
Submitting Questions to the Mailing Lists

기본용어
기술통계 (descriptive statistics)
추론통계 (inferential statistics)
아래의 개념은 샘플링 문서를 먼저 볼것

전집 (population)
표본 (sample)
모수치 (parameter)
통계치 (statistics)

sampling methods
- probability
- non-probability

가설 (hypothesis)

차이와 연관 (difference and association)

변인 (variables)

Assignment

첫째주 숙제.

class survey - https://bit.ly/3a96nd2
만약에 조사방법론 수업을 동시에 듣고 있다면 지금 서베이 외에 동일한 서베이를 다시 한번 더 해야 합니다 (이 경우 마지막 질문은 생략가능). 조원 구성을 위해서 필요합니다.

Week02 (Mar. 23, 26)

Concepts and ideas

Some basics

Introduction
Printing Something
Setting Variables
Listing Variables
Deleting Variables
Creating a Vector
Computing Basic Statistics
Creating Sequences
Comparing Vectors
Selecting Vector Elements
Performing Vector Arithmetic
Getting Operator Precedence Right
Defining a Function
Typing Less and Accomplishing More
Avoiding Some Common Mistakes

from the previous lecture (research question and hypothesis)

Research Questions (or Problems)
- Two ideas guided by theories
- Questions on their relationships
- Conceptualization
Hypothesis
- Educated guess (via theories)
- Difference
- Association
- Variables (vs. ideas, concepts, and constructs)
  - Operationalization
  - Types of Variables
    - IV
    - DV
    - Control variable
    - Mediating (Intervening) variable

Assignment

Week03 (Mar 30, April 2)

3주차 온라인 강의 동영상

MS the 3rd Week 011: Grouping 22:34
MS the 3rd Week 012: the Basic (R cookbook) 32:00
MS the 3rd Week 013: Navigating the R 12:31
MS the 3rd Week 014: Mean, Median, Mode (Howell, Ch. 4 Part) 16:17

이후 Howell, Ch. 4 내용 중 Variance와 (분산) Standard deviation은 (표준편차는) 이후 통계 검증방법을 이해하는데 기초가 되는 중요한 내용이니 꼭 숙지하시기 바랍니다.

Concepts and ideas

Navigating software

Introduction
Getting and Setting the Working Directory
Saving Your Workspace
Viewing Your Command History
Saving the Result of the Previous Command
Displaying the Search Path
Accessing the Functions in a Package
Accessing Built-in Datasets
Viewing the List of Installed Packages
Installing Packages from CRAN
Setting a Default CRAN Mirror
Suppressing the Startup Message
Running a Script
Running a Batch Script
Getting and Setting Environment Variables
Locating the R Home Directory
Customizing R

Mean
Mode
Median
Variance
Standard Deviation

+-1 sd = 68% = +-1 sd
+-2 sd = 95% = +-1.96 sd
+-3 sd = 99% (99.7%) = +-3 sd

표준점수 (unit with a standard deviation) = z score

Sampling distribution via random sampling
Central Limit Theorem

Assignment

Find two research articles that have listed hypotheses (social science research article would be good option). For each article:

각 가설을 적고
독립변인과 종속변인 그리고 intervening (moderator) 변인 등이 무엇인지 설명하시오.
각 변인이 어떻게 측정되었는지 설명하시오.
각 가설이 어떤 종류인지 설명하시오. (차이, 연관의 가설)
가설검증을 위해서 어떤 테스트방법을 취했는지 찾아서 기록하시오.

due date: 다음 주 수요일 자정까지 완성하시오 (2018/09/26 11:59).

Week04 (April 6, 9)

Class Activity

Lecture materials for this week

https://youtu.be/JvpOJPCBQkQ : R cookbook: data structure
https://youtu.be/_ynGzFFmm7U : Howell Ch 4. Variance 01: Introduction (DS, error, and SS)
https://youtu.be/HugtyhU7Im8 : Howell Ch. 4. Variance 02: Variance for sample and n-1

Concepts and ideas

Input and output

Introduction
Entering Data from the Keyboard
Printing Fewer Digits (or More Digits)
Redirecting Output to a File
Listing Files
Dealing with “Cannot Open File” in Windows
Reading Fixed-Width Records
Reading Tabular Data Files
Reading from CSV Files
Writing to CSV Files
Reading Tabular or CSV Data from the Web
Reading Data from HTML Tables
Reading Files with a Complex Structure
Reading from MySQL Databases
Saving and Transporting Objects

Assignment

Week05 (April 13, 16)

https://youtu.be/RE6DSk1DcJI : 왜 분산에는 n-1을 사용하는가?
https://youtu.be/PrPoOCW3v1s : n-1 증명
https://youtu.be/Ssznnbdj5Lg : degrees of freedom
https://youtu.be/valhVpf-haY : standard deviation
https://youtu.be/Qaxj6LZ-iL0 : sampling distribution
https://youtu.be/AbeIQvJJ5Vw : sampling distribution e.g. in R

Concepts and ideas

Data Structures

Introduction
Appending Data to a Vector
Inserting Data into a Vector
Understanding the Recycling Rule
Creating a Factor (Categorical Variable)
Combining Multiple Vectors into One Vector and a Factor
Creating a List
Selecting List Elements by Position
Selecting List Elements by Name
Building a Name/Value Association List
Removing an Element from a List
Flatten a List into a Vector
Removing NULL Elements from a List
Removing List Elements Using a Condition
Initializing a Matrix
Performing Matrix Operations
Giving Descriptive Names to the Rows and Columns of a Matrix
Selecting One Row or Column from a Matrix
Initializing a Data Frame from Column Data
Initializing a Data Frame from Row Data
Appending Rows to a Data Frame
Preallocating a Data Frame
Selecting Data Frame Columns by Position
Selecting Data Frame Columns by Name
Selecting Rows and Columns More Easily
Changing the Names of Data Frame Columns
Editing a Data Frame
Removing NAs from a Data Frame
Excluding Columns by Name
Combining Two Data Frames
Merging Data Frames by Common Column
Accessing Data Frame Contents More Easily
Converting One Atomic Value into Another
Converting One Structured Data Type into Another

Assignment

조원들과 협력하여

선행연구조사와 가설이 수록된 사회과학 논문을 찾습니다
- dbpia, kyobo scholar를 이용하세요
선행연구조사에 수록된 내용을 요약합니다.
가설을 소개합니다.
- 각 가설의 독립변인과 종속변인 혹은 그 외의 변인종류를 밝힙니다
- 각 변인이 어떻게 측정되었는지 그 측정수준을 밝힙니다
논문을 하나 찾기 전에 조원들과 함께 조원들의 학문적 관심사에 대한 통일을 하여 재미있는 논문을 찾기를 권합니다. 가령 내가 디자인에 관심이 많은 학생이라면 UI와 관련된 논문에 더 관심이 갈 것입니다. 거기에 더하여 요사이 자율주행 자동차 (혹은 그냥 자동차) UI에 대한 논문이 사회과학에서 있어서 읽을 수 있다면 흥미로울 것입니다 (그런데 없을 것 같은 생각이 . . . )
마감일은 다음 주 화요일 자정까지 입니다.
조원미팅은 카톡방이나 그 외의 테크놀로지를 이용하여 하시는 걸 권합니다.

Week06 (April 20, 23)

오늘 할 일 (실시간 온라인 미팅)

그룹확인
다음 주 퀴즈 공지
그룹과제 설명
그룹미팅

Concepts and ideas

Data Transformations

Introduction
Splitting a Vector into Groups
Applying a Function to Each List Element
Applying a Function to Every Row
Applying a Function to Every Column
Applying a Function to Groups of Data
Applying a Function to Groups of Rows
Applying a Function to Parallel Vectors or Lists

Strings and Dates

Announcement

First quiz on Week 07, Tuesday class (Oct. 16)
- RANGE: Week 01 - 03 materials + lecture content + textbook
  - hypothesis, variables, types of variables, operationalization
  - z-test, mean . . . .
  - Textbook:
    - chapter 2, 3, 4, 5
- NEXT quiz will be held on Oct. 23 during the mid term schedule.
- The 2nd quiz will cover 1st quiz + Week 05-07 materials.

Assignment

Week07 (April 27, 30)

Concepts and ideas

과제 리뷰 –> groups

Hypothesis testing
z-test

r 에서 qnorm(proportion) pnorm(z-score) function 이해 필요
z_score 참조

types of error
t-test

r 에서, qt(proportion, df), pt(t-score, df) function 이해 필요
probability 참조

Probability calculation in R ← Probability in R cookbook (텍스트북)

. . . .
ANOVA
factorial anova
correlation
regression

Probability

Introduction
Counting the Number of Combinations
Generating Combinations
Generating Random Numbers
Generating Reproducible Random Numbers
Generating a Random Sample
Generating Random Sequences
Randomly Permuting a Vector
Calculating Probabilities for Discrete Distributions
Calculating Probabilities for Continuous Distributions
Converting Probabilities to Quantiles
Plotting a Density Function

Assignment

가설 만들어 보기
- how to write hypothesis at behavioral science writing.
- One sample hypothesis Hypothesis at www.socialresearchmethods.net

개인과제

Week08 (May 4, 7)

시험기간
보강영상 수업

Week09 (May 11, 14)

Concepts and ideas

General Statistics
t-test
ANOVA
Factorial ANOVA
repeated measures anova
correlation and regression and multiple regression

Before regression, SS actually is sum of (error squared of guessing estimates).
sum of error square = 오차의 제곱의 합 = SS (오차라는 단어 없이 사용되는 용어)
For this, read carefully 표준오차 잔여변량 (standard error residual) in Regression document.

Introduction
Summarizing Your Data
Calculating Relative Frequencies
Tabulating Factors and Creating Contingency Tables
Testing Categorical Variables for Independence
Calculating Quantiles (and Quartiles) of a Dataset
Inverting a Quantile
Converting Data to Z-Scores
Testing the Mean of a Sample (t Test)
Forming a Confidence Interval for a Mean
Forming a Confidence Interval for a Median
Testing a Sample Proportion
Forming a Confidence Interval for a Proportion
Testing for Normality
Testing for Runs
Comparing the Means of Two Samples
Comparing the Locations of Two Samples Nonparametrically
Testing a Correlation for Significance
Testing Groups for Equal Proportions
Performing Pairwise Comparisons Between Group Means
Testing Two Samples for the Same Distribution

Assignment

Week10 (May 18, 21)

Concepts and ideas

multiple regression continued.
using dummy variables

Assignment

Week11 (May 25, 28)

Concepts and ideas

getting started
basics
navigating in r
input output in r
data structures
data transformations

Graphics

Introduction
Creating a Scatter Plot
Adding a Title and Labels
Adding a Grid
Creating a Scatter Plot of Multiple Groups
Adding a Legend
Plotting the Regression Line of a Scatter Plot
Plotting All Variables Against All Other Variables
Creating One Scatter Plot for Each Factor Level
Creating a Bar Chart
Adding Confidence Intervals to a Bar Chart
Coloring a Bar Chart
Plotting a Line from x and y Points
Changing the Type, Width, or Color of a Line
Plotting Multiple Datasets
Adding Vertical or Horizontal Lines
Creating a Box Plot
Creating One Box Plot for Each Factor Level
Creating a Histogram
Adding a Density Estimate to a Histogram
Creating a Discrete Histogram
Creating a Normal Quantile-Quantile (Q-Q) Plot
Creating Other Quantile-Quantile Plots
Plotting a Variable in Multiple Colors
Graphing a Function
Pausing Between Plots
Displaying Several Figures on One Page
Opening Additional Graphics Windows
Writing Your Plot to a File
Changing Graphical Parameters

Assignment

Week12 (June 1, 4)

Announcement

Quiz 03: Nov. 23

Concepts and ideas

chi-square test
probability
general statistics

Graphics

Assignment

Week13 (June 8, 11)

Concepts and ideas

Assignment

그룹 assignment: independent t-test, repeated measures t-test, ANOVA, Factorial ANOVA, repeated measures ANOVA, regression, multiple regression 와 관련된 가설을 만들고, 구글독스를 이용하여 설문문항을 작성하시오. 이를 이용하여 데이터를 수집한 후 검증을 하시오. 검증 결과를 최대한 자세하게 논하시오. 과제는 기본적으로 아래를 수행하여야 합니다.

가설은 일반상식, 알고있는 사회과학 이론 등에 기반을 해서 만듭니다
- 가설작성에는 가설에 대한 설명이 포함되어야 합니다. 즉, 가설만 만들어서는 부족합니다.
구글서베이를 이용하여 서베이 문항을 만들 때 아래를 포함하여야 합니다.
- 응답자 학번, 이름, 이메일 (참여 평가를 위해서: 참여 + (불)성실응답)
- 각 가설을 검증할 수 있는 문항들
R을 이용하여 검증합니다.
검증 결과를 의미있게 논합니다.

—-
과제제출

가설 소개와 설명
- independent t-test
- repeated measures t-test
- ANOVA
- Factorial ANOVA
- repeated measures ANOVA
- regression
- multiple regression
가설에 따른 설문 문항과 이 때의 IV와 DV 파악 및 측정 수준에 대한 설명
- independent t-test
- repeated measures t-test
- ANOVA
- Factorial ANOVA
- repeated measures ANOVA
- regression
- multiple regression
각 가설검증 분석결과 및 논의
- independent t-test
- repeated measures t-test
- ANOVA
- Factorial ANOVA
- repeated measures ANOVA
- regression
- multiple regression

첨부파일 제출

서베이 참여자 명단 (survey.participants.group.01.xlsx 와 같은 이름의 excel 파일 형식으로 따로 제출)
- 강사가 우선 클래스메이트 명단을 배포할 것입니다 (excel 파일로).
- 스프레드시트에 참여한 사람의 성과이름은 붙여서 적습니다.
- 자신의 조에 속한 조원들도 자기 조 서베이에 참여합니다.
  - 완전참여 = 1
  - 비참여 = 0
  - 불완전참여 = 2

Week14 (June 15, 18)

Concepts and ideas

ANOVA
Linear Regression and ANOVA
http://commres.net/wiki/text_mining_example_with_korean_songs

Assignment

이번 주 주말에 (토요일) 퀴즈 봅니다.

사지선다 혹은 단답식
토요일 오전 9:00 - 오후 6:00 중 시작시간 정할 수 있음
퀴즈제한 시간은 40분정도

퀴즈 범위는
stats part

r part

Week15 (June 22, 25)

Final quiz
Part I (필기시험): NO open book.

correlation
regression
multiple regression
chi-square test
factor analysis - 이론적인 이해와 관련된 부분
r 과 관련된 내용 중 통계에 대한 이해와 관련된 부분, 예를 들면
- t-test, ANOVA, Factorial ANOVA output에 대한 이해
- regression, multiple regression output에 대한 이해 등

Part II (r 실기시험): 교재와 R help만 허용

Assignment

<color #ed1c24>그룹과제 내용수정 —-</color>
그룹과제 내용을 아래와 같이 수정합니다.

그룹 assignment:

independent t-test, repeated measures t-test, ANOVA, Factorial ANOVA, repeated measures ANOVA, regression, multiple regression 와 관련된 가설을 만들고,
MS WORD를 이용하여 각 가설에 대응하는 설문문항을 모두 작성하시오.
아래 가설에 대응하는 데이터를 자의적으로 만든 후 검증을 하시오.
- Independent t-test
- ANOVA
- Factorial ANOVA
- Multiple Regression
검증 결과를 자세하게 논하시오.

과제는 기본적으로 아래를 수행하여야 합니다.

MS Word를 이용하여 작성합니다 (아래한글 제외)
가설만들기:
- 가설은 일반상식, 알고있는 사회과학 이론 등에 기반을 해서 만듭니다
  - 가설작성에는 가설에 대한 설명이 포함되어야 합니다. 즉, 가설만 만들어서는 부족합니다.
  - 가설들은 서로 연관되어 있어도 좋습니다. 가령, 한 이론에서 파생된 t-test와 ANOVA 가설등.
  - 아래 통계테스트에 대응하는 가설을 모두 만들어야 합니다
    - Independent t-test
    - Repeated measures t-test
    - ANOVA
    - Factorial ANOVA
    - Repeated measures ANOVA
    - Regression
    - Multiple Regression
설문문항 만들기
- 각 가설에 대응하는 서베이 문항을 만들어야 합니다. 서베이문항은 다음을 포함합니다.
- 각 가설을 (7개, indepedent t-test, repeated measures t-test, ANOVA, etc.) 검증할 수 있는 문항들
데이터 수집
- <color #ed1c24>데이터는 구글서베이를 이용하여 수집하지 않습니다.</color>
- 각 그룹 아래 가설에 해당하는 데이터를 <color #ed1c24>인위적으로 만들어</color> 테스트합니다.
- (인위적으로 만들) 응답자 수는 최소 30명으로 합니다. 즉, 30명의 데이터를 인위적으로 만듭니다.
  - Independent t-test
  - ANOVA
  - Factorial ANOVA
  - Multiple Regression
가설의 검증
- R을 이용하여 가설을 검증합니다.
- R 코드와 아웃풋을 MS Word에 기록합니다
- MS Word에서 R코드와 아웃풋에 사용되는 폰트는 fixed width로 합니다 (courier, courier new와 같은)
- 분석은 인위적으로 데이터를 만든 가설만을 포함하면 됩니다.
  - Independent t-test
  - ANOVA
  - Factorial ANOVA
  - Multiple Regression (두 가지 이상의 IV 포함한)
결론 쓰기
- 검증 결과를 의미있게 논합니다.
- 가설을 만드는데 사용된 이론이나 아이디어와 연관지어 의미있는 결론을 도출해 냅니다.

과제제출

가설만들기
- 가설 소개와 설명 (이론이나 논리적인 아이디어에 기반한)
- 각 가설마다 새로운 이론을 사용할 필요는 없습니다.
  - independent t-test
  - repeated measures t-test
  - ANOVA
  - Factorial ANOVA
  - repeated measures ANOVA
  - regression
  - multiple regression
변인, 변수 설명과 대응하는 설문문항 작성
- 각 가설의 IV와 DV 파악 및 측정 수준에 대한 설명
- 가설에 대한 설문문항 작성
데이터 인위적 수집
분석
- 각 가설검증을 위해 사용된 코드 및 아웃풋
- 이에 대한 설명
결론
- 검증 결과 논의

Week16 (June 22, 25)

Final-term

July 02. 목요일 Quiz 봅니다.
퀴즈 시간은 12:00 - 5:00 입니다. 퀴즈 시간은 한정되어 있습니다. 연장이나 늦게 제출되지 않도록 할 예정입니다. 60분이 제한시간이면 이 시간이 지나면 자동제출됩니다.
범위는 다음과 같습니다.
- t-test
- ANOVA
- Repeated measure ANOVA
- Factorial ANOVA
- correlation
- Regression
- Multiple regression
  - Using dummy variable
  - Interpreting IVs roles

Table of Contents

Week01 (Mar 16, 19)

ideas and concepts

Assignment

Week02 (Mar. 23, 26)

Concepts and ideas

Assignment

Week03 (Mar 30, April 2)

Concepts and ideas

Assignment

Week04 (April 6, 9)

Class Activity

Concepts and ideas

Assignment

Week05 (April 13, 16)

Concepts and ideas

Assignment

Week06 (April 20, 23)

Concepts and ideas

Announcement

Assignment

Week07 (April 27, 30)

Concepts and ideas

Assignment

개인과제

Week08 (May 4, 7)

Week09 (May 11, 14)

Concepts and ideas

Assignment

Week10 (May 18, 21)

Concepts and ideas

Assignment

Week11 (May 25, 28)

Concepts and ideas

Assignment

Week12 (June 1, 4)

Announcement

Concepts and ideas

Assignment

Week13 (June 8, 11)

Concepts and ideas

Assignment

Week14 (June 15, 18)

Concepts and ideas

Assignment

Week15 (June 22, 25)

Assignment

Week16 (June 22, 25)