Differences

This shows you the differences between two versions of the page.

--- r:data_structures [2018/11/19 23:23] – [Preallocating a Data Frame] hkimscil
+++ r:data_structures [2019/09/19 09:16] (current) – [Creating a Factor (Categorical Variable)] hkimscil
@@ Line 54: / Line 54: @@
 ^ Object  ^ Example  ^ Mode  ^
-| Number  | 3.1415  | numeric  |
+| Number  | ''%%3.1415%%''  | numeric  |
-| Vector of numbers  | c(2.7.182, 3.1415)  | numeric  |
+| Vector of numbers  | ''%%c(2.7.182, 3.1415)%%''  | numeric  |
-| Character string  | "Moe"  | character  |
+| Character string  | ''%%"Moe"%%''  | character  |
-| Vector of character strings  | c("Moe", "Larry", "Curly")  | character  |
+| Vector of character strings  | ''%%c("Moe", "Larry", "Curly")%%''  | character  |
-| Factor  | factor(c("NY", "CA", "IL"))  | numeric  |
+| Factor  | ''%%factor(c("NY", "CA", "IL"))%%''  | numeric  |
-| List  | list("Moe", "Larry", "Curly")  | list  |
+| List  | ''%%list("Moe", "Larry", "Curly")%%''  | list  |
-| Data frame  | data.frame(x=1:3, y=c("NY", "CA", "IL"))  | list  |
+| Data frame  | ''%%data.frame(x=1:3, y=c("NY", "CA", "IL"))%%''  | list  |
-| Function  | print  | function  |
+| Function  | ''%%print%%''  | function  |
 ===== Class =====
@@ Line 135: / Line 135: @@
 Grouping: This is a technique for labeling or tagging your data items according to their group. See the Introduction to Chapter 6.
+<code>> A <- c(1,2,2,3,3,4,4,4,4,2,1,2,3,3)
+> A
+ [1] 1 2 2 3 3 4 4 4 4 2 1 2 3 3
+> str(A)
+ num [1:14] 1 2 2 3 3 4 4 4 4 2 ...
+> fA <- factor(A)
+> fA
+ [1] 1 2 2 3 3 4 4 4 4 2 1 2 3 3
+Levels: 1 2 3 4
+> str(fA)
+ Factor w/ 4 levels "1","2","3","4": 1 2 2 3 3 4 4 4 4 2 ...
+>
+</code>
 ===== Data Frames =====
@@ Line 227: / Line 241: @@
 [1] 11 12 13 14 15 16
 </code>
-<WRAP box help>The above code is very useful. But, sometimes the recycling rule is very annoying. How would I avoid it?
-</WRAP>
 ====== Creating a Factor (Categorical Variable) ======
@@ Line 252: / Line 263: @@
 <code>> f <- factor(wday, c("Mon","Tue","Wed","Thu","Fri")) # c(...) part means "levels" not data
-> f
+> f  # note that there is no Fri in the below output.
  [1] Wed Thu Mon Wed Thu Thu Thu Tue Thu Tue
 Levels: Mon Tue Wed Thu Fri
@@ Line 818: / Line 829: @@
 ====== Selecting data frame columns by position ======
-<code>suburbs
+<code csv suburbs.csv>
-                city   county state     pop
+city	county	state	pop
-            Chicago     Cook    IL 2853114
+Chicago	Cook	IL	2853114
-            Kenosha  Kenosha    WI   90352
+Kenosha	Kenosha	WI	90352
-             Aurora     Kane    IL  171782
+Aurora	Kane	IL	171782
-              Elgin     Kane    IL   94487
+Elgin	Kane	IL	94487
-               Gary Lake(IN)    IN  102746
+Gary	Lake(IN)	IN	102746
-             Joliet  Kendall    IL  106221
+Joliet	Kendall	IL	106221
-         Naperville   DuPage    IL  147779
+Naperville	DuPage	IL	147779
-  Arlington Heights     Cook    IL   76031
+Arlington Heights	Cook	IL	76031
-        Bolingbrook     Will    IL   70834
+Bolingbrook	Will	IL	70834
-            Cicero     Cook    IL   72616
+Cicero	Cook	IL	72616
-          Evanston     Cook    IL   74239
+Evanston	Cook	IL	74239
-           Hammond Lake(IN)    IN   83048
+Hammond	Lake(IN)	IN	83048
-          Palatine     Cook    IL   67232
+Palatine	Cook	IL	67232
-        Schaumburg     Cook    IL   75386
+Schaumburg	Cook	IL	75386
-            Skokie     Cook    IL   63348
+Skokie	Cook	IL	63348
-          Waukegan Lake(IL)    IL   91452
+Waukegan	Lake(IL)	IL	91452
 </code>
-or {{:r:suburbs.csv|download file}}
+<code>suburbs <- read.csv("http://commres.net/wiki/_export/code/r/data_structures?codeblock=96", head=T, sep="	")</code>
 <code>> suburbs[[1]]
@@ Line 866: / Line 877: @@
 </code>
-<code>> suburbs[c(1,3)]
+<code>> suburbs[c(1,4)]
                 city     pop
             Chicago 2853114
@@ Line 994: / Line 1005: @@
 # then, close the edit window
 </code>
+<WRAP box help>Can you save it as "mat.csv." Then, retrieve it again into r space?
+When you read back the csv file? How would you avoid like the below output? I mean aovid X column?
+<code>  X before treatment  after
+1 -0.818    -0.946 -0.611
+2 -0.667    -0.205 -2.155
+3 -0.494     0.385 -0.535
+4 -0.819     1.531 -0.316</code>
+Or even, how would I save the csv file, without the X column?
+</WRAP>
 ====== Removing NAs from a Data Frame ======