r:data_structures
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| r:data_structures [2018/11/19 23:23] – [Preallocating a Data Frame] hkimscil | r:data_structures [2019/09/19 09:16] (current) – [Creating a Factor (Categorical Variable)] hkimscil | ||
|---|---|---|---|
| Line 54: | Line 54: | ||
| ^ Object | ^ Object | ||
| - | | Number | + | | Number |
| - | | Vector of numbers | + | | Vector of numbers |
| - | | Character string | + | | Character string |
| - | | Vector of character strings | + | | Vector of character strings |
| - | | Factor | + | | Factor |
| - | | List | list(" | + | | List | '' |
| - | | Data frame | data.frame(x=1: | + | | Data frame | '' |
| - | | Function | + | | Function |
| ===== Class ===== | ===== Class ===== | ||
| Line 135: | Line 135: | ||
| Grouping: This is a technique for labeling or tagging your data items according to their group. See the Introduction to Chapter 6. | Grouping: This is a technique for labeling or tagging your data items according to their group. See the Introduction to Chapter 6. | ||
| + | |||
| + | < | ||
| + | > A | ||
| + | [1] 1 2 2 3 3 4 4 4 4 2 1 2 3 3 | ||
| + | > str(A) | ||
| + | num [1:14] 1 2 2 3 3 4 4 4 4 2 ... | ||
| + | > fA <- factor(A) | ||
| + | > fA | ||
| + | [1] 1 2 2 3 3 4 4 4 4 2 1 2 3 3 | ||
| + | Levels: 1 2 3 4 | ||
| + | > str(fA) | ||
| + | | ||
| + | > | ||
| + | </ | ||
| ===== Data Frames ===== | ===== Data Frames ===== | ||
| Line 227: | Line 241: | ||
| [1] 11 12 13 14 15 16 | [1] 11 12 13 14 15 16 | ||
| </ | </ | ||
| - | |||
| - | <WRAP box help>The above code is very useful. But, sometimes the recycling rule is very annoying. How would I avoid it? | ||
| - | </ | ||
| ====== Creating a Factor (Categorical Variable) ====== | ====== Creating a Factor (Categorical Variable) ====== | ||
| Line 252: | Line 263: | ||
| < | < | ||
| - | > f | + | > f # note that there is no Fri in the below output. |
| [1] Wed Thu Mon Wed Thu Thu Thu Tue Thu Tue | [1] Wed Thu Mon Wed Thu Thu Thu Tue Thu Tue | ||
| Levels: Mon Tue Wed Thu Fri | Levels: Mon Tue Wed Thu Fri | ||
| Line 818: | Line 829: | ||
| ====== Selecting data frame columns by position ====== | ====== Selecting data frame columns by position ====== | ||
| - | <code>suburbs | + | < |
| - | city | + | city county state pop |
| - | 1 | + | Chicago Cook IL 2853114 |
| - | 2 | + | Kenosha Kenosha WI 90352 |
| - | 3 Aurora | + | Aurora Kane IL 171782 |
| - | 4 | + | Elgin Kane IL 94487 |
| - | 5 Gary Lake(IN) | + | Gary Lake(IN) IN 102746 |
| - | 6 Joliet | + | Joliet Kendall IL 106221 |
| - | 7 Naperville | + | Naperville DuPage IL 147779 |
| - | 8 | + | Arlington Heights Cook IL 76031 |
| - | 9 | + | Bolingbrook Will IL 70834 |
| - | 10 | + | Cicero Cook IL 72616 |
| - | 11 | + | Evanston Cook IL 74239 |
| - | 12 Hammond Lake(IN) | + | Hammond Lake(IN) IN 83048 |
| - | 13 | + | Palatine Cook IL 67232 |
| - | 14 | + | Schaumburg Cook IL 75386 |
| - | 15 | + | Skokie Cook IL 63348 |
| - | 16 | + | Waukegan Lake(IL) IL 91452 |
| </ | </ | ||
| - | or {{:r:suburbs.csv|download file}} | + | < |
| < | < | ||
| Line 866: | Line 877: | ||
| </ | </ | ||
| - | < | + | < |
| city pop | city pop | ||
| 1 Chicago 2853114 | 1 Chicago 2853114 | ||
| Line 994: | Line 1005: | ||
| # then, close the edit window | # then, close the edit window | ||
| </ | </ | ||
| + | |||
| + | <WRAP box help>Can you save it as " | ||
| + | |||
| + | When you read back the csv file? How would you avoid like the below output? I mean aovid X column? | ||
| + | < | ||
| + | 1 1 -0.818 | ||
| + | 2 2 -0.667 | ||
| + | 3 3 -0.494 | ||
| + | 4 4 -0.819 | ||
| + | |||
| + | Or even, how would I save the csv file, without the X column? | ||
| + | </ | ||
| ====== Removing NAs from a Data Frame ====== | ====== Removing NAs from a Data Frame ====== | ||
r/data_structures.1542669803.txt.gz · Last modified: by hkimscil
