Say you read a data frame from a file but you don’t like the column names. Here’s how you go about labelling them as you like. Start with a simple csv file:
1
2
3
col1, col2, col3
"1,233", "$12.79", "$1,333,233.17"
"470", "$1,113.22", "$0.12"
Load it, and see what we get:
1
2
3
4
5
6
7
8
9
10
11
data <- read.csv(file='~/stuff/blog/dirty.csv', header=T, sep=',')
> data
col1 col2 col3
1 1,233 $12.79 $1,333,233.17
2 470 $1,113.22 $0.12
> str(data)
'data.frame': 2 obs. of 3 variables:
$ col1: Factor w/ 2 levels "1,233","470": 1 2
$ col2: Factor w/ 2 levels " $1,113.22"," $12.79": 2 1
$ col3: Factor w/ 2 levels " $0.12"," $1,333,233.17": 2 1
>
Now, lets examine the column names (and also note how we see how many there are) using colnames
, nrow
, ncol
, dim
:
1
2
3
4
5
6
7
8
> colnames( data )
[1] "col1" "col2" "col3"
> nrow(data)
[1] 2
> ncol(data)
[1] 3
> dim(data)
[1] 2 3
And R allows us to modify the column names of a data frame by assigning to the array produced by colnames
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
> colnames(data)
[1] "col1" "col2" "col3"
>
> # set the name of column 2
> colnames(data)[2] <- 'column 2'
> colnames(data)
[1] "col1" "column 2" "col3"
>
> # you can assign all of the columns at once, if you wish
> colnames(data) <- c( 'col 1', 'col 2', 'col 3')
> colnames(data)
[1] "col 1" "col 2" "col 3"
> str(data)
'data.frame': 2 obs. of 3 variables:
$ col 1: Factor w/ 2 levels "1,233","470": 1 2
$ col 2: Factor w/ 2 levels " $1,113.22"," $12.79": 2 1
$ col 3: Factor w/ 2 levels " $0.12"," $1,333,233.17": 2 1