Stochastic Nonsense

Put something smart here.

Labeling Plots - Annotations, Legends, Etc -- Part 6 in a Series

This is post #06 in a running series about plotting in R.

You regularly want to label pieces of a plot in order to point a particular feature out or answer a question that your audience will have. Let’s see how to do this in R.

First, let’s collapse all the R source we need to get to the plot we had at the end of part 5 – axis labeling.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# load data and prep
yahoo <- read.csv(file='~/stuff/blog/YHOO stock prices [19960412, 20090702].csv', header=T, sep=',')
colnames(yahoo) <- tolower( colnames(yahoo) )

yahoo$date <- as.Date( as.character( yahoo$date ) )
yahoo <- yahoo[order(yahoo$date),]

# util functions
summary30 <- function( x, FUN, na.rm=F ){
  val <- rep( 0, length( x ) )
  for( j in 1:length( x ) ){
      val[ j ] <- FUN( x[ max( j - 29, 1 ):j ], na.rm=na.rm)
  }
  val
}

yahoo$close30 <- ma30(yahoo$close)

# create our plot
plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,1.1*max(yahoo2$close)),
  col='black', type='l',
  main='YHOO stock close', xlab='date', ylab='close ($)',
  xaxt='n')
points(x=yahoo2$date, y=yahoo2$close30, col='red', type='l', lwd=2)

# put X axis labels on first date present in each quarter
locs <- tapply(X=yahoo2$date, FUN=min, INDEX=format(yahoo2$date, '%Y%m'))

at = yahoo2$date %in% locs

at = at & format(yahoo2$date, '%m') %in% c('01', '04', '07', '10')
axis(side=1, at=yahoo2$date[ at ],    labels=format(yahoo2$date[at], '%b-%y'))
abline(v=yahoo2$date[at], col='grey', lwd=0.5)

First, let’s look at the dramatic jump in the stock price on the first of February 2008 — Microsoft announced their takeover bid for Yahoo. We can annotate our plot with that text, and even draw an arrow to our series. Note that the x,y locations specified in all these functions are in whatever coordinate system you passed into the plot function.

1
2
3
4
text(x=as.Date('2008-03-01'), y=9, labels='MSFT offer', col='blue')

# length slightly shrinks the size of the arrow head; lwd makes the line bolder
arrows(x0=as.Date('2008-03-01'), y0=10, x1=as.Date('2008-02-01'), y1=20, col='blue', length=0.1, lwd=3)

Finally, more for demonstration than anything else, let’s plot the 30 day moving min and max of the close price. To distinguish these two series, I’ll use dashes instead of solid lines and make the lines very light. First, I’ll create functions to calculate the respective series. The lty param to points controls the line type, in this case dashed, and lwd less than one is a very narrow line.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# add some min / max info
summary30 <- function( x, FUN, na.rm=F ){
  val <- rep( 0, length( x ) )
  for( j in 1:length( x ) ){
      val[ j ] <- FUN( x[ max( j - 29, 1 ):j ], na.rm=na.rm)       }
  val
}

# create the series and pass in the function we want to use
yahoo$minclose30 <- summary30(yahoo$close, FUN=min)
yahoo$maxclose30 <- summary30(yahoo$close, FUN=max)
yahoo2 <- yahoo[ yahoo$date >= as.Date('2008-01-01'),]
#
points(x=yahoo2$date, y=yahoo2$minclose30, col='blue', type='l', lty=2, lwd=0.5)
points(x=yahoo2$date, y=yahoo2$maxclose30, col='blue', type='l', lty=2, lwd=0.5)

And finally, we should add a legend just to make very clear what is going on in our plot. Note that lty allows you to set the line type — normal or dashed — for each legend item. I also used png with width=720 and height = 480 to stretch the plot out for better viewing.

1
2
3
legend(x=as.Date('2009-02-01'), y=30, 
  legend=c('daily close', '30 day MA', '30 day min/max'), 
  col=c('black', 'red', 'blue'), lwd=3, lty=c(1,1,2))