Plotting Multiple Series in R -- Part 4 in a Series
This is post #04 in a running series about plotting in R.
Frequently, you want to simultaneously plot multiple series on the same plot. Let’s try plotting daily observations along with a 30 day moving average.
First, the data needs cleaning — I turn the column names into lower case for convenience with the tolower function and turn the text dates formatted as yyyy-mm-dd into dates instead of factors via the as.Date constructor for Date classes:
12345678910111213141516
yahoo <- read.csv(file='~/stuff/blog/YHOO stock prices [19960412, 20090702].csv', header=T, sep=',')
str(yahoo)
'data.frame': 3329 obs. of 7 variables:
$ Date : Factor w/ 3329 levels "1996-04-12","1996-04-15",..: 3329 3328 3327 3326 3325 3324 3323 3322 3321 3320 ...
$ Open : num 15.2 15.5 15.8 15.9 15.6 ...
$ High : num 15.3 15.7 15.9 16 15.8 ...
$ Low : num 14.9 15.3 15.3 15.6 15.5 ...
$ Close : num 15 15.4 15.7 15.9 15.7 ...
$ Volume : int 16919900 12716100 16033900 12312100 26449100 19827800 30979700 15866300 26488700 20323100 ...
$ Adj.Close: num 15 15.4 15.7 15.9 15.7 ...
colnames(yahoo) <- tolower( colnames(yahoo) )
yahoo$date <- as.Date( as.character( yahoo$date ) )
# order yahoo into the same way we want to display it
yahoo <- yahoo[ order(yahoo$date), ]
That isn’t very pretty, not least of which because we’re displaying too much data to be useful. Let’s cut it down to just data from January 1 2008 and on:
It’s worth pointing out that R’s plotting code will attempt to set the upper and lower y bounds to something reasonable based on that data you present it with. However, sometimes, particularly to get a sense of scale, you really want to see the full range. You can accomplish this by explicitly setting the y axis limits with ylim. I also make the data more presentable.
Also, I wish to plot the moving average, so I create the function ma30 to calculate it. I also add ma30 as a column, using the whole data range so that the moving average is correct at the beginning of our subset:
And finally, I replot the data, adding the moving average as a second series and making it slightly bolder (lwd=2) to emphasize the moving average over the daily observations: