Area Plots with Intensity Coloring

2010-07-13 07:47:00 Allan Engelhardt wrote in CYBAEA Data and Analysis:

I am not sure apeescape’s ggplot2 area plot with intensity colouring is really the best way of presenting the information, but it had me intrigued enough to replicate it using base R graphics.

The key technique is to draw a gradient line which R does not support natively so we have to roll our own code for that. Unfortunately, lines(..., type="l") does not recycle the colour col= argument, so we end up with rather more loops than I thought would be necessary.

(The answer is not to use lines(..., type="h") which, confusingly, does recycle the colour col= argument. This one had me for a while, but the type=h lines always start from zero so you do not get the gradient feature.)

We also get a nice opportunity to use the under-appreciated read.fwf function.

##!/usr/bin/Rscript
## nino.R - another version of http://bit.ly/9P9Gh1
## Copyright © 2010 Allan Engelhardt (http://www.cybaea.net/)

## Get the data from the NOAA server
nino <- read.fwf("http://www.cpc.noaa.gov/data/indices/wksst.for",
                 widths=c(-1, 9, rep(c(-5, 4, 4), 4)),
                 skip=4,
                 col.names=c("Week",
                     paste(rep(c("Nino12","Nino3","Nino34","Nino4"), rep(2, 4)),
                           c("SST", "SSTA"), sep=".")))

## Make the date column something useful
nino$Week <- as.Date(nino$Week, format="%d%b%Y")

## Make colour gradients
ncol <- 50
grad.neg <- hsv(4/6, seq(0, 1, length.out=ncol), 1) # Blue gradient
grad.pos <- hsv(  0, seq(0, 1, length.out=ncol), 1) # Red gradient

## Make plot
plot(Nino34.SSTA ~ Week, data=nino, type="n",
     main="Nino34", xlab="Date", ylab="SSTA", axes=FALSE)
do.call(function (...) rect(..., col="gray85", border=NA),
        as.list(par("usr")[c(1, 3, 2, 4)]))

y <- nino$Nino34.SSTA                   # The values we will plot
x <- nino$Week

axis.Date(1, x=x, tck=1, col="white")
axis(2, tck=1, col="white")
box()

idx <- integer(NROW(nino))
idx[y >= 0] <- 1 + round( y[y >= 0] * (ncol - 1) / max( y[y >= 0]), 0)
idx[y <  0] <- 1 + round(-y[y <  0] * (ncol - 1) / max(-y[y <  0]), 0)

draw.gradient <- function(x, ys, cols) {
    xs <- rep(x, 2)
    for (i in seq(1, length(ys)-1))
        plot.xy(list(x=xs, y=c(ys[i], ys[i+1])), type="l", col=cols[i])
}

for (i in 1:length(x)) {
    ys <- seq(0, y[i], length.out=idx[i]+1)
    cols <- (if (y[i] >=0) grad.pos else grad.neg)
    draw.gradient(x[i], ys, cols)
}

The result is a decent gradient:

[Graphics output]

I deliberately omitted the scale legend on the right hand side following Allan’s First Law of Happy Graphics: Thou shall not present the same information twice.

For less dense information, you should increase the line width. That is left to the reader. (Hint: it is hard to get just right in base graphics, but lwd <- ceiling(par("pin")[1] / dev.size("in")[1] * dev.size("px")[1] / length(x)) could be a starting point for an approximation. We really need gradient-filled polygons in base R.)

Subscribe to CYBAEA Data and Analysis

Jump to comments.

You may also like these posts:

  1. [0.51] Employee productivity as function of number of workers revisited

    We have a mild obsession with employee productivity and how that declines as companies get bigger. We have previously found that when you treble the number of workers, you halve their individual productivity which is mildly scary. We revisit the analysis for the FTSE-100 constituent companies and find that the relation still holds four years later and across a continent.

  2. [0.48] R: Eliminating observed values with zero variance

    I needed a fast way of eliminating observed values with zero variance from large data sets using the R statistical computing and analysis platform . In other words, I want to find the columns in a data frame that has zero variance. And as fast as possible, because my data sets are large, many, and changing fast. The final result surprised me a little.

  3. [0.45] Feature selection: Using the caret package

    Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. In a previous post we looked at all-relevant feature selection using the Boruta package while in this post we consider the same (artificial, toy) examples using the caret package. Max Kuhn kindly listed me as a contributor for some performance enhancements I submitted, but the genius behind the package is all his.

  4. [0.41] Benchmarking feature selection with Boruta and caret

    Feature selection is the data mining process of selecting the variables from our data set that may have an impact on the outcome we are considering. For commercial data mining, which is often characterised by having too many variables for model building, …

  5. [0.40] Getting started with the Heritage Health Price competition

    The US$ 3 million Heritage Health Price competition is on so we take a look at how to get started using the R statistical computing and analysis platform .

  6. [0.39] SNA with R: Loading large networks using the igraph library

    We are interested in Social Network Analysis using the statistical analysis and computing platform R . The documentation for R is voluminous but typically not very good, so this entry is part of a series where we document what we learn as we explore the t…

  7. [0.35] Feature selection: All-relevant selection with the Boruta package

    Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. There are two main approaches to selecting the features (variables) we will use for the a…

  8. [0.35] R tips: Installing Rmpi on Fedora Linux

    Somebody on the R-help mailing list asked how to get Rmpi working on his Fedora Linux machine so he could do high-performance computing on a cluster of machines (or a single multicore machine) using the R statistical computing and analysis platform . Sinc…

  9. [0.34] Big data for R

Join the discussion

Do you agree or disagree? Have a question of want to make a point? Join the discussion:

There are no comments yet. Be the first to comment.