I like the “multicore” library for a particular task. I can easily write a combination of
if(require("multicore",...)) that means that my function will automatically use the parallel
mclapply() instead of
lapply() where it is available. Which is grand 99% of the time, except when my function is called from
mclapply() (or one of the lower level functions) in which case much CPU trashing and grinding of teeth will result.
Somebody on the R-help mailing list asked how to get Rmpi working on his Fedora Linux machine so he could do high-performance computing on a cluster of machines (or a single multicore machine) using the R statistical computing and analysis platform. Since it is unusually painful to get working, I might as well copy the instructions here.
O’Reilly has published Data Mashups in R as a $4.99 PDF download in their Short Cut series. In 27 pages it takes you through an example of how to combine foreclosure information with maps and geographical information to produce plots like the one below. This is all done with the R statistical computing and analysis platform.
Someone on the R-help mailing list had a data frame with a column containing IP addresses in quad-dot format (e.g. 18.104.22.168). He wanted to sort by this column and I proposed a solution involving
strsplit. But Peter Dalgaard comes up with a much nicer method using
read.table on a
Havard Business School has an interesting study titled Do Friends Influence Purchases in a Social Network?. I would like to get my hands on the raw data (which is from the Korean social site Cyworld), but the outline conclusions seems plausible: