We knew the potential existed already, of course. Mobile devices in the USA generates some 600 billion transactions per day, each tagged with the location and time. Jeff Jonas says,
Every call, text message, email and data transfer handled by your mobile device creates a transaction with your space-time coordinate[…]. Got a Blackberry? Every few minutes, it sends a heartbeat, creating a transaction whether you are using the phone or not. That is some 7 million transactions per second, on average.
This is by far the best description of why traditional parallel databases (like Teradata, Greenplum et al.) is a evolutionary dead end. But much more than a theoretical discussion, they have built a solution which they call HadoopDB. It is based on Hadoop, PostgreSQL, and Hive and is completely Open Source. Alternative, column-based, backends to PostgreSQL are being implemented now. Read: Announcing release of HadoopDB.
The nice people at Velocity has released The B2B Content Marketing Workbook. It is behind a registration wall which means we wouldn’t normally recommend it but you can just type junk in the fields if you are not comfortable with giving your personal details to a marketing agency. (Think about it….) If you are relatively new in the B2B world, say having joined a professional services or consulting organization, you may find this one useful.
A story from antiquity involving a king of Rome and a Greek Sibyl has lovely marketing lessons.
I like the “multicore” library for a particular task. I can easily write a combination of
if(require("multicore",...)) that means that my function will automatically use the parallel
mclapply() instead of
lapply() where it is available. Which is grand 99% of the time, except when my function is called from
mclapply() (or one of the lower level functions) in which case much CPU trashing and grinding of teeth will result.
Somebody on the R-help mailing list asked how to get Rmpi working on his Fedora Linux machine so he could do high-performance computing on a cluster of machines (or a single multicore machine) using the R statistical computing and analysis platform. Since it is unusually painful to get working, I might as well copy the instructions here.
O’Reilly has published Data Mashups in R as a $4.99 PDF download in their Short Cut series. In 27 pages it takes you through an example of how to combine foreclosure information with maps and geographical information to produce plots like the one below. This is all done with the R statistical computing and analysis platform.