On 2009-08-17 09:18:00, Allan Engelhardt wrote in CYBAEA Journal:
We knew the potential existed already, of course. Mobile devices in the USA generates some 600 billion transactions per day, each tagged with the location and time. Jeff Jonas: Every call, text message, email and data transfer handled by your mobile device creates a transaction with your space-time coordinate[...]. Got a Blackberry? Every few minutes, it sends a heartbeat, creating a transaction whether you are using the phone or not.
That is some 7 million transactions per second, on average.
The mobile operators have this data, of course. We all know this (especially here where we have been using some of it for social network analysis), at least in principle. No real surprises here, except perhaps in the volumes.
But did you know that the operators are sharing your data? What is new, at least to me, is that this data is being provided to third parties that are leveraging specially designed analytics to make sense of our space-time-travel data. With the data out and specialized analytics emerging, this infant industry is already doing some pretty amazing work. Your space-time-travel data makes where you live and where you work self-evident, and it reveals your most frequent, periodic, infrequent and rare destinations.
This is powerful stuff. I can't validate Jeff's claim that the network operators are sharing their detailed network data in any systematic way and for anything other than research purposes, but surely it is only a matter of time. This data is too good to ignore — super-food for analytics, as Jeff calls it.
Think about the possibilities. We obviously know where you live and where you work, but also who is in the house with you (your wife, hopefully) and which of your office mates join you for the Friday afternoon beer.
We know which department stores and malls you visit. We know where in the stores you spend the time. We know your hobbies, your interests, your friends, .... We know your mistress and your bank manager, your colleagues and the boys on the football team you coach. We know where you shop and when you shop, we know your commuting route and that you always sop at Starbucks for a coffee on the way in. We know it in real-time so we know you are going to be late in the office this morning. Would you like us to let them know that you will be there at 09:23 (or 09:12 if you do not stop for that coffee on the way)?
There are so many possibilities. Much good could be done with this data. Some bad. Which service would you like to see developed?
On 2010-07-13 07:47:00, Allan Engelhardt wrote in CYBAEA Data and Analysis:
I am not sure apeescape’s ggplot2 area plot with intensity colouring is really the best way of presenting the information, but it had me intrigued enough to replicate it using base R graphics.
The key technique is to draw a gradient line which R does not support natively so we have to roll our own code for that. Unfortunately, lines(..., type="l") does not recycle the colour col= argument, so we end up with rather more loops than I thought would be necessary.
We also get a nice opportunity to use the under-appreciated read.fwf function.
Read more (~535 words).
On 2010-06-22 11:45:00, Allan Engelhardt wrote in CYBAEA Journal:
We have a mild obsession with employee productivity and how that declines as companies get bigger. We have previously found that when you treble the number of workers, you halve their individual productivity which is scary.
We now re-do the analysis four years later and, just because we can, we are using the leading companies of the London stock exchange instead of the largest American companies.
The results still hold. We called it the 3/2 rule: treble the number of workers and you halve their individual productivity. Large companies with ten times the number of employees are ¼ as productive as their smaller competitors.
Employee productivity is a big issue. If all the FTSE-100 companies achieved their average profits per employee, then the index would generate almost £1 trn of additional net profits for the economy.
Read more (~245 words).
On 2010-06-22 11:20:00, Allan Engelhardt wrote in CYBAEA Data and Analysis:
We have a mild obsession with employee productivity and how that declines as companies get bigger. We have previously found that when you treble the number of workers, you halve their individual productivity which is mildly scary.
We revisit the analysis for the FTSE-100 constituent companies and find that the relation still holds four years later and across a continent.
Read more (~763 words, 5 comments).
On 2010-06-17 09:05:00, Allan Engelhardt wrote in CYBAEA Data and Analysis:
Following on from my previous post about improving performance of R by linking with optimized linear algebra libraries, I thought it would be useful to try out the five benchmarks Revolutions Analytics have on their Revolutionary Performance pages.
Read more (~300 words, 2 comments).
On 2010-06-15 10:21:00, Allan Engelhardt wrote in CYBAEA Data and Analysis:
Can we make our analysis using the R statistical computing and analysis platform run faster? Usually the answer is yes, and the best way is to improve your algorithm and variable selection.
But recently David Smith was suggesting that a big benefit of their (commercial) version of R was that it was linked to a to a better linear algebra library. So I decided to investigate.
The quick summary is that it only really makes a difference for fairly artificial benchmark tests. For “normal” work you are unlikely to see a difference most of the time.
Read more (~934 words, 1 comments).
Join the discussion
To good to be true ...
Allan, I don't think I can fully agree to your last paragraph though. As much as MNO's know from the above data, do you think that this kind of analysis will be allowed in Europe? Even now tracking on travelling needs to be anonymized if can be used for marketing purposes at all.
Besides the fact you need a real big beefy machine to analyze all this data, right? What you want to combine is SNA from CDR's with location & time data using triangularization. Wow, it's an appealing idea but like always what is the financial benefit to justify such investment and later on knowledge?
Where does your marketing campaign comes in? Ok, you might know which MSISN'd I buddy in for a beer on a Friday regularly (well, maybe once a month), you know that a couple of MSISDN's are in the same sporting complex that I'm in twice a week, but how are you going to make some money out of this?
Just a little challening you :-)
Andreas