On 2010-06-22 11:20:00, Allan Engelhardt wrote in CYBAEA Data and Analysis:
We have a mild obsession with employee productivity and how that declines as companies get bigger. We have previously found that when you treble the number of workers, you halve their individual productivity which is mildly scary.
Let’s try the FTSE-100 index of leading UK companies to see if they are significantly different from the S&P 500 leading American companies that we analyzed four years ago.
We will of course use the R statistical computing and analysis platform for our analysis, and once again we are grateful to Yahoo Finance for providing the data.
The analysis script is available as ftse100.R and is really simple:
## ftse100.R - Display employee productivity for FTSE-100 consitituents
## Copyright © 2010 Allan Engelhardt <http://www.cybaea.net/>
## All Rights Reserved.
## Get the index constituents.
ftse.100 <- read.csv(file = "http://uk.old.finance.yahoo.com/d/quotes.csv?s=@%5EFTSE&f=s&e=.csv", header = FALSE)
names(ftse.100) <- c("symbol")
data <- data.frame(symbol=NULL, employees=NULL, profit=NULL, sector=NULL)
## For each stock symbol, get employees, profit, and sector
for (symbol in ftse.100$symbol) {
profile.url <- paste("http://uk.finance.yahoo.com/q/pr?s=", symbol, sep="")
con <- url(profile.url, open = "r")
text <- readChar(con, 2^24) # enough bytes
close(con)
x <- sub('.*Number of employees:</td><td.*?>[[:space:]]*([[:digit:],]+).*', "\\1", text, ignore.case = TRUE)
x <- gsub(',', '', x)
empl <- tryCatch(as.integer(x), warning = function(x) NA)
x <- sub('.*Net Profit.*?</td><td.*?>[[:space:]]*([+-]?[[:digit:],]+).*', '\\1', text)
x <- gsub(',', '', x)
profit <- tryCatch(as.integer(x)*1e6, warning = function(x) NA)
sector <- sub('.*Sector:</td><td.*?>(.*?)</td>.*', '\\1', text)
if (any(c(empl, profit) <= 0, is.na(c(empl, profit)))) {
cat("Error parsing symbol", symbol, "see", profile.url, "\n")
} else {
data <- rbind(data, data.frame(symbol=symbol, employees=empl, profit=profit, sector=sector))
}
Sys.sleep(1)
}
## Save the data so we don't have to hit Yahoo all the time.
save(data, file = "data.RData")
## Save plot to file:
#png(filename="ftse100.png", width=800, height=800, pointsize=14, bg="white", res=100)
opar <- par(cex.sub = sqrt(sqrt(2)), font.sub = 3, font.lab = 2)
## x and y coordinates of plot and plot limits
x <- with(data, employees)
y <- with(data, profit/employees)
xlim <- c(10^floor(log10(min(x))), 10^ceiling(log10(max(x))))
ylim <- c(10^floor(log10(min(y))), 10^ceiling(log10(max(y))))
## Set up to display different color and symbols
plot_col <- 1
plot_pch <- 1
markers <- 21:25
pchs <- rep(markers, ceiling(length(levels(data$sector))/length(markers)))
palette(rainbow(length(levels(data$sector)), start=3/6, end=6/6))
# Make empty plot:
plot.new()
plot(profit/employees ~ employees, data = data[FALSE, ],
type = "p", pch = pchs[plot_pch], col = plot_col,
log="xy", xaxp = c(xlim, 1), yaxp = c(ylim, 1), xlim = xlim, ylim = ylim,
main = "Profit per employee (FTSE 100)", xlab = "Employees", ylab = "Profit per employees (GBP)")
## Plot each sector
for (sector in levels(data$sector)) {
plot.xy(xy.coords(with(data[data$sector == sector,], employees),
with(data[data$sector == sector,], profit/employees),
log = "xy", xlab = "", ylab = ""),
type = "p", pch = pchs[plot_pch], col = plot_col, bg = plot_col)
plot_pch <- plot_pch + 1
plot_col <- plot_col + 1
}
legend(x = "bottomleft", legend = levels(data$sector), title = "Industry Sectors",
col = palette(), pt.bg = palette(), pch = pchs, cex = 2/3, pt.cex = 1, ncol = 2)
## Fit a linear model to the log-log data:
m <- lm(log10(y) ~ log10(x))
xl <- c(xlim[1]*5, xlim[2]/5)
yl <- 10^predict(m, data.frame(x = xl))
lines(xl, yl, col = "darkred", lty = "dashed", lwd = 2)
t <- sprintf("Power = %0.3g", m$coefficients[2])
text(xl[2], yl[2], t, adj = c(0.25, -1.5), col = "darkred", font = 2)
## All done.
par(opar)
dev.off()
Leave it to run and this is what you get:
The power law still broadly holds. In a large company, the productivity of the individual employee is only ¼ of the productivity in a company with one-tenth of the number of workers.
The analysis for the FTSE All-Share index is easy (ftse-all.R) and gives a slope of -0.7605541 for the 301 companies with the required information. This is worse, much worse (you only need about 6 times as many workers to reduce the productivity to ¼).
Join the discussion
statistically significant effects of sector?
I see you plot with different symbols per sector. Could you run a model with fixed effects by sector to see if you get statistically/substantially different slopes or intercepts by sector? That is, does (say) the petrochemical sector have a reduced drop-off, while the IT sector has a sharper decrease in productivity?
Effects per sector
@Harlan:
Thank you for your comment. I like the suggestion, but there are so few samples in many of the sectors that I doubt you will get a statistical result. But intuitively we would expect an effect: the Investment Companies sector has different productivity from Transports (median 10,000,000 versus 2,599 in the All-Share data set).
Do
sort( with(data, tapply(profit/employees, sector, median)) )
to see the productive industries.
Explanation and mirages
Big companies are usually old -> profitability has decreased and the companies are waiting bankruptcy.
Another thing is that because of subcontractors the employee and accounting information is partly unreliable/irrelevant. Subcontractors are deliberately used to cook the books and to hide what really happens in the business.
Selection bias
If productivity and number of employees were independent, the largest companies would be those at the extremes of productivity and employee numbers. The FTSE sample will only include a company with few employees if they are extremely productive but will include companies thatare not very productive if they have large numbers of employees.
It's obviously more complicated than that but I would guess that the relationship would be weaker in the FTSE 250.
Re: Selection bias
@Bunbury:
Thank you for your comment. You are of course completely right about the selection bias and that is a very good point indeed.
Which makes it even more surprising that the relationship is stronger when a broader index is used. We found -0.60 for the FTSE-100, but -0.68 for the S&P 500 index and -0.76 for the FTSE All-Share.
Again, our analysis is by necessity limited to companies where we have the profit and employee information (and where they are greater than zero) which introduces another selection bias. But it is not immediately obvious to me that this bias should make the effect stronger??