Research project: Increasing innovation velocity in the enterprise using text clustering

2004-07-21 14:07:00 Allan Engelhardt wrote in CYBAEA Journal:

I have an interest in social software in the enterprise - the use of tools like blogs, wikis, document management, rss, communities, discussion boards, and so on within large organisations to foster “bottom up” knowledge management, collaboration, etc.

I really became aware of the problem when I received a note from one of the guys at Amazon. He argued that as an organisation they prided themselves on hiring above-average people with above-average desire to create new things, but as a company they still found that ideas were lost and that starting new projects took too long.

We have worked with organisations to implement social software, looking at tools like Ecademy, Socialtext, Enable2, eGroupWare, and others. I argue that the creation of personal (blog) and collaborative content (wiki+document management), and its distribution (rss, Atom) is a solved problem.

What is not a solved problem is how to connect teams or individuals within the organisation that are, unbeknown to each other, working with the same or similar ideas.

I wish to investigate the idea that auto-classification and automatic taxonomy generation may be useful to enable such teams to make contact. The basic idea is to be able to cluster groups of output (blogs, workspaces, etc.) that are discussing similar ideas.

The challenge in the enterprise is that there is simply too much content being generated for a human to follow it (one of my clients has 60,000 employees and some 2,500 active, funded projects).

Search is not the answer, because an individual team does not know that it needs to search.

I do not want to rely exclusively on existing ("top-down”) taxonomies. Chances are that if you have a taxonomy then you have a project, and if you have a project then there are existing processes that enable people to know about them and contribute to them.

I am interested in new projects and emerging ideas within the organisation, and how to bring together the team that can make them happen. This means that I am interested in “the taxonomy of tomorrow”, which is something you haven't formally built yet.

My theory is that automatic classification and taxonomy generation should be effective when applied within a single enterprise, as the vocabulary and topics will be fairly standard.

I do not wish to rely on authors creating their own categories. In my experience, people don't categorise. Getting anybody to document what they are doing is enough of a challenge without bringing up topics like “information architecture”.

That is why I am looking for automatic (unsupervised) text clustering. If I have somebody in London who has great ideas for my retail shops; somebody in Manchester who is experimenting with practical changes to my consumer stores; and a man in Glasgow who would like to promote change in our high-street outlets; how do I enable them to discover each other and work together?

I can not use a standard text classifier on this because I do not have a training set.

An alternative approach explored by people like Matt Mower of eVectors is to assume that 10-20% of people will classify and use that to automatically classify the rest. That is an interesting assumption and a well-understood problem (classify text based on examples) with well-documented solutions from naive Bayes through neural networks and on to support vector machines and similar solutions (the list here in roughly order of increasing performance).

However, the issue is that you are always using “yesterday's taxonomy” to categorise. I am not very interested in this, because chances are that if you have a useful taxonomy then you have existing projects within the organisation dealing with the issues, and promoting existing (funded) projects within a company is a (largely) solved organisational problem.

I'm interested in “tomorrow's taxonomy” to bring together people around new innovative ideas. In the example above, assume that retail stores are a new idea and that the corporate terminology ("retail shops”, “consumer stores”, or “high-street outlets”) has not yet been embedded within the corporate culture. How can I bring together the idea-man in London with the guys in Manchester who can implement them and the manager in Glasgow who can promote the change within the organisation and help it become a change project?

The assumption (which I want to test) is that there will be enough shared vocabulary and a sufficient limited set of goals and topics that automatic text clustering can work within the single enterprise. The additional advantage is that it doesn't have to be perfect. I was working with an organisation who estimated that it took them about one year from ideas to funded projects. If we can change that average to 11 months, say by driving the time down to six months for the 17% of connections that we successfully make (which seems like an unambitious hope), then that organisation has a real and lasting advantage over its competitors.

Ideas and suggestions would be most welcome.

Subscribe to CYBAEA Journal

Jump to comments.

You may also like these posts:

  1. [0.43] Social Software in the Enterprise

    “We like to think that we hire mostly above-average people with above-average skills and motivation. We know that our future as a company depends on our ability to continuously innovate and stay one step ahead of the competition. How can we best encourage creativity and risk-taking within our organization and how can we be much more effective at initiating and executing changes initiatives?” The organizations that I have been working with see social software in the enterprise as an opportunity to effectively address two longstanding business issues of knowledge management and collaboration and thereby increase innovation within the enterprise. They recognize that the business climate has fundamentally changed and that a much more agile and adaptive organiza…

  2. [0.39] Selling Software: Artisans of the World Unite!

    The inimitable Tim Bray has an interesting little two-part musing titled Business Ignorance . Part product placement, the main concern of the article are thoughts and ideas on how to market and sell yourself in todays business climate if you are a small, …

  3. [0.38] Connecting people in the creative enterprise

    In order to get any new project or initiative underway in a large enterprise, you need at least three people: somebody with the idea or business problem, somebody who can implement or at least prototype a solution, and somebody who can promote it. This is…

  4. [0.38] Requirements for Electronic Representation of Social Networks

    It seems to me that the requirements for systems like LinkedIn (which we covered obliquely in a previous note ), Friendster , and many others, all of which allow you to document your social networks and, ultimately, profit from them, can be understood by …

  5. [0.37] Sometimes the shortest route is a long one

    Sometimes the shortest route is a long one. I am still a little dubious about some of the social networking sites out there, but one thing I find absolutely fascinating is the insights they can give into how human networks and relationships really work. I…

  6. [0.35] Social Software in the Enterprise: A Historic Perspective (PART 1)

    The applications of social software to the enterprise will profoundly change our business culture and therefore it will be a substantial force for shaping what our society will look like in the future. To understand the fundamental changes that are influe…

Join the discussion

Do you agree or disagree? Have a question of want to make a point? Join the discussion:

There are no comments yet. Be the first to comment.