Enterprise CRM 2.0 - integrating Voice

We talk a lot about new forms of voice & messaging. We also talk quite a lot about context, and about Web service APIs. And we also talk a great deal about the importance of the enterprise market. Now, we're going to put them all together in a sort of Telco 2.0 'Turducken'; after our 2008 Voice & Messaging 2.0 strategy report, which concentrated on new communications services for consumers, we're working on one which has evolved from an initial working title of "Call Centre 2.0" to Customer Care 2.0. It's all about how new voice technology can change the way organisations and their customers interact, for mutual pleasure and profit.

On the way, however, we've noticed an interesting technical problem.

The doughnut model - value is on each side of the call

First of all, let's recap. It's a fundamental Telco 2.0 principle that the price of voice is heading inexorably closer to zero, or at least the margin on it is; whether by VoIP of various kinds, or just by huger bundle sizes at static to falling prices, the core product is in crisis. We argue that the scarcity, and therefore the value, amid this voice glut is to be found in the context of calls - their latent content, the social meaning of the call, which isn't necessarily captured in the actual speech transferred over the bearer channel.

If you know someone really well, this doesn't matter; but for all other calls, there's no user interface for the shadow of context. Context includes the reason for your call, the caller and recipient's social graphs, their locations, their self-described status, their place in organisations...all the stuff that you spend telephone calls not talking about. This information is currently latent; invisible, hard to quantify or work with, and economically sterile. In the future, though, we think it will be the main source of economic value in telephony - which means thinking about calls in terms of what happens before and after the call.

But there's currently no interface for capturing this data. As Thomas Howe put it at November's Telco 2.0 event:

t-howe-1.png

Your call is important to us, but not important enough to answer

It's especially bad in enterprise applications - every day, literally millions of people spend their time either reciting the same information to successive call-centre agents or asking successive customers the same questions, then typing it into a computer. Quite often, the caller reads the information they give to the agent off a computer screen themselves, a computer that is in all probability connected to the other one by worldwide internetworks; the immense technological ingenuity of the Silicon Age is being expended so that two civilised human beings can pretend to be a pair of $5 Ethernet cards and a length of Cat5 cable. Meanwhile, other hordes of workers are busy trying their best to call people who don't wish to receive their calls and are probably less likely to buy from them if they do receive them, while still more people endeavour to avoid them.

Yet others are constantly interrupted by content-free communication, because of the high social status of telephony compared to, say, e-mail; it simply requires more emotional and mental investment to deal with a telephone call than an instant message/e-mail/RSS update/whatever, but unfortunately we have the social custom that calls must be answered, sight unseen. Probably only voicemail, a bastard offshoot of telephony which is implemented very poorly almost everywhere in the world, contains less actual information per unit of human intelligence consumed than the great majority of telephone calls.

So, it's urgent to provide ways to insert voice into new contexts, for example, by creating a Web services API for CEBP developers to use in other people's business processes. But this is necessary rather than sufficient. The end of the telco voice premium means that generating more minutes is futile.

In fact, precisely because of some of the "features" of telephony we mentioned above - the invisibility of context, the hegemony of the caller, and the difficulty of processing speech compared to text - this may even be harmful. It's precisely because existing telephony transfers all the content except for the context that so many calls involving a contact centre are muda - useless work. And the last thing we want to do is encourage that.

What if telephone numbers had more options?

Think of a telephone number. As Thomas Howe said at the last Telco 2.0 event, it's essentially a little program, or rather it's an API provided by the telephone network, that takes a maximum of three arguments - the line number, area code, and international prefix - and does one thing, initiate a telephone call to the number passed in. If the number is assigned to a GSM device, there is also the option of sending an SMS. The call is always initiated now; there is no provision for checking if the other party wants to receive it or is able to receive it. The source of the call is now usually passed to the recipient, but this feature is still not very reliable, and anyway it only provides a line identifier, not a user identity. Despite that, it provides only the most limited location information for fixed lines and none for mobile. It is stateless - there is no way of referring to a past call.

Clearly, whatever replaces this will have to have more arguments; we'll need to be able to pass a user ID distinct from the line ID, to send and request a current status message, to do the same with location and social-graph information, and to create a unique reference for the call. This is vital to make the ideas in this post on Skype happen. Fortunately, SIP and XMPP provide quite a lot of scope for passing this sort of information; some of the message profiles we've just looked at already exist, and it's easy enough to create user-defined ones for the others. And, crucially, they are both asychronous, so you don't have to start a voice call in order to get context information.

Again, let's look back to Thomas Howe's presentation at the November Telco 2.0 event...

t-howe-2.png

But there are some big unanswered questions here; for a start, a lot of the information involved falls under "plutonium" rather than "potatoes". Obviously, it's best if the user retains control of this, so the client application will have to be able to apply rules to different requesters. Another one is the way in which other applications and business processes interacting with these systems will keep track of conversations - the Web eventually standardised on the use of cookies, but this is widely recognised as an imperfect hack. Of course, migrating from the use of plain e164 telephone numbers to some sort of user ID which maps to a network location will help with this - a line doesn't necessarily equal a customer, after all.

Can telephony be better than the Web?

Do we need a URL for conversations? A search engine? There's an obvious issue in that telephony is a huge-traffic application, and the namespace would be somewhat crowded; however, this could be to some extent dealt with by delegating, so that conversations would be referenced by the participants. And the opportunities would be spectacular, both on the organisation side in CRM applications and on the individual (or community) side in VRM ones. (Which raises the question - is it time to think about IRM, Integrated Relationship Management, the intersection between CRM and VRM?)

[Ed. - We'll be covering this topic in more depth over the next few months and sharing new analysis at the 6th Telco 2.0 'world' event on 6-7 May in Nice.]