Sunday, October 16, 2011

The iPhone, Siri, and the Turing Test


We can start counting the days until computers routinely win a Turing Test. It will happen for two reasons.

One reason, which is the basis for a post I wrote on the Turing Test earlier in the year, is that we are meeting the computers half way. The more we become twittering, texting beings, the easier it is for a computer to mimic us, because we are stripped of much of our human context and behave more like computers. 

The second reason is now readily apparent with the unfurling of the Apple iPhone4S and Siri, the digital assistant.  With the iPhone users accessing Siri to find restaurants, make appointments, and ask trivia-level questions (and with more areas of interaction added down the road),  Apple's servers are going to amass the queries of millions of people many times every day.  And as Google has shown with Google Translate, if a computer has enough raw material, it can pretty much figure this sort of thing out.

So as this database grows by orders of magnitude and the logic is refined accordingly, if a Turing Test is fashioned to distinguish a computer from a person in the day-to-day tasks of working with a personal assistant – in one room is hidden an iPhone, in another room a person, you interact with them as you would an executive assistant over the course of the day, and then at the end of the day you choose which one you think is the person – it is only a matter of time before the iPhone becomes indistinguishable from the human. In fact, to keep it from standing out, the iPhone will have to be dumbed down. 

In this respect, Apple's move toward a voice interface is brilliant. For one thing, no matter how well you do it, using a touchscreen on a phone is cumbersome. And although we have grown accustomed to it, as we have the desktop mouse and laptop touch pad, this isn't really the way we do things in life. Furthermore, hardware need only go so far. It is not like smart phone users are trying to model fluid dynamics. But while the hardware improvements at this point are marginal, for Siri it is open-field running. More and more sites can be added – travelocity, fandango, and what not – sites will be optimized for Siri and new sites will pop-up specifically for Siri. Logic and voice recognition will improve, and the move toward the iPhone as a conversational partner will accelerate. 

There already is an annual Turing Test underway, the Loebner competition, where a set of judges spend a few minutes conversing (via keyboard) with computers and with people, and then have to decide which is which. It is not a great test, because it is a competition rather than a normal human environment. The judges are trying to weed out the computer through types of questions and cadence of conversation in ways they wouldn’t in real life. A more reasonable Turing Test would be to invite a computer into a round of dinner conversations where the human subjects are not made aware that this is occurring. (They would all have to be remote conversations for obvious reasons). After the fact, subjects are told that some of their companions might have been computers, and only then are they asked to rank the guests by “humanness.”

A Personal Assistant Turing Test will be something like a mid-term. Computers may get to the final exam, but they will still have a ways to go. Free-ranging dinner conversation puts the bar high, because it brings in context and give and take.  The low bar, sort of the tests for remedial work, is one-liner text, or invective-laden argument, where the objective is to rant while ignoring anything the other person is saying. I go through a classic and humorous example of this in my other post. On the continuum from context-rich, intelligent conversation toward the increasingly vacuous – e-mail exchanges, online chat, and finally twittering – the digital assistant leans toward the latter. Its conversation is close to stateless, because each command is unanchored from all but the last inquiry and the information provided up to that point. One rung up is something like cocktail party chit-chat of the “do you know so-and-so”, “have you ever been to wherever” variety. For that, I think the iPhone and Siri will be able to shine. It can know just about everyone and everyplace.  

So if you love your iPhone now, just wait until you can chat with it over a couple of drinks.

3 comments:

  1. I have to think that the version of the Turing Test of serious interest to most people is the one in which we do our best to distinguish between AI and humans, not the one in which we pay attention to other stuff, drink some cocktails, and then wrack our addled memories for details that might tip us off. The vodka-ed down version you propose has probably already been passed by the automated voice on Verizon. among many other Siri precursors.

    Google Translate is great, don't get me wrong. But use it 100 times for any given language, and you'll see its limitations. There's a wall there, and it's somewhere right near where you talk about cocktail chatter (of the 'do you know so-and-so' type, as you put it) being 'stateless' or 'context-free.' Do you always answer those questions in the negative? If you answer them positively--ah, yes, old so-and-so--is your following statement drawn only from statistical analysis of previously occuring strings?

    Chomsky's review of Verbal Behavior is worth taking a peek at, since it addresses almost exactly the same claims put forward by Skinner over 50 years ago. I don't hold it up as pre-ordained truth: we'll know whether you're right about Suri soon enough. But he's been right up until now in non-vodka-ed-down environments.

    ReplyDelete
  2. And then what?

    Sing me,
    John Henry

    ReplyDelete
  3. Essay in the Communications of the ACM (Dec 2012) asks the question: Is it time to move beyond the test?

    http://cacm.acm.org/magazines/2012/12/157871-moving-beyond-the-turing-test/fulltext

    ReplyDelete