Watch the above video. Then watch it again, but close your eyes. Listen carefully to the voice making a restaurant reservation.
Duplex — Google’s artificially intelligent chat agent that can arrange appointments over the phone — has started rolling out to a “small group” of Google Pixel phone owners in select cities (Atlanta, New York City, Phoenix, and San Francisco). For now, the feature only works in English, with some restaurants, and can’t handle any other businesses that take appointments.
As news of the feature becoming slowly available has spread, a lot of debate has focused on whether it’s worth the effort. As many have pointed out, it seems faster to just call the restaurant yourself than to have to input all that is required into Google Assistant and wait for a confirmation. There are plenty of scenarios where this is useful, though — if you have a speech impediment, social anxiety when making phone calls, in a location where you can’t place a call, the restaurant is closed when you want to make the reservation, and so on.
I want to focus on the other hotly discussed part of the news: the Google Duplex voice. Many can’t get over just how humanlike it sounds, although I’ve watched the video so many times that I’ve convinced myself it doesn’t sound human.
If you listen very closely, you will notice “mistakes” in how the Duplex AI speaks. I put mistakes in quotes because I’m not entirely sure Google wants the technology to perfectly mimic how a human assistant would conduct the conversation.
What Duplex actually says sounds extremely believable — especially the multiple thank-yous and the “ba-bye” at the end. But you can tell that something is a bit off if you pay attention to the pauses. They are a little too long, especially at the very beginning and at the end. For the start, a human might fill a gap like that with an umm or an uhh, out of respect for the person on the other side. At the end, it’s clear Duplex isn’t going to hang up first (until it gets some sort of confirmation, anyway).
That’s what I’m calling “mistakes.” But I don’t know if Google is striving for perfection. And frankly, I don’t think it should be.
Getting a conversational AI’s voice to not sound robotic makes sense — it’s simply more pleasant and comfortable to talk to. But having it perfectly replicate what a human would do? That’s simply too much of a good thing.
Disclosure and transparency
In this Duplex ad from earlier this year, here is how the voice introduced itself:
Hi! I’m the Google Assistant calling to make a reservation for a client. This automated call will be recorded.
In the call we recorded, the wording has changed slightly, removing the part that makes it crystal clear this is not a human calling:
Hi, I’m calling to make a reservation for a client. I’m calling from Google, so the call may be recorded.
I’m sure Google is still iterating here — the wording will likely change a few more times. The team could in fact be A/B testing multiple versions.
But there’s a reason this disclosure is in here. You’ll remember that Google received a ton of criticism after its initial Duplex demo in May — many were not amused that Google Assistant mimicked a human so well. In June, the company promised that Google Assistant using Duplex would first introduce itself.
This is a double-edged sword. If Duplex gets things wrong and screws up the conversation, it makes Google look bad. If Duplex tries too hard to act human, it comes off as creepy and … makes Google look bad.
The trick is to strike a perfect balance: accurate and intelligent, but also transparent and honest.
While Duplex is a user-facing feature, currently exclusive to Pixel phones, it is ultimately businesses that interface with the conversational AI. That’s the part it can’t screw up. Google has to tread lightly on that tightrope or the whole experience will come crashing down.
More videos to come
We may have recorded the first video of Duplex in action, but I suspect this is going to birth a whole genre of new content.
Duplex is going to mess up, and it will be hilarious. Duplex is going to make serious mistakes, and it will be concerning. Duplex is going to get things too right, and it will be scary.
But hey, at least the internet will document it with plenty of videos.
ProBeat is a column in which Emil rants about whatever crosses him that week.