As a voice of countless telephone systems globally, and an Amazon Alexa upgrade voice, I’m actually thrilled about the bias:
Women are preferred for automation platform voicing over men several times over. And while there are many men who expertly voice telephony platforms and AI personas, women generally are preferred as a “disembodied” voice.
But should I be happy with that preference?
There’s a line of thought that the idea of issuing commands to an obedient and compliant unseen female voice is perpetuating gender roles. Chandra Steele in PC Magazine wrote in 2018:
“…one might think that using an emotionless AI as a personal assistant would erase concerns about outdated gender stereotypes. But companies have repeatedly launched these products with female voices and, in some cases, names. But when we can only see a woman, even an artificial one, in that position, we enforce a harmful culture.”
The goal – in development of most AI/assistant voices – is capable, efficient tones. A speech scientist working on the Cortana project may have said it best: A Microsoft spokesperson said Cortana can technically be genderless, but the company did immerse itself in gender research when choosing a voice and weighed the benefits of a male and female voice. "However, for our objectives—building a helpful, supportive, trustworthy assistant—a female voice was the stronger choice.”
But it needs to be pointed out that the female voice prized by most applications is clearly *not* the stronger choice. She is docile, obedient, and has a distinct lack of agency. Watson? He’s take charge, clipped, shorter sentences, more in command.
How responsible should we – those of us who work in the speech design space -- be for perpetuating some unsavory stereotypes? The goal has always been to create a calm, reassuring, and pleasant essence in a voice…and if those qualities are embodied by females, where’s the harm?
The first voice we hear – nay, pick up on vibration – is our mother’s. In primary school grades, we are typically taught by females in school. Friend’s mothers often become our ersatz surrogates when our own mothers aren’t around. And these females – who guide, teach, protect, and advise – embed in us a level of trust. The female voice, in its soothing tones and nurturing guidance, have guided us all practically from zygotes. The female voice is inherently trusted, familiar, and embraced.
It’s no wonder it’s the default choice when the objective is to create an AI essence with warm, reassuring, capable tones.
Attempts were made (and continue to be made) to build into automation voices a set of “defense” prompts, which can derail the inevitable intrusive, lewd, and emotion-venting prompts which get fired at them. (Siri’s sassy comebacks come to mind…)
These retorts create a sense of “self” that the automated voice has: she is quick with the comeback and can take care of herself. But, when it’s said and done: she is a servant, there to accommodate any and all requests without objection or opinion.
I think it’s essential that an automation voice be present but not intrusive; that they are there when they’re needed (and invisible when they’re not), and that they’re quick to serve us efficiently and make as rapid an exit when we don’t need them.
That the preference in the female voice has set up a situation where females are the receptacles of commands and orders is a slippery slope – created by the automation industry and the extent to which it gets perpetuated is a debate between utility and efficiency, and setting the tone for respect.