Today’s voice transcription services are just the tip of the iceberg when it comes to potential applications for speech technology in the enterprise. Conversational artificial intelligence, hands-free mobile applications and immersive collaboration tools are among the emerging trends, according to Jon Arnold, principal, J Arnold & Associates.
“Artificial intelligence (AI) is driving all forms of technology now, and where AI goes, speech tech will follow,” said Arnold at an Enterprise Connect session, “Where Speech Technology is Today for Enterprises, and Where It’s Heading.” However, he cautioned that voice is a means, not an end, “so you can’t view speech tech in a vacuum.”
Currently, there are four core enterprise speech tech applications that focus on collaboration and productivity, Arnold said. They include speech-to-text for meetings, automatic speech recognition (ASR) for virtual assistants, ASR for conversational analytics, and real-time translations.
When enterprises add AI into these applications, speech-to-text remains the dominant use case, according to Arnold. Other popular applications include text-to-speech and sentiment analysis, although their adoption rates are much lower. Other use cases for speech tech include:
• Web conferencing transcriptions
• Customer experience analytics
• Subtitling and closed captioning
• Professional transcriptions
• Communications and media monitoring
• Voicemail and legal transcriptions
AI can add value to all these applications, said Arnold. However, there are many other ways to put AI-driven speech tech to work in the enterprise, particularly when supporting frontline and field-based workers. For example, ASR applications can give frontline workers real-time access to knowledge experts, or provide touchless access to applications and workflows.
AI can also play an important role in collaboration applications, such as using voice search for key phrases and ideas during a meeting, managing compliance requirements, implementing voice-identity biometrics, and providing multiple language translations to support global teams.
“Beyond collaboration, speech technologies can extract new value from conversations across the organization,” Arnold said. “Recognizing their importance, most vendor offerings now are building their solutions around vertical markets, such as legal, sales teams, healthcare, retail, education or government.”
Conversational artificial intelligence (CAI) is one of the hot trends of 2022, Arnold said. By combining natural language processing tools with software applications like chatbots and voice assistants, these speech tech solutions can develop the ability to interact with customers beyond a predetermined script through progressive training.
Arnold said the next stage for speech technology is immersive collaboration, and major players are already investing in these applications, such as Mesh for Microsoft Teams or Meta’s Horizon Workrooms. “This is all about the future of work and digital transformation,” he said. “Whatever form immersive collaboration takes, voice will be central to adoption.”
As enterprises roll out new applications for speech tech, Arnold said they need to be careful to address privacy issues, such as unintentional tracking of conversation. They also need to be designed to support worker productivity, such as automating routine voice workflows.
Organizations also need to be careful to minimize biases in training AI systems, and incorporate security features, Arnold said. “Technology is neutral, but AI bias complicates matters,” he added. “With innovation, comes both good and bad actors, so enterprises need to be aware of security issues such as validating what’s real and authentic to avoid deep fakes. That will be one of the fundamental IT challenges for managing the road ahead.”