Keeping up with the speech services you will notice a few nice additions to the speech service. Besides the previous features like Text-To-Speech or Speech-To-Text (tutorials for both can be found here), the speech service has expanded to several new cool features out-of-the-box! Speech-To-Text and Speech translation are now real-time helping in several business scenarios, like call centre analysis and conversation transcription.

Take a look at the complete list of the updated Speech services here:

Service Feature Description SDK REST
Speech-to-Text Speech-to-text Speech-to-text transcribes audio streams to text in real-time that your applications, tools, or devices can consume or display. Use speech-to-text with Language Understanding (LUIS) to derive user intents from transcribed speech and act on voice commands. Yes Yes
Batch Transcription Batch transcription enables asynchronous speech-to-text transcription of large volumes of data. This is a REST-based service, which uses the same endpoint as customization and model management. No Yes
Conversation Transcription Enables real-time speech recognition, speaker identification, and diarization. It’s perfect for transcribing in-person meetings with the ability to distinguish speakers. Yes No
Create Custom Speech Models If you are using speech-to-text for recognition and transcription in a unique environment, you can create and train custom acoustic, language, and pronunciation models to address ambient noise or industry-specific vocabulary. No Yes
Text-to-Speech Text-to-speech Text-to-speech converts input text into human-like synthesized speech using Speech Synthesis Markup Language (SSML). Choose from standard voices and neural voices (see Language support). Yes Yes
Create Custom Voices Create custom voice fonts unique to your brand or product. No Yes
Speech Translation Speech translation Speech translation enables the real-time, multi-language translation of speech to your applications, tools, and devices. Use this service for speech-to-speech and speech-to-text translation. Yes No
Voice-first Virtual Assistants Voice-first virtual assistants Custom virtual assistants using Azure Speech Services empower developers to create natural, human-like conversational interfaces for their applications and experiences. The Bot Framework’s Direct Line Speech channel enhances these capabilities by providing a coordinated, orchestrated entry point to a compatible bot that enables voice in, voice out interaction with low latency and high reliability. Yes No

Source: Microsoft Docs

Find the documentation here for Conversation Transcription, Call Center Transcription, and Voice-first Virtual Assistants and stay tuned for more tutorials!


