Using Azure neural voices TTS via python speech services module, I am trying to get a custom lexicon to be used. Yes, I’ve spent hours reading and trying things already.
I’ve read that the lexicon file must be stored in Azure blob storage or Github. I’ve created blob storage, and ensured it is anonymously readable. I get audio output, but the phrase "BTW" in the SSML is pronounced as "By the way" which is the default alias built-in, and not the one I provided in my lexicon.
publicly readable lexicon file
<?xml version="1.0" encoding="utf-8"?>
<lexicon xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd" version="1.0" alphabet="ipa" xml:lang="en-US" xmlns="http://www.w3.org/2005/01/pronunciation-lexicon">
<lexeme>
<grapheme>BTW</grapheme>
<alias>By the flippin' way</alias>
</lexeme>
</lexicon>
SSML
<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xmlns:emo="http://www.w3.org/2009/10/emotionml" version="1.0" xml:lang="en-US">
<voice name="en-US-EmmaNeural">
<lexicon uri="https://<mynamespace>.blob.core.windows.net/ttsfiles/lexicon3.xml"/>
The phrase is: BTW
</voice></speak>
- namespace redacted
- the postfix number I increment to get around the 15-minute caching rule
2
Answers
Reading the docs more closely, lexicons are not supported for the specific neural voices I was using. It's helpful to use the Speech Studio to debug.
According to the documents, lexicon URLs support Azure Blob Storage.
AFAIK, there no need to Store lexicon file in Azure blob storage for TTS .
You can use SSML in Azure Cognitive Speech with Azure Storage.
Below is a Python code example for integrating the audio synthesis result with Azure Storage using Azure Text-to-Speech (TTS) with SSML:
Output:
For an alternative approach, refer to this document to set up storage for the Speech resource.