Azure - RuntimeError with Microsoft's Text-to-Speech on Google Cloud Run

Nori
July 21, 2023
129 views
1 vote
2 Answers

I am experiencing an issue with running Microsoft’s text-to-speech on Google Cloud Run. The problem arose suddenly last night and I’ve been getting the following error:

 Traceback (most recent call last):
  File "/code/app/./speech/backend.py", line 42, in save_text_to_speech
    speech_api.speech()
  File "/code/app/./speech/speech_api.py", line 266, in speech
    synthesizer = SpeechSynthesizer(speech_config=self.speech_config, audio_config=audio_config)
  File "/usr/local/lib/python3.9/site-packages/azure/cognitiveservices/speech/speech.py", line 1598, in __init__
    self._impl = self._get_impl(impl.SpeechSynthesizer, speech_config, audio_config,
  File "/usr/local/lib/python3.9/site-packages/azure/cognitiveservices/speech/speech.py", line 1703, in _get_impl
    _impl = synth_type._from_config(speech_config._impl, None if audio_config is None else audio_config._impl)
RuntimeError: Runtime error: Failed to initialize platform (azure-c-shared). Error: 2153

The error occurs when I try to execute synthesizer.speak_ssml(). Here is the related code:

audio_config = AudioOutputConfig(filename=file_name)
synthesizer = SpeechSynthesizer(speech_config=self.speech_config, audio_config=audio_config)
synthesizer.speak_ssml(self.input_data['text'])

Interestingly, this issue doesn’t occur in my local environment. Additionally, if I build the image locally and deploy it to Cloud Run, I don’t encounter this error.

My local environment is:

MacOS 11.6.2
Docker v20.10.10

However, when I build it with CloudBuild and deploy it to Cloud Run, I get the above error. I have tried the following to resolve it:

Clearing the kaniko cache
Switching from ‘kaniko’ to ‘gcr.io/cloud-builders/docker’

Neither of these attempts resolved the issue. Considering the circumstances under which the error occurs, I suspect there might be a problem with CloudBuild, but I can’t pinpoint the exact cause. If there are any other potential solutions I could try, I would greatly appreciate your advice.

Update 2023-07-18

FROM python:3.11

WORKDIR /app


RUN apt-get update && 
    apt-get install -y build-essential libssl-dev ca-certificates libasound2 wget && 
    wget -O - https://www.openssl.org/source/openssl-1.1.1u.tar.gz | tar zxf - && 
    cd openssl-1.1.1u && 
    ./config --prefix=/usr/local && 
    make -j $(nproc) && 
    make install_sw install_ssldirs && 
    ldconfig -v && 
    export SSL_CERT_DIR=/etc/ssl/certs && 
    cd ../ && 
    rm -rf openssl-1.1.1u && 
    pip install --no-cache-dir azure-cognitiveservices-speech==1.30.0
COPY . /app
CMD ["python3", "app.py"]


import os
import azure.cognitiveservices.speech as speechsdk

# This example requires environment variables named "SPEECH_KEY" and "SPEECH_REGION"
speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('SPEECH_KEY'), region=os.environ.get('SPEECH_REGION'))
audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
print('key:'+os.environ.get('SPEECH_KEY'))
print('region:'+os.environ.get('SPEECH_REGION'))
# The language of the voice that speaks.
speech_config.speech_synthesis_voice_name='en-US-JennyNeural'

speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)

# Get text from the console and synthesize to the default speaker.
text = "Hello world!"

speech_synthesis_result = speech_synthesizer.speak_text_async(text).get()

if speech_synthesis_result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
    print("Speech synthesized for text [{}]".format(text))
elif speech_synthesis_result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = speech_synthesis_result.cancellation_details
    print("Speech synthesis canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        if cancellation_details.error_details:
            print("Error details: {}".format(cancellation_details.error_details))
            print("Did you set the speech resource key and region values?")

Code from MS document

Error

Speech synthesis canceled: CancellationReason.Error
Error details: Connection failed (no connection to the remote host). Internal error: 1. Error details: Failed with error: WS_OPEN_ERROR_UNDERLYING_IO_OPEN_FAILED
wss://southeastasia.tts.speech.microsoft.com/cognitiveservices/websocket/v1
X-ConnectionId: c4955d953f8e480c906061e6219eb8fd USP state: Sending. Received audio size: 0 bytes.
Did you set the speech resource key and region values?

I had set SPEECH_KEY and SPEECH_REGION. It was printed on console. However I got the error. Please help me.

Answers

Chosen as BEST ANSWER
- Nori
- July 21, 2023 at 7:06 am
- 0 votes
0
I was able to resolve this issue by reaching out to Microsoft. I will share the solution here.

When I implemented log activation for TTS, the following error occurred:
```
[342629]: 104ms SPX_TRACE_ERROR: AZ_LOG_ERROR: tlsio_openssl.c:691 error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed
```
This error indicates that the certificate verification process to establish a TLS session failed. The cause of the error could be potentially influenced by the operating system being used. Specifically, an error may occur if the location where the Speech SDK expects the certificate information to be stored does not match the location where the OS actually stores the certificate information.

To resolve this issue, adjustments need to be made for the handling of certificates when using "Python:3.9", which is used in the Dockerfile as an OS layer for Debian (bookworm).

Specifically, the problem can be resolved by setting the "SSL_CERT_DIR" environment variable as follows:

export SSL_CERT_DIR=/usr/lib/ssl/certs

This has confirmed that Text-to-Speech can be used within the container.

Run python app.py

Here is the Dockerfile:
```
FROM python:3.9

WORKDIR /app


RUN apt-get update && 
    apt-get install -y build-essential libssl-dev ca-certificates libasound2 wget && 
    wget -O - https://www.openssl.org/source/openssl-1.1.1u.tar.gz | tar zxf - && 
    cd openssl-1.1.1u && 
    ./config --prefix=/usr/local && 
    make -j $(nproc) && 
    make install_sw install_ssldirs && 
    ldconfig -v && 
    export SSL_CERT_DIR=/etc/ssl/certs && 
    cd ../ && 
    rm -rf openssl-1.1.1u && 
    pip install --no-cache-dir azure-cognitiveservices-speech==1.30.0
ENV SSL_CERT_DIR=/usr/lib/ssl/certs
COPY . /app
CMD ["python3", "app.py"]
```

(Edit)

- DasariKamali
- June 27, 2023 at 2:42 pm
- 0 votes
0
I also got the same error with Python 3.10 in Dockerfile as below,

Dockerfile:
```
FROM python:3.10

WORKDIR /app

COPY . /app

RUN pip install --no-cache-dir azure-cognitiveservices-speech==1.29.0

CMD ["python3", "app.py"]
```
Ouput:

Then, I changed the Python version to 3.9 and got the audio output with the input text .

Code:

app.py:

I tried below sample code to generate audio with input text.
```
import azure.cognitiveservices.speech as speechsdk

AZURE_SUBSCRIPTION_KEY = '<key>'
AZURE_REGION = '<region>'
voice_name = 'en-IN-NeerjaNeural'

def save_text_to_speech(text, file_name):
    speech_config = speechsdk.SpeechConfig(subscription=AZURE_SUBSCRIPTION_KEY, region=AZURE_REGION)
    audio_config = speechsdk.audio.AudioOutputConfig(filename=file_name)
    synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
    result = synthesizer.speak_text_async(text).get()

text_to_speak = "Hi Kamali, how are you?"
output_file = "output.wav"
save_text_to_speech(text_to_speak, output_file)
```
Dockerfile:
```
FROM python:3.9

WORKDIR /app

COPY . /app

RUN pip install --no-cache-dir azure-cognitiveservices-speech==1.18.0

CMD ["python3", "app.py"]
```
Output:

Below is the command to build a Docker image:
```
docker build -t <image_name> <build_context>
```
Successfully build the Docker image as below,

Below is the command to run Docker image:
```
docker run <image_name> 
```
It runs successfully without any errors as below,

Then, We need to check the Docker container ID to get audio generated output.wav file.

Command to check Docker Container ID:
```
docker ps -a
```
You can also get the Docker container ID in Docker Desktop directly as below,

And below is the command to get the audio genrated to the output.wav file.
```
docker cp <container_id>:/app/output.wav <path to output.wav file>
```
It successfully copied to output.wav file as below,

Reference:

Check this link to know more about Convertion of text to speech.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Azure – RuntimeError with Microsoft's Text-to-Speech on Google Cloud Run

Update 2023-07-18

Error

Answers