My use case is to convert text to speech using Azure and then play it into a virtual microphone.
option 1 – with an intermediate .wav file
I tried both steps manually on a Jupiter notebook.
The problem is, the output .wav file of Azure cannot be played directly on the python
"error: No file ‘file.wav’ found in working directory". When I restart the python kernal, audio can be played.
text-to-speech
audio_config = speechsdk.audio.AudioOutputConfig(filename="file.wav")
...
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
speech_synthesis_result = speech_synthesizer.speak_text_async(text).get()
audio play
mixer.init(devicename = 'Line 1 (Virtual Audio Cable)')
mixer.music.load("file.wav")
mixer.music.play()
option 2 – direct stream to audio device
I tried to configure the audio output device of azure SDK.
this method worked for output devices. but when I add an ID of the virtual microphone, it won’t play any sound.
audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=False,device_name="{0.0.0.00000000}.{9D30BDBF-1418-4AFC-A709-CD4C431833E2}")
2
Answers
I found a solution by changing the output a stream, saving to a file and then play it through pygame as follows,
Also much appreciated if there is any other method that doesn't need any intermediate audio file.
Create a speech service and get the key and location of the service.
Then set the environment with that key. Open command prompt and use the below code block.
Use
import azure.cognitiveservices.speech as speechsdk
After conversion, use the below code block to get the virtual device.
Get the device speaker information and set it in this location.