Trying to transcript audio from Telegram voice message but I got “unable to transcode data stream audio/opus -> audio/x-float-array” error from watson’s speech to text node.
I’m using Node-Red on Raspberry to simply transcript audio from Telegram voice message with node-red-contrib-telegrambot and node-red-node-watson.
With text messages, my code works as a charm.
With Voice Messages, I got “unable to transcode data stream audio/opus -> audio/x-float-array” error from watson’s speech to text node.
node-red flow images I don’t have enough reputation point to post images 🙁
JSON flow export
[
{
"id": "b4106ec1.63dd58",
"type": "tab",
"label": "Telegram",
"disabled": false,
"info": ""
},
{
"id": "d1198164.e38f68",
"type": "telegram receiver",
"z": "b4106ec1.63dd58",
"name": "FMWatsonBot",
"bot": "5f347711.7876d8",
"saveDataDir": "",
"x": 110,
"y": 100,
"wires": [
[
"f4b4ab25.5dde18",
"f5d126df.5b6928"
],
[]
]
},
{
"id": "c6ec445d.0840d8",
"type": "telegram sender",
"z": "b4106ec1.63dd58",
"name": "Send2Telegram",
"bot": "5f347711.7876d8",
"x": 780,
"y": 80,
"wires": [
[]
]
},
{
"id": "f4b4ab25.5dde18",
"type": "debug",
"z": "b4106ec1.63dd58",
"name": "",
"active": true,
"tosidebar": true,
"console": false,
"tostatus": false,
"complete": "false",
"x": 290,
"y": 60,
"wires": []
},
{
"id": "f5d126df.5b6928",
"type": "function",
"z": "b4106ec1.63dd58",
"name": "Save chat context",
"func": "msg.chatId = msg.payload.chatId;nmsg.type = msg.payload.type;nmsg.content = msg.payload.content;nreturn msg;",
"outputs": 1,
"noerr": 0,
"x": 230,
"y": 160,
"wires": [
[
"c3d1a92d.227568"
]
]
},
{
"id": "276cfad7.cef62e",
"type": "function",
"z": "b4106ec1.63dd58",
"name": "Set Chat Context",
"func": "msg.payload = {n chatId : msg.chatId,n topic : msg.type,n type : "message",n content : msg.payload};nreturn msg;n",
"outputs": 1,
"noerr": 0,
"x": 730,
"y": 220,
"wires": [
[
"c6ec445d.0840d8"
]
]
},
{
"id": "c3d1a92d.227568",
"type": "switch",
"z": "b4106ec1.63dd58",
"name": "Check msg type",
"property": "type",
"propertyType": "msg",
"rules": [
{
"t": "eq",
"v": "message",
"vt": "str"
},
{
"t": "eq",
"v": "voice",
"vt": "str"
},
{
"t": "else"
}
],
"checkall": "true",
"repair": false,
"outputs": 3,
"x": 300,
"y": 240,
"wires": [
[
"e8452a44.f967c8"
],
[
"6aea6224.578d8c"
],
[]
]
},
{
"id": "e8452a44.f967c8",
"type": "function",
"z": "b4106ec1.63dd58",
"name": "Echo message",
"func": "msg.payload = {n chatId : msg.chatId,n topic : "Text Echo",n type : msg.type,n content : msg.content};nreturn msg;",
"outputs": 1,
"noerr": 0,
"x": 540,
"y": 80,
"wires": [
[
"c6ec445d.0840d8"
]
]
},
{
"id": "6aea6224.578d8c",
"type": "change",
"z": "b4106ec1.63dd58",
"name": "Set voice URL",
"rules": [
{
"t": "set",
"p": "payload",
"pt": "msg",
"to": "payload.weblink",
"tot": "msg"
}
],
"action": "",
"property": "",
"from": "",
"to": "",
"reg": false,
"x": 440,
"y": 300,
"wires": [
[
"fc7b1590.557c"
]
]
},
{
"id": "493d1bac.216d3c",
"type": "change",
"z": "b4106ec1.63dd58",
"name": "Set transcription",
"rules": [
{
"t": "set",
"p": "payload",
"pt": "msg",
"to": "transcription",
"tot": "msg"
}
],
"action": "",
"property": "",
"from": "",
"to": "",
"reg": false,
"x": 680,
"y": 300,
"wires": [
[
"276cfad7.cef62e"
]
]
},
{
"id": "fc7b1590.557c",
"type": "watson-speech-to-text",
"z": "b4106ec1.63dd58",
"name": "S2T",
"alternatives": 1,
"speakerlabels": false,
"smartformatting": false,
"lang": "en-GB",
"langhidden": "en-GB",
"langcustom": "NoCustomisationSetting",
"langcustomhidden": "",
"custom-weight": "0.5",
"band": "BroadbandModel",
"bandhidden": "BroadbandModel",
"keywords": "",
"keywords-threshold": "0.5",
"word-confidence": false,
"password": "",
"apikey": "#########CHANGED VALUE TO POST###########",
"payload-response": false,
"streaming-mode": false,
"streaming-mute": true,
"auto-connect": false,
"discard-listening": false,
"disable-precheck": false,
"default-endpoint": true,
"service-endpoint": "https://stream.watsonplatform.net/speech-to-text/api",
"x": 530,
"y": 360,
"wires": [
[
"493d1bac.216d3c"
]
]
},
{
"id": "5f347711.7876d8",
"type": "telegram bot",
"z": "",
"botname": "FMWatsonBot",
"usernames": "",
"chatids": "",
"baseapiurl": "",
"updatemode": "polling",
"pollinterval": "300",
"bothost": "",
"localbotport": "8443",
"publicbotport": "8443",
"privatekey": "",
"certificate": "",
"verboselogging": false
}
]
Any hint?
Thanks in advance
Ferruccio
2
Answers
Update: Flow perfectly works with telegram node v4.4.0, but fails with new version 5.1.5
So, it's not a problem regarding Speech to Text node.
It boils down to how you are fetching the audio from Telegram. Check the answer to this related question – https://developer.ibm.com/answers/questions/424777/help-how-do-i-use-speech-to-text-with-my-telegram/
which shows how to build the url to send through to the Speech to text node.