I am trying to receive a phone call and stream it into cognitive services. How can I do this?
I tried:
Setup phone number in Azure and make Azure Event grid and setup a webhook to send the the event to my app.
However I cannot seem to get a stream out of the received object.
Code:
[AllowAnonymous]
[HttpPost]
public async Task<IActionResult> IncomingCall([FromBody] EventGridEvent[] events)
{
int eventCount = 0;
foreach (var eventGridEvent in events)
{
try
{
switch (eventGridEvent.EventType)
{
case SystemEventNames.EventGridSubscriptionValidation:
{
var eventData = eventGridEvent.Data.ToObjectFromJson<SubscriptionValidationEventData>();
var responseData = new SubscriptionValidationResponse
{ ValidationResponse = eventData.ValidationCode };
if (responseData.ValidationResponse != null)
{ return Ok(responseData);}
}
break;
case "Microsoft.Communication.IncomingCall":
var _client = new CallAutomationClient("<ACS connection string>");
var eventData2 = eventGridEvent.Data.ToObjectFromJson<AcsIncomingCallEventData>();
string incomingCallContext = eventData2.IncomingCallContext;
string serverCallId = eventData2.ServerCallId;
var answerCallOptions = new AnswerCallOptions(incomingCallContext, new Uri("wss://... my url here}"));
answerCallOptions.OperationContext = "";
var call = await _client.AnswerCallAsync(answerCallOptions);
// How to get the stream from the call?
// then stream into Azure CLU
// some constants
var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);
speechConfig.SpeechRecognitionLanguage = "en-US";
speechConfig.SetProperty(PropertyId.Speech_SegmentationSilenceTimeoutMs, "2000");
using AudioConfig audioConfig = AudioConfig.FromStreamInput(audioStream);
using (var intentRecognizer = new IntentRecognizer(speechConfig, audioConfig))
Alternative I tried this.
But then I get some text, and how can I stream that text into CLU (IntentRecognizer)?
var callAutomationClient = new CallAutomationClient("<ACS connection string>");
var answerCallOptions = new AnswerCallOptions("<Incoming call context once call is connected>", new Uri("<https://sample-callback-uri>"))
{
AzureCognitiveServicesEndpointUrl = new Uri("https://sample-cognitive-service-resource.cognitiveservices.azure.com/") // for Speech-To-Text (choices)
};
var answerCallResult = await callAutomationClient.AnswerCallAsync(answerCallOptions);
2
Answers
Found a solution here
How to convert PCM S16LE audio to MU-LAW/8000 using .NET (Windows/Mac/Linux)
C# Code is here: https://github.com/chambersandpartners/twilio-transcription-poc/blob/master/TwilioMediaStreams/Services/MuLawEncoder.cs
In your code, you have already set up the EventGrid subscription and are receiving events. To handle incoming call events, you can use the following code
call.AudioStream
is a stream containing the audio from the phone call.I already had a test via Email check like below: