skip to Main Content

I asked a question about combining linear16 encoded audio files together here – Combine linear16 encoded audio files | PHP

I was suggested to use ffmpeg, I’m new with it, any suggestion how to combine LINEAR16 encoded audio files?

...
$client = new TextToSpeechClient();
$synthesisInputText = new SynthesisInput();
$audioConfig = new AudioConfig();
// $audioConfig->setAudioEncoding(AudioEncoding::MP3); works fine
$audioConfig->setAudioEncoding(AudioEncoding::LINEAR16);
$voice = new VoiceSelectionParams();

$voice->setLanguageCode('en-US');
$voice->setName('en-US-Neural2-A');

$media_total = '';
foreach($txt_array as $k => $txt) {

     $synthesisInputText->setText($txt);
     $response = $client->synthesizeSpeech($synthesisInputText, $voice, $audioConfig);
     $media = $response->getAudioContent();

     $media_total .= $media; // audio is encoded linear16, and combining in this way does not work
    
// how to use ffmpeg here, to combine $media pieces, change encoding

}

Thanks

UPDATE

Examples of generated audio files, stored in mp3 file:
https://storage.googleapis.com/gspeech-audio-storage/gspeech_en-US:6:D:1:0_5c021edd23218122a7ce116cb297bc91_5be04f8b55b1b28025ad6eea55109e83.mp3

The audio looks like this. Seems like when making encoding as linear16, the output like the WAV format:

5249 4646 240b 0400 5741 5645 666d 7420
1000 0000 0100 0100 c05d 0000 80bb 0000
0200 1000 6461 7461 000b 0400 ffff f2ff
d6ff bcff b7ff c9ff f5ff 2900 5800 7b00
8800 8b00 8c00 8d00 9000 9400 9100 8800
7a00 6600 5400 4500 3e00 3b00 3600 3200
2800 1b00 1000 0a00 0700 0200 f8ff f5ff
f3ff f8ff ffff 0400 0300 0e00 1b00 0c00
0800 0500 1000 1500 1e00 2800 2b00 2e00
2b00 2d00 2900 3000 3300 3200 2f00 2600
1800 0b00 0200 feff f6ff f7ff f5ff f3ff
f1ff edff eaff e6ff e1ff d5ff d0ff caff
c5ff c3ff c2ff c1ff cbff ccff cbff d0ff
d9ff d7ff ddff ddff e0ff e5ff e2ff f2ff
f4ff fdff 0200 0100 0600 0a00 1000 0e00
1300 0f00 0f00 1c00 1a00 2000 2400 2500
2500 2000 1d00 1800 1300 0f00 1200 0500
fdff f1ff ecff ddff d3ff cdff b7ff c1ff
c0ff b8ff b4ff b2ff aaff a8ff a5ff abff
a6ff a0ff 9bff 91ff 8fff 8aff 86ff 83ff
84ff 8aff 83ff 78ff 70ff 65ff 66ff 6eff
71ff 70ff 6fff 69ff 5aff 54ff 4fff 5bff
5eff 64ff 6eff 68ff 68ff 6aff 6aff 65ff
71ff 63ff 5aff 54ff 4aff 4eff 56ff 58ff
66ff 6bff 66ff 66ff 70ff 75ff 6bff 7bff
71ff 76ff 8bff 98ff 8eff 95ff 9fff 92ff
93ff 98ff a2ff afff bdff baff b5ff bdff
b6ff c1ff c0ff c9ff d9ff dfff e7ff f5ff
f8ff ffff 1000 1200 1200 0a00 0a00 1100
f7ff 0900 0a00 0300 1300 0e00 1300 1900
1e00 2600 0400 1700 2000 2600 3200 2f00
2b00 1c00 2500 1d00 2800 2f00 3200 3d00
2b00 3c00 3100 4300 4900 4300 5c00 6c00
6a00 5f00 5e00 4d00 4400 4800 4900 2e00
3300 3500 3900 3e00 4800 3a00 3600 3a00
3700 3900 2d00 3800 3800 2700 2500 2000
2c00 2800 2700 3100 0f00 2100 4200 2f00
2e00 2400 2c00 2600 1900 2a00 0100 1600
1d00 2900 4100 4100 6400 5c00 5700 4800
3e00 3800 1f00 2700 3600 2400 3d00 5200
6600 6400 6200 4600 4f00 4800 2200 2000
1f00 2800 0400 2300 1700 2b00 2700 1800
0000 0000 f0ff e3ff b9ff a6ff acff a3ff
a2ff 97ff a5ff 96ff 7aff 5cff 56ff 4eff
31ff 18ff 2aff 1eff 11ff 19ff 04ff 07ff
08ff e3fe f3fe 1eff e8fe f5fe f1fe cffe
03ff e2fe 17ff 2bff 11ff 34ff 4fff 4fff
2dff 4bff 43ff 2eff 60ff 49ff 6cff acff
e9ff 0b00 5600 3e00 5000 3d00 5b00 5500
4300 2100 3500 2d00 6200 6800 4b00 5900
9a00 3a00 5d00 6e00 3200 0400 1400 f8ff
2900 3200 0700 ebff 0700 f2ff 2800 1100
1c00 3700 d7ff ddff cdff e3ff f2ff 0e00
daff 2f00 6900 8f00 a900 7900 6e00 6b00
f000 e500 7e00 ab00 de00 c700 e000 4801
3b01 5c01 1901 0701 6c01 e300 f800 4a01
2901 5a01 5f01 6701 8801 5901 1701 1e01
4901 3f01 1f01 ff00 5601 cf00 5200 0401
2e01 7001 2801 4d01 a100 df00 df00 6801
6401 a100 fa00 ba01 e300 b000 2c01 5501
b3ff 7e01 5702 3700 f300 6401 f501 3f02
9dff 98ff 8201 2cff 24fd 6b01 6bff 48ff
7301 f201 9102 a0ff 3c02 9c00 c500 3901
5d02 a500 1200 1300 bdfe d8fe 25ff 10fe
04fd a400 0300 ad01 9e04 6105 1405 ee04
fe04 8a05 4606 2f05 5702 fa02 1901 b002
4cff 8b01 9902 6a03 0d04 2e05 f202 3801
9000 d4fe 0c00 2501 56ff 41ff 31ff 98fd
d0fd a7fb 3efe 45ff 91fc c9fe 3cfd 21fc
b5fd 2bfd 7bfb 0afd 14ff f1fe 95ff 18ff
0000 50fd 59fe a000 f9ff 7a00 8100 1100
f0ff 5300 af00 1f00 3a03 e001 7703 2e03
9503 5001 8fff b400 c9ff 1602 d201 3e00
2201 9501 88ff 71ff f901 5100 4eff 1b00
7e00 aefe e6fd 24fd 27fc d2fe 47fd d5fe
befe ddfd 22ff 03fe c7fc 5dfe c2fd f2fb
aefe 45fd 63fd 16fb 82fd ecfc 5ffc 66fd
3cfe fcfd 4cfd 21fe f2fd 3bfd 0afd f8fc
74fd bafe 36fd 6bff 65fe c2fd c0fd e8fe
cbff 1dff dcff 26ff 3afe 92fd 06fd 49fe
78fb 30fe 8bff dafe bcff 37ff 50fe a3fe
69fe 7dfe 5e00 06ff 1900 4afd 93fd 09ff
0eff d5fd a900 d702 0300 a301 2903 2fff
07ff d6fd 51fe 9aff 8dff 3200 4a01 e200
cdff 5201 d501 fd01 bd00 6001 ab00 c6fd
20ff ccff c0fc 6eff c400 44ff 8001 24ff
3bff 0800 93fd 9500 bffe 9dfe c8fe dafd
5ffe 76fe e6ff 8bfe 8500 6001 73ff a200
d2fe fdfe 1bff 60ff cb00 1800 1d01 7e00
f800 c600 ec00 9e01 c101 f601 8701 ff01
fb01 d400 bc00 b300 5c00 9100 eeff 6e01
6c02 3eff b0ff 1101 7bff d8ff 9c00 6ffe
1fff aafd a0fe d8fd b5fe 5cfd 36fc 31ff
24fe 71ff 23fe 60ff 65fd 20fd 5efd 9efb
7bfe 07fc 3ffc c1fe ecfc c1fb bcfb ebfc
6afd d1fd 9cff 8dfe 82fd cffd 4dfc 53fd
6cfd 82fe a1fd 52fd f8fe 6ffd 39fd e8fc
b7fb 23fd 2dfd aefc aefc bffc 75fc f0fa
f5fc e8fd 76fe 2dfd 9afc 80fd 1bfc a8fa
36fd 35fa d8f9 31fa d1fa f8fb a2fc e7fd
52fd d0fe d9ff 2eff 97ff 41ff 23fd 22fe
73fd 88fd 78ff 9aff e2fe 0802 e203 b203
0b04 6104 9303 1903 d701 4704 b905 0903
d103 5805 3006 0706 7707 4a07 9607 e907
d606 b906 1f08 ee04 1a04 1507 3d08 ff08
ce09 740a a609 5109 ff08 fd08 a708 e507
1e08 2e07 7208 4e08 9807 3009 6808 d408
c00a 250a f308 6807 4806 5c05 0904 2206
6e06 8707 c208 3c08 8e0a 7e09 fc07 7708
d406 d205 6405 6d04 7e04 6604 7704 7005
7707 2107 1a06 9b05 ab03 5901 47fe 75fd
28fc 84fa 87fb 21fc d4fb 77f9 41f9 98f8
57f7 2bf6 7af4 64f2 7cee 29ec 05ea 9ee8
90e9 0aea d2ea b6eb efeb 77e9 e0e6 85e6
59e4 5fe4 89e6 cee8 10ea 5eeb 5aee 66ef
49f2 4cf7 3ffa 91fb befd 75fd 8ffd 32fd
9eff 0202 8806 7a07 c90d ca15 b913 5614
7a16 7e16 a515 c617 2c17 c815 8b13 ef0f
9512 4f13 0212 5313 c413 0511 5810 cf0d
f907 8704 6b01 47fd 7afb 8bfb a9f8 25f8
0ef7 acf5 1cf5 83f3 fff2 53f0 c4ed 7ceb
76eb 14e9 eee8 f8e9 cfeb 1ded 89ef dff2
fcf3 caf4 bff4 0df6 b7f6 57f7 89f9 56fc
bafe 6702 c705 2609 ee0b 480e 5610 c512
9713 1614 9015 9d16 da16 e417 ad19 831a
cb1b 541d 421e 341e 5f1d 701b 1419 c117
e615 6b15 af15 b714 6e13 fe13 3112 1d11
be0f 130c 430a 4d08 9305 0d03 4402 9cff
4ffd 1efe 9dfd 73fc 8efa 49f9 76f7 e3f3
3bf1 19f0 fded ebeb daec edec c4eb bdea
ace9 b4e7 b7e5 73e4 f8e2 f3e1 d3e0 fadf
35df 77e0 c6e1 84e2 f3e2 14e2 04e1 03df
49dc 7bdc 61db a2dc dde4 99ed fef1 1bf9
93fc 76f9 11fa 3c01 df02 ce04 2c07 f106
8f09 9b07 470a 4d0f 5213 b817 6d20 5a23
...

And when the encoding is set to mp3, it looks like this:

fff3 84c4 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
...

It seems like the linear16 file, which can be repeated one after another, and it works fine.

I was trying to convert audio1 format to audio2 in native php by making a byte analyze, but the result was broken, and not be readable.

2

Answers


  1. I would do this:

    1. collect all $media responses in a temp dir
    2. obtain a list of media files saved, and then run:
    3. ffmpeg -i $media1 -i $media2 -i $media3 -filter_complex "[0:a][1:a][2:a]concat=n=3:v=0:a=1" output.mp3

    This example is for three files. I’ll try to explain:

    • the exact number of chunks to concat can be obtained with a counter in your foreach loop
    • the various [0:a], [1:a] are used to instruct FFmpeg to take only the audio stream, not video (in that case, the a would become v). The exact number of these follows your counter
    • the final number in concat=n=3 is for three files. You need to insert your counter there.
    • FFmpeg is capable to guess automatically the encoding of input files
    • FFmpeg is capable to guess output format based on the extension given in output.xxx

    I think this is a good starting point

    Login or Signup to reply.
  2. I suggest you generate correct ffmpeg command by videoalchemy

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search