skip to Main Content

I am working on a problem I have been stumped on the past couple days. I am using Node.js with Express (v4.18.2) to eventually create a Firebase deployment that can take in a video URL and output an audio mp3 to the Firebase Firestore. I have made some progress, but am still unsuccessful in some areas.

I cannot save the file locally using fs, but for this example I have shown that it works with FS. I am successfully saving a local .mp3 file.

First a few functions I have:

async function downloadVideo(videoUrl) {
    try {
      const response = await axios.get(videoUrl, {
        responseType: 'stream',
      });
  
      if (response.status === 200) {
        return response.data;
      } else {
        throw new Error('Failed to fetch the video');
      }
    } catch (error) {
      throw new Error('Error fetching the video: ' + error.message);
    }
  }
async function extractAudioFromVideo(videoUrl) {
    try {
      const videoStream = await downloadVideo(videoUrl);
  
      // Create a PassThrough stream to pipe the video data
      const passThrough = new PassThrough();
      videoStream.pipe(passThrough);

      const outputFile = 'output.mp3';
      const outputStream = fs.createWriteStream(outputFile);
  
        return new Promise((resolve, reject) => {
            const audioBuffers = [];

            passThrough.on('data', chunk => {
                audioBuffers.push(chunk)
                outputStream.write(chunk); // Write chunks to a local file
              });
        
              passThrough.on('error', err => {
                reject(err);
              });
        

            ffmpeg()
            .input(passThrough)
            .output('/dev/null') // Null output as a placeholder
            .outputOptions('-vn') // Extract audio only
            .noVideo()
            .audioQuality(0)
            .audioCodec('libmp3lame') // Set audio codec
            .format('mp3')
            .on('end', () => {
                const audioBuffer = Buffer.concat(audioBuffers)
                if (audioBuffer.length > 0) {
                    resolve(audioBuffer);
                  } else {
                    reject(new Error('Empty audio buffer'));
                  }
              })
            .on('error', err => reject(err))
            .run();
        })
    } catch (error) {
      throw new Error('Error extracting audio: ' + error.message);
    }
  }
 async function saveAudioToFirebase(audioBuffer, fileName) {
    try {
        let storage = admin.storage()
    let storageRef = storage.bucket(serviceAccount.storage_bucket_content)
      const file = storageRef.file(fileName) // Specify the desired file name here

      const renamedFileName = fileName.replace(/.[^/.]+$/, '.mp3'); // Change the file extension to .mp3
  
      await file.save(audioBuffer, {
        metadata: {
          contentType: 'audio/mpeg', // Adjust the content type as needed
        },
      });

      await file.setMetadata({
        contentType: 'audio/mpeg'
      })

      await file.move(renamedFileName); // Rename the file with the .mp3 extension
  
      console.log('Audio saved to Firebase Storage.');
    } catch (error) {
      console.error('Error saving audio to Firebase Storage:', error);
    }
  }

What works:

  • Downloading the video via Axios
  • Saving to Firebase storage (no intializing or pointer issues to Firebase)
  • Outputting a local .mp3 file called "output.mp3"
  • I am able to log the result of extractAudioFromVideo and get a buffer logged in my terminal

What doesn’t work:

  • Saving a file to Firebase Storage that is an .mp3. It says ".mp3" in the url and it has a content type of ‘audio/mpeg’ but it is in fact an .mp4. Still has video and plays video in the browser window.

I am willing to use other libraries like tmp if suggested and the solution works.

2

Answers


  1. const ffmpeg = require('fluent-ffmpeg');
    const { PassThrough } = require('stream');
    
    async function extractAudioFromVideo(videoUrl) {
      try {
        const videoStream = await downloadVideo(videoUrl);
    
        return new Promise((resolve, reject) => {
          const passThrough = new PassThrough();
          videoStream.pipe(passThrough);
    
          const audioBuffers = [];
    
          ffmpeg(passThrough)
            .outputOptions('-vn') // Extract audio only
            .audioCodec('libmp3lame') // Set audio codec to MP3
            .format('mp3')
            .on('data', chunk => {
              audioBuffers.push(chunk);
            })
            .on('end', () => {
              const audioBuffer = Buffer.concat(audioBuffers);
              if (audioBuffer.length > 0) {
                resolve(audioBuffer);
              } else {
                reject(new Error('Empty audio buffer'));
              }
            })
            .on('error', err => {
              reject(err);
            })
            .run();
        });
      } catch (error) {
        throw new Error('Error extracting audio: ' + error.message);
      }
    }
    
    Login or Signup to reply.
  2. Preface

    Don’t expect answers like this from StackOverflow often. I just enjoyed working on the problem and got carried away.

    Note: Below code has been coded free-hand, expect typos. Corrections welcome.


    The Problem

    Looking at your current approach, the video file is first downloaded into memory (as audioBuffer, before audio conversion) and also written out to a file as output.mp3 (before audio conversion). This is caused by these lines (rearranged for clarity):

    const audioBuffers = [];
    
    const passThrough = new PassThrough()
      .on('data', chunk => {
        audioBuffers.push(chunk)
        outputStream.write(chunk); // Write chunks to a local file
      })
      .on('error', err => {
        reject(err);
      });
    
    videoStream.pipe(passThrough);
    

    Note that the above lines do not make any mention of MP3 file conversion. This is why your uploaded and local file are both videos with an .mp3 file extension. Below those lines, you feed the output of the passThrough stream to ffmpeg and discard the result (by sending it to /dev/null).

    ffmpeg()
      .input(passThrough)
      .output('/dev/null')
      /* ... other config */
    

    Instead of interacting with the file system at all, it should be possible to extract the original video stream, transform the streams content by removing the video content and converting the audio track as needed, then load the resulting audio stream straight into Google Cloud Storage. This is known as an ETL pipeline (for Extract, Transform, Load) and helps minimise the resources needed to host this Cloud Function.


    Potential Solution

    In this first code block, I’ve combined the extractAudioFromVideo and saveAudioToFirebase helper methods into one streamAudioTrackToCloudStorage helper method. Bringing these components together helps to prevent passing streams around that don’t have appropriate listeners. It also helps with binding the current context to errors thrown during the transform and load steps. This is particularly important because a file that was being uploaded may be incomplete or empty if FFMPEG could not process the incoming stream properly. It is left up to the error handling code to dispose of an incomplete file.

    The streamAudioTrackToCloudStorage method accepts a downloadable boolean argument that can generate the Firebase Storage Download URL at upload time. This is useful for inserting the file’s record into Cloud Firestore or the Realtime Database if it is for public consumption.

    import { randomUUID } from "crypto"; // Requires node v14.17+
    import * as ffmpeg from "fluent-ffmpeg";
    
    /**
     * Generates a token and download URL that could target the given GCS file
     * reference. However, it is up to the caller to upload this information to
     * file's metadata.
     *
     * @param {import('@google-cloud/storage').File} storageFile - The GCS File
     * object to generate a download token for.
     * @returns {[token: string, url: string]} - A tuple containing the generated
     * download token and an assembled Firebase Storage URL containing that token.
     */
    const generateDownloadURLParts = (storageFile) => {
      // this is random enough as it doesn't need to actually be unique
      const token = randomUUID(); 
      const url = "https://firebasestorage.googleapis.com/v0/b/" + storageFile.bucket.name +
        "/o/" + encodeURIComponent(storageFile.name) +
        "?alt=media&token=" + token;
      return [ token, url ];
    }
    
    /**
     * Uploads the audio track of the provided stream to the given Google Cloud Storage
     * file reference.
     *
     * @param {import('@google-cloud/storage').File} storageFile - The GCS File object
     * to write the stream to.
     * @param {ReadableStream} sourceStreamWithAudio - A stream that can be ingested
     * by FFMPEG to produce the uploaded audio track.
     * @param {boolean} [downloadable] - Determines whether a download token is
     * attached to the uploaded file.
     * @returns {Promise<[ file: import('@google-cloud/storage').File, bytesUploaded: number, downloadURL: string ]>} - A
     * promise that resolves to a tuple containing the reference to the uploaded
     * GCS file, its size in bytes and its download URL if available.
     */
    const streamAudioTrackToCloudStorage = (storageFile, sourceStreamWithAudio, downloadable = false) => {
      return new Promise((resolve, reject) => {
        let byteCount = 0;
    
        // Generate download token and URL if requested.
        // (using downloadURL as Firebase uses getDownloadURL in the client SDKs)
        const [firebaseStorageDownloadTokens, downloadURL] = downloadable
          ? generateDownloadURLParts(storageFile)
          : [null, null];
        
        // before calling reject, try to bind some metadata to the error for debugging
        const onErrorCb = (error, source) => {
          const context = { byteCount, file: storageFile, source, downloadURL };
    
          try {
            Object.assign(error, context); // add context to error object
          } catch (_ignored) {
            console.error("Failed to bind context to thrown error.", { error, context });
          }
    
          reject(error);
        }
    
        // define Google Cloud Storage upload stream
        const uploadStream = storageFile
          .createWriteStream({
            metadata: {
              contentType: 'audio/mpeg',
              ...(downloadable ? { metadata: { firebaseStorageDownloadTokens }} : {}) 
            },
            resumable: false,
          })
          .on('finish', () => resolve([file, byteCount, downloadURL])) // 'finish' event fired once upload confirmed by Cloud Storage
          .on('error', err => onErrorCb(err, "storage"));
    
        // define source to audio transform stream
        ffmpeg(sourceStreamWithAudio)
          .output(storageFile)
          .outputOptions('-vn') // Extract audio only
          .noVideo()
          .audioQuality(0)
          .audioCodec('libmp3lame') // Set audio codec
          .format('mp3')
          .on('data', chunk => (byteCount += chunk.length))
          .on('end', () => { 
            if (byteCount === 0) {
              onErrorCb(new Error('Empty audio buffer'), "convert");
            }
          })
          .on('error', err => onErrorCb(err, "convert"))
          .run();
      });
    }
    

    Now that we have our transform and load streams, we need to obtain the extract stream.

    With the introduction of the native Node Fetch API in Node v18, I’ve dropped axios as it wasn’t really being used for anything useful to this step. You can modify the below script to add it back in if you were making use of interceptors for authentication or something similar.

    In this code block, we define the storeAudioFromRemoteVideo helper method which accepts the video URL to be converted along with the final upload path for the converted mp3 file. Unless another GCS bucket is provided as part of the options argument, the file will be uploaded to the default bucket as you specified in the code you shared. The remaining properties of the options argument are passed through to fetch() as the second argument if you need to specify things like Authorization headers, API keys or request bodies.

    import * as admin from "firebase-admin";
    
    /**
     * Attempts to ingest the body of the provided response as if it
     * were a JSON-encoded string, falling back to plain text on failure.
     *
     * This allows for simple handling of HTML, plain text and JSON-encoded
     * bodies as part of error handling.
     *
     * @param {Response} res - The response containing the body to consume.
     * @returns {unknown} - The parsed content of this response.
     */
    const attemptToParseResponseBodyAsJSON = (res) => {
      const text = res.text();
      try {
        return JSON.parse(text);
      } catch (err) {
        return text;
      }
    }
    
    /**
     * Streams remote video's content through FFMPEG to extract the
     * audio track and immediately uploads it to Google Cloud Storage.
     *
     * This method assumes that the provided file path is safe to write content to.
     *
     * @param {string} storedFilePath - File path in Cloud Storage to upload the
     * stream to. Should include ".mp3" file extension.
     * @param {string} videoUrl - URL of the video to be converted.
     * @param {Object} [options] - Optional object to override target GCS bucket and
     * pass options through to `fetch()`.
     * @returns {Promise<[ file: import('@google-cloud/storage').File, bytesUploaded: number, downloadURL: string ]>} - A
     * promise that resolves to a tuple containing the reference to the uploaded
     * GCS file, its size in bytes and its download URL if available.
     */
    const storeAudioFromRemoteVideo = async (storedFilePath, videoUrl, { bucket, ...fetchOptions } = {}) => {
    
      // custom bucket not provided? use default
      if (!bucket) {
        bucket = admin.storage()
          .bucket(serviceAccount.storage_bucket_content);
      }
    
      const response = await fetch(videoUrl, fetchOptions);
      const responseContentType = response.headers.get('Content-Type');
    
      if (!response.ok || !/^(?:audio|video)//.test(responseContentType)) {
        const err = new Error(`Unexpected HTTP ${response.status} response with [Content-Type]="${responseContentType}" from remote server.`);
        Object.assign(err, { // add context to error object
          response,
          status: response.status,
          body: attemptToParseResponseBodyAsJSON(response),
          source: "remote-http"
        });
        throw err;
      }
    
      // The Content-Disposition header of the response may contain the
      // filename if you want to use it for the uploaded file.
      // It is assumed that the calling method has taken care to prevent
      // overwriting existing files.
      const file = bucket.file(storedFilePath);
    
      return streamAudioTrackToCloudStorage(file, response.body);
    }
    

    Usage

    Now that the above methods are defined, your Cloud Function code may look like:

    import * as functions from "firebase-functions";
    import { randomUUID } from "crypto"; // Requires node v14.17+
    
    const { HttpsError } = functions.https;
    
    export const ingestVideoFile = functions.https.onCall((data, context) => {
      if (!context.auth) {
        throw new HttpsError("unauthenticated", "You must be logged in to continue.");
      }
    
      // TODO: Check if user has permission to call this endpoint (e.g. admin/maintainer/creator/etc.)
    
      const targetUrl = data.source || data.videoUrl || data.targetUrl;
      if (!targetUrl) {
        throw new HttpsError("failed-precondition", "No source provided.");
      }
    
      // TODO: Implement API quota?
    
      // randomUUID() for demo only, use v4 from uuid or a Cloud Firestore
      // document ID for a more stable unique ID for production use
      const targetFilePath = "ingestedVideos/" + randomUUID() + ".mp3";
      const startTimeMS = Date.now();
    
      return ingestVideoFile(targetUrl, targetFilePath)
        .then(([file, size, downloadURL) => {
          return {
            bucket: file.bucket,
            name: file.name,
            size,
            downloadURL,
            jobDurationMS: Date.now() - startTimeMS
          });
        })
        .catch((err) => {
          const { source, file } = err && typeof err === "object" ? err : {};
          console.error("Failed to ingest video file", err);
    
          if (file) {
            await file.delete({ ignoreNotFound: true })
              .catch(err => console.error(
                "Failed to cleanup errored file. Manual cleanup required.", 
                { bucket: file.bucket, name: file.name }
              ));
          }
    
          throw new HttpsError(
            "internal",
            `Failed due to an error in the ${source} component`,
            {
              ...(file ? { bucket: file.bucket, name: file.name } : {}),
              source,
              jobDurationMS: Date.now() - startTimeMS
            }
          );
        });
    });
    

    Potential Next Steps:

    • Investigate antivirus/antimalware protections.
    • Implement a user-based API quota and/or rate-limit for calling this function.
    • Benchmark performance using the returned jobDurationMS values and project cost to execute this function.
    • Deploy Cloud Function as a 2nd Gen Cloud Function if processing times are expected to be longer than 9 minutes.
    • Decide how to handle incomplete/empty files that have been uploaded when the function errors out. Hold for investigation? Delete on error? etc.
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search