skip to Main Content

I’m trying to create a Synthetic monitor that will serve as a kind of Hearth Beat.
The tricky thing is that I want it to check the authenticated pages.
So, I have a script that logs-in using an alerting account and verify that the page is ok.

Following the documentation of Sythetic Monitors we see:

A synthetic monitor periodically executes a single-purpose 2nd gen Cloud Function that is deployed on Cloud Run. When you create the synthetic monitor, you define the Cloud Function, which must be written in Node.js

Then, the docs continues and gives some examples.

Taking the Puppeteer example, we see this:

const {
 instantiateAutoInstrumentation,
 runSyntheticHandler } = require('@google-cloud/synthetics-sdk-api');
// Run instantiateAutoInstrumentation before any other code runs, to get automatic logs and traces
instantiateAutoInstrumentation();
const functions = require('@google-cloud/functions-framework');
const axios = require('axios');
const assert = require('node:assert');
const puppeteer = require('puppeteer');


functions.http('CustomPuppeteerSynthetic', runSyntheticHandler(async ({logger, executionId}) => {
 // Launch a headless Chrome browser and open a new page
 const browser = await puppeteer.launch({ headless: 'new', timeout: 0});
 const page = await browser.newPage();

 // Navigate to the target URL
 const result = await page.goto('https://www.example.com', {waitUntil: 'load'});

 // Confirm successful navigation
 await assert.equal(result.status(), 200);

 // Print the page title to the console
 const title = await page.title();
 logger.info(`My Page title: ${title} ` + executionId);

 // Close the browser
 await browser.close();
}));

So, in theory, this should be enough to work the Puppeteer inside 2nd gen cloud functions. But, when testing this error happens:

PRODUCTION APP: Health check failed - Could not find Chromium (rev. 1108766). This can occur if either
 1. you did not perform an installation before running the script (e.g. npm install) or
 2. your cache path is incorrectly configured (which is: /root/.cache/puppeteer).
For (2), check out our guide on configuring puppeteer at https://pptr.dev/guides/configuration.

Checking Puppeteer docs, I see this:

The Node.js runtime of Google Cloud Functions comes with all system packages needed to run Headless Chrome.

To use puppeteer, specify the module as a dependency in your package.json and then override the puppeteer cache directory by including a file named .puppeteerrc.cjs at the root of your application with the contents:

const {join} = require('path');

/**
 * @type {import("puppeteer").Configuration}
 */
module.exports = {
  cacheDirectory: join(__dirname, 'node_modules', '.puppeteer_cache'),
};

[!NOTE] Google Cloud Functions caches your node_modules between builds. Specifying the puppeteer cache as subdirectory of node_modules mitigates an issue in which the puppeteer install process does not run when the cache is hit.

But I receive the same error, after adding the .puppeteerrc.cjs file.

But if we stop and think a bit, we remember that 2nd gen Cloud Functions are deployed on Cloud Run. So, following Puppeteer docs for Cloud Run, we see:

The default Node.js runtime of Google Cloud Run does not come with the system packages needed to run Headless Chrome. You will need to set up your own Dockerfile and include the missing dependencies.

So, according to Puppeteer docs for Linux we should create a Dockerfile and install the listed dependencies.

I did that, and here’s my dockerfile as example:

# Use the official Node.js 20 image as a parent image
FROM node:20-slim

# Set working directory
WORKDIR /app

# Install necessary tools and libraries for Puppeteer and Chromium
RUN apt-get update && apt-get install -y 
    ca-certificates 
    fonts-liberation 
    gnupg 
    libasound2 
    libatk-bridge2.0-0 
    libatk1.0-0 
    libc6 
    libcairo2 
    libcups2 
    libdbus-1-3 
    libexpat1 
    libfontconfig1 
    libgbm1 
    libgcc1 
    libglib2.0-0 
    libgtk-3-0 
    libnspr4 
    libnss3 
    libpango-1.0-0 
    libpangocairo-1.0-0 
    libstdc++6 
    libx11-6 
    libx11-xcb1 
    libxcb1 
    libxcomposite1 
    libxcursor1 
    libxdamage1 
    libxext6 
    libxfixes3 
    libxi6 
    libxrandr2 
    libxrender1 
    libxss1 
    libxtst6 
    lsb-release 
    procps 
    wget 
    xdg-utils 
    --no-install-recommends 
    && apt-get clean 
    && rm -rf /var/lib/apt/lists/*

# Install Chromium
RUN apt-get update 
    && apt-get install -y chromium 
    && apt-get clean 
    && rm -rf /var/lib/apt/lists/*

# Copy package.json and package-lock.json (if available)
COPY package*.json ./

# Install npm dependencies
RUN npm install

# Copy the rest of the application's source code
COPY . .

# Run the app
CMD ["node", "index.js"]

But the error persists. I also tried to use other versions of Puppeteer without success.
I also tried to check where Chromium is being installed:

...
const { exec } = require('child_process');
...

exec('chromium --version', (err, stdout, stderr) => {
      if (err) {
        // If an error occurs, log it (this could indicate Chromium is not installed or path is incorrect)
        logger.error(`Error checking Chromium version: ${err.message}`);
        return;
      }
      // Log the Chromium version to console
      logger.info(`Chromium version: ${stdout}`);
    });

so I could force Puppeter to use it this way:

const browser = await puppeteer.launch({ 
      executablePath: '/usr/bin/chromium',
      args: ['--no-sandbox', '--disable-setuid-sandbox'],
      headless: 'new',
      timeout: 0 
    });

But I get the error:

"Error checking Chromium version: Command failed: chromium --versionn/bin/sh: 1: chromium: not found"

I’ve tried many other things that I don’t remember but nothing gets the Chromium installed. Any idea how to make it work?

2

Answers


  1. Chosen as BEST ANSWER

    I discovered the following about Synthetic Monitor and related functions.

    1. Despite the UI allowing you to select any runtime, the documentation states that you should run in a Nodejs environment.
    2. The associated Cloud Function is a Gen 2 Cloud Function which implies having a Cloud Run (it's automatically created on function deployment)
    3. You should name the file where your script stands as index.js

    With that in mind, here are the specifics of this use-case. To use Puppeteer in Cloud Functions, we need add the .puppeteerrc.cjs file as described in Puppeteer docs:

    const { join } = require('path');
    
    /**
     * @type {import("puppeteer").Configuration}
     */
    module.exports = {
      cacheDirectory: join(__dirname, '.cache', 'puppeteer'),
    };
    

    We also need to add the magic script in package.json which is called during the function build. I believe the functions-framework is responsible for calling it.

    "scripts": {
      "gcp-build": "node node_modules/puppeteer/install.mjs"
    }
    

    So we can deploy a function that runs Puppeteer in Node20 with this package.json

    {
        "main": "index.js",
        "scripts": {
          "gcp-build": "node node_modules/puppeteer/install.mjs"
        },
        "dependencies": {
            "@google-cloud/functions-framework": "^3.1.2",
            "@google-cloud/synthetics-sdk-api": "^0.4.1",
            "puppeteer": "^21.3.6"
        }, 
    }
    

    Lastly and MOST IMPORTANT, at this moment (27/02/2024) the testing UI simply DOES NOT WORK due to the missing dependencies as described in the question. Every time I tried to run the function it failed complaining about a dependency that is missing.

    But, if you just go forward and deploy the function, it just works as expected.


  2. You can site this reference from Luba on how they run Cloud Functions 2nd Gen with node.js 16 with Puppeteer function

    const puppeteer = require('puppeteer')
    
    
    let browserPromise = puppeteer.launch(
        {
        args: [
            '--no-sandbox'
        ]
    }
    );
    
    exports.productads = async (req, res) => {
      /* Your function goes here*/
    }

    That may require .puppeteerrc.cjs:

    const {join} = require('path');
    module.exports = {
      cacheDirectory: join(__dirname, '.cache', 'puppeteer')
    };

    And package.json similar to this:

    {
      "name": "puppeteer",
      "version": "1.0.0"
      "description": "",
      "main": "index.js",
      "scripts": {
        "gcp-build": "node node_modules/puppeteer/install.js"
      },
      "devDependencies": {
        "@google-cloud/functions-framework": "^3.1.2"
      },
      "dependencies": {
        "puppeteer": "^19.2.2"
      }
    }

    Including here documentation for reference

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search