"OSError: Chromium downloadable not found at" when using requests_html module in python

SatyaBharadwaj
November 4, 2024
71 views
0 votes
2 Answers

I’m using requests_html module in python to render web pages dynamically. However, I’ve been facing an issue with chromium download when render method is used (see below code snippet):

response = session.get(url)
response.html.render(timeout=20)

The error shown is: "OSError: Chromium downloadable not found at https://storage.googleapis.com/chromium-browser-snapshots/Win_x64/1181205/chrome-win.zip: Received NoSuchKeyThe specified key does not exist.No such object: chromium-browser-snapshots/Win_x64/1181205/chrome-win.zip"

I’ve tried to download other versions of requests_html module but I failed.
I tried to look out for possibilities to include custom executable path but I coudn’t find any.

Please let me know if there’s a solution to this.

Answers

- AlexKrentz
- November 4, 2024 at 2:20 pm
- 0 votes
0
The requests_html module uses pyppeteer to handle its web automation. For whatever reason, the default executable url that pyppeteer downloads chromium from is no longer working – it now requires a key. Normally, you’d be able to specify an executable path for pyppeteer to use but as you can see below, requests_html doesn’t give any option for you to specify your own executablePath argument to pyppeteer.launch.

Part of the BaseSession class in requests_html
```
class BaseSession(requests.Session):
    @property
    async def browser(self):
        if not hasattr(self, "_browser"):
            self._browser = await pyppeteer.launch(ignoreHTTPSErrors=not(self.verify), headless=True, args=self.__browser_args)

        return self._browser
```
The best thing to do would be to post a new issue on the requests_html github and ask them to allow you to pass your own executable path when creating a Session.

Alternatively, you could switch to using pyppeteer directly rather than through requests_html. This will give you full control over the Chromium executable path and other configuration options.

Here’s a basic example of how to use pyppeteer itself to render a webpage:
```
import asyncio
from pyppeteer import launch

async def render_page(url):

    browser = await launch(
        headless=True,
        executablePath='/path/to/your/chromium')

    page = await browser.newPage()
    await page.goto(url)

    content = await page.content()
    await browser.close()

    return content

# Run the function and fetch the page content
rendered_html = asyncio.get_event_loop().run_until_complete(
    render_page("https://www.google.com"))
print(rendered_html)
```
In this case you’d replace '/path/to/your/chromium' with with path to a locally installed chrome or chromium binary. You can download one from the chromium website.
Login or Signup to reply.

- Starmania
- November 4, 2024 at 2:24 pm
- 0 votes
0
Ok, under the hood, requests_html use pyppeteer… By following theses steps, you could ignore this "error"…
1. Download Chromium manually: Get version 1181217 from Chromium snapshots. It’s a little bit more recent but you could expect the same thing to happens.
2. Extract to Expected Directory: Unzip it to:
```
%USERPROFILE%/AppData/Local/pyppeteer/pyppeteer/local-chromium/1181205/chrome-win/
```
  Rename the folder to 1181205 (even though it’s actually version 1181217) to match what pyppeteer expects.
3. requests_html will detect the existing chrome.exe and skip downloading.
This bypasses the download error by mimicking the path pyppeteer expects without changing any code.

Furthermore, I recommend you to not use requests_html anymore, and switch so something like playright, who is more recent…
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.