I have a small application that uses fast-api and playwright to scrape data and send it back to the client.
The program is working properly when I’m running it locally, but when I try to run it as a Docker image it fails with the following error:
Looks like you launched a headed browser without having a XServer running.
Set either 'headless: true' or use 'xvfb-run <your-playwright-app>' before running Playwright.
obviously I tried running it in Headless=True mode, but the code fails with this error:
net::ERR_EMPTY_RESPONSE at https://book.flygofirst.com/Flight/Select?inl=0&CHD=0&s=True&o1=BOM&d1=BLR&ADT=1&dd1=2022-12-10&gl=0&glo=0&cc=INR&mon=true
logs
navigating to "https://book.flygofirst.com/Flight/Select?inl=0&CHD=0&s=True&o1=BOM&d1=BLR&ADT=1&dd1=2022-12-10&gl=0&glo=0&cc=INR&mon=true",
waiting until "load"
I also tried to run it locally with Headless=True and it failed with "Timeout 30000ms exceeded" error.
This is the funcion I’m using to return the page html:
def extract_html(self):
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto('https://book.flygofirst.com/Flight/Select?inl={}&CHD={}&s=True&o1={}&d1={}&ADT={}&dd1={}&gl=0&glo=0&cc=INR&mon=true'.format(self.infants, self.children , self.origin, self.destination, self.adults, self.date))
html = page.inner_html('#sectionBody')
return html
and this is my Dockerfile:
FROM python:3.9-slim
COPY ../../requirements/dev.txt ./
RUN python3 -m ensurepip
RUN pip install -r dev.txt
RUN playwright install
RUN playwright install-deps
ENV PYTHONPATH "${PYTHONPATH}:/app/"
WORKDIR /code/src
COPY ./src /app
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--reload"]
Hope someone could figure out what I’m doing wrong.
2
Answers
After investigating and trying several things, looks like the problem is the user_agent of the browser when is in headless mode, for some reason the default user agent does not like to that page, try with:
Locally it works as there’s GUI stuff for sure already installed in order to open a browser (especially with
headless=False
)but when you’re trying to put it to Docker env additional actions required, so I’ve resolved it in this way:
Dockerfile:
docker-compose.yml
Hope it will help