Ubuntu - Problems with .exe file after converting it from .py. "ModuleNotFoundError: no module named 'selenium'"

vaahtlnirn1
March 1, 2024
312 views
0 votes
2 Answers

Hear me out: Quite a newbie with Python. I totally may have messed up somewhere in this.

Here’s the error message in full:

Traceback (most recent call last):
  File "webScrapingTool.py", line 1, in <module>
    from selenium import webdriver
ModuleNotFoundError: No module named 'selenium'

I wrote the code on Ubuntu 22.04, whose default Python version is 3.10.4. I have a dual-boot system. I had not realized that I apparently(?) needed to make a Windows executable directly in Windows, so I moved the file to there and tried. I downloaded Python for Windows, whose version is 3.12.2. As far as I understand, this is possibly part one of the problem.

Keep in mind that I have tried both ‘pyinstaller’ and ‘auto-py-to-exe’ on Ubuntu and I tried ‘pyinstaller’ on Windows too. When I create the executable in Windows, it will show the error message as above.

As mentioned, I am almost brand new to Python and I did a pretty basic project, but I really need to know what the issue is with finally making my file executable/usable for the average person.

This is my code:

from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import time
import re
import requests
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.common.exceptions import StaleElementReferenceException
from requests.exceptions import RequestException, Timeout, HTTPError, ConnectionError

filename = "data"
link = input("Please enter the Google Maps link for scraping: ")

browser = webdriver.Chrome()
record = []
e = []
le = 0

def Selenium_extractor():
    action = ActionChains(browser)
    prev_length = 0
    a = browser.find_elements(By.CLASS_NAME, "hfpxzc")

    while len(a) < 1000:
        print(len(a))
        var = len(a)
        last_element = a[-1]
        action.move_to_element(last_element).perform()
        browser.execute_script("arguments[0].scrollIntoView();", last_element)
        time.sleep(2)
        a = browser.find_elements(By.CLASS_NAME, "hfpxzc")

        try:
            if len(a) == var:
                le += 1
                if le > 20 or len(a) == prev_length:
                    break
            else:
                le = 0
            prev_length = len(a)
        except StaleElementReferenceException:
            continue


    names_processed = False  # Flag to indicate if names are processed

    for i in range(len(a)):
        if names_processed:
            break  # If names are processed, break out of the loop
        action.move_to_element(a[i]).perform()
        time.sleep(2)
        source = browser.page_source
        soup = BeautifulSoup(source, 'html.parser')
        try:
            Item_Html = soup.findAll('div', {"class": "lI9IFe"})
            for item_html in Item_Html:
                Name_Html = item_html.find('div', {"class": "qBF1Pd fontHeadlineSmall"})
                name = Name_Html.text.strip()
                if name not in e:
                    e.append(name)
                    divs = item_html.findAll('div', {"class": "W4Efsd"})
                    email_scraped = False

                    for div in divs:
                        phone_span = div.find('span', {"class": "UsdlK"})
                        if phone_span and phone_span.text.strip().startswith("+"):  # Check condition
                            phone = phone_span.text.strip()
                        else:
                            phone = "Not available"
                    Address_Html = divs[2]
                    address_text = Address_Html.get_text().split(' · ')
                    if len(address_text) > 1:
                        address = address_text[1].strip()
                    else:
                        address = "Not available"
                    if not email_scraped:
                        Website_Html = item_html.find('a', {"class": "lcr4fd S9kvJb"})
                        for j in range(len(divs)):
                            if Website_Html:
                                website = Website_Html.get('href')
                                try:
                                    website_source = requests.get(website, timeout=10).text
                                    emails = re.findall(r'b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Z|a-z]{2,}b', website_source)
                                    emails = [email for email in emails if not email.endswith('.wixpress.com')]
                                    emails = list(set(emails))
                                    if not emails:
                                        emails = "Not available"
                                    else:
                                        email_scraped = True
                                except (Timeout, ConnectionError) as ex:
                                    print("Error scraping emails from website due to network issues:", ex)
                                except HTTPError as ex:
                                    print("HTTP error occurred while accessing the website:", ex)
                                except RequestException as ex:
                                    print("An error occurred while accessing the website:", ex)
                            else:
                                website = "Not available"
                                emails = "Not available"
                    
                    print([name, phone, address, website, emails])
                    record.append([name, phone, address, website, emails])
            names_processed = True  # Set flag to indicate names are processed
        except Exception as ex:
            print("Error occurred:", ex)
            continue

    print(record)
    return record

browser.get(str(link))
time.sleep(10)
Selenium_extractor()

df=pd.DataFrame(record,columns=['Business Name', 'Phone', 'Street Address', 'Website', 'Email Addresses'])  # writing data to the file
df.to_csv(filename + '.csv',index=False,encoding='utf-8')

When I try to do "pip install cx_freeze" or to install the ‘requirements.txt’ with pip, I get the error message like this:

https://pastebin.com/KheA21nM

The leads that I can give with this are that I’m almost certain that it is associated with where the program was written versus where the executable is being made (two different Python versions). I have seen a few pages that mentioned ‘hiddenimports’ in the .spec file, but no luck from what some suggested. Hopefully somebody knows exactly what I mean because while there are similar questions here, none of them are exactly like my situation here. Please let me know of anything that I can do to fix this. Thanks!

Answers

- AshhadDevLab
- March 1, 2024 at 9:28 am
- 0 votes
0
Description:

Inside the error focus on this line:
```
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": [url here - Reddit doesn't like links, so I removed it)
```
This indicates that you have to install or upgrade your previous version of Microsoft Visual C++. As C++ plays a vital role in converting your python file into a windows executable.

Secondly, if you familiar with manual library importing in pyinstaller, perform that action for the libraries that can’t be installed by pyinstaller.
Login or Signup to reply.

- Xavier
- March 1, 2024 at 9:39 am
- 0 votes
0
you have to include project module before converting it

like:
```
pyinstaller.exe --onefile --paths=D:envLibsite-packages  .foo.py
```
Login or Signup to reply.