So I Wrote the code and ran it and got the .xlsx file but the output is not as the same order of the Url list i put in the code.
#importing the libraries
import re
import lxml
import chardet
from os import truncate
import bs4
from bs4 import BeautifulSoup
import multiprocessing
import requests
import pandas as pd
from fake_useragent import UserAgent
import numpy as np
urls = list(('https://isabad.com/advanced-professional-email-templates-opencart-extension' ,
'https://isabad.com/seo-basic-pack-opencart-extension',
'https://isabad.com/x-shipping-pro',
'https://isabad.com/bot-blocker-opencart-extension',
'https://isabad.com/opencart-mobile-application'
))
dit = {}
user_agent = UserAgent()
for url in urls:
data = requests.get(url, headers={"user-agent": user_agent.chrome})
soup = bs4.BeautifulSoup(data.content, "lxml")
dit[url] = soup.find_all("title")
ex = pd.DataFrame({"title": dit ,})
print(ex)
ex.to_excel('sasa.xlsx', index=False, engine='xlsxwriter')
How Can I fix this problem?
2
Answers
You are using the
set
data structure for storing the list of URLs and theset
data structure in python is an unordered data structure. To have the output in the same order, you should store the URLs inlist
data structure as follows:Cheers!
use a
list
so the results would be in the same order that you defined.