I am trying to scrape a text from a line using Python. I was able to get the class attribute from the same line but just not the text, tried .text
and .get_text()
, and neither of them works.
What am I missing?
Here is my Python script to get the text from the line:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import time
import datetime
import csv
class toy(object):
browser = webdriver.Chrome(ChromeDriverManager().install())
browser.get('https://continuumgames.com/product/16-tracer-racer-set/')
time.sleep(2)
try:
test = browser.find_element_by_xpath('//*[@id="tab-additional_information"]/table/tbody/tr[3]/td').get_attribute('class')
except:
test = 'NA'
try:
upcode = browser.find_element_by_xpath('//*[@id="tab-additional_information"]/table/tbody/tr[3]/td').text
except:
upcode = 'NA'
print(test)
print(upcode)
browser.close()
Here is the page’s HTML:
<div class="woocommerce-Tabs-panel woocommerce-Tabs-panel--additional_information panel entry-content wc-tab" id="tab-additional_information" role="tabpanel" aria-labelledby="tab-title-additional_information" style="display: none;">
<table class="woocommerce-product-attributes shop_attributes">
<tbody>
<tr class="woocommerce-product-attributes-item woocommerce-product-attributes-item--weight">
<th class="woocommerce-product-attributes-item__label">Weight</th>
<td class="woocommerce-product-attributes-item__value">2.5 oz</td>
</tr>
<tr class="woocommerce-product-attributes-item woocommerce-product-attributes-item--dimensions">
<th class="woocommerce-product-attributes-item__label">Dimensions</th>
<td class="woocommerce-product-attributes-item__value">24 × 4 × 2 in</td>
</tr>
<tr class="woocommerce-product-attributes-item woocommerce-product-attributes-item--attribute_product_upc">
<th class="woocommerce-product-attributes-item__label">UPC</th>
<td class="woocommerce-product-attributes-item__value">605444972168</td>
</tr>
</tbody>
</table>
</div>
Here is my run:
C:UsersCarrescrape>python test.py
[WDM] - Current google-chrome version is 83.0.4103
[WDM] - Get LATEST driver version for 83.0.4103
[WDM] - Driver [C:UsersCarre.wdmdriverschromedriverwin3283.0.4103.39chromedriver.exe] found in cache
DevTools listening on ws://127.0.0.1:56807/devtools/browser/03318f43-1d26-44c7-8d90-65233969f03b
woocommerce-product-attributes-item__value
2
Answers
Your selector is probably off. Try using Xpath. Right-click on the tag and then select copy Xpath. Then replace your code with this.
I have your solution, this is my usual roundabout way when dealing with inconsistencies on selenium: switch to beautifulsoup4