Navigate javascript form via Python Request

GuilhermeLouzada
September 24, 2023
236 views
0 votes
2 Answers

I am trying to scrap data from a webpage that shows a limited amount of data, and requires the user to click a button to navigate to the next set of records. The webpage achieves that by sending GET requests to itself.

I tried to write a code in Python that would send a GET request to the page hoping to get the next set of results, and write a for loop to retrieve subsequent results, but I am always getting the initial sets (apparently the website is ignoring my params)

This is the website I am targeting:
https://portaltransparencia.procempa.com.br/portalTransparencia/despesaLancamentoPesquisa.do?viaMenu=true&entidade=PROCEMPA

This is my code:

url = "https://portaltransparencia.procempa.com.br/portalTransparencia/despesaLancamentoPesquisa.do?viaMenu=true&entidade=PROCEMPA"

r_params = {
    "perform": "view",
    "actionForward": "success",
    "validate": True,
    "pesquisar": True,
    "defaultSearch.pageSize":23,
    "defaultSearch.currentPage": 2
    }
page = requests.get(url, params=r_params)

I expected that this generated a response with data from the 2nd page, but it is responding that from the first page.

Answers

Chosen as BEST ANSWER
- GuilhermeLouzada
- September 24, 2023 at 5:52 pm
- 0 votes
0
I just edited the previous answer with the code that effectively worked for this example.

(Just had to adjust the pointer to the button)
```
# Find the "Next Page" button and click it
next_button = driver.find_element(By.ID, "cmdProximo")
next_button.click()
```

(Edit)

- VivekThakur
- September 24, 2023 at 7:11 am
- 0 votes
0
The issue you’re facing is likely because the website you’re trying to scrape uses some client-side JavaScript to handle the pagination and retrieve the next set of records. Sending a GET request with query parameters may not be sufficient in this case because the website might not respond to those parameters the way you expect.

To scrape data from such websites that rely on JavaScript to load content dynamically, you would typically need to use a tool like Selenium, which can automate interactions with a web page, including clicking buttons and handling JavaScript events.

Here’s an example of how you might use Selenium to interact with the website and retrieve data:
```
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

# Initialize the WebDriver
driver = webdriver.Chrome()  # You'll need to have ChromeDriver installed

# Open the URL
url = "https://portaltransparencia.procempa.com.br/portalTransparencia/despesaLancamentoPesquisa.do?viaMenu=true&entidade=PROCEMPA"
driver.get(url)

# Find the "Next Page" button and click it
next_button = driver.find_element(By.XPATH, "//button[@title='Próxima Página']")
next_button.click()

# Now you are on the next page with the next set of records
# You can scrape the data from this page as needed

# When you are done, you can close the WebDriver
driver.quit()
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.