skip to Main Content

We have many VM instances in Compute Engine used to scrape, they can get blocked in some sites and then we try to change the IP using NordVPN. We are trying to create a Python script to automate the IP change when we detect we’re blocked. Currently, we are using this Python package that we recently found: NordVPN-switcher, but we are getting the next error:

Connecting you to Denver ...
An unknown error occurred while connecting to a different server! 

An unknown error occurred while connecting to a different server! Retrying with a different server...

Traceback (most recent call last):
  File "demo.py", line 13, in <module>
    rotate_VPN(instructions)  # refer to the instructions variable here
  File "/home/eduardo_santos_housecallprosolut/.local/lib/python3.8/site-packages/nordvpn_switcher/nordvpn_switch.py", line 514, in rotate_VPN
    raise Exception("Unable to connect to a new server. Please check your internet connection.n")
Exception: Unable to connect to a new server. Please check your internet connection.

Note: We have an internet connection.

VM instances also have NordVPN installed, if we try manually we can change it, but as we are connected to the instance using SSH, at the moment we change the IP the connection is lost.

Then, the current problems are:

  1. How to dynamically change the IP of an instance properly?
  2. How to keep a connection after the change occurs.

Note: The scrapers and all the logic is dockerized, and the Python version is 3.9

As I mentioned at the beginning, we have many machines used for scrape, we would like to keep a registry of the IPs used in each one in order to have a better assignation, probably using a Redis DB o a small collection in MongoDB. What do you think about it? What is a good way to de develop this?

Thank you so much.

2

Answers


  1. How to dynamically change the IP of an instance properly?

    There is no supported method. Any existing connections will break/fail once the IP address changes. Software that uses IP will need to be written to handle connection failures and attempt to reconnect. This type of feature is common with cell phone applications but less so in the desktop/server world.

    An important point with Google Cloud (and most of the cloud vendors) is that your VM does not have a public IP address assigned to a network interface. The public IP address is assigned to one side of a one-to-one NAT. This means IP address change notifications within the OS and applications will not happen.

    Google provides a CLI, SDKs and APIs that can be used to programmatically change the IP address assigned to an instance.

    How to keep a connection after the change occurs.

    Two strategies:

    1. Add another network interface with a public IP address that does not change. Connect to the VM using that IP address.

    2. Create a pool of public IP address that you will use. Use a VPN such as WireGuard which has excellent features for following connection address changes. Connect via the VPN using the VM’s private IP address which does not change when the public IP address is changed.

    I would use the first strategy as that has less complexity and fewer potential problems. However, once you understand how WireGuard manages connections and implements signatures instead of IP addresses, there are numerous possibilities for connection management.

    Login or Signup to reply.
  2. I tried this tonight on a VM with a public IP, and then I remove the public IP and it continued to work. It could be the solution!

    You can use IAP to connect to your vm. Do it in a terminal with gcloud like this:

    gcloud compute ssh --tunnel-through-iap -project=$project_name --zone=us-west2-a $instance_name
    

    Let me know. If it doesn’t work for you, I will delete the answer

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search