skip to Main Content

I am trying to make an AI agent for playing OpenAI Gym CarRacing environment and I am having trouble loading saved models. I train them, they work, I save them and load them and suddenly the car doesn’t even move. I even tried downloading models from other people and when loaded, the car just doesn’t move.

I am on Ubuntu 20.04 in VS Code in a Jupyter notebook using
gym==0.21.0, stable-baselines3==1.6.0, python==3.7.0

import gym 
from stable_baselines3 import PPO
from stable_baselines3.common.evaluation import evaluate_policy
import os

I make the environment

environment_name = "CarRacing-v0"
env = gym.make(environment_name)

I create the PPO model and make it learn for a couple thousand timesteps. Now when I evaluate the policy, the car renders as moving.

log_path = os.path.join('Training', 'Logs')
model = PPO("CnnPolicy", env, verbose=1, tensorboard_log=log_path)
model.learn(total_timesteps=4000)
evaluate_policy(model, env, n_eval_episodes=1, render=True)

I save the model

ppo_path = os.path.join('Training', 'Saved Models', 'PPO_Car_Testing')
model.save(ppo_path)

now I delete the model and load the saved one and when I evaluate it the car just doesn’t move as if it always got action do nothing. I tried models learning for 2k timesteps up to a model which has been learning for 2 million timesteps.

del model
model = PPO("CnnPolicy", env, verbose=1, tensorboard_log=log_path)
ppo_path_load = os.path.join('Training', 'Saved Models', 'PPO_2m_Driving_model')
model.load(ppo_path_load, env)
evaluate_policy(model, env, n_eval_episodes=1, render=True)

Any ideas why the models load incorrectly?

2

Answers


  1. Chosen as BEST ANSWER

    The issue can be solved using PPO.load() instead of model.load(). Not sure why tho.

    Here is the correct code for loading the model

    del model
    ppo_path_load = os.path.join('Training', 'Saved Models', 'PPO_2m_Driving_model')
    model = PPO.load(ppo_path_load)
    

  2. it seems like you model didn’t load Correctly

    you gave the code wrong

    model = PPO("CnnPolicy", env, verbose=1, tensorboard_log=log_path)
    ppo_path_load = os.path.join('Training', 'Saved Models', 'PPO_2m_Driving_model')
    model.load(ppo_path_load, env)
    

    Change it to as

    model = PPO("CnnPolicy", env, verbose=1, tensorboard_log=log_path)
    ppo_path_load = os.path.join('Training', 'Saved Models', 'PPO_2m_Driving_model')
    model = ppo.load(ppo_path_load, env)
    

    change RLALGORITHM to your Rl-agorithm such as PPO or A2C, etc

    model = RLALGORITHM.load(ppo_path_load, env)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search