skip to Main Content

I know that this question has been asked a lot, but none of the suggestions seem to work, probably since my setup is somewhat different:

Ubuntu          22.04
python          3.10.8
tensorflow      2.11.0
cudatoolkit     11.2.2
cudnn           8.1.0.77
nvidia-tensorrt 8.4.3.1
nvidia-pyindex  1.0.9

Having created a conda environment ‘tf’, in the directory home/dan/anaconda3/envs/tf/lib/python3.10/site-packages/tensorrt I have

libnvinfer_builder_resource.so.8.4.3
libnvinfer_plugin.so.8
libnvinfer.so.8
libnvonnxparser.so.8
libnvparsers.so.8
tensorrt.so

When running python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))" I get

tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7';
dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory;
LD_LIBRARY_PATH: :/home/dan/anaconda3/envs/tf/lib

tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7';
dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory;
LD_LIBRARY_PATH: :/home/dan/anaconda3/envs/tf/lib

tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

I’m guessing I should downgrade nvidia-tensorrt, but nothing I’ve tried seems to work, any advice would be much appreciated.

3

Answers


  1. Chosen as BEST ANSWER

    Solution: follow the steps listed here https://github.com/tensorflow/tensorflow/issues/57679#issuecomment-1249197802.

    Add the following to ~/.bashrc (for the conda envs as described in my scenario):

    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/dan/anaconda3/lib/
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/dan/anaconda3/lib/python3.8/site-packages/tensorrt/
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/dan/anaconda3/envs/tf/lib
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/dan/anaconda3/envs/tf/lib/python3.8/site-packages/tensorrt/
    

  2. For me the setting a symbolic link from libnvinfer version 7 to 8 worked:

    # the following path will be different for you - depending on your install method
    $ cd env/lib/python3.10/site-packages/tensorrt
    
    # create symbolic links
    $ ln -s libnvinfer_plugin.so.8 libnvinfer_plugin.so.7
    $ ln -s libnvinfer.so.8 libnvinfer.so.7
    
    # add tensorrt to library path
    $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/env/lib/python3.10/site-packages/tensorrt/
    
    Login or Signup to reply.
  3. This has a solution already, but maybe I can add more depth to the problem and a description for folks (like me) who had to install that stuff from scratch.

    Most likely (judging from these questions) you will come here because you install kohya_ss.

    In my case the machine does not have some of the required packages (tensorflow and tensorrt) and installing these brings versions of certain components that are newer than what kohya_ss expects.

    When you see this error, you are probably missing tensorflow and tensorrt (or you have it already, but in a different version)

    # install tensor-runtime 
    sudo apt install python3-pip -y
    pip install tensorrt tensorflow
    

    Now try the following command to see if above install fixed the problem already:

    python3 -c 'import tensorflow as tf; print(tf.__version__)' 
    

    If not (if the above command still complains about these missing files), understand that the above install command will give you an installation of tensorflow and tensorrt inside a hidden folder named ~/.local/… This will contain versions of the file you need, but with the wrong version numbers/filenames.

    The following command will find this other (wrong) version and its paths:

    find . -name libnvinfer.so* -print
    

    This will give you an output in the form

    .local/lib/python3.10/site-packages/tensorrt/libnvinfer.so.8
    

    Note the part before the actual filename and cd there.

    cd  ~/.local/lib/python3.10/site-packages/tensorrt/
    

    Type the following command. This will create a file with the same content but under a different name (the name that is missing).

    ln -s libnvinfer.so.8 libnvinfer.so.7
    

    Do a similar find command with libnvinfer_plugin.so.* file

    find . -name libnvinfer_plugin.so.* -print
    

    With the result do the same steps (this may be slightly different for you but same steps as above, i.e. find file with new name, go to that folder, create link with missing name):

     cd ~/.local/lib/python3.10/site-packages/nvidia/cuda_runtime/lib/
     ln -s libcudart.so.12 libcudart.so.11.0
    

    Add the two paths to the LD_LIBRARY_PATH like this (only the paths which you used in the cd commands, not the file names):

    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.local/lib/python3.10/site-packages/tensorrt/
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.local/lib/python3.10/site-packages/nvidia/cuda_runtime/lib/
    

    For good measure also add the /bin path that the installer complained about:

    export $PATH=$PATH:/home/sdgui/.local/bin
    

    Verify that python3 can now use tensorflow without the error:

    python3 -c 'import tensorflow as tf; print(tf.__version__)' 
    

    Hopefully a message, but no more warning about missing files.

    You can then add the two export commands to your .profile or .bashrc file. as pointed out in the answer from the original poster.

    Also, if you had tensorrt already installed globally (via sudo), your find command will need to be:

    sudo find / -name libnvinfer.so* -print
    

    The steps will be similar, just with global paths and requiring sudo to do it.

    Enjoy

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search