I know that this question has been asked a lot, but none of the suggestions seem to work, probably since my setup is somewhat different:
Ubuntu 22.04
python 3.10.8
tensorflow 2.11.0
cudatoolkit 11.2.2
cudnn 8.1.0.77
nvidia-tensorrt 8.4.3.1
nvidia-pyindex 1.0.9
Having created a conda environment ‘tf’, in the directory home/dan/anaconda3/envs/tf/lib/python3.10/site-packages/tensorrt
I have
libnvinfer_builder_resource.so.8.4.3
libnvinfer_plugin.so.8
libnvinfer.so.8
libnvonnxparser.so.8
libnvparsers.so.8
tensorrt.so
When running python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
I get
tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7';
dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory;
LD_LIBRARY_PATH: :/home/dan/anaconda3/envs/tf/lib
tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7';
dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory;
LD_LIBRARY_PATH: :/home/dan/anaconda3/envs/tf/lib
tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
I’m guessing I should downgrade nvidia-tensorrt
, but nothing I’ve tried seems to work, any advice would be much appreciated.
3
Answers
Solution: follow the steps listed here https://github.com/tensorflow/tensorflow/issues/57679#issuecomment-1249197802.
Add the following to ~/.bashrc (for the conda envs as described in my scenario):
For me the setting a symbolic link from
libnvinfer
version 7 to 8 worked:This has a solution already, but maybe I can add more depth to the problem and a description for folks (like me) who had to install that stuff from scratch.
Most likely (judging from these questions) you will come here because you install kohya_ss.
In my case the machine does not have some of the required packages (tensorflow and tensorrt) and installing these brings versions of certain components that are newer than what kohya_ss expects.
When you see this error, you are probably missing tensorflow and tensorrt (or you have it already, but in a different version)
Now try the following command to see if above install fixed the problem already:
If not (if the above command still complains about these missing files), understand that the above install command will give you an installation of tensorflow and tensorrt inside a hidden folder named ~/.local/… This will contain versions of the file you need, but with the wrong version numbers/filenames.
The following command will find this other (wrong) version and its paths:
This will give you an output in the form
Note the part before the actual filename and cd there.
Type the following command. This will create a file with the same content but under a different name (the name that is missing).
Do a similar find command with libnvinfer_plugin.so.* file
With the result do the same steps (this may be slightly different for you but same steps as above, i.e. find file with new name, go to that folder, create link with missing name):
Add the two paths to the LD_LIBRARY_PATH like this (only the paths which you used in the cd commands, not the file names):
For good measure also add the /bin path that the installer complained about:
Verify that python3 can now use tensorflow without the error:
Hopefully a message, but no more warning about missing files.
You can then add the two export commands to your .profile or .bashrc file. as pointed out in the answer from the original poster.
Also, if you had tensorrt already installed globally (via sudo), your find command will need to be:
The steps will be similar, just with global paths and requiring sudo to do it.
Enjoy