Ubuntu - Compiling TensorFlow 2.15.0 from source

TargettAdams
December 30, 2023
123 views
0 votes
2 Answers

I am trying to build Tensorflow 2.15.0 with GPU support from source (Ubuntu 22.04). All of the documentation I have seen says that CUDA 12.2 should be used. But the build fails unless I have TensorRT installed. Fine – but TensorRT does not support CUDA 12.2 (I cannot even install TensorRT unless I have CUDA <= 12.1).

What am I missing here?

Details:

In order to compile from source I followed these steps:

Install CUDA 12.2 (as per documentation/release notes) using standard NVIDIA instructions.
Install cuDNN 8.8 (as per documentation/release notes) using standard NVIDIA instructions.
Install clang 17 (as per documentation/release notes).
Clone the tensorflow repository; checkout 2.15.0.
I run the configure script as follows:

    You have bazel 6.1.0 installed.
    Please specify the location of python. [Default is /home/christopher/Desktop/code/tf-source/venv/bin/python3]: 
    
    
    Found possible Python library paths:
      /home/christopher/Desktop/code/tf-source/venv/lib/python3.10/site-packages
    Please input the desired Python library path to use.  Default is [/home/christopher/Desktop/code/tf-source/venv/lib/python3.10/site-packages]
    
    Do you wish to build TensorFlow with ROCm support? [y/N]: 
    No ROCm support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with CUDA support? [y/N]: y
    CUDA support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with TensorRT support? [y/N]: 
    No TensorRT support will be enabled for TensorFlow.
    
    Found CUDA 12.2 in:
        /usr/local/cuda-12.2/targets/x86_64-linux/lib
        /usr/local/cuda-12.2/targets/x86_64-linux/include
    Found cuDNN 8 in:
        /usr/lib/x86_64-linux-gnu
        /usr/include
    
    
    Please specify a list of comma-separated CUDA compute capabilities you want to build with.
    You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Each capability can be specified as "x.y" or "compute_xy" to include both virtual and binary GPU code, or as "sm_xy" to only include the binary code.
    Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 8.9]: 8.0
    
    
    Do you want to use clang as CUDA compiler? [Y/n]: 
    Clang will be used as CUDA compiler.
    
    Please specify clang path that to be used as host compiler. [Default is /usr/lib/llvm-17/bin/clang]: 
    
    
    You have Clang 17.0.6 installed.
    
    Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -Wno-sign-compare]: 
    
    
    Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: 
    Not configuring the WORKSPACE for Android builds.
    
    Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
        --config=mkl            # Build with MKL support.
        --config=mkl_aarch64    # Build with oneDNN and Compute Library for the Arm Architecture (ACL).
        --config=monolithic     # Config for mostly static monolithic build.
        --config=numa           # Build with NUMA support.
        --config=dynamic_kernels    # (Experimental) Build kernels into separate shared objects.
        --config=v1             # Build with TensorFlow 1 API instead of TF 2 API.
    Preconfigured Bazel build configs to DISABLE default on features:
        --config=nogcp          # Disable GCP support.
        --config=nonccl         # Disable NVIDIA NCCL support.
    Configuration finished

When I compile using:

bazel build --config=cuda //tensorflow/tools/pip_package:build_pip_package

I see errors like this:

ERROR: /home/christopher/Desktop/code/tf-source/tensorflow/WORKSPACE:84:14: fetching tensorrt_configure rule //external:local_config_tensorrt: Traceback (most recent call last):
    File "/home/christopher/Desktop/code/tf-source/tensorflow/third_party/tensorrt/tensorrt_configure.bzl", line 300, column 38, in _tensorrt_configure_impl
        _create_local_tensorrt_repository(repository_ctx)
    File "/home/christopher/Desktop/code/tf-source/tensorflow/third_party/tensorrt/tensorrt_configure.bzl", line 159, column 30, in _create_local_tensorrt_repository
        config = find_cuda_config(repository_ctx, ["cuda", "tensorrt"])
    File "/home/christopher/Desktop/code/tf-source/tensorflow/third_party/gpus/cuda_configure.bzl", line 649, column 26, in find_cuda_config
        exec_result = execute(repository_ctx, [python_bin, repository_ctx.attr._find_cuda_config] + cuda_libraries)
    File "/home/christopher/Desktop/code/tf-source/tensorflow/third_party/remote_config/common.bzl", line 230, column 13, in execute
        fail(
Error in fail: Repository command failed
Could not find any NvInferVersion.h matching version '' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'
        'local/cuda/extras/CUPTI/include'
        'targets/x86_64-linux/include'
of:
        '/lib'
        '/lib/i386-linux-gnu'
        '/lib/x86_64-linux-gnu'
        '/lib32'
        '/usr'
        '/usr/lib/x86_64-linux-gnu/libfakeroot'
        '/usr/lib32'
        '/usr/local/cuda'
        '/usr/local/cuda/targets/x86_64-linux/lib'

The missing headers belong to TensorRT I believe. So I try to install TensorRT using NVIDIA’s documentation. But CUDA 12.2 is not supported in the most recent release, only <= 12.1. Obviously, I have tried installing 12.1 and then I can get quite deep into the compilation; however the official release is built using CUDA 12.2, so I’m stumped at the moment.

Tags: compilation tensorflow

Answers

Chosen as BEST ANSWER

The two libraries - libnvinfer-dev and libnvinfer-plugin-dev must be installed. For me, this was as follows:

sudo apt-get install -y libnvinfer-dev=8.6.1.6-1+cuda12.0 libnvinfer-plugin-dev=8.6.1.6-1+cuda12.0

They are installed alongside TensorRT, but can be installed independently.

Here is a docker file that sets an environment up which is capable of compiling 2.15 from source. Note the following:

The cudnn .deb file must be downloaded manually and placed in he docker build directory.
Once built, cd into the build directory, pull the latest code and checkout the v2.15.0 branch.
Run the configure script (do not use clang as there is a known issue with building 2.15.0 with clang; use nvcc)

FROM ubuntu:22.04

ENV DEBIAN_FRONTEND=noninteractive

WORKDIR /downloads

RUN apt-get update && 
    apt-get install -y --no-install-recommends wget ca-certificates git lsb-release software-properties-common gnupg && 
    rm -rf /var/lib/apt/lists/*

# Install Bazelisk.
RUN wget https://github.com/bazelbuild/bazelisk/releases/download/v1.19.0/bazelisk-linux-amd64 -O /usr/local/bin/bazel && 
    chmod +x /usr/local/bin/bazel

# Install LLVM/Clang 16
RUN wget https://apt.llvm.org/llvm.sh && 
    chmod +x llvm.sh && 
    ./llvm.sh 16 && 
    rm llvm.sh

# Install CUDA Toolkit 12.2.
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin && 
    mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600 && 
    wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.0-535.54.03-1_amd64.deb && 
    dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.0-535.54.03-1_amd64.deb && 
    cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/ && 
    apt-get update && 
    apt-get -y install cuda

# Install cuDnn. This .deb file is fetched manually from the NVIDIA archive (TODO - there is a way to get around the required authorization and use wget).
COPY cudnn-local-repo-ubuntu2204-8.8.1.3_1.0-1_amd64.deb /downloads/
RUN dpkg -i cudnn-local-repo-ubuntu2204-8.8.1.3_1.0-1_amd64.deb && 
    cp /var/cudnn-local-repo-ubuntu2204-8.8.1.3/cudnn-local-*-keyring.gpg /usr/share/keyrings/ && 
    apt-get update && 
    apt-get install libcudnn8=8.8.1.3-1+cuda12.0 && 
    apt-get install libcudnn8-dev=8.8.1.3-1+cuda12.0 && 
    apt-get install libcudnn8-samples=8.8.1.3-1+cuda12.0

# Fetch the tensorflow source code.
RUN git clone https://github.com/tensorflow/tensorflow.git

# Install nvinfer dependencies.
RUN apt-get install -y --no-install-recommends gnupg2 curl ca-certificates && 
    curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-archive-keyring.gpg -o /usr/share/keyrings/cuda-archive-keyring.gpg && 
    echo "deb [signed-by=/usr/share/keyrings/cuda-archive-keyring.gpg] https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /" > /etc/apt/sources.list.d/cuda.list && 
    apt-get purge --autoremove -y curl && 
    rm -rf /var/lib/apt/lists/* && 
    apt-get update && 
    apt-get install -y --no-install-recommends libnvinfer-dev=8.6.1.6-1+cuda12.0 libnvinfer-plugin-dev=8.6.1.6-1+cuda12.0 && 
    apt-get clean && 
    rm -rf /var/lib/apt/lists/*

RUN apt-get update && apt-get -y install libstdc++-12-dev 

RUN apt-get install python-is-python3 python3-pip python3-dev patchelf

CMD ["/bin/bash"]

(Edit)

- JuxunyWu
- December 28, 2023 at 10:10 pm
- 0 votes
0
You should set up the TensorRT install path, like this:
```
export TENSORRT_INSTALL_PATH=<Your/TensorRT/install/path>
```
By the way, you also need the cuDNN install path,
```
export CUDNN_INSTALL_PATH=<Your/cuDNN/install/path>
```
The bazel build will find the header files of TensorRT via the system environment variables .
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Ubuntu – Compiling TensorFlow 2.15.0 from source

Answers