Add cuda12 variant of tensorflow-notebook (#2100)

ChristofKaufmann · mathbunnyru · web-flow · commit b9553a8e5d33 · 2024-03-26T01:12:59.000Z
* Add cuda12 variant for tensorflow-notebook

* Reduce size of CPU version of tensorflow-notebook

* Try to fix tests

* Update docs/using/selecting.md

Co-authored-by: Ayaz Salikhov &lt;mathbunnyru@users.noreply.github.com&gt;

* Update images/tensorflow-notebook/cuda12/Dockerfile

Co-authored-by: Ayaz Salikhov &lt;mathbunnyru@users.noreply.github.com&gt;

* Update tests/docker-stacks-foundation/test_packages.py

Co-authored-by: Ayaz Salikhov &lt;mathbunnyru@users.noreply.github.com&gt;

* Remove obsolete XLA_FLAGS env var

* Install CUDA and cuDNN using pip instead of mamba

* Fix pre-commit shell checks

* Change tensorflow variant name from cuda12 to cuda

* Update selecting.md

* Update selecting.md

---------

Co-authored-by: Ayaz Salikhov &lt;mathbunnyru@users.noreply.github.com&gt;
diff --git a/.github/workflows/docker.yml b/.github/workflows/docker.yml
@@ -196,6 +196,17 @@ jobs:
     needs: [x86_64-scipy]
     if: ${{ !contains(github.event.pull_request.title, '[FAST_BUILD]') }}
 
+  x86_64-tensorflow-cuda:
+    uses: ./.github/workflows/docker-build-test-upload.yml
+    with:
+      parent-image: scipy-notebook
+      image: tensorflow-notebook
+      variant: cuda
+      platform: x86_64
+      runs-on: ubuntu-latest
+    needs: [x86_64-scipy]
+    if: ${{ !contains(github.event.pull_request.title, '[FAST_BUILD]') }}
+
   aarch64-pytorch:
     uses: ./.github/workflows/docker-build-test-upload.yml
     with:
@@ -378,6 +389,7 @@ jobs:
             { image: r-notebook, variant: default },
             { image: julia-notebook, variant: default },
             { image: tensorflow-notebook, variant: default },
+            { image: tensorflow-notebook, variant: cuda },
             { image: pytorch-notebook, variant: default },
             { image: pytorch-notebook, variant: cuda11 },
             { image: pytorch-notebook, variant: cuda12 },
@@ -439,6 +451,7 @@ jobs:
             { image: r-notebook, variant: default },
             { image: julia-notebook, variant: default },
             { image: tensorflow-notebook, variant: default },
+            { image: tensorflow-notebook, variant: cuda },
             { image: pytorch-notebook, variant: default },
             { image: pytorch-notebook, variant: cuda11 },
             { image: pytorch-notebook, variant: cuda12 },
diff --git a/docs/using/selecting.md b/docs/using/selecting.md
@@ -18,11 +18,12 @@ The following sections describe these images, including their contents, relation
 
 ## CUDA enabled variant
 
-We provide CUDA accelerated version of `pytorch-notebook` image.
-Prepend a CUDA version prefix (like `cuda12-`) to the image tag to allow PyTorch operations to use compatible NVIDIA GPUs for accelerated computation.
-We only build images for 2 last major versions of CUDA.
+We provide CUDA accelerated version of `pytorch-notebook` and `tensorflow-notebook` images.
+Prepend a CUDA version prefix (like `cuda12-` for `pytorch-notebook` or `cuda-` for `tensorflow-notebook`) to the image tag
+to allow PyTorch or TensorFlow operations to use compatible NVIDIA GPUs for accelerated computation.
+Note: We only build `pytorch-notebook` for 2 last major versions of CUDA, `tensorflow-notebook` image only supports the latest CUDA version listed in the [officially tested build configurations](https://www.tensorflow.org/install/source#gpu).
 
-For example, you can use an image `quay.io/jupyter/pytorch-notebook:cuda12-python-3.11.8`
+For example, you can use an image `quay.io/jupyter/pytorch-notebook:cuda12-python-3.11.8` or `quay.io/jupyter/tensorflow-notebook:cuda-latest`
 
 ### jupyter/docker-stacks-foundation
 
diff --git a/images/tensorflow-notebook/Dockerfile b/images/tensorflow-notebook/Dockerfile
@@ -11,7 +11,8 @@ LABEL maintainer="Jupyter Project <jupyter@googlegroups.com>"
 # Fix: https://github.com/koalaman/shellcheck/wiki/SC3014
 SHELL ["/bin/bash", "-o", "pipefail", "-c"]
 
-# Install Tensorflow with pip
-RUN pip install --no-cache-dir tensorflow && \
+# Install tensorflow with pip, on x86_64 tensorflow-cpu
+RUN [[ $(uname -m) = x86_64 ]] && TF_POSTFIX="-cpu" || TF_POSTFIX="" && \
+    pip install --no-cache-dir "tensorflow${TF_POSTFIX}" && \
     fix-permissions "${CONDA_DIR}" && \
     fix-permissions "/home/${NB_USER}"
diff --git a/images/tensorflow-notebook/cuda/Dockerfile b/images/tensorflow-notebook/cuda/Dockerfile
@@ -0,0 +1,27 @@
+# Copyright (c) Jupyter Development Team.
+# Distributed under the terms of the Modified BSD License.
+ARG REGISTRY=quay.io
+ARG OWNER=jupyter
+ARG BASE_CONTAINER=$REGISTRY/$OWNER/scipy-notebook
+FROM $BASE_CONTAINER
+
+LABEL maintainer="Jupyter Project <jupyter@googlegroups.com>"
+
+# Fix: https://github.com/hadolint/hadolint/wiki/DL4006
+# Fix: https://github.com/koalaman/shellcheck/wiki/SC3014
+SHELL ["/bin/bash", "-o", "pipefail", "-c"]
+
+# Install TensorFlow, CUDA and cuDNN with pip
+RUN pip install --no-cache-dir "tensorflow[and-cuda]" && \
+    fix-permissions "${CONDA_DIR}" && \
+    fix-permissions "/home/${NB_USER}"
+
+# workaround for https://github.com/tensorflow/tensorflow/issues/63362
+RUN mkdir -p "${CONDA_DIR}/etc/conda/activate.d/" && \
+    fix-permissions "${CONDA_DIR}"
+
+COPY --chown="${NB_UID}:${NB_GID}" nvidia-lib-dirs.sh "${CONDA_DIR}/etc/conda/activate.d/"
+
+# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/docker-specialized.html#dockerfiles
+ENV NVIDIA_VISIBLE_DEVICES="all" \
+    NVIDIA_DRIVER_CAPABILITIES="compute,utility"
diff --git a/images/tensorflow-notebook/cuda/nvidia-lib-dirs.sh b/images/tensorflow-notebook/cuda/nvidia-lib-dirs.sh
@@ -0,0 +1,9 @@
+#!/bin/bash
+# Copyright (c) Jupyter Development Team.
+# Distributed under the terms of the Modified BSD License.
+
+# This adds the NVIDIA libraries to the LD_LIBRARY_PATH. Workaround for
+# https://github.com/tensorflow/tensorflow/issues/63362
+NVIDIA_DIR=$(dirname "$(python -c 'import nvidia;print(nvidia.__file__)')")
+LD_LIBRARY_PATH=$(echo "${NVIDIA_DIR}"/*/lib/ | sed -r 's/\s+/:/g')${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
+export LD_LIBRARY_PATH
diff --git a/tagging/taggers.py b/tagging/taggers.py
@@ -98,7 +98,10 @@ def tag_value(container: Container) -> str:
 class TensorflowVersionTagger(TaggerInterface):
     @staticmethod
     def tag_value(container: Container) -> str:
-        return "tensorflow-" + _get_pip_package_version(container, "tensorflow")
+        try:
+            return "tensorflow-" + _get_pip_package_version(container, "tensorflow")
+        except AssertionError:
+            return "tensorflow-" + _get_pip_package_version(container, "tensorflow-cpu")
 
 
 class PytorchVersionTagger(TaggerInterface):