Skip to content

Upgrade Base Image: colab_20250404-060113_RC00 #1484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 8, 2025
Merged

Conversation

calderjo
Copy link
Contributor

@calderjo calderjo commented May 2, 2025

This particular image had issues with UV installs however does highlight a solution and will be included in the next image: googlecolab/colabtools#5237

Base image also removes Gensim due to SciPy 1.14.1, we included a fix to install both, since Gensim is a popular package 200 users per day.

Updated mocks for GCS related tests, latest version causes issues

Adding a few packages back into requirements.txt that were remove due to fixes that have been since resolved

@calderjo calderjo changed the title Update base image Update base image to colab_20250404-060113_RC00 May 5, 2025
@calderjo calderjo changed the title Update base image to colab_20250404-060113_RC00 Upgrade Base Image: colab_20250404-060113_RC00 May 5, 2025
@calderjo calderjo force-pushed the brave-new-world branch from 7a47787 to 2479f1f Compare May 7, 2025 03:55
@calderjo calderjo requested a review from djherbis May 7, 2025 16:41
Copy link
Contributor

@djherbis djherbis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaesong-colab Any thoughts about this diff from the new image?

Dockerfile.tmpl Outdated

# b/408284435: Keras 3.6 broke test_keras.py > test_train > keras.datasets.mnist.load_data()
# See https://github.com/keras-team/keras/commit/dcefb139863505d166dd1325066f329b3033d45a
# Colab base is on Keras 3.8, we have to install the package separately
RUN uv pip install --system google-cloud-automl==1.0.1 google-cloud-aiplatform google-cloud-translate==3.12.1 \
google-cloud-videointelligence google-cloud-vision google-genai "keras<3.6"
RUN uv pip install --system "keras<3.6"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we stuck on old Keras?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same error as the Original:
load_dataset doesn't allow you to install the dataset in a specific dir, but only in the "keras cache dir".
I think we can change the test to upgrade keras if we like. wdyt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I presume updating a test for a package upgrade is reasonable.

Is there a reason not to update the test? Is it validating something that shouldn't change and will break something in Kaggle Notebooks?

Dockerfile.tmpl Outdated
@@ -44,8 +47,10 @@ RUN uv pip install --no-build-isolation --system "git+https://github.com/Kaggle/
# b/408281617: Torch is adamant that it can not install cudnn 9.3.x, only 9.1.x, but Tensorflow can only support 9.3.x.
# This conflict causes a number of package downgrades, which are handled in this command
# b/302136621: Fix eli5 import for learntools
# b/416137032: cuda 12.9.0 breaks datashader 1.18.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's datashader required for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't seem required for learn-tools and usage isn't high, we can remove.

@@ -7,6 +7,10 @@ FROM gcr.io/kaggle-images/python-lightgbm-whl:${BASE_IMAGE_TAG}-${LIGHTGBM_VERSI
{{ end }}
FROM ${BASE_IMAGE}:${BASE_IMAGE_TAG}

#b/415358342: UV reports missing requirements files https://github.com/googlecolab/colabtools/issues/5237
ENV UV_CONSTRAINT= \
UV_BUILD_CONSTRAINT=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this and including the issue!

Dockerfile.tmpl Outdated
RUN uv pip install --system --force-reinstall --extra-index-url https://pypi.nvidia.com "cuml-cu12==25.2.1" \
"nvidia-cudnn-cu12==9.3.0.75" scipy tsfresh scikit-learn==1.2.2 category-encoders eli5
"nvidia-cudnn-cu12==9.3.0.75" cuda-bindings==12.8.0 cuda-python==12.8.0 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvidia-cudnn-cu12 is already at 9.3.0.75. We do not install "cuda-bindings", "cuda-python", "tsfresh", "category-encoders", "eli5". perhaps it can be moved to requirements.txt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah these package were being problematic when we did this fix, seem like they can be re-added to req txt after testing it locally with learn tools.

# b/408284143: google-cloud-automl 2.0.0 introduced incompatible API changes, need to pin to 1.0.1
RUN uv pip install --system --force-reinstall --prerelease=allow kagglehub[pandas-datasets,hf-datasets,signing]>=0.3.12 \
google-cloud-automl==1.0.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not install "google-cloud-automl" perhaps it can be moved to requirements.txt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't breaks build due conflicting need for protobuf version. added a comment to make that clear and ensure it is revisited.

@calderjo calderjo merged commit 8a20862 into main May 8, 2025
3 checks passed
@calderjo calderjo deleted the brave-new-world branch May 8, 2025 00:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants