Skip to content

change: merge dockerfiles #235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 74 commits into from
Sep 27, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
c93d034
Scriptmode single machine training implementation (#78)
icywang86rui Sep 27, 2018
3763697
Add tox.ini and configure coverage and flake runs (#80)
icywang86rui Oct 2, 2018
99eaf6b
Add integration tests to run training jobs with sagemaker (#81)
icywang86rui Oct 5, 2018
1338820
Add Script Mode example (#83)
mvsusp Oct 9, 2018
a1916a8
Add benchmarking script (#86)
mvsusp Oct 23, 2018
7047101
Edited the tf script mode notebook (#90)
eslesar-aws Oct 27, 2018
032cf60
Add distributed training support (#98)
icywang86rui Nov 6, 2018
177773d
Add CI configuration files (#109)
mvsusp Nov 15, 2018
a897135
Set S3 environment variables (#112)
icywang86rui Nov 16, 2018
5913b17
GPU fix (#117)
mvsusp Nov 19, 2018
1fab499
Update sagemaker containers (#119)
mvsusp Nov 19, 2018
c4abcae
Set parameter process waiting to False (#120)
mvsusp Nov 20, 2018
378add5
Disable GPU for parameter process (#121)
icywang86rui Nov 21, 2018
534ffa7
Unset CUDA_VISIBLE_DEVICES for worker processes (#122)
icywang86rui Nov 21, 2018
e6bf988
Fix broken unit tests (#124)
icywang86rui Nov 23, 2018
49a0547
Add Keras support (#126)
mvsusp Nov 24, 2018
962f15b
Create parameter server in different thread (#129)
icywang86rui Nov 27, 2018
8e6c4f2
Fix Keras test (#132)
icywang86rui Dec 4, 2018
d2f9f48
Skip keras local mode test on gpu and use random port for serving in …
icywang86rui Dec 5, 2018
80aa735
Update script_mode_train_any_tf_script_in_sage_maker.ipynb (#110)
mvsusp Dec 21, 2018
441adb0
Add python-dev and build-essential to Dockerfiles (#141)
laurenyu Dec 21, 2018
a9e4359
Force parameter server to run on CPU (#143)
icywang86rui Jan 3, 2019
a4e6cfa
Deprecate get_marker. Use get_closest_marker instead (#146)
icywang86rui Jan 7, 2019
4f66042
TensorFlow 1.12 and Horovod support (#138)
mvsusp Jan 8, 2019
8be0efe
Skip horovod integration tests (#149)
icywang86rui Jan 8, 2019
070e5fb
Add Horovod tests (#151)
mvsusp Jan 10, 2019
658ec5a
Skip horovod local CPU test in GPU instances (#152)
mvsusp Jan 11, 2019
48507bb
Add S3 plugin tests (#155)
icywang86rui Jan 25, 2019
ec07c35
Fix broken test test_distributed_mnist_no_ps (#156)
icywang86rui Jan 28, 2019
f339949
Use the test argement framework_version in all tests (#158)
icywang86rui Jan 29, 2019
c4d6b85
Configure encoding to be utf-8 (#160)
yangaws Feb 11, 2019
a7c0aaf
Fix SageMaker Session handling in Horovod test (#165)
laurenyu Feb 15, 2019
269c9a1
Read framework version from Python SDK for integ test default (#167)
laurenyu Feb 15, 2019
e16a936
Fix instance_type fixture setup for tests (#168)
laurenyu Feb 18, 2019
b3cb548
Specify region when creating S3 resource in integ tests (#169)
laurenyu Feb 19, 2019
686ae25
Add model saving warning at end of training (#171)
icywang86rui Feb 28, 2019
c276dac
Skip the s3_plugin test before new binary released (#177)
yangaws Mar 26, 2019
c286f01
Tune test_s3_plugin test (#178)
icywang86rui Apr 3, 2019
00a7a0b
fix: change model_dir to training job name if it is for tuning. (#179)
chuyang-deng Apr 12, 2019
ce47c76
Fix model_dir adjustment for hyperparameter tuning jobs (#181)
laurenyu Apr 22, 2019
215179b
Add Horovod benchmark (#157)
mvsusp Apr 24, 2019
1b60209
Add SageMaker integ test for hyperparameter tuning model_dir logic (#…
laurenyu Apr 25, 2019
f40f010
Add mpi4py to pip installs (#185)
laurenyu Apr 30, 2019
c097ca1
Upgrade to TensorFlow 1.13.1 (#184)
icywang86rui May 8, 2019
85cded3
Update integ test for checking Python version (#189)
laurenyu May 13, 2019
2b7138d
Pull request to test codebuild trigger on TensorFlow script mode (#186)
yangaws May 14, 2019
fb1fbdf
Explicitly set lower-bound for botocore version (#187)
laurenyu May 15, 2019
4610af3
Add release build (#191)
icywang86rui May 20, 2019
d57e1ae
fix: use tar file name as framework_support_installable in build_all.…
icywang86rui May 21, 2019
5b81f42
fix: ignore coverage in release build tests (#193)
icywang86rui May 21, 2019
7bb4475
fix: remove setup file in release build gpu test (#194)
icywang86rui May 21, 2019
cfc4ac5
fix: add branch name to remote gpu test run command (#195)
icywang86rui May 21, 2019
8e97672
fix: add setup file back (#196)
icywang86rui May 21, 2019
c257cf1
fix: skip setup on second remote run (#197)
icywang86rui May 21, 2019
7b1faac
prepare release v0.1.0
May 22, 2019
29fd3c6
update development version to v0.1.1.dev0
May 22, 2019
92e5579
fix: skip gpu SageMaker test in regions with limited amount of p2/3 i…
icywang86rui May 23, 2019
0b0ddb7
fix: fix flake8 errors and add flake8 run in buildspec.yml (#200)
icywang86rui May 23, 2019
c93e7a4
Merge dockerfiles
icywang86rui May 24, 2019
8fd4b49
Update VERSION
icywang86rui May 25, 2019
0b937a9
Update CuDNN and NCCL2 versions. (#214)
Jun 18, 2019
6ef2afa
Pin Horovod version (#218)
laurenyu Jun 22, 2019
a7a61bf
Update the location of latest TF 1.13 builds (#219)
nikhil-sk Jun 25, 2019
e4cc41e
Add Dockerfile for TF 1.14 (#221)
Richardwan7 Jul 15, 2019
c25b68c
Update sagemaker-tensorflow to 1.14 (#223)
Richardwan7 Jul 19, 2019
43d0bc9
Update openmpi to 4.0.1 (#229)
Richardwan7 Jul 29, 2019
1ebee3c
change: add py2 support for TF 1.14 (#232)
chuyang-deng Aug 20, 2019
575f74c
remove unused args
Sep 25, 2019
1b9e28f
update dockerfiles
Sep 26, 2019
8c84a57
Merge branch 'master' into merge-dockerfile
chuyang-deng Sep 26, 2019
9e19483
assert exception message with error.value
Sep 26, 2019
9f6831c
remove extra conftest
Sep 26, 2019
eb4aa17
recover converagerc files
Sep 26, 2019
adf0f5f
recover gpu_device_placement script
Sep 26, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Changelog

## v0.1.0 (2019-05-22)

### Bug fixes and other changes

## v2.0.7 (2019-08-15)

### Bug fixes and other changes
Expand Down
74 changes: 46 additions & 28 deletions docker/1.13.1/Dockerfile.cpu
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,10 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
openssh-server \
ca-certificates \
curl \
&& add-apt-repository ppa:deadsnakes/ppa -y \
git \
wget \
vim \
zlib1g-dev \
&& rm -rf /var/lib/apt/lists/*

# Install Open MPI
Expand Down Expand Up @@ -54,39 +57,54 @@ ENV KMP_AFFINITY=granularity=fine,compact,1,0 KMP_BLOCKTIME=1 KMP_SETTINGS=0

WORKDIR /

ARG py_version
ARG framework_installable
ARG framework_support_installable=sagemaker_tensorflow_container-2.0.0.tar.gz
ARG PYTHON=python3
ARG PYTHON_PIP=python3-pip
ARG PIP=pip3
ARG PYTHON_VERSION=3.6.6

RUN if [ $py_version -eq 3 ]; then PYTHON_VERSION=python3.6; else PYTHON_VERSION=python2.7; fi && \
apt-get update && apt-get install -y --no-install-recommends $PYTHON_VERSION-dev --allow-unauthenticated && \
ln -s -f /usr/bin/$PYTHON_VERSION /usr/bin/python && \
ln -s -f /usr/bin/$PYTHON_VERSION /usr/local/bin/python && \
rm -rf /var/lib/apt/lists/*
RUN wget https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz && \
tar -xvf Python-$PYTHON_VERSION.tgz && cd Python-$PYTHON_VERSION && \
./configure && make && make install && \
apt-get update && apt-get install -y --no-install-recommends libreadline-gplv2-dev libncursesw5-dev libssl-dev libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev && \
make && make install && rm -rf ../Python-$PYTHON_VERSION* && \
ln -s /usr/local/bin/pip3 /usr/bin/pip

ENV PYTHONDONTWRITEBYTECODE=1 PYTHONUNBUFFERED=1 PYTHONIOENCODING=UTF-8 LANG=C.UTF-8 LC_ALL=C.UTF-8

RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && \
python get-pip.py --disable-pip-version-check --no-cache-dir "pip==18.1" && \
rm get-pip.py

COPY $framework_installable tensorflow-1.13.1-py2.py3-none-any.whl
ARG framework_support_installable=sagemaker_tensorflow_container-2.0.0.tar.gz
COPY $framework_support_installable .

RUN pip install --no-cache-dir -U \
keras==2.2.4 \
mpi4py==3.0.1 \
"sagemaker-tensorflow>=1.13,<1.14" && \
ARG TF_URL="https://tensorflow-aws.s3-us-west-2.amazonaws.com/1.13/AmazonLinux/cpu/latest-patch-latest-patch/tensorflow-1.13.1-cp36-cp36m-linux_x86_64.whl"

RUN ${PIP} --no-cache-dir install --upgrade pip setuptools

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python

RUN ${PIP} install --no-cache-dir -U \
numpy==1.16.2 \
scipy==1.2.1 \
scikit-learn==0.20.3 \
pandas==0.24.2 \
Pillow==5.4.1 \
h5py==2.9.0 \
keras_applications==1.0.7 \
keras_preprocessing==1.0.9 \
keras==2.2.4 \
requests==2.21.0 \
awscli==1.16.130 \
mpi4py==3.0.1 \
"sagemaker-tensorflow>=1.13,<1.14" && \
# Let's install TensorFlow separately in the end to avoid
# the library version to be overwritten
pip install --force-reinstall --no-cache-dir -U \
tensorflow-1.13.1-py2.py3-none-any.whl \
horovod && \
pip install --no-cache-dir -U $framework_support_installable && \
rm -f tensorflow-1.13.1-py2.py3-none-any.whl && \
rm -f $framework_support_installable && \
pip uninstall -y --no-cache-dir \
markdown \
tensorboard
${PIP} install --force-reinstall --no-cache-dir -U \
${TF_URL} \
horovod==0.16.4 && \
${PIP} install --no-cache-dir -U $framework_support_installable && \
rm -f $framework_support_installable && \
${PIP} uninstall -y --no-cache-dir \
markdown \
tensorboard

ENV SAGEMAKER_TRAINING_MODULE sagemaker_tensorflow_container.training:main

CMD ["bin/bash"]
83 changes: 52 additions & 31 deletions docker/1.13.1/Dockerfile.gpu
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,6 @@ FROM nvidia/cuda:10.0-base-ubuntu16.04

LABEL maintainer="Amazon AI"

RUN apt-get update && apt-get install -y --no-install-recommends --allow-unauthenticated \
software-properties-common && \
add-apt-repository ppa:deadsnakes/ppa -y && \
rm -rf /var/lib/apt/lists/*

RUN apt-get update && apt-get install -y --no-install-recommends --allow-unauthenticated \
ca-certificates \
cuda-command-line-tools-10-0 \
Expand All @@ -17,18 +12,22 @@ RUN apt-get update && apt-get install -y --no-install-recommends --allow-unauthe
cuda-cusolver-dev-10-0 \
cuda-cusparse-dev-10-0 \
curl \
libcudnn7=7.4.1.5-1+cuda10.0 \
libcudnn7=7.5.1.10-1+cuda10.0 \
# TensorFlow doesn't require libnccl anymore but Open MPI still depends on it
libnccl2 \
libnccl-dev \
libnccl2=2.4.7-1+cuda10.0 \
libgomp1 \
libnccl-dev=2.4.7-1+cuda10.0 \
libfreetype6-dev \
libhdf5-serial-dev \
libpng12-dev \
libzmq3-dev \
git \
wget \
vim \
build-essential \
openssh-client \
openssh-server \
build-essential && \
zlib1g-dev && \
# The 'apt-get install' of nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda9.0
# adds a new list which contains libnvinfer library, so it needs another
# 'apt-get update' to retrieve that list before it can actually install the
Expand All @@ -42,7 +41,8 @@ RUN apt-get update && apt-get install -y --no-install-recommends --allow-unauthe
rm /usr/lib/x86_64-linux-gnu/libnvinfer_plugin* && \
rm /usr/lib/x86_64-linux-gnu/libnvcaffe_parser* && \
rm /usr/lib/x86_64-linux-gnu/libnvparsers* && \
rm -rf /var/lib/apt/lists/*
rm -rf /var/lib/apt/lists/* && \
mkdir -p /var/run/sshd

###########################################################################
# Horovod & its dependencies
Expand All @@ -60,14 +60,17 @@ RUN mkdir /tmp/openmpi && \
ldconfig && \
rm -rf /tmp/openmpi

ARG py_version
ARG framework_installable
ARG framework_support_installable=sagemaker_tensorflow_container-2.0.0.tar.gz
ARG PYTHON=python3
ARG PYTHON_PIP=python3-pip
ARG PIP=pip3
ARG PYTHON_VERSION=3.6.6

RUN if [ $py_version -eq 3 ]; then PYTHON_VERSION=python3.6; else PYTHON_VERSION=python2.7; fi && \
apt-get update && apt-get install -y --no-install-recommends $PYTHON_VERSION-dev --allow-unauthenticated && \
ln -s -f /usr/bin/$PYTHON_VERSION /usr/bin/python && \
rm -rf /var/lib/apt/lists/*
RUN wget https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz && \
tar -xvf Python-$PYTHON_VERSION.tgz && cd Python-$PYTHON_VERSION && \
./configure && make && make install && \
apt-get update && apt-get install -y --no-install-recommends libreadline-gplv2-dev libncursesw5-dev libssl-dev libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev && \
make && make install && rm -rf ../Python-$PYTHON_VERSION* && \
ln -s /usr/local/bin/pip3 /usr/bin/pip

# Create a wrapper for OpenMPI to allow running as root by default
RUN mv /usr/local/bin/mpirun /usr/local/bin/mpirun.real && \
Expand Down Expand Up @@ -100,33 +103,51 @@ RUN mkdir -p /root/.ssh/ && \
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONDONTWRITEBYTECODE=1 PYTHONUNBUFFERED=1 PYTHONIOENCODING=UTF-8 LANG=C.UTF-8 LC_ALL=C.UTF-8

RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && \
python get-pip.py --disable-pip-version-check --no-cache-dir "pip==18.1" && \
rm get-pip.py

WORKDIR /

COPY $framework_installable tensorflow-1.13.1-py2.py3-none-any.whl
ARG TF_URL="https://tensorflow-aws.s3-us-west-2.amazonaws.com/1.13/AmazonLinux/gpu/latest-patch-latest-patch/tensorflow-1.13.1-cp36-cp36m-linux_x86_64.whl"

RUN ${PIP} --no-cache-dir install --upgrade pip setuptools

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python

ARG framework_support_installable=sagemaker_tensorflow_container-2.0.0.tar.gz
COPY $framework_support_installable .

RUN pip install --no-cache-dir -U \
RUN ${PIP} install --no-cache-dir -U \
numpy==1.16.2 \
scipy==1.2.1 \
scikit-learn==0.20.3 \
pandas==0.24.2 \
Pillow==5.4.1 \
h5py==2.9.0 \
keras_applications==1.0.7 \
keras_preprocessing==1.0.9 \
requests==2.21.0 \
keras==2.2.4 \
awscli==1.16.130 \
mpi4py==3.0.1 \
$framework_support_installable \
"sagemaker-tensorflow>=1.13,<1.14" \
# Let's install TensorFlow separately in the end to avoid
# the library version to be overwritten
&& pip install --force-reinstall --no-cache-dir -U tensorflow-1.13.1-py2.py3-none-any.whl \
\
&& rm -f tensorflow-1.13.1-py2.py3-none-any.whl \
&& rm -f $framework_support_installable \
&& pip uninstall -y --no-cache-dir \
&& ${PIP} install --force-reinstall --no-cache-dir -U ${TF_URL} \
&& ${PIP} install --no-cache-dir -U $framework_support_installable && \
rm -f $framework_support_installable \
&& ${PIP} uninstall -y --no-cache-dir \
markdown \
tensorboard

# Install Horovod, temporarily using CUDA stubs
RUN ldconfig /usr/local/cuda-10.0/targets/x86_64-linux/lib/stubs && \
HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_WITH_TENSORFLOW=1 pip install --no-cache-dir horovod && \
HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_WITH_TENSORFLOW=1 ${PIP} install --no-cache-dir horovod==0.16.4 && \
ldconfig

ENV SAGEMAKER_TRAINING_MODULE sagemaker_tensorflow_container.training:main
# Allow OpenSSH to talk to containers without asking for confirmation
RUN cat /etc/ssh/ssh_config | grep -v StrictHostKeyChecking > /etc/ssh/ssh_config.new && \
echo " StrictHostKeyChecking no" >> /etc/ssh/ssh_config.new && \
mv /etc/ssh/ssh_config.new /etc/ssh/ssh_config

ENV SAGEMAKER_TRAINING_MODULE sagemaker_tensorflow_container.training:main

CMD ["bin/bash"]
127 changes: 127 additions & 0 deletions docker/1.14.0/py2/Dockerfile.cpu
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
FROM ubuntu:16.04

LABEL maintainer="Amazon AI"

RUN apt-get update && apt-get install -y --no-install-recommends \
software-properties-common \
build-essential \
openssh-client \
openssh-server \
ca-certificates \
curl \
git \
wget \
vim \
gcc-4.9 \
g++-4.9 \
gcc-4.9-base \
zlib1g-dev \
&& rm -rf /var/lib/apt/lists/*

# Install Open MPI
RUN mkdir /tmp/openmpi && \
cd /tmp/openmpi && \
curl -fSsL -O https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.1.tar.gz && \
tar zxf openmpi-4.0.1.tar.gz && \
cd openmpi-4.0.1 && \
./configure --enable-orterun-prefix-by-default && \
make -j $(nproc) all && \
make install && \
ldconfig && \
rm -rf /tmp/openmpi

# Create a wrapper for OpenMPI to allow running as root by default
RUN mv /usr/local/bin/mpirun /usr/local/bin/mpirun.real && \
echo '#!/bin/bash' > /usr/local/bin/mpirun && \
echo 'mpirun.real --allow-run-as-root "$@"' >> /usr/local/bin/mpirun && \
chmod a+x /usr/local/bin/mpirun

RUN echo "hwloc_base_binding_policy = none" >> /usr/local/etc/openmpi-mca-params.conf && \
echo "rmaps_base_mapping_policy = slot" >> /usr/local/etc/openmpi-mca-params.conf

ENV LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH

ENV PATH /usr/local/openmpi/bin/:$PATH

# SSH login fix. Otherwise user is kicked off after login
RUN sed 's@session\s*required\s*pam_loginuid.so@session optional pam_loginuid.so@g' -i /etc/pam.d/sshd

# Create SSH key.
RUN mkdir -p /root/.ssh/ && \
mkdir -p /var/run/sshd && \
ssh-keygen -q -t rsa -N '' -f /root/.ssh/id_rsa && \
cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys && \
printf "Host *\n StrictHostKeyChecking no\n" >> /root/.ssh/config

# Set environment variables for MKL
# For more about MKL with TensorFlow see:
# https://www.tensorflow.org/performance/performance_guide#tensorflow_with_intel%C2%AE_mkl_dnn
ENV KMP_AFFINITY=granularity=fine,compact,1,0 KMP_BLOCKTIME=1 KMP_SETTINGS=0

WORKDIR /

ARG PYTHON=python
ARG PYTHON_PIP=python-pip
ARG PIP=pip

RUN apt-get update && apt-get install -y \
${PYTHON} \
${PYTHON_PIP}

ENV PYTHONDONTWRITEBYTECODE=1 PYTHONUNBUFFERED=1 PYTHONIOENCODING=UTF-8 LANG=C.UTF-8 LC_ALL=C.UTF-8

ARG framework_support_installable=sagemaker_tensorflow_container-2.0.0.tar.gz
ARG sagemaker_tensorflow_extensions=sagemaker_tensorflow-1.14.0.1.0.0-cp27-cp27mu-manylinux1_x86_64.whl
COPY $framework_support_installable .
COPY $sagemaker_tensorflow_extensions .
ARG TF_URL="https://tensorflow-aws.s3-us-west-2.amazonaws.com/1.14/AmazonLinux/cpu/final/tensorflow-1.14.0-cp27-cp27mu-linux_x86_64.whl"

# Pin GCC to 4.9 (priority 200) to compile correctly against TensorFlow, PyTorch, and MXNet with horovod
# Backup existing GCC installation as priority 100, so that it can be recovered later.
RUN update-alternatives --install /usr/bin/gcc gcc $(readlink -f $(which gcc)) 100 && \
update-alternatives --install /usr/bin/x86_64-linux-gnu-gcc x86_64-linux-gnu-gcc $(readlink -f $(which gcc)) 100 && \
update-alternatives --install /usr/bin/g++ g++ $(readlink -f $(which g++)) 100 && \
update-alternatives --install /usr/bin/x86_64-linux-gnu-g++ x86_64-linux-gnu-g++ $(readlink -f $(which g++)) 100
RUN update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 200 && \
update-alternatives --install /usr/bin/x86_64-linux-gnu-gcc x86_64-linux-gnu-gcc /usr/bin/gcc-4.9 200 && \
update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 200 && \
update-alternatives --install /usr/bin/x86_64-linux-gnu-g++ x86_64-linux-gnu-g++ /usr/bin/g++-4.9 200

RUN ${PIP} --no-cache-dir install --upgrade pip setuptools

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python

RUN ${PIP} install --no-cache-dir -U \
numpy==1.16.4 \
scipy==1.2.2 \
scikit-learn==0.20.3 \
pandas==0.24.2 \
Pillow==6.1.0 \
h5py==2.9.0 \
keras_applications==1.0.8 \
keras_preprocessing==1.1.0 \
requests==2.22.0 \
keras==2.2.4 \
awscli==1.16.196 \
mpi4py==3.0.2 \
$sagemaker_tensorflow_extensions \
# Let's install TensorFlow separately in the end to avoid
# the library version to be overwritten
&& ${PIP} install --force-reinstall --no-cache-dir -U ${TF_URL} \
&& ${PIP} install --no-cache-dir -U $framework_support_installable && \
rm -f $framework_support_installable \
&& ${PIP} install --no-cache-dir -U horovod==0.16.4 \
&& ${PIP} uninstall -y --no-cache-dir \
markdown

# Remove GCC pinning
RUN update-alternatives --remove gcc /usr/bin/gcc-4.9 && \
update-alternatives --remove x86_64-linux-gnu-gcc /usr/bin/gcc-4.9 && \
update-alternatives --remove g++ /usr/bin/g++-4.9 && \
update-alternatives --remove x86_64-linux-gnu-g++ /usr/bin/g++-4.9


ENV SAGEMAKER_TRAINING_MODULE sagemaker_tensorflow_container.training:main

CMD ["bin/bash"]
Loading