Description
Describe the bug
When I create a Pipeline with two NotebookJobStep steps and both steps were created using the same dict as environment_variables parameter the first step is run with the second input notebook isntead of its own input notebook.
To reproduce
env_vars = {
'test': 'test',
}
steps = [
NotebookJobStep(
image_uri="885854791233.dkr.ecr.us-east-1.amazonaws.com/sagemaker-distribution-prod:1-cpu",
kernel_name="python3",
input_notebook="job1.ipynb",
initialization_script="setup.sh",
environment_variables=env_vars,
),
NotebookJobStep(
image_uri="885854791233.dkr.ecr.us-east-1.amazonaws.com/sagemaker-distribution-prod:1-cpu",
kernel_name="python3",
input_notebook="job2.ipynb",
initialization_script="setup.sh",
environment_variables=env_vars,
),
]
pipeline = Pipeline(
name="pipeline",
steps=steps,
)
pipeline.upsert(role_arn=role)
execution = pipeline.start()
The problem seems the env vars for each step:
print(json.loads(pipeline.definition())["Steps"][0]["Arguments"]["Environment"])
{
'test': 'test',
'AWS_DEFAULT_REGION': 'us-east-1',
'SM_JOB_DEF_VERSION': '1.0',
'SM_ENV_NAME': 'sagemaker-default-env',
'SM_SKIP_EFS_SIMULATION': 'true',
'SM_EXECUTION_INPUT_PATH': '/opt/ml/input/data/sagemaker_headless_execution_pipelinestep',
'SM_KERNEL_NAME': 'python3',
'SM_INPUT_NOTEBOOK_NAME': 'job2.ipynb', <<==== wrong input
'SM_OUTPUT_NOTEBOOK_NAME': 'job2-ipynb-2024-08-29-15-04-49-575.ipynb',
'SM_INIT_SCRIPT': 'setup.sh'
}
print(json.loads(pipeline.definition())["Steps"][1]["Arguments"]["Environment"])
{
'test': 'test',
'AWS_DEFAULT_REGION': 'us-east-1',
'SM_JOB_DEF_VERSION': '1.0',
'SM_ENV_NAME': 'sagemaker-default-env',
'SM_SKIP_EFS_SIMULATION': 'true',
'SM_EXECUTION_INPUT_PATH': '/opt/ml/input/data/sagemaker_headless_execution_pipelinestep',
'SM_KERNEL_NAME': 'python3',
'SM_INPUT_NOTEBOOK_NAME': 'job2.ipynb',
'SM_OUTPUT_NOTEBOOK_NAME': 'job2-ipynb-2024-08-29-15-04-49-575.ipynb',
'SM_INIT_SCRIPT': 'setup.sh'
}
Expected behavior
Run job1.ipynb and job2.ipynb in each step.
Screenshots or logs
Screenshot of notebook jobs in Studio UI:
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 2.226.1
- Framework name (eg. PyTorch) or algorithm (eg. KMeans):
- Framework version:
- Python version: 3.8.18
- CPU or GPU: CPU
- Custom Docker image (Y/N): N