Skip to content

Make tests container-runtime agnostic #2396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vkarak opened this issue Jan 25, 2022 · 2 comments · Fixed by #2537
Closed

Make tests container-runtime agnostic #2396

vkarak opened this issue Jan 25, 2022 · 2 comments · Fixed by #2537

Comments

@vkarak
Copy link
Contributor

vkarak commented Jan 25, 2022

Currently, the test has to define the container platform and the configuration simply states how that platform is made available. Ideally, we want the test to be able to run without changes on a container platform that is entirely configured at the system level.

@vkarak
Copy link
Contributor Author

vkarak commented May 29, 2022

I think that the approach that we have been investigating so far, ie. to extend the Environment configuration to use a container base image is going towards the wrong direction. Generally, a container image tends to be rather application specific, so the current approach that binds it to the test is more correct. The only problem with the current approach is the one described in this issue. Additionally, if the image is part of the test (as of now) we can very easily parameterise it on multiple images and test all of them at once. Therefore, I suggest for this feature to improve the ContainerPlatform, so that we don't need to specify its type in the test and also reuse the executable and executable_opts of the test, so that the test remains practically the same in its containerised vs non-containerised form.

@vkarak
Copy link
Contributor Author

vkarak commented Jun 13, 2022

After having tried several approaches to solving this issue, I think that the current solution is (almost) the best and I will explain it here. The original idea of this issue is to to allow a test to be transparently executed inside a container as-is and have reframe launch the container (through the current ContainerPlatform machinery). Although this seems reasonable in the case of a normal container, it collapses when you try to apply this example with HPC containers. According to the original idea, the user could set container_platform.image = 'image' and magically the test runs inside that image. The problem with that comes when you want to interface with HPC container runtimes, where you write something like srun sarus [options] run [executable]. This will allow the image to get access to host-optimised resources, such as the MPI stack or accelerator device drivers and runtimes. However, in the original (uncontainerized) test only the executable would be run with the parallel launcher. Where do then other parts of the tests, such as prerun_cmds, postrun_cmds and any sort of environment set up commands get? If they are emitted before parallel container launch, then they are running outside of the container, which was not the intent of this feature. If they are squeezed in the container launch command with the executable, then multiple instances of those (imagine a curl command in prerun_cmds) would run, each one on each node. I believe that this coincides with the rationale behind our original decision to have the users specify explicitly the container commands to run by setting the container_platform.command and not having the framework to guess what should go where.

In summary, I believe the current implementation of container runtime support is fine. The only fine tuning we could do is to not having users specify the container platform themselves in the test, but rather having this be automatically picked by the partition configuration, or later on to be auto-detected.

Finally, for the build phase, I argue that we possibly need a separate test variable, e.g., build_container_platform, similarly to the build_job that we use for launching build jobs. But this should come as a separate feature request and PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment