Skip to content

[UR] Enable hwloc in UMF linux build #15261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 24, 2025

Conversation

PatKamin
Copy link
Contributor

@PatKamin PatKamin commented Sep 3, 2024

Update UR to statically link UMF with hwloc on Linux.
This will enable hwloc support in UMF on Linux.

Also, always use the v2.9.3 hwloc version for hwloc builds.

@PatKamin PatKamin marked this pull request as ready for review September 13, 2024 13:35
@PatKamin PatKamin requested a review from a team as a code owner September 13, 2024 13:35
@PatKamin PatKamin force-pushed the enable-hwloc-on-linux branch from d8a76ff to 94c1a73 Compare December 6, 2024 08:47
@PatKamin PatKamin force-pushed the enable-hwloc-on-linux branch from 7aa9690 to dcc6761 Compare December 9, 2024 12:11
@PatKamin PatKamin force-pushed the enable-hwloc-on-linux branch from dcc6761 to 2891f1e Compare December 9, 2024 12:16
@PatKamin PatKamin force-pushed the enable-hwloc-on-linux branch from 2891f1e to c647376 Compare December 9, 2024 12:28
@PatKamin PatKamin marked this pull request as draft December 9, 2024 13:58
@PatKamin
Copy link
Contributor Author

PatKamin commented Dec 9, 2024

Drafted until the UMF fix is pulled down to llvm via UMF update in UR.

@PatKamin PatKamin force-pushed the enable-hwloc-on-linux branch from c647376 to c6e5f06 Compare December 12, 2024 09:01
@PatKamin PatKamin marked this pull request as ready for review December 12, 2024 09:01
@PatKamin
Copy link
Contributor Author

Drafted until the UMF fix is pulled down to llvm via UMF update in UR.

Fix is now pulled down, ready for review.

@igchor
Copy link
Member

igchor commented Dec 30, 2024

You should probably also update https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md (remove hwloc as a dependency that needs to be installed by the user)

@PatKamin PatKamin force-pushed the enable-hwloc-on-linux branch from c6e5f06 to f1fa58b Compare December 31, 2024 11:36
@PatKamin PatKamin requested a review from a team as a code owner December 31, 2024 11:36
@PatKamin
Copy link
Contributor Author

You should probably also update https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md (remove hwloc as a dependency that needs to be installed by the user)

Right, now it will automatically be downloaded - deleted user requirement.

Comment on lines 47 to 48
* `hwloc` version 2.3 or later (Linux only)
* libhwloc-dev or hwloc-devel package on linux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update UR to statically link hwloc v2.9.3 both on linux and Windows.

If I get the change right, the dependency on hwloc still exists. The change removed the runtime dependency, but not the compile time dependency. Am I right?

This document describes the environment for DPC++ compiler developers, not DPC++ compiler users.

Your change seems to raise the version requirement. Please, update the hwloc version to 2.9.3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hwloc library will always be fetched from sources. Regardless of hwloc-dev packages user has installed, the fetched version will be linked with the UMF library. That's why I remove hwloc from this dependencies list, as the user-installed hwloc-dev package will not be used by UMF.

@bader
Copy link
Contributor

bader commented Dec 31, 2024

You should probably also update https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md (remove hwloc as a dependency that needs to be installed by the user)

Right, now it will automatically be downloaded - deleted user requirement.

If hwloc dependencies will be automatically resolved by CMake, please, remove installation of these dependencies from CI scripts. Specifically, from

and
sudo apt-get install -y graphviz ssh ninja-build libhwloc-dev
.

@PatKamin PatKamin force-pushed the enable-hwloc-on-linux branch from f1fa58b to d54b9d2 Compare January 2, 2025 08:48
@PatKamin
Copy link
Contributor Author

In general, I'm not a fan of having those dependencies because they complicate our SW stack for not so apparent reason. Pardon my ignorance, but why hwloc and even UMF have to be used by default? At least for the amount of testing we perform at intel/llvm - are there any tests which can detect side effects of UMF and/or hwloc being present or absent? If not, then I would be strongly against bringing that dependency by default, because it means that its an overhead without a reason.

The UMF library is required for allocation handling by default in UR, the user cannot opt-out, UR can't be built without UMF. There are two ways that UR gets UMF now. UMF is either fetched by UR or the user can force CMake to search for the installed UMF with find_package using the UR_USE_EXTERNAL_UMF env var.
Although currently hwloc can be disabled and not used, the env var that lets the user to do this (UMF_DISABLE_HWLOC) is planned to be removed prior the stable 1.0 release of UMF. hwloc will be a required dependency of UMF.

Also, why do we need to build either from source at all? I understand reason for building UR from sources, because it is moves together with SYCL RT on a very fast pace and changes have to be done in sync, but is it really the case for UMF and hwloc?

We don't release UMF prebuilt libraries as a part of open-source releases and there are no plans to do so. Users using UMF have to fetch it.

As @bader suggested, why can't we just download a binary release of hwloc (and ideally UMF) during UMF (UR) build?

I suggest we fetch binaries instead of sources.
https://www.open-mpi.org/software/hwloc/v2.9/ has links to the binaries for different platforms.

I don't fetch hwloc binaries as we apply a patch to its sources.
I also wanted the intel/llvm hwloc handling to be the same as on the downstream where I wouldn't like to force having hwloc binaries installed in the system as a requirement.

On the second hand, currently on Linux hwloc is a requirement of intel/llvm and CI seems to be ready for enabling hwloc in UMF Linux builds, I could verify that. So I propose to stick with having hwloc as a requirement in intel/llvm and not fetch it.

@vinser52
Copy link
Contributor

AS @PatKamin said today UMF project does not distribute binaries. We can prioritize this discussion and estimate what should be done to distribute UMF in binary form. I agree that it might be a best option.

Regarding why UMF is not optional, @PatKamin said that UR uses UMF to support memory pooling. Another reason is the interop between runtimes. UMF provides observability for other runtimes. For example, MPI uses UMF to get the memory properties of the memory allocated by SYCL applications.

@AlexeySachkov
Copy link
Contributor

I don't fetch hwloc binaries as we apply a patch to its sources.

Does it mean that we need to redistribute our own version of hwloc then?

@PatKamin
Copy link
Contributor Author

Does it mean that we need to redistribute our own version of hwloc then?

No, it's just one of the options. The other is to require the user to have hwloc installed.

@PatKamin
Copy link
Contributor Author

Reverted back changes with fetching hwloc. Leaving hwloc as a required dependency. Failing CI jobs also fail on other PRs, looks like these failures are not caused by changes in this PR.

@AlexeySachkov
Copy link
Contributor

Thanks for educating me on the topic, @vinser52, @PatKamin.

I don't fetch hwloc binaries as we apply a patch to its sources.

Does it mean that we need to redistribute our own version of hwloc then?

No, it's just one of the options. The other is to require the user to have hwloc installed.

What is the point of patching hwloc if we link with it dynamically and don't redistribute the updated version? Shouldn't we link statically with hwloc then?

Do I understand correctly that we don't fetch hwloc on Windows either and it is also expected to be present on a build system?

@PatKamin
Copy link
Contributor Author

What is the point of patching hwloc if we link with it dynamically and don't redistribute the updated version? Shouldn't we link statically with hwloc then?

The patch is needed to satisfy Coverity requirements in case hwloc is fetched and built as a static library.

Do I understand correctly that we don't fetch hwloc on Windows either and it is also expected to be present on a build system?

In Windows we have to do the fetch and link statically with hwloc always: https://github.com/intel/llvm/blob/sycl/sycl/cmake/modules/FetchUnifiedRuntime.cmake#L129-L130

Please, note that I've minimized the changes in this PR recently. On current sycl branch, hwloc already is a required llvm dependency and is installed in intel/llvm CI environment. I've removed the fetching for Linux builds in favor of using the installed hwloc library to reduce the unnecessary overhead of hwloc fetch and build during llvm build.

This will, however, require changes in the downstream CI environment, so I put this PR on draft until it's ready there.

@PatKamin PatKamin marked this pull request as draft February 17, 2025 15:38
@PatKamin
Copy link
Contributor Author

Changes wait for #17551 to be merged.

@PatKamin PatKamin force-pushed the enable-hwloc-on-linux branch from b9e9269 to 4effcff Compare March 21, 2025 11:35
@PatKamin PatKamin marked this pull request as ready for review March 21, 2025 12:04
@PatKamin
Copy link
Contributor Author

@intel/unified-runtime-reviewers, please take a look - all tests passed. Changes are now minimal.

@PatKamin
Copy link
Contributor Author

@intel/llvm-gatekeepers, I think this PR is ready to merge

@ldrumm
Copy link
Contributor

ldrumm commented Mar 24, 2025

It doesn't look like Alexey's concerns have been addressed so I'm going to hold off merge until he's confirmed he's okay with this.

As a side note: the PR description doesn't do what it says

@martygrant martygrant merged commit 548a221 into intel:sycl Mar 24, 2025
21 checks passed
@maksimsab
Copy link
Contributor

@PatKamin

I am getting the following error locally. Can it be related to this change?

[1231/8769] Generating ../hwloc_targ-src/configure
FAILED: _deps/hwloc_targ-src/configure /localdisk2/msabiani/llvm/build/_deps/hwloc_targ-src/configure
cd /localdisk2/msabiani/llvm/build/_deps/hwloc_targ-src && ./autogen.sh
./autogen.sh: line 13: autoreconf: command not found

@PatKamin
Copy link
Contributor Author

@PatKamin

I am getting the following error locally. Can it be related to this change?

[1231/8769] Generating ../hwloc_targ-src/configure
FAILED: _deps/hwloc_targ-src/configure /localdisk2/msabiani/llvm/build/_deps/hwloc_targ-src/configure
cd /localdisk2/msabiani/llvm/build/_deps/hwloc_targ-src && ./autogen.sh
./autogen.sh: line 13: autoreconf: command not found

The hwloc is required on Linux (

* `hwloc` version 2.3 or later (Linux only)
* libhwloc-dev or hwloc-devel package on linux
), otherwise fetching triggers.

@PatKamin PatKamin deleted the enable-hwloc-on-linux branch March 27, 2025 15:56
sarnex pushed a commit that referenced this pull request Mar 27, 2025
This flag will be removed from UMF. Either `hwloc` installed in the
system is used or it's fetched if not found.

Ref.: #15261
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants