Skip to content

Mechanically port bulk of x86 builtins to TableGen #120831

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 4, 2025

Conversation

chandlerc
Copy link
Member

@chandlerc chandlerc commented Dec 21, 2024

The goal is to make incremental (if small) progress towards fully TableGen'ed builtins, and to unblock #120534 by gaining access to more powerful TableGen-based representations.

The bulk .td file addition was generated with the help of a very rough Python script. That script made no attempt to be robust or reusable, it specifically handled only the cases in the X86 .def file.

Four entries from the .def file were not handled automatically as they used BUILTIN rather than TARGET_BUILTIN. These were ported by hand to an empty-feature TargetBuiltin entry, which seems like a better match.

For all the automatically ported entries, the results were compared by sorting and diffing the .def file and the generated .inc file. The only differences were:

  • Different horizontal whitespace

  • Additional entries that had already been ported to the .td file.

  • More systematically using Oi instead of LLi for the type long long int in the fully general __builtin_ia32_... builtins for OpenCL support. The .def file was only partially moved to this it seems, and the systematic migration has updated a few missed builtins.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Dec 21, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 21, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-x86

Author: Chandler Carruth (chandlerc)

Changes

The goal is to make incremental (if small) progress towards fully TableGen'ed builtins, and to unblock #120534 by gaining access to more powerful TableGen-based representations.

The bulk .td file addition was generated with the help of a very rough Python script. That script made no attempt to be robust or reusable, it specifically handled only the cases in the X86 .def file.

Four entries from the .def file were not handled automatically as they used BUILTIN rather than TARGET_BUILTIN. These were ported by hand to an empty-feature TargetBuiltin entry, which seems like a better match.

For all the automatically ported entries, the results were compared by sorting and diffing the .def file and the generated .inc file. The only differences were:

  • Different horizontal whitespace

  • Additional entries that had already been ported to the .td file.

  • Systematically using Oi instead of LLi for the type long long int. The .def file uses a mixture of Oi and LLi. I chose the shorter encoding.

This gives me high confidence in the correctness of the change.


Patch is 504.10 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/120831.diff

6 Files Affected:

  • (modified) clang/include/clang/Basic/BuiltinsBase.td (+8-3)
  • (removed) clang/include/clang/Basic/BuiltinsX86.def (-2225)
  • (modified) clang/include/clang/Basic/BuiltinsX86.td (+5387)
  • (modified) clang/include/clang/Basic/TargetBuiltins.h (-2)
  • (modified) clang/lib/Basic/Targets/X86.cpp (-8)
  • (modified) clang/utils/TableGen/ClangBuiltinsEmitter.cpp (+24)
diff --git a/clang/include/clang/Basic/BuiltinsBase.td b/clang/include/clang/Basic/BuiltinsBase.td
index cff182f3f282cb..afed3c815d3290 100644
--- a/clang/include/clang/Basic/BuiltinsBase.td
+++ b/clang/include/clang/Basic/BuiltinsBase.td
@@ -95,9 +95,6 @@ class CustomEntry {
 }
 
 class AtomicBuiltin : Builtin;
-class TargetBuiltin : Builtin {
-  string Features = "";
-}
 
 class LibBuiltin<string header, string languages = "ALL_LANGUAGES"> : Builtin {
   string Header = header;
@@ -122,6 +119,14 @@ class OCL_DSELangBuiltin : LangBuiltin<"OCL_DSE">;
 class OCL_GASLangBuiltin : LangBuiltin<"OCL_GAS">;
 class OCLLangBuiltin : LangBuiltin<"ALL_OCL_LANGUAGES">;
 
+class TargetBuiltin : Builtin {
+  string Features = "";
+}
+class TargetLibBuiltin : TargetBuiltin {
+  string Header;
+  string Languages = "ALL_LANGUAGES";
+}
+
 class Template<list<string> substitutions,
                list<string> affixes,
                bit as_prefix = 0> {
diff --git a/clang/include/clang/Basic/BuiltinsX86.def b/clang/include/clang/Basic/BuiltinsX86.def
deleted file mode 100644
index 352b3a9ec594a7..00000000000000
--- a/clang/include/clang/Basic/BuiltinsX86.def
+++ /dev/null
@@ -1,2225 +0,0 @@
-//===--- BuiltinsX86.def - X86 Builtin function database --------*- C++ -*-===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-//
-// This file defines the X86-specific builtin function database.  Users of
-// this file must define the BUILTIN macro to make use of this information.
-//
-//===----------------------------------------------------------------------===//
-
-// The format of this database matches clang/Basic/Builtins.def.
-
-// FIXME: Ideally we would be able to pull this information from what
-// LLVM already knows about X86 builtins. We need to match the LLVM
-// definition anyway, since code generation will lower to the
-// intrinsic if one exists.
-
-#if defined(BUILTIN) && !defined(TARGET_BUILTIN)
-#   define TARGET_BUILTIN(ID, TYPE, ATTRS, FEATURE) BUILTIN(ID, TYPE, ATTRS)
-#endif
-
-#if defined(BUILTIN) && !defined(TARGET_HEADER_BUILTIN)
-#  define TARGET_HEADER_BUILTIN(ID, TYPE, ATTRS, HEADER, LANG, FEATURE) BUILTIN(ID, TYPE, ATTRS)
-#endif
-
-// MMX
-//
-// All MMX instructions will be generated via builtins. Any MMX vector
-// types (<1 x i64>, <2 x i32>, etc.) that aren't used by these builtins will be
-// expanded by the back-end.
-// FIXME: _mm_prefetch must be a built-in because it takes a compile-time constant
-// argument and our prior approach of using a #define to the current built-in
-// doesn't work in the presence of re-declaration of _mm_prefetch for windows.
-TARGET_BUILTIN(_mm_prefetch, "vcC*i", "nc", "mmx")
-
-// SSE intrinsics.
-
-TARGET_BUILTIN(__builtin_ia32_ldmxcsr, "vUi", "n", "sse")
-TARGET_HEADER_BUILTIN(_mm_setcsr, "vUi", "nh",XMMINTRIN_H, ALL_LANGUAGES, "sse")
-TARGET_BUILTIN(__builtin_ia32_stmxcsr, "Ui", "n", "sse")
-TARGET_HEADER_BUILTIN(_mm_getcsr, "Ui", "nh", XMMINTRIN_H, ALL_LANGUAGES, "sse")
-TARGET_BUILTIN(__builtin_ia32_cvtss2si, "iV4f", "ncV:128:", "sse")
-TARGET_BUILTIN(__builtin_ia32_cvttss2si, "iV4f", "ncV:128:", "sse")
-TARGET_BUILTIN(__builtin_ia32_movmskps, "iV4f", "nV:128:", "sse")
-TARGET_BUILTIN(__builtin_ia32_sfence, "v", "n", "sse")
-TARGET_HEADER_BUILTIN(_mm_sfence, "v", "nh", XMMINTRIN_H, ALL_LANGUAGES, "sse")
-TARGET_BUILTIN(__builtin_ia32_rcpps, "V4fV4f", "ncV:128:", "sse")
-TARGET_BUILTIN(__builtin_ia32_rcpss, "V4fV4f", "ncV:128:", "sse")
-TARGET_BUILTIN(__builtin_ia32_rsqrtps, "V4fV4f", "ncV:128:", "sse")
-TARGET_BUILTIN(__builtin_ia32_rsqrtss, "V4fV4f", "ncV:128:", "sse")
-TARGET_BUILTIN(__builtin_ia32_sqrtps, "V4fV4f", "ncV:128:", "sse")
-TARGET_BUILTIN(__builtin_ia32_sqrtss, "V4fV4f", "ncV:128:", "sse")
-TARGET_BUILTIN(__builtin_ia32_shufps, "V4fV4fV4fIi", "ncV:128:", "sse")
-
-TARGET_BUILTIN(__builtin_ia32_maskmovdqu, "vV16cV16cc*", "nV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_movmskpd, "iV2d", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_pmovmskb128, "iV16c", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_movnti, "vi*i", "n", "sse2")
-TARGET_BUILTIN(__builtin_ia32_pshufd, "V4iV4iIi", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_pshuflw, "V8sV8sIi", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_pshufhw, "V8sV8sIi", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psadbw128, "V2OiV16cV16c", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_sqrtpd, "V2dV2d", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_sqrtsd, "V2dV2d", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_shufpd, "V2dV2dV2dIi", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_cvtpd2dq, "V2OiV2d", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_cvtpd2ps, "V4fV2d", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_cvttpd2dq, "V4iV2d", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_cvtsd2si, "iV2d", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_cvttsd2si, "iV2d", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_cvtsd2ss, "V4fV4fV2d", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_cvtps2dq, "V4iV4f", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_cvttps2dq, "V4iV4f", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_clflush, "vvC*", "n", "sse2")
-TARGET_HEADER_BUILTIN(_mm_clflush, "vvC*", "nh", EMMINTRIN_H, ALL_LANGUAGES, "sse2")
-TARGET_BUILTIN(__builtin_ia32_lfence, "v", "n", "sse2")
-TARGET_HEADER_BUILTIN(_mm_lfence, "v", "nh", EMMINTRIN_H, ALL_LANGUAGES, "sse2")
-TARGET_BUILTIN(__builtin_ia32_mfence, "v", "n", "sse2")
-TARGET_HEADER_BUILTIN(_mm_mfence, "v", "nh", EMMINTRIN_H, ALL_LANGUAGES, "sse2")
-TARGET_BUILTIN(__builtin_ia32_pause, "v", "n", "")
-TARGET_HEADER_BUILTIN(_mm_pause, "v", "nh", EMMINTRIN_H, ALL_LANGUAGES, "")
-TARGET_BUILTIN(__builtin_ia32_pmuludq128, "V2OiV4iV4i", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psraw128, "V8sV8sV8s", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psrad128, "V4iV4iV4i", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psrlw128, "V8sV8sV8s", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psrld128, "V4iV4iV4i", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psrlq128, "V2OiV2OiV2Oi", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psllw128, "V8sV8sV8s", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_pslld128, "V4iV4iV4i", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psllq128, "V2OiV2OiV2Oi", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psllwi128, "V8sV8si", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_pslldi128, "V4iV4ii", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psllqi128, "V2OiV2Oii", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psrlwi128, "V8sV8si", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psrldi128, "V4iV4ii", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psrlqi128, "V2OiV2Oii", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psrawi128, "V8sV8si", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psradi128, "V4iV4ii", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_pmaddwd128, "V4iV8sV8s", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_pslldqi128_byteshift, "V2OiV2OiIi", "ncV:128:", "sse2")
-TARGET_BUILTIN(__builtin_ia32_psrldqi128_byteshift, "V2OiV2OiIi", "ncV:128:", "sse2")
-
-TARGET_BUILTIN(__builtin_ia32_monitor, "vvC*UiUi", "n", "sse3")
-TARGET_BUILTIN(__builtin_ia32_mwait, "vUiUi", "n", "sse3")
-TARGET_BUILTIN(__builtin_ia32_lddqu, "V16ccC*", "nV:128:", "sse3")
-
-TARGET_BUILTIN(__builtin_ia32_palignr128, "V16cV16cV16cIi", "ncV:128:", "ssse3")
-
-TARGET_BUILTIN(__builtin_ia32_insertps128, "V4fV4fV4fIc", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_pblendvb128, "V16cV16cV16cV16c", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_pblendw128, "V8sV8sV8sIi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_blendpd, "V2dV2dV2dIi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_blendps, "V4fV4fV4fIi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_blendvpd, "V2dV2dV2dV2d", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_blendvps, "V4fV4fV4fV4f", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_packusdw128, "V8sV4iV4i", "ncV:128:", "sse4.1")
-
-TARGET_BUILTIN(__builtin_ia32_pmuldq128, "V2OiV4iV4i", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_roundps, "V4fV4fIi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_roundss, "V4fV4fV4fIi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_roundsd, "V2dV2dV2dIi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_roundpd, "V2dV2dIi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_dpps, "V4fV4fV4fIc", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_dppd, "V2dV2dV2dIc", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_ptestz128, "iV2OiV2Oi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_ptestc128, "iV2OiV2Oi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_ptestnzc128, "iV2OiV2Oi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_mpsadbw128, "V16cV16cV16cIc", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_phminposuw128, "V8sV8s", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_vec_ext_v16qi, "cV16cIi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_vec_set_v16qi, "V16cV16ccIi", "ncV:128:", "sse4.1")
-TARGET_BUILTIN(__builtin_ia32_vec_set_v4si, "V4iV4iiIi", "ncV:128:", "sse4.1")
-
-// SSE 4.2
-TARGET_BUILTIN(__builtin_ia32_pcmpistrm128, "V16cV16cV16cIc", "ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpistri128, "iV16cV16cIc", "ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpestrm128, "V16cV16ciV16ciIc", "ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpestri128, "iV16ciV16ciIc","ncV:128:", "sse4.2")
-
-TARGET_BUILTIN(__builtin_ia32_pcmpistria128, "iV16cV16cIc","ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpistric128, "iV16cV16cIc","ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpistrio128, "iV16cV16cIc","ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpistris128, "iV16cV16cIc","ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpistriz128, "iV16cV16cIc","ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpestria128, "iV16ciV16ciIc","ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpestric128, "iV16ciV16ciIc","ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpestrio128, "iV16ciV16ciIc","ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpestris128, "iV16ciV16ciIc","ncV:128:", "sse4.2")
-TARGET_BUILTIN(__builtin_ia32_pcmpestriz128, "iV16ciV16ciIc","ncV:128:", "sse4.2")
-
-TARGET_BUILTIN(__builtin_ia32_crc32qi, "UiUiUc", "nc", "crc32")
-TARGET_BUILTIN(__builtin_ia32_crc32hi, "UiUiUs", "nc", "crc32")
-TARGET_BUILTIN(__builtin_ia32_crc32si, "UiUiUi", "nc", "crc32")
-
-// SSE4a
-TARGET_BUILTIN(__builtin_ia32_extrqi, "V2OiV2OiIcIc", "ncV:128:", "sse4a")
-TARGET_BUILTIN(__builtin_ia32_extrq, "V2OiV2OiV16c", "ncV:128:", "sse4a")
-TARGET_BUILTIN(__builtin_ia32_insertqi, "V2OiV2OiV2OiIcIc", "ncV:128:", "sse4a")
-TARGET_BUILTIN(__builtin_ia32_insertq, "V2OiV2OiV2Oi", "ncV:128:", "sse4a")
-TARGET_BUILTIN(__builtin_ia32_movntsd, "vd*V2d", "nV:128:", "sse4a")
-TARGET_BUILTIN(__builtin_ia32_movntss, "vf*V4f", "nV:128:", "sse4a")
-
-// AES
-TARGET_BUILTIN(__builtin_ia32_aesenc128, "V2OiV2OiV2Oi", "ncV:128:", "aes")
-TARGET_BUILTIN(__builtin_ia32_aesenclast128, "V2OiV2OiV2Oi", "ncV:128:", "aes")
-TARGET_BUILTIN(__builtin_ia32_aesdec128, "V2OiV2OiV2Oi", "ncV:128:", "aes")
-TARGET_BUILTIN(__builtin_ia32_aesdeclast128, "V2OiV2OiV2Oi", "ncV:128:", "aes")
-TARGET_BUILTIN(__builtin_ia32_aesimc128, "V2OiV2Oi", "ncV:128:", "aes")
-TARGET_BUILTIN(__builtin_ia32_aeskeygenassist128, "V2OiV2OiIc", "ncV:128:", "aes")
-
-// VAES
-TARGET_BUILTIN(__builtin_ia32_aesenc256, "V4OiV4OiV4Oi", "ncV:256:", "vaes")
-TARGET_BUILTIN(__builtin_ia32_aesenc512, "V8OiV8OiV8Oi", "ncV:512:", "avx512f,evex512,vaes")
-TARGET_BUILTIN(__builtin_ia32_aesenclast256, "V4OiV4OiV4Oi", "ncV:256:", "vaes")
-TARGET_BUILTIN(__builtin_ia32_aesenclast512, "V8OiV8OiV8Oi", "ncV:512:", "avx512f,evex512,vaes")
-TARGET_BUILTIN(__builtin_ia32_aesdec256, "V4OiV4OiV4Oi", "ncV:256:", "vaes")
-TARGET_BUILTIN(__builtin_ia32_aesdec512, "V8OiV8OiV8Oi", "ncV:512:", "avx512f,evex512,vaes")
-TARGET_BUILTIN(__builtin_ia32_aesdeclast256, "V4OiV4OiV4Oi", "ncV:256:", "vaes")
-TARGET_BUILTIN(__builtin_ia32_aesdeclast512, "V8OiV8OiV8Oi", "ncV:512:", "avx512f,evex512,vaes")
-
-// GFNI
-TARGET_BUILTIN(__builtin_ia32_vgf2p8affineinvqb_v16qi, "V16cV16cV16cIc", "ncV:128:", "gfni")
-TARGET_BUILTIN(__builtin_ia32_vgf2p8affineinvqb_v32qi, "V32cV32cV32cIc", "ncV:256:", "avx,gfni")
-TARGET_BUILTIN(__builtin_ia32_vgf2p8affineinvqb_v64qi, "V64cV64cV64cIc", "ncV:512:", "avx512f,evex512,gfni")
-TARGET_BUILTIN(__builtin_ia32_vgf2p8affineqb_v16qi, "V16cV16cV16cIc", "ncV:128:", "gfni")
-TARGET_BUILTIN(__builtin_ia32_vgf2p8affineqb_v32qi, "V32cV32cV32cIc", "ncV:256:", "avx,gfni")
-TARGET_BUILTIN(__builtin_ia32_vgf2p8affineqb_v64qi, "V64cV64cV64cIc", "ncV:512:", "avx512f,evex512,gfni")
-TARGET_BUILTIN(__builtin_ia32_vgf2p8mulb_v16qi, "V16cV16cV16c", "ncV:128:", "gfni")
-TARGET_BUILTIN(__builtin_ia32_vgf2p8mulb_v32qi, "V32cV32cV32c", "ncV:256:", "avx,gfni")
-TARGET_BUILTIN(__builtin_ia32_vgf2p8mulb_v64qi, "V64cV64cV64c", "ncV:512:", "avx512f,evex512,gfni")
-
-// CLMUL
-TARGET_BUILTIN(__builtin_ia32_pclmulqdq128, "V2OiV2OiV2OiIc", "ncV:128:", "pclmul")
-
-// VPCLMULQDQ
-TARGET_BUILTIN(__builtin_ia32_pclmulqdq256, "V4OiV4OiV4OiIc", "ncV:256:", "vpclmulqdq")
-TARGET_BUILTIN(__builtin_ia32_pclmulqdq512, "V8OiV8OiV8OiIc", "ncV:512:", "avx512f,evex512,vpclmulqdq")
-
-// AVX
-TARGET_BUILTIN(__builtin_ia32_vpermilvarpd, "V2dV2dV2Oi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vpermilvarps, "V4fV4fV4i", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vpermilvarpd256, "V4dV4dV4Oi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vpermilvarps256, "V8fV8fV8i", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_blendpd256, "V4dV4dV4dIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_blendps256, "V8fV8fV8fIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_blendvpd256, "V4dV4dV4dV4d", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_blendvps256, "V8fV8fV8fV8f", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_shufpd256, "V4dV4dV4dIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_shufps256, "V8fV8fV8fIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_dpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmppd256, "V4dV4dV4dIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vextractf128_pd256, "V2dV4dIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vextractf128_ps256, "V4fV8fIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vextractf128_si256, "V4iV8iIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cvtpd2ps256, "V4fV4d", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cvtps2dq256, "V8iV8f", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cvttpd2dq256, "V4iV4d", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cvtpd2dq256, "V4iV4d", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cvttps2dq256, "V8iV8f", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vperm2f128_pd256, "V4dV4dV4dIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vperm2f128_ps256, "V8fV8fV8fIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vperm2f128_si256, "V8iV8iV8iIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vpermilpd, "V2dV2dIi", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vpermilps, "V4fV4fIi", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vpermilpd256, "V4dV4dIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vpermilps256, "V8fV8fIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vinsertf128_pd256, "V4dV4dV2dIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vinsertf128_ps256, "V8fV8fV4fIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vinsertf128_si256, "V8iV8iV4iIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_sqrtpd256, "V4dV4d", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_sqrtps256, "V8fV8f", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_rsqrtps256, "V8fV8f", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_rcpps256, "V8fV8f", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_roundpd256, "V4dV4dIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_roundps256, "V8fV8fIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestzpd, "iV2dV2d", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestcpd, "iV2dV2d", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestnzcpd, "iV2dV2d", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestzps, "iV4fV4f", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestcps, "iV4fV4f", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestnzcps, "iV4fV4f", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestzpd256, "iV4dV4d", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestcpd256, "iV4dV4d", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestnzcpd256, "iV4dV4d", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestzps256, "iV8fV8f", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestcps256, "iV8fV8f", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vtestnzcps256, "iV8fV8f", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_ptestz256, "iV4OiV4Oi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_ptestc256, "iV4OiV4Oi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_ptestnzc256, "iV4OiV4Oi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_movmskpd256, "iV4d", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_movmskps256, "iV8f", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vzeroall, "v", "n", "avx")
-TARGET_BUILTIN(__builtin_ia32_vzeroupper, "v", "n", "avx")
-TARGET_BUILTIN(__builtin_ia32_lddqu256, "V32ccC*", "nV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_maskloadpd, "V2dV2dC*V2Oi", "nV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_maskloadps, "V4fV4fC*V4i", "nV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_maskloadpd256, "V4dV4dC*V4Oi", "nV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_maskloadps256, "V8fV8fC*V8i", "nV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_maskstorepd, "vV2d*V2OiV2d", "nV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_maskstoreps, "vV4f*V4iV4f", "nV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_maskstorepd256, "vV4d*V4OiV4d", "nV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_maskstoreps256, "vV8f*V8iV8f", "nV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vec_ext_v32qi, "cV32cIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vec_ext_v16hi, "sV16sIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vec_ext_v8si, "iV8iIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vec_set_v32qi, "V32cV32ccIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vec_set_v16hi, "V16sV16ssIi", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_vec_set_v8si, "V8iV8iiIi", "ncV:256:", "avx")
-
-// AVX2
-TARGET_BUILTIN(__builtin_ia32_mpsadbw256, "V32cV32cV32cIc", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_packsswb256, "V32cV16sV16s", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_packssdw256, "V16sV8iV8i", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_packuswb256, "V32cV16sV16s", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_packusdw256, "V16sV8iV8i", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_palignr256, "V32cV32cV32cIi", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_pavgb256, "V32cV32cV32c", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_pavgw256, "V16sV16sV16s", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_pblendvb256, "V32cV32cV32cV32c", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_pblendw256, "V16sV16sV16sIi", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_phaddw256, "V16sV16sV16s", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_phaddd256, "V8iV8iV8i", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_phaddsw256, "V16sV16sV16s", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_phsubw256, "V16sV16sV16s", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_phsubd256, "V8iV8iV8i", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_phsubsw256, "V16sV16sV16s", "ncV:256:", "avx2")
-TARGET_BUILTIN(__builtin_ia32_pmaddubsw256, "V16s...
[truncated]

@phoebewang
Copy link
Contributor

  • Systematically using Oi instead of LLi for the type long long int. The .def file uses a mixture of Oi and LLi. I chose the shorter encoding.

The mixture use of Oi and LLi is a mess, but Oi has different meaning for OpenCL targets. I think we should not change LLi to Oi. I think a lot Oi can be replaced with LLi, but I cannot tell which are required with a quick look.

@chandlerc
Copy link
Member Author

  • Systematically using Oi instead of LLi for the type long long int. The .def file uses a mixture of Oi and LLi. I chose the shorter encoding.

The mixture use of Oi and LLi is a mess, but Oi has different meaning for OpenCL targets. I think we should not change LLi to Oi. I think a lot Oi can be replaced with LLi, but I cannot tell which are required with a quick look.

I do understand that, but I'm actually a bit more confident that these changes are correct... Or at least not a regression.

Specifically, the x86 intrinsic builtins use Oi very consistently for "quad word" (64-bit) vector operations from SSE through AVX2. For example __builtin_ia32_psllqi256 uses V4Oi. This seems quite intentional, so I was very hesitant to reverse it.

The places where LLi has crept in are:

  • 1 or 2 very isolated cases I'll list explicitly below
  • Some AVX-512 and SHA512 intrinsics. This seems like a mistake as they're also using it for "double word" (64-bit) vector elements, the exact same element size that uses Oi elsewhere. And some actually do use Oi with the same feature (avx10.2-256), so I couldn't see any pattern that seemed to indicate a critical distinction...

I think the AVX-512 stuff was just added without realizing the historical use of Oi here?

The only examples outside of AVX-512 and SHA512 I could find:

  • __builtin_ia32_rdpru
  • __emul and __emulu (MS intrinsics)
  • __readfsqword and __readgsqword (also MS intrinsics)

I can understand that these don't make lots of sense in OpenCL, but they also seem very unlikely to show up or do the wrong thing here, when getting the same type that OpenCL uses for 64-bit vector elements on x86?

The reason I'd like to consolidate here is that preserving this distinction would add complexity to the tablegen code, and at least seems like it may be propagating more of a mistake than a careful choice in AVX-512 (especially given the lack of test coverage that errors with the change as is).

@chandlerc
Copy link
Member Author

A long way from an expert on OpenCL, but it seems to not even have the concept of long long, and long is defined as a 64-bit type (and just optional for embedded stuff)?

https://registry.khronos.org/OpenCL/sdk/3.0/docs/man/html/scalarDataTypes.html

@phoebewang
Copy link
Contributor

A long way from an expert on OpenCL, but it seems to not even have the concept of long long, and long is defined as a 64-bit type (and just optional for embedded stuff)?

https://registry.khronos.org/OpenCL/sdk/3.0/docs/man/html/scalarDataTypes.html

Thanks for the confirmation! Does OpenCL supports Windows? IIRC, long is 32-bit on Windows.

@topperc
Copy link
Collaborator

topperc commented Dec 22, 2024

When I was still involved in X86 my recollection was we primarily used LLi. It looks like there was a large replacement of LLi with Oi here fa8cd76.

@chandlerc
Copy link
Member Author

A long way from an expert on OpenCL, but it seems to not even have the concept of long long, and long is defined as a 64-bit type (and just optional for embedded stuff)?

https://registry.khronos.org/OpenCL/sdk/3.0/docs/man/html/scalarDataTypes.html

Thanks for the confirmation! Does OpenCL supports Windows? IIRC, long is 32-bit on Windows.

Yeah, it's definitely different on windows in C/C++. My impression is that opencl long is definitively 64 bit, and thats kind of why Oi exists -- to map that to long long outside of opencl where it would be ambiguous otherwise.

@chandlerc
Copy link
Member Author

When I was still involved in X86 my recollection was we primarily used LLi. It looks like there was a large replacement of LLi with Oi here fa8cd76.

Yeah, this patch makes me think the change to Oi here is ultimately correct, and focuses types on the correct 64-bit integer type in the different languages.

chandlerc added a commit to chandlerc/llvm-project that referenced this pull request Dec 24, 2024
This PR follows llvm#120831 (the PR contains both, only review the last
commit here as the other commit will be reviewed on the other PR).

Similar to that PR, this does a very mechanical port of X86 builtins to
TableGen. There is a *lot* of improvement available here to use TableGen
more effectively and collapse repeated structures. But those can now be
follow-up PRs that restructure *within* the `.td` file.

The current structure produces a file that exactly matches the original
X-macros except for the differences outlined in llvm#120831:

- Horizontal whitespace
- `long long` types now use `long long` outside of OpenCL, but switch to
  `long` in OpenCL (if relevant at all).

Otherwise, only the order of builtins change, and no tests regress.
The goal is to make incremental (if small) progress towards fully
TableGen'ed builtins, and to unblock llvm#120534 by gaining access to more
powerful TableGen-based representations.

The bulk `.td` file addition was generated with the help of a very rough
Python script. That script made no attempt to be robust or reusable, it
specifically handled only the cases in the X86 `.def` file.

Four entries from the `.def` file were not handled automatically as they
used `BUILTIN` rather than `TARGET_BUILTIN`. These were ported by hand
to an empty-feature `TargetBuiltin` entry, which seems like a better
match.

For all the automatically ported entries, the results were compared by
sorting and diffing the `.def` file and the generated `.inc` file. The
only differences were:

- Different horizontal whitespace

- Additional entries that had already been ported to the `.td` file.

- Systematically using `Oi` instead of `LLi` for the type `long long
  int`. The `.def` file uses a mixture of `Oi` and `LLi`. I chose the
  shorter encoding.

This gives me high confidence in the correctness of the change.
chandlerc added a commit to chandlerc/llvm-project that referenced this pull request Jan 4, 2025
This PR follows llvm#120831 (the PR contains both, only review the last
commit here as the other commit will be reviewed on the other PR).

Similar to that PR, this does a very mechanical port of X86 builtins to
TableGen. There is a *lot* of improvement available here to use TableGen
more effectively and collapse repeated structures. But those can now be
follow-up PRs that restructure *within* the `.td` file.

The current structure produces a file that exactly matches the original
X-macros except for the differences outlined in llvm#120831:

- Horizontal whitespace
- `long long` types now use `long long` outside of OpenCL, but switch to
  `long` in OpenCL (if relevant at all).

Otherwise, only the order of builtins change, and no tests regress.
@chandlerc
Copy link
Member Author

Ping, rebased to top-of-tree.

@phoebewang -- I think you're the most relevant reviewer here. If the O vs. LL thing is really a blocker despite the added information, I'd like to know so I can explore options to switch back. All of the ones I've come up with add complexity, so I'm hoping the current version is OK, but if its a blocker happy to start discussing what would work here.

I think the latest question was around MSVC / Windows where long in C and C++ is 32-bit, but AFAICT with my research in the standard, it is always 64-bits in OpenCL.

chandlerc added a commit to chandlerc/llvm-project that referenced this pull request Jan 4, 2025
This PR follows llvm#120831 (the PR contains both, only review the last
commit here as the other commit will be reviewed on the other PR).

Similar to that PR, this does a very mechanical port of X86 builtins to
TableGen. There is a *lot* of improvement available here to use TableGen
more effectively and collapse repeated structures. But those can now be
follow-up PRs that restructure *within* the `.td` file.

The current structure produces a file that exactly matches the original
X-macros except for the differences outlined in llvm#120831:

- Horizontal whitespace
- `long long` types now use `long long` outside of OpenCL, but switch to
  `long` in OpenCL (if relevant at all).

Otherwise, only the order of builtins change, and no tests regress.
@@ -108,9 +109,15 @@ class PrototypeParser {
} else if (T.consume_back("&")) {
ParseType(T);
Type += "&";
} else if (T.consume_front("long long")) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a bit surprising that "long long" wouldn't map to "LL". I get that we need to support OpenCL, but maybe we should have a keyword for OpenCL? Are all targets going to want "O" for "long long"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure -- the vast majority of x86 builtins use O for this.

It's a no-op on targets without OpenCL support, but it seems harmless there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does maybe point at something that doesn't add much complexity -- I can condition using O on a flag that only X86 builtins use so it doesn't impact any other targets? That should be quite simple, and then other targets can opt in if/when they wish?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, PR updated with an explicit opt-in for OpenCL long type support.

Somehow, I hadn't considered how easily this would address an unrelated part: the occurance in intrinsic header builtins. That just fell out of this. Sorry for pushing back earlier, but all my ideas were much more complex than this ended up being.

With this tiny change to the .td file in the second commit here, the diff of things switching from LLi to Oi becomes very small and looks pretty compelling: https://gist.github.com/chandlerc/4395df8d838cd1a110ecc2170e67adc4

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you get this diff? I think it's useful for us to verify the correctness.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I use the Fish shell and have a bunch of command line tools that help with this installed:

Combined, they let me write a diff command like:

diff -u (rg -I '^(TARGET|BUILTIN)' BuiltinsX86.def BuiltinsX86.inc | sd '^BUILTIN\((.*)\)' 'TARGET_BUILTIN($1, "")' | sd ' +' ' ' | sd ',([X"])' ', $1' | sort | psub)  (rg '^TARGET' dev/tools/clang/include/clang/Basic/BuiltinsX86.inc | sort | psub)

Here BuiltinsX86.def and BuiltinsX86.inc are copies of the .def and .inc from before this PR, and dev/tools/.../BuiltinsX86.inc is the .inc produced in my development build.

All the uses of sd on the original .def and .inc files are to process away weird whitespace artifacts, and the use of BUILTIN instead of TARGET_BUILTIN in four places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the information!

X86 builtins.

This minimizes the delta from the non-TableGen and avoids unintended
consequences on other targets.
Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@chandlerc
Copy link
Member Author

Thanks, merging! I've put the script here for posterity: https://gist.github.com/chandlerc/de807ea073beac351f87c660e1d4b7a0

@chandlerc chandlerc merged commit 2529a8d into llvm:main Jan 4, 2025
8 checks passed
chandlerc added a commit to chandlerc/llvm-project that referenced this pull request Jan 4, 2025
This PR follows llvm#120831 (the PR contains both, only review the last
commit here as the other commit will be reviewed on the other PR).

Similar to that PR, this does a very mechanical port of X86 builtins to
TableGen. There is a *lot* of improvement available here to use TableGen
more effectively and collapse repeated structures. But those can now be
follow-up PRs that restructure *within* the `.td` file.

The current structure produces a file that exactly matches the original
X-macros except for the differences outlined in llvm#120831:

- Horizontal whitespace
- `long long` types now use `long long` outside of OpenCL, but switch to
  `long` in OpenCL for the core `__builtin_ia32_...` builtins.

Otherwise, only the order of builtins change, and no tests regress.
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 4, 2025

LLVM Buildbot has detected a new failure on builder clang-debian-cpp20 running on clang-debian-cpp20 while building clang at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/108/builds/7722

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s' FAILED ********************
Exit Code: 134

Command Output (stderr):
--
RUN: at line 1: rm -rf /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp && mkdir -p /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp
+ rm -rf /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp
+ mkdir -p /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp
RUN: at line 2: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-mc -triple=riscv64 -position-independent -filetype=obj      -o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-mc -triple=riscv64 -position-independent -filetype=obj -o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s
RUN: at line 4: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec      -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096      -abs external_func=0x1 -abs external_data=0x2      -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096 -abs external_func=0x1 -abs external_data=0x2 -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o
RUN: at line 10: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-mc -triple=riscv64 -position-independent -filetype=obj      -mattr=+relax -o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-mc -triple=riscv64 -position-independent -filetype=obj -mattr=+relax -o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s
RUN: at line 12: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec      -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096      -abs external_func=0x1 -abs external_data=0x2      -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096 -abs external_func=0x1 -abs external_data=0x2 -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o
llvm-jitlink: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/include/llvm/ExecutionEngine/Orc/SymbolStringPool.h:285: llvm::orc::SymbolStringPool::~SymbolStringPool(): Assertion `Pool.empty() && "Dangling references at pool destruction time"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096 -abs external_func=0x1 -abs external_data=0x2 -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o
 #0 0x0000597f47dc4b98 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0xeb5b98)
 #1 0x0000597f47dc268d llvm::sys::RunSignalHandlers() (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0xeb368d)
 #2 0x0000597f47dc5138 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007a5a62d4d510 (/lib/x86_64-linux-gnu/libc.so.6+0x3c510)
 #4 0x00007a5a62d9b0fc (/lib/x86_64-linux-gnu/libc.so.6+0x8a0fc)
 #5 0x00007a5a62d4d472 raise (/lib/x86_64-linux-gnu/libc.so.6+0x3c472)
 #6 0x00007a5a62d374b2 abort (/lib/x86_64-linux-gnu/libc.so.6+0x264b2)
 #7 0x00007a5a62d373d5 (/lib/x86_64-linux-gnu/libc.so.6+0x263d5)
 #8 0x00007a5a62d463a2 (/lib/x86_64-linux-gnu/libc.so.6+0x353a2)
 #9 0x0000597f4765df0c (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0x74ef0c)
#10 0x0000597f47ca7b37 llvm::orc::ExecutorProcessControl::~ExecutorProcessControl() (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0xd98b37)
#11 0x0000597f47ca927f llvm::orc::SelfExecutorProcessControl::~SelfExecutorProcessControl() (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0xd9a27f)
#12 0x0000597f47bdae88 llvm::orc::ExecutionSession::~ExecutionSession() (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0xccbe88)
#13 0x0000597f47636edd llvm::Session::~Session() (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0x727edd)
#14 0x0000597f47641664 main (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0x732664)
#15 0x00007a5a62d386ca (/lib/x86_64-linux-gnu/libc.so.6+0x276ca)
#16 0x00007a5a62d38785 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x27785)
#17 0x0000597f4762f671 _start (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0x720671)
/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.script: line 5: 3606366 Aborted                 /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096 -abs external_func=0x1 -abs external_data=0x2 -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o

--

********************


@chandlerc
Copy link
Member Author

LLVM Buildbot has detected a new failure on builder clang-debian-cpp20 running on clang-debian-cpp20 while building clang at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/108/builds/7722

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s' FAILED ********************
Exit Code: 134

Command Output (stderr):
--
RUN: at line 1: rm -rf /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp && mkdir -p /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp
+ rm -rf /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp
+ mkdir -p /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp
RUN: at line 2: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-mc -triple=riscv64 -position-independent -filetype=obj      -o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-mc -triple=riscv64 -position-independent -filetype=obj -o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s
RUN: at line 4: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec      -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096      -abs external_func=0x1 -abs external_data=0x2      -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096 -abs external_func=0x1 -abs external_data=0x2 -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o
RUN: at line 10: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-mc -triple=riscv64 -position-independent -filetype=obj      -mattr=+relax -o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-mc -triple=riscv64 -position-independent -filetype=obj -mattr=+relax -o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s
RUN: at line 12: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec      -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096      -abs external_func=0x1 -abs external_data=0x2      -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096 -abs external_func=0x1 -abs external_data=0x2 -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o
llvm-jitlink: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/include/llvm/ExecutionEngine/Orc/SymbolStringPool.h:285: llvm::orc::SymbolStringPool::~SymbolStringPool(): Assertion `Pool.empty() && "Dangling references at pool destruction time"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096 -abs external_func=0x1 -abs external_data=0x2 -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o
 #0 0x0000597f47dc4b98 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0xeb5b98)
 #1 0x0000597f47dc268d llvm::sys::RunSignalHandlers() (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0xeb368d)
 #2 0x0000597f47dc5138 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007a5a62d4d510 (/lib/x86_64-linux-gnu/libc.so.6+0x3c510)
 #4 0x00007a5a62d9b0fc (/lib/x86_64-linux-gnu/libc.so.6+0x8a0fc)
 #5 0x00007a5a62d4d472 raise (/lib/x86_64-linux-gnu/libc.so.6+0x3c472)
 #6 0x00007a5a62d374b2 abort (/lib/x86_64-linux-gnu/libc.so.6+0x264b2)
 #7 0x00007a5a62d373d5 (/lib/x86_64-linux-gnu/libc.so.6+0x263d5)
 #8 0x00007a5a62d463a2 (/lib/x86_64-linux-gnu/libc.so.6+0x353a2)
 #9 0x0000597f4765df0c (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0x74ef0c)
#10 0x0000597f47ca7b37 llvm::orc::ExecutorProcessControl::~ExecutorProcessControl() (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0xd98b37)
#11 0x0000597f47ca927f llvm::orc::SelfExecutorProcessControl::~SelfExecutorProcessControl() (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0xd9a27f)
#12 0x0000597f47bdae88 llvm::orc::ExecutionSession::~ExecutionSession() (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0xccbe88)
#13 0x0000597f47636edd llvm::Session::~Session() (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0x727edd)
#14 0x0000597f47641664 main (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0x732664)
#15 0x00007a5a62d386ca (/lib/x86_64-linux-gnu/libc.so.6+0x276ca)
#16 0x00007a5a62d38785 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x27785)
#17 0x0000597f4762f671 _start (/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink+0x720671)
/vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.script: line 5: 3606366 Aborted                 /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llvm-jitlink -noexec -slab-allocate 100Kb -slab-address 0xfff00000 -slab-page-size 4096 -abs external_func=0x1 -abs external_data=0x2 -check /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/ExecutionEngine/JITLink/RISCV/ELF_riscv64_got_plt_reloc.s /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/test/ExecutionEngine/JITLink/RISCV/Output/ELF_riscv64_got_plt_reloc.s.tmp/elf_riscv64_got_plt_reloc.o

--

********************


This is an LLVM failure and so I can't see how it relates. Likely a flaky crash of a tool given the log.

chandlerc added a commit that referenced this pull request Jan 5, 2025
This PR follows #120831 for
x86-64.

Similar to that PR, this does a very mechanical port of X86 builtins to
TableGen. There is a *lot* of improvement available here to use TableGen
more effectively and collapse repeated structures. But those can now be
follow-up PRs that restructure *within* the `.td` file.

The current structure produces a file that exactly matches the original
X-macros except for the differences outlined in
#120831:

- Horizontal whitespace
- `long long` types now use `long long` outside of OpenCL, but switch to
  `long` in OpenCL where relevant.

Otherwise, only the order of builtins change, and no tests regress.
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 10, 2025
This PR follows llvm/llvm-project#120831 for
x86-64.

Similar to that PR, this does a very mechanical port of X86 builtins to
TableGen. There is a *lot* of improvement available here to use TableGen
more effectively and collapse repeated structures. But those can now be
follow-up PRs that restructure *within* the `.td` file.

The current structure produces a file that exactly matches the original
X-macros except for the differences outlined in
llvm/llvm-project#120831:

- Horizontal whitespace
- `long long` types now use `long long` outside of OpenCL, but switch to
  `long` in OpenCL where relevant.

Otherwise, only the order of builtins change, and no tests regress.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants