-
Notifications
You must be signed in to change notification settings - Fork 14.3k
added a script to update llvm-mc test file #107246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mc @llvm/pr-subscribers-backend-amdgpu Author: Brox Chen (broxigarchen) ChangesAdded a script to update the test file generated by llvm-mc binary. The script parse the test assembly and disassembly line-by-line, and output check marks the same way as update_llc_test_check. The script currently accept .s and .txt for asm and dasm. It assumes the test is always line-by-line and propogate the output correspondingly. Full diff: https://github.com/llvm/llvm-project/pull/107246.diff 7 Files Affected:
diff --git a/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_asm.s b/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_asm.s
new file mode 100644
index 00000000000000..b21935e1d1a3ab
--- /dev/null
+++ b/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_asm.s
@@ -0,0 +1,3 @@
+// RUN: llvm-mc -triple=amdgcn -show-encoding %s 2>&1 | FileCheck --check-prefixes=CHECK %s
+
+v_bfrev_b32 v5, v1
diff --git a/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_asm.s.expected b/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_asm.s.expected
new file mode 100644
index 00000000000000..d29e1fc121e852
--- /dev/null
+++ b/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_asm.s.expected
@@ -0,0 +1,5 @@
+; NOTE: Assertions have been autogenerated by utils/update_mc_test_check.py UTC_ARGS: --version 5
+// RUN: llvm-mc -triple=amdgcn -show-encoding %s 2>&1 | FileCheck --check-prefixes=CHECK %s
+
+// CHECK: v_bfrev_b32_e32 v5, v1 ; encoding: [0x01,0x71,0x0a,0x7e]
+v_bfrev_b32 v5, v1
diff --git a/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_dasm.txt b/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_dasm.txt
new file mode 100644
index 00000000000000..9f5fba6e50df25
--- /dev/null
+++ b/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_dasm.txt
@@ -0,0 +1,5 @@
+# RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -disassemble -show-encoding %s 2>&1 | FileCheck -check-prefixes=CHECK %s
+
+0x00,0x00,0x00,0x7e
+
+0xfd,0xb8,0x0a,0x7f
diff --git a/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_dasm.txt.expected b/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_dasm.txt.expected
new file mode 100644
index 00000000000000..896d5beb12d575
--- /dev/null
+++ b/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_dasm.txt.expected
@@ -0,0 +1,8 @@
+; NOTE: Assertions have been autogenerated by utils/update_mc_test_check.py UTC_ARGS: --version 5
+# RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -disassemble -show-encoding %s 2>&1 | FileCheck -check-prefixes=CHECK %s
+
+# CHECK: v_nop ; encoding: [0x00,0x00,0x00,0x7e]
+0x00,0x00,0x00,0x7e
+
+# COM: CHECK: warning: invalid instruction encoding
+0xfd,0xb8,0x0a,0x7f
diff --git a/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/amdgpu-basic.test b/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/amdgpu-basic.test
new file mode 100644
index 00000000000000..a74e0ae4e76f95
--- /dev/null
+++ b/llvm/test/tools/UpdateTestChecks/update_mc_test_checks/amdgpu-basic.test
@@ -0,0 +1,7 @@
+# REQUIRES: amdgpu-registered-target
+## Check that basic asm/dasm process is correct
+
+# RUN: cp -f %S/Inputs/amdgpu_asm.s %t.s && %update_mc_test_checks %t.s
+# RUN: diff -u %S/Inputs/amdgpu_asm.s.expected %t.s
+# RUN: cp -f %S/Inputs/amdgpu_dasm.txt %t.txt && %update_mc_test_checks %t.txt
+# RUN: diff -u %S/Inputs/amdgpu_dasm.txt.expected %t.txt
diff --git a/llvm/utils/UpdateTestChecks/common.py b/llvm/utils/UpdateTestChecks/common.py
index 9b9be69ee38448..b861bd010e2b25 100644
--- a/llvm/utils/UpdateTestChecks/common.py
+++ b/llvm/utils/UpdateTestChecks/common.py
@@ -573,7 +573,7 @@ def invoke_tool(exe, cmd_args, ir, preprocess_cmd=None, verbose=False):
IR_FUNCTION_RE = re.compile(r'^\s*define\s+(?:internal\s+)?[^@]*@"?([\w.$-]+)"?\s*\(')
TRIPLE_IR_RE = re.compile(r'^\s*target\s+triple\s*=\s*"([^"]+)"$')
-TRIPLE_ARG_RE = re.compile(r"-mtriple[= ]([^ ]+)")
+TRIPLE_ARG_RE = re.compile(r"-m?triple[= ]([^ ]+)")
MARCH_ARG_RE = re.compile(r"-march[= ]([^ ]+)")
DEBUG_ONLY_ARG_RE = re.compile(r"-debug-only[= ]([^ ]+)")
diff --git a/llvm/utils/update_mc_test_check.py b/llvm/utils/update_mc_test_check.py
new file mode 100755
index 00000000000000..ccaee25b3fa6ad
--- /dev/null
+++ b/llvm/utils/update_mc_test_check.py
@@ -0,0 +1,330 @@
+#!/usr/bin/env python3
+"""
+A test update script. This script is a utility to update LLVM 'llvm-mc' based test cases with new FileCheck patterns.
+"""
+
+from __future__ import print_function
+
+import argparse
+import os # Used to advertise this file's name ("autogenerated_note").
+
+from UpdateTestChecks import common
+
+import subprocess
+import re
+
+mc_LIKE_TOOLS = [
+ "llvm-mc",
+]
+
+ERROR_RE = re.compile(r"(warning|error): .*")
+ERROR_CHECK_RE = re.compile(r"# COM: .*")
+OUTPUT_SKIPPED_RE = re.compile(r"(.text)")
+COMMENT = {
+ "asm" : "//",
+ "dasm" : "#"
+ }
+
+
+def invoke_tool(exe, cmd_args, testline, verbose=False):
+ if isinstance(cmd_args, list):
+ args = [applySubstitutions(a, substitutions) for a in cmd_args]
+ else:
+ args = cmd_args
+
+ cmd = "echo \"" + testline + "\" | " + exe + " " + args
+ if verbose:
+ print("Command: ", cmd)
+ out = subprocess.check_output(cmd, shell=True)
+ # Fix line endings to unix CR style.
+ return out.decode().replace("\r\n", "\n")
+
+
+# create tests line-by-line, here we just filter out the check lines and comments
+# and treat all others as tests
+def isTestLine(input_line, mc_mode):
+ # Skip comment lines
+ if input_line.strip(' \t\r').startswith(COMMENT[mc_mode]):
+ return False
+ elif input_line.strip(' \t\r') == '':
+ return False
+ # skip any CHECK lines.
+ elif common.CHECK_RE.match(input_line):
+ return False
+ return True
+
+def hasErr(err):
+ if err is None or len(err) == 0:
+ return False
+ if ERROR_RE.search(err):
+ return True
+ return False
+
+def getErrString(err):
+ if err is None or len(err) == 0:
+ return ""
+
+ lines = err.split('\n')
+ # take the first match
+ for line in lines:
+ s = ERROR_RE.search(line)
+ if s:
+ return s.group(0)
+ return ""
+
+def getOutputString(out):
+ if out is None or len(out) == 0:
+ return ""
+ lines = out.split('\n')
+ output = ""
+
+ for line in lines:
+ if OUTPUT_SKIPPED_RE.search(line):
+ continue
+ if line.strip('\t ') == '':
+ continue
+ output += line.lstrip('\t ')
+ return output
+
+def should_add_line_to_output(input_line, prefix_set, mc_mode):
+ # special check line
+ if mc_mode == 'dasm' and ERROR_CHECK_RE.search(input_line):
+ return False
+ else:
+ return common.should_add_line_to_output(input_line, prefix_set, comment_marker=COMMENT[mc_mode])
+
+
+def getStdCheckLine(prefix, output, mc_mode):
+ lines = output.split('\n')
+ output = ""
+ for line in lines:
+ output += COMMENT[mc_mode] + ' ' + prefix + ": " + line + '\n'
+ return output
+
+def getErrCheckLine(prefix, output, mc_mode):
+ if mc_mode == 'asm':
+ return COMMENT[mc_mode] + ' ' + prefix + ": " + output + '\n'
+ elif mc_mode == 'dasm':
+ return COMMENT[mc_mode] + ' COM: ' + prefix + ": " + output + '\n'
+
+def main():
+ parser = argparse.ArgumentParser(description=__doc__)
+ parser.add_argument(
+ "--mc-binary",
+ default=None,
+ help='The "mc" binary to use to generate the test case',
+ )
+ parser.add_argument(
+ "--tool",
+ default=None,
+ help="Treat the given tool name as an mc-like tool for which check lines should be generated",
+ )
+ parser.add_argument(
+ "--default-march",
+ default=None,
+ help="Set a default -march for when neither triple nor arch are found in a RUN line",
+ )
+ parser.add_argument("tests", nargs="+")
+ initial_args = common.parse_commandline_args(parser)
+
+ script_name = os.path.basename(__file__)
+
+ for ti in common.itertests(
+ initial_args.tests, parser, script_name="utils/" + script_name
+ ):
+ if ti.path.endswith('.s'):
+ mc_mode = "asm"
+ elif ti.path.endswith('.txt'):
+ mc_mode = "dasm"
+ else:
+ common.warn("Expected .s and .txt, Skipping file : ", ti.path)
+ continue
+
+ triple_in_ir = None
+ for l in ti.input_lines:
+ m = common.TRIPLE_IR_RE.match(l)
+ if m:
+ triple_in_ir = m.groups()[0]
+ break
+
+ run_list = []
+ for l in ti.run_lines:
+ if "|" not in l:
+ common.warn("Skipping unparsable RUN line: " + l)
+ continue
+
+ commands = [cmd.strip() for cmd in l.split("|")]
+ assert len(commands) >= 2
+ mc_cmd = " | ".join(commands[:-1])
+ filecheck_cmd = commands[-1]
+ mc_tool = mc_cmd.split(" ")[0]
+
+ triple_in_cmd = None
+ m = common.TRIPLE_ARG_RE.search(mc_cmd)
+ if m:
+ triple_in_cmd = m.groups()[0]
+
+ march_in_cmd = ti.args.default_march
+ m = common.MARCH_ARG_RE.search(mc_cmd)
+ if m:
+ march_in_cmd = m.groups()[0]
+
+ common.verify_filecheck_prefixes(filecheck_cmd)
+
+ mc_like_tools = mc_LIKE_TOOLS[:]
+ if ti.args.tool:
+ mc_like_tools.append(ti.args.tool)
+ if mc_tool not in mc_like_tools:
+ common.warn("Skipping non-mc RUN line: " + l)
+ continue
+
+ if not filecheck_cmd.startswith("FileCheck "):
+ common.warn("Skipping non-FileChecked RUN line: " + l)
+ continue
+
+ mc_cmd_args = mc_cmd[len(mc_tool) :].strip()
+ mc_cmd_args = mc_cmd_args.replace("< %s", "").replace("%s", "").strip()
+ check_prefixes = common.get_check_prefixes(filecheck_cmd)
+
+ run_list.append(
+ (
+ check_prefixes,
+ mc_tool,
+ mc_cmd_args,
+ triple_in_cmd,
+ march_in_cmd,
+ )
+ )
+
+
+ # find all test line from input
+ testlines = [l for l in ti.input_lines if isTestLine(l, mc_mode)]
+ run_list_size = len(run_list)
+ testnum = len(testlines)
+
+ raw_output = []
+ raw_prefixes = []
+ for (
+ prefixes,
+ mc_tool,
+ mc_args,
+ triple_in_cmd,
+ march_in_cmd,
+ ) in run_list:
+ common.debug("Extracted mc cmd:", mc_tool, mc_args)
+ common.debug("Extracted FileCheck prefixes:", str(prefixes))
+ common.debug("Extracted triple :", str(triple_in_cmd))
+ common.debug("Extracted march:", str(march_in_cmd))
+
+ triple = triple_in_cmd or triple_in_ir
+ if not triple:
+ triple = common.get_triple_from_march(march_in_cmd)
+
+ raw_output.append([])
+ for line in testlines:
+ # get output for each testline
+ out = invoke_tool(
+ ti.args.mc_binary or mc_tool,
+ mc_args,
+ line,
+ verbose=ti.args.verbose,
+ )
+ raw_output[-1].append(out)
+
+ common.debug("Collect raw tool lines:", str(len(raw_output[-1])))
+
+ raw_prefixes.append(prefixes)
+
+ output_lines = []
+ generated_prefixes = []
+ used_prefixes = set()
+ prefix_set = set([prefix for p in run_list for prefix in p[0]])
+ common.debug("Rewriting FileCheck prefixes:", str(prefix_set))
+
+ for test_id in range(testnum):
+ input_line = testlines[test_id]
+
+ # a {prefix : output, [runid] } dict
+ # insert output to a prefix-key dict, and do a max sorting
+ # to select the most-used prefix which share the same output string
+ p_dict = {}
+ for run_id in range(run_list_size):
+ out = raw_output[run_id][test_id]
+
+ if hasErr(out):
+ o = getErrString(out)
+ else:
+ o = getOutputString(out)
+
+ prefixes = raw_prefixes[run_id]
+
+ for p in prefixes:
+ if p not in p_dict:
+ p_dict[p] = o, [run_id]
+ else:
+ if p_dict[p] == (None, []):
+ continue
+
+ prev_o, run_ids = p_dict[p]
+ if o == prev_o:
+ run_ids.append(run_id)
+ p_dict[p] = o, run_ids
+ else:
+ # conflict, discard
+ p_dict[p] = None, []
+
+ p_dict_sorted = dict(sorted(p_dict.items(), key=lambda item: -len(item[1][1])))
+
+ # prefix is selected and generated with most shared output lines
+ # each run_id can only be used once
+ gen_prefix = ""
+ used_runid = set()
+ for prefix, tup in p_dict_sorted.items():
+ o, run_ids = tup
+
+ if len(run_ids) == 0:
+ continue
+
+ skip = False
+ for i in run_ids:
+ if i in used_runid:
+ skip = True
+ else:
+ used_runid.add(i)
+ if not skip:
+ used_prefixes.add(prefix)
+
+ if hasErr(o):
+ gen_prefix += getErrCheckLine(prefix, o, mc_mode)
+ else:
+ gen_prefix += getStdCheckLine(prefix, o, mc_mode)
+
+ generated_prefixes.append(gen_prefix.rstrip('\n'))
+
+ # write output
+ prefix_id = 0
+ for input_info in ti.iterlines(output_lines):
+ input_line = input_info.line
+ if isTestLine(input_line, mc_mode):
+ output_lines.append(generated_prefixes[prefix_id])
+ output_lines.append(input_line)
+ prefix_id += 1
+
+ elif should_add_line_to_output(input_line, prefix_set, mc_mode):
+ output_lines.append(input_line)
+
+ elif input_line in ti.run_lines or input_line == "":
+ output_lines.append(input_line)
+
+ if ti.args.gen_unused_prefix_body:
+ output_lines.extend(
+ ti.get_checks_for_unused_prefixes(run_list, used_prefixes)
+ )
+
+ common.debug("Writing %d lines to %s..." % (len(output_lines), ti.path))
+ with open(ti.path, "wb") as f:
+ f.writelines(["{}\n".format(l).encode("utf-8") for l in output_lines])
+
+
+if __name__ == "__main__":
+ main()
|
Hi reviewers, the script is being added when I am working on the amdgpu development, I am no sure how useful this script is, but posting it for review and collect feedbacks. |
✅ With the latest revision this PR passed the Python code formatter. |
9ae9c72
to
7df9ecb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice idea.
Does this work for combined asm/disasm tests, e.g., gfx12_asm_vop1.s
? Does this work for other backend's asm/disasm tests?
llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_asm.s.expected
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really useful! I wonder if we could also make it work for the error message checking?
E.g. generate the following check lines
CHECK-ERR: [[#@LINE]]:<col>: error: ...
if the RUN: line contains something like not llvm-mc .... 2>&1 | FileCheck
?
I tried gfx12_asm_vop1.s, but seems the script does not understand the Regarding to the other backend, I did a quick look, and it seems some tests are using -- I took a look and it seems it required some parsing on the lit.local.cfg file. The test infra seems not have something for this yet. I guess for now we can just try to replace the |
I was looking at it. It's currently not supporting since the |
Added |
16d0d12
to
20f8fa9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LVGTM. Please also wait for Alexander's approval.
// RUN: not llvm-mc -triple=amdgcn -show-encoding %s 2>&1 | FileCheck --check-prefixes=CHECK %s | ||
|
||
v_bfrev_b32 v5, v299 | ||
// CHECK: error: register index is out of range |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Daydreaming: I guess this could even do the [[@LINE-1]]
thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can work. Let me try it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
After some research on windows platform I realized that all lit tests for the update scripts are disabled for windows platform (either binary check failed or platform not supported. In windows, llc is named llc.exe so the llc binary check in os.path.isfile(llc) return false -_-! and it will not run). Disabled windows platform test for this script as well |
c3e6a2a
to
dcf4c41
Compare
dcf4c41
to
776cef4
Compare
llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/amdgpu_asm_err.s.expected
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments inline - looking forward to being able to use this script.
Quick ping! This PR should be ready to get in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this LGTM.
A few minor simplification suggestions if you think this makes it better.
simplify the code Co-authored-by: Alexander Richardson <[email protected]>
Added a script to update the test file generated by llvm-mc binary. The script accepts .s and .txt for asm and dasm.
For mc test I am targetting there is no function name which can be used as a key, thus no clear mapping between input and output. The script assumes the test are always line-by-line and it update the output marker for each test line-by-line.