Skip to content

[lldb] Support custom LLVM formatting for variables #81196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions lldb/docs/use/variable.rst
Original file line number Diff line number Diff line change
Expand Up @@ -460,6 +460,15 @@ summary strings, regardless of the format they have applied to their types. To
do that, you can use %format inside an expression path, as in ${var.x->x%u},
which would display the value of x as an unsigned integer.

Additionally, custom output can be achieved by using an LLVM format string,
commencing with the ``:`` marker. To illustrate, compare ``${var.byte%x}`` and
``${var.byte:x-}``. The former uses lldb's builtin hex formatting (``x``),
which unconditionally inserts a ``0x`` prefix, and also zero pads the value to
match the size of the type. The latter uses ``llvm::formatv`` formatting
(``:x-``), and will print only the hex value, with no ``0x`` prefix, and no
padding. This raw control is useful when composing multiple pieces into a
larger whole.

You can also use some other special format markers, not available for formats
themselves, but which carry a special meaning when used in this context:

Expand Down
70 changes: 60 additions & 10 deletions lldb/source/Core/FormatEntity.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/Compiler.h"
#include "llvm/Support/Regex.h"
#include "llvm/TargetParser/Triple.h"

#include <cctype>
Expand Down Expand Up @@ -658,6 +659,38 @@ static char ConvertValueObjectStyleToChar(
return '\0';
}

static llvm::Regex LLVMFormatPattern{"x[-+]?\\d*|n|d", llvm::Regex::IgnoreCase};

static bool DumpValueWithLLVMFormat(Stream &s, llvm::StringRef options,
ValueObject &valobj) {
std::string formatted;
std::string llvm_format = ("{0:" + options + "}").str();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we can make this string static, by switching over the supported options?
Or let me ask another way — what happens if options contained "}{1}" is this well-defined in llvm::formatv because it knows the template arguments and thus will not lead to corruption and crashes?
If the answer is yes, then this is okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by switching over the supported options?

There are many supported options, and they vary from type to type (int has different options than say strings). See FormatProviders.h for the builtins.

It would be possible to exhaustively iterate all the options llvm includes, but I'm not sure it would be worth it. Switching over them seems fragile to changes made to the source of truth in llvm.

I will check/test what llvm does for invalid options.


// Options supported by format_provider<T> for integral arithmetic types.
// See table in FormatProviders.h.

auto type_info = valobj.GetTypeInfo();
if (type_info & eTypeIsInteger && LLVMFormatPattern.match(options)) {
if (type_info & eTypeIsSigned) {
bool success = false;
int64_t integer = valobj.GetValueAsSigned(0, &success);
if (success)
formatted = llvm::formatv(llvm_format.data(), integer);
} else {
bool success = false;
uint64_t integer = valobj.GetValueAsUnsigned(0, &success);
if (success)
formatted = llvm::formatv(llvm_format.data(), integer);
}
}

if (formatted.empty())
return false;

s.Write(formatted.data(), formatted.size());
return true;
}

static bool DumpValue(Stream &s, const SymbolContext *sc,
const ExecutionContext *exe_ctx,
const FormatEntity::Entry &entry, ValueObject *valobj) {
Expand Down Expand Up @@ -728,9 +761,12 @@ static bool DumpValue(Stream &s, const SymbolContext *sc,
return RunScriptFormatKeyword(s, sc, exe_ctx, valobj, entry.string.c_str());
}

llvm::StringRef subpath(entry.string);
auto split = llvm::StringRef(entry.string).split(':');
auto subpath = split.first;
auto llvm_format = split.second;

// simplest case ${var}, just print valobj's value
if (entry.string.empty()) {
if (subpath.empty()) {
if (entry.printf_format.empty() && entry.fmt == eFormatDefault &&
entry.number == ValueObject::eValueObjectRepresentationStyleValue)
was_plain_var = true;
Expand All @@ -739,22 +775,19 @@ static bool DumpValue(Stream &s, const SymbolContext *sc,
target = valobj;
} else // this is ${var.something} or multiple .something nested
{
if (entry.string[0] == '[')
if (subpath[0] == '[')
was_var_indexed = true;
ScanBracketedRange(subpath, close_bracket_index,
var_name_final_if_array_range, index_lower,
index_higher);

Status error;

const std::string &expr_path = entry.string;

LLDB_LOGF(log, "[Debugger::FormatPrompt] symbol to expand: %s",
expr_path.c_str());
LLDB_LOG(log, "[Debugger::FormatPrompt] symbol to expand: {0}", subpath);

target =
valobj
->GetValueForExpressionPath(expr_path.c_str(), &reason_to_stop,
->GetValueForExpressionPath(subpath, &reason_to_stop,
&final_value_type, options, &what_next)
.get();

Expand Down Expand Up @@ -883,8 +916,18 @@ static bool DumpValue(Stream &s, const SymbolContext *sc,
}

if (!is_array_range) {
LLDB_LOGF(log,
"[Debugger::FormatPrompt] dumping ordinary printable output");
if (!llvm_format.empty()) {
if (DumpValueWithLLVMFormat(s, llvm_format, *target)) {
LLDB_LOGF(log, "dumping using llvm format");
return true;
} else {
LLDB_LOG(
log,
"empty output using llvm format '{0}' - with type info flags {1}",
entry.printf_format, target->GetTypeInfo());
}
}
LLDB_LOGF(log, "dumping ordinary printable output");
return target->DumpPrintableRepresentation(s, val_obj_display,
custom_format);
} else {
Expand Down Expand Up @@ -2227,6 +2270,13 @@ static Status ParseInternal(llvm::StringRef &format, Entry &parent_entry,
if (error.Fail())
return error;

auto [_, llvm_format] = llvm::StringRef(entry.string).split(':');
if (!LLVMFormatPattern.match(llvm_format)) {
error.SetErrorStringWithFormat("invalid llvm format: '%s'",
llvm_format.data());
return error;
}

if (verify_is_thread_id) {
if (entry.type != Entry::Type::ThreadID &&
entry.type != Entry::Type::ThreadProtocolID) {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
C_SOURCES := main.c
include Makefile.rules
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
import lldb
from lldbsuite.test.lldbtest import *
import lldbsuite.test.lldbutil as lldbutil


class TestCase(TestBase):
def test_raw_bytes(self):
self.build()
lldbutil.run_to_source_breakpoint(self, "break here", lldb.SBFileSpec("main.c"))
self.runCmd("type summary add -s '${var.ubyte:x-2}${var.sbyte:x-2}!' Bytes")
self.expect("v bytes", substrs=[" = 3001!"])

def test_bad_format(self):
self.build()
lldbutil.run_to_source_breakpoint(self, "break here", lldb.SBFileSpec("main.c"))
self.expect(
"type summary add -s '${var.ubyte:y}!' Bytes",
error=True,
substrs=["invalid llvm format"],
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#include <stdint.h>
#include <stdio.h>

struct Bytes {
uint8_t ubyte;
int8_t sbyte;
};

int main() {
struct Bytes bytes = {0x30, 0x01};
(void)bytes;
printf("break here\n");
}