Skip to content

[llvm-readobj][ELF] Implement JSON output for --dynamic-table #95976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

feg208
Copy link
Contributor

@feg208 feg208 commented Jun 18, 2024

When printing JSON output with --dynamic-table I noticed that the output is invalid JSON. This patch overrides the printDynamicTable() function in the JSONELFDumper to return a list of dictionaries for the DynamicSection value.
Before the output was:

 {
    "FileSummary": {
      "File": "bin/llvm-readelf",
      "Format": "elf64-x86-64",
      "Arch": "x86_64",
      "AddressSize": "64bit",
      "LoadName": "<Not found>"
    }DynamicSection [ (35 entries)
  Tag                Type         Name/Value
  0x000000000000001D RUNPATH      Library runpath: [$ORIGIN/../lib:]
  0x0000000000000001 NEEDED       Shared library: [libm.so.6]
  0x0000000000000001 NEEDED       Shared library: [libz.so.1]
  0x0000000000000001 NEEDED       Shared library: [libzstd.so.1]

Now the output looks like:

 "DynamicSection": [
      {
        "Tag": 29,
        "Type": "RUNPATH",
        "Value": 6322,
        "Path": [
          "$ORIGIN/../lib"
        ]
      },
      {
        "Tag": 1,
        "Type": "NEEDED",
        "Value": 6109,
        "Library": "libm.so.6"
      },

@llvmbot
Copy link
Member

llvmbot commented Jun 18, 2024

@llvm/pr-subscribers-llvm-binary-utilities

Author: Fred Grim (feg208)

Changes

When printing JSON output with --dynamic-table I noticed that the output is invalid JSON. This patch overrides the printDynamicTable() function in the JSONELFDumper to return a list of dictionaries for the DynamicSection value.


Full diff: https://github.com/llvm/llvm-project/pull/95976.diff

2 Files Affected:

  • (modified) llvm/test/tools/llvm-readobj/ELF/dynamic-tags.test (+313)
  • (modified) llvm/tools/llvm-readobj/ELFDumper.cpp (+17)
diff --git a/llvm/test/tools/llvm-readobj/ELF/dynamic-tags.test b/llvm/test/tools/llvm-readobj/ELF/dynamic-tags.test
index 5610ed872df5c..05ba9863bb717 100644
--- a/llvm/test/tools/llvm-readobj/ELF/dynamic-tags.test
+++ b/llvm/test/tools/llvm-readobj/ELF/dynamic-tags.test
@@ -8,6 +8,9 @@
 # RUN: llvm-readelf --dynamic-table %t1 \
 # RUN:   | FileCheck %s --strict-whitespace --match-full-lines --check-prefix=GNU64
 # RUN: llvm-readelf -d %t1 | FileCheck %s --check-prefix=GNU64
+# RUN: llvm-readelf --dynamic-table --pretty-print --elf-output-style=JSON %t1 \
+# RUN:   | FileCheck %s --check-prefix=JSON64
+# RUN: llvm-readelf -d --pretty-print --elf-output-style=JSON %t1 | FileCheck %s --check-prefix=JSON64
 
 #      LLVM64:DynamicSection [ (61 entries)
 # LLVM64-NEXT:  Tag                Type                 Name/Value
@@ -138,6 +141,316 @@
 # GNU64-NEXT:  0x0000000076543210 (<unknown:>0x76543210) 0x5555666677778888
 # GNU64-NEXT:  0x0000000000000000 (NULL)                 0x0
 
+#      JSON64:"DynamicSection": [
+# JSON64-NEXT:  {
+# JSON64-NEXT:      "Tag": 1,
+# JSON64-NEXT:      "Value": "Shared library: [D]",
+# JSON64-NEXT:      "Type": "NEEDED"
+# JSON64-NEXT:  },
+# JSON64-NEXT:  {
+# JSON64-NEXT:		"Tag": 2,
+# JSON64-NEXT:      "Value": "16 (bytes)",
+# JSON64-NEXT:        "Type": "PLTRELSZ"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 3,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "PLTGOT"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 4,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "HASH"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 5,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "STRTAB"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 6,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "SYMTAB"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 7,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "RELA"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 8,
+# JSON64-NEXT:        "Value": "16 (bytes)",
+# JSON64-NEXT:        "Type": "RELASZ"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 9,
+# JSON64-NEXT:        "Value": "1929 (bytes)",
+# JSON64-NEXT:        "Type": "RELAENT"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 10,
+# JSON64-NEXT:        "Value": "16 (bytes)",
+# JSON64-NEXT:        "Type": "STRSZ"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 11,
+# JSON64-NEXT:        "Value": "2439 (bytes)",
+# JSON64-NEXT:        "Type": "SYMENT"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 12,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "INIT"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 13,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "FINI"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 14,
+# JSON64-NEXT:        "Value": "Library soname: [U]",
+# JSON64-NEXT:        "Type": "SONAME"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 15,
+# JSON64-NEXT:        "Value": "Library rpath: [f]",
+# JSON64-NEXT:        "Type": "RPATH"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 16,
+# JSON64-NEXT:        "Value": "0x1234567890ABCDEF",
+# JSON64-NEXT:        "Type": "SYMBOLIC"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 17,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "REL"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 18,
+# JSON64-NEXT:        "Value": "16 (bytes)",
+# JSON64-NEXT:        "Type": "RELSZ"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 19,
+# JSON64-NEXT:        "Value": "291 (bytes)",
+# JSON64-NEXT:        "Type": "RELENT"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 20,
+# JSON64-NEXT:        "Value": "RELA",
+# JSON64-NEXT:        "Type": "PLTREL"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 21,
+# JSON64-NEXT:        "Value": "0xFEDCBA0987654321",
+# JSON64-NEXT:        "Type": "DEBUG"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 22,
+# JSON64-NEXT:        "Value": "0x1122334455667788",
+# JSON64-NEXT:        "Type": "TEXTREL"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 23,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "JMPREL"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 24,
+# JSON64-NEXT:        "Value": "0x8877665544332211",
+# JSON64-NEXT:        "Type": "BIND_NOW"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 25,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "INIT_ARRAY"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 26,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "FINI_ARRAY"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 27,
+# JSON64-NEXT:        "Value": "16 (bytes)",
+# JSON64-NEXT:        "Type": "INIT_ARRAYSZ"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 28,
+# JSON64-NEXT:        "Value": "16 (bytes)",
+# JSON64-NEXT:        "Type": "FINI_ARRAYSZ"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 29,
+# JSON64-NEXT:        "Value": "Library runpath: [w]",
+# JSON64-NEXT:        "Type": "RUNPATH"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 30,
+# JSON64-NEXT:        "Value": "ORIGIN SYMBOLIC TEXTREL BIND_NOW STATIC_TLS ",
+# JSON64-NEXT:        "Type": "FLAGS"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 32,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "PREINIT_ARRAY"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 33,
+# JSON64-NEXT:        "Value": "16 (bytes)",
+# JSON64-NEXT:        "Type": "PREINIT_ARRAYSZ"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 34,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "SYMTAB_SHNDX"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 35,
+# JSON64-NEXT:        "Value": "16 (bytes)",
+# JSON64-NEXT:        "Type": "RELRSZ"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 36,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "RELR"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 37,
+# JSON64-NEXT:        "Value": "17185 (bytes)",
+# JSON64-NEXT:        "Type": "RELRENT"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1610612751,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "ANDROID_REL"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1610612752,
+# JSON64-NEXT:        "Value": "16 (bytes)",
+# JSON64-NEXT:        "Type": "ANDROID_RELSZ"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1610612753,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "ANDROID_RELA"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1610612754,
+# JSON64-NEXT:        "Value": "16 (bytes)",
+# JSON64-NEXT:        "Type": "ANDROID_RELASZ"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879040000,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "ANDROID_RELR"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879040001,
+# JSON64-NEXT:        "Value": "0x10",
+# JSON64-NEXT:        "Type": "ANDROID_RELRSZ"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879040003,
+# JSON64-NEXT:        "Value": "0x1234",
+# JSON64-NEXT:        "Type": "ANDROID_RELRENT"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879047925,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "GNU_HASH"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879047926,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "TLSDESC_PLT"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879047927,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "TLSDESC_GOT"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879048185,
+# JSON64-NEXT:        "Value": "0",
+# JSON64-NEXT:        "Type": "RELACOUNT"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879048186,
+# JSON64-NEXT:        "Value": "0",
+# JSON64-NEXT:        "Type": "RELCOUNT"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879048187,
+# JSON64-NEXT:        "Value": "NOW GLOBAL GROUP NODELETE LOADFLTR INITFIRST NOOPEN ORIGIN DIRECT TRANS INTERPOSE NODEFLIB NODUMP CONFALT ENDFILTEE DISPRELDNE DISPRELPND NODIRECT IGNMULDEF NOKSYMS NOHDR EDITED NORELOC SYMINTPOSE GLOBAUDIT SINGLETON PIE ",
+# JSON64-NEXT:        "Type": "FLAGS_1"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879048176,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "VERSYM"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879048188,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "VERDEF"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879048189,
+# JSON64-NEXT:        "Value": "0",
+# JSON64-NEXT:        "Type": "VERDEFNUM"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879048190,
+# JSON64-NEXT:        "Value": "0x1000",
+# JSON64-NEXT:        "Type": "VERNEED"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1879048191,
+# JSON64-NEXT:        "Value": "0",
+# JSON64-NEXT:        "Type": "VERNEEDNUM"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 2147483645,
+# JSON64-NEXT:        "Value": "Auxiliary library: [D]",
+# JSON64-NEXT:        "Type": "AUXILIARY"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 2147483646,
+# JSON64-NEXT:        "Value": "Not needed object: [U]",
+# JSON64-NEXT:        "Type": "USED"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 2147483647,
+# JSON64-NEXT:        "Value": "Filter library: [U]",
+# JSON64-NEXT:        "Type": "FILTER"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 305419896,
+# JSON64-NEXT:        "Value": "0x8765432187654321",
+# JSON64-NEXT:        "Type": "<unknown:>0x12345678"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1790762736,
+# JSON64-NEXT:        "Value": "0x9988776655443322",
+# JSON64-NEXT:        "Type": "<unknown:>0x6abcdef0"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 1985229328,
+# JSON64-NEXT:        "Value": "0x5555666677778888",
+# JSON64-NEXT:        "Type": "<unknown:>0x76543210"
+# JSON64-NEXT:      },
+# JSON64-NEXT:      {
+# JSON64-NEXT:        "Tag": 0,
+# JSON64-NEXT:        "Value": "0x0",
+# JSON64-NEXT:        "Type": "NULL"
+# JSON64-NEXT:      }
+# JSON64-NEXT:    ]
+# JSON64-NEXT:  }
+# JSON64-NEXT:]
+
 --- !ELF
 FileHeader:
   Class:      ELFCLASS[[BITS=64]]
diff --git a/llvm/tools/llvm-readobj/ELFDumper.cpp b/llvm/tools/llvm-readobj/ELFDumper.cpp
index c696934a959b2..f98997ab1f23f 100644
--- a/llvm/tools/llvm-readobj/ELFDumper.cpp
+++ b/llvm/tools/llvm-readobj/ELFDumper.cpp
@@ -793,6 +793,8 @@ template <typename ELFT> class JSONELFDumper : public LLVMELFDumper<ELFT> {
 
   void printEmptyGroupMessage() const override;
 
+  void printDynamicTable() override;
+
 private:
   std::unique_ptr<DictScope> FileScope;
 };
@@ -7365,6 +7367,21 @@ template <class ELFT> void LLVMELFDumper<ELFT>::printDynamicTable() {
   W.startLine() << "]\n";
 }
 
+template <class ELFT> void JSONELFDumper<ELFT>::printDynamicTable() {
+  Elf_Dyn_Range Table = this->dynamic_table();
+  if (Table.empty())
+    return;
+  ListScope L(this->W, "DynamicSection");
+  for (const auto &Entry : Table) {
+    DictScope D(this->W);
+    uintX_t Tag = Entry.getTag();
+    std::string Value = this->getDynamicEntry(Tag, Entry.getVal());
+    this->W.printHex("Tag", Tag);
+    this->W.printString("Value", Value);
+    this->W.printString("Type", this->Obj.getDynamicTagAsString(Tag));
+  }
+}
+
 template <class ELFT> void LLVMELFDumper<ELFT>::printDynamicRelocations() {
   W.startLine() << "Dynamic Relocations {\n";
   W.indent();

@feg208 feg208 force-pushed the users/feg208/fix_json_output_for_readelf_dynamic_table branch from 8a29f77 to 1d73352 Compare June 18, 2024 19:38
@ilovepi
Copy link
Contributor

ilovepi commented Jun 18, 2024

Can you include a snippet in the commit message that shows how the output is changing? We've done that for patches like this in the past, and it helps make clear what the new behavior is.

@ilovepi
Copy link
Contributor

ilovepi commented Jun 18, 2024

I think this is basically LGTM from me, modulo the output diff. I'll defer to @jh7370 for final approval, though.

@feg208 feg208 force-pushed the users/feg208/fix_json_output_for_readelf_dynamic_table branch from 1d73352 to 15230eb Compare June 18, 2024 20:02
@feg208
Copy link
Contributor Author

feg208 commented Jun 18, 2024

Can you include a snippet in the commit message that shows how the output is changing? We've done that for patches like this in the past, and it helps make clear what the new behavior is.

Done. I added it here and in the commit message

Comment on lines 7377 to 7423
this->W.printHex("Tag", Tag);
this->W.printString("Value", Value);
this->W.printString("Type", this->Obj.getDynamicTagAsString(Tag));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to give this output a bit more consideration, given that this is JSON and therefore intended to be machine-readable (otherwise, we could just use LLVM-style output). In particular, I have the following two thoughts:

  1. Tag/Type fields are just different ways of representing the same information. Do we need both of them?
  2. The Value field is a string, yet this is almost never the "natural" representation of the value for a dynamic tag. It also is rarely the actual value, because it is an interpretation of that value. In my opinion, this potentially deserves splitting into two fields: a raw integer value (which is just the "bytes on disk" numeric value for the entry), and an interpreted value, in a format that is appropriate for the interpretation, e.g. a list of strings, or an integer (in the latter case, arguably the interpreted field may not be necessary).

Thoughts?

Copy link
Contributor Author

@feg208 feg208 Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I do like having the type string there but I can appreciate that it's duplicative.
  2. Yeah. what about a construct like
"Value": { "bytes": 16 },
"Type": "PLTRELSZ",
},
....
"Value": { "Shared library": "[D]" }
"Type": "NEEDED"
},
...
"Value": {"value": 4096 },
"Type": "HASH"
},

and so on? Maybe using "string" and "number" or something to disambiguate the cases where there is just a string or just a number

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to having the type field, if you think it's useful for readability of the output.

As I understand JSON, I'm not sure we need a nested object for the Value field, because the individual DynamicTable entries can have different types for that field. In other words, I think it could look something like this:

...
"Value": 16
"Type": "PLTRELSZ"
...
"Value": ["A", "B"]
"RawValue":4321
"Type": "NEEDED"
...
"Value": 4096
"Type": "HASH"

Note that I've added "RawValue" for the case where the value isn't just the raw number in the dynamic table. For some dynamic entries, the value is the actual meaning. In other cases, it's a pointer to some other information stored somewhere. The "bytes" bit of the size is unnecessary, since the size values are implicitly in bytes.

A possible variation on my above suggestion is the following, where every entry has a "Value" member, which is the raw value stored in the dynamic table, and in the cases where that value has some interpreted meaning, there's an additional entry. In this way, a generic parser doesn't need to understand the tag types to be able to record the raw values, since all the "Value" entries have the same type.

"Type": "PLTRELSZ",
"Value": 16
...
"Type": "NEEDED",
"Value": 1234,
"Libraries": ["A", "B", "C"]
...
"Type": "SONAME",
"Value": 100,
"Name": "my-library"

(For simplicity in both examples, I've not included separate Tag/Type fields, but these can be included as per the current behaviour)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with option 2. How does that strike you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we are good with the format I'll clean up the commit message so the output matches @ilovepi to keep everything tidy

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot to respond to this: current format is basically fine now, thanks. I'll continue reviewing and make any more comments I find.

@feg208 feg208 force-pushed the users/feg208/fix_json_output_for_readelf_dynamic_table branch from 15230eb to b9b91c1 Compare June 19, 2024 16:56
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the LLVM PR guidelines, please avoid force pushes on your branch: they make it harder to review, because you can no longer see what's changed since the previous version. If rebasing is necessary, you can use a merge commit from the base branch. When you squash and merge upon submission, the merge commit (and any fixup commits) will disappear.

I also prefer it if you don't "Resolve Conversation" on threads I've started, because I use this to track comments that I'm happy have been satisfactorily resolved (you'd be surprised how often people click "Resolve Conversation" when they haven't made the requested change, or they've made it incorrectly. See https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178 for more context.

Comment on lines 7377 to 7423
this->W.printHex("Tag", Tag);
this->W.printString("Value", Value);
this->W.printString("Type", this->Obj.getDynamicTagAsString(Tag));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to having the type field, if you think it's useful for readability of the output.

As I understand JSON, I'm not sure we need a nested object for the Value field, because the individual DynamicTable entries can have different types for that field. In other words, I think it could look something like this:

...
"Value": 16
"Type": "PLTRELSZ"
...
"Value": ["A", "B"]
"RawValue":4321
"Type": "NEEDED"
...
"Value": 4096
"Type": "HASH"

Note that I've added "RawValue" for the case where the value isn't just the raw number in the dynamic table. For some dynamic entries, the value is the actual meaning. In other cases, it's a pointer to some other information stored somewhere. The "bytes" bit of the size is unnecessary, since the size values are implicitly in bytes.

A possible variation on my above suggestion is the following, where every entry has a "Value" member, which is the raw value stored in the dynamic table, and in the cases where that value has some interpreted meaning, there's an additional entry. In this way, a generic parser doesn't need to understand the tag types to be able to record the raw values, since all the "Value" entries have the same type.

"Type": "PLTRELSZ",
"Value": 16
...
"Type": "NEEDED",
"Value": 1234,
"Libraries": ["A", "B", "C"]
...
"Type": "SONAME",
"Value": 100,
"Name": "my-library"

(For simplicity in both examples, I've not included separate Tag/Type fields, but these can be included as per the current behaviour)

# JSON64-NEXT: {
# JSON64-NEXT: "Tag": 30,
# JSON64-NEXT: "Value": 18446744073709551615,
# JSON64-NEXT: "Flags": "ORIGIN SYMBOLIC TEXTREL BIND_NOW STATIC_TLS ",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trailing space may look strange

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. It seems like a bug in the dynamic string method. I'll find it and squash it. It doesn't show up in the LLVM formatter since there aren't quotes to delineate it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have thought that Flags should be a list anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. done

this->getDynamicEntry(Entry.getTag(), Entry.getVal()));
break;
default:
break;
Copy link
Member

@MaskRay MaskRay Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps return for the default clause.

Make these string tags shared one this->W.printString(this->getDynamicString(Entry.getVal())); after the switch table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super sure how to make this work? I tried doing a printString for the label and then a printString() below but that caused crashes. Could you give me a short example for what you are looking for here?

@MaskRay
Copy link
Member

MaskRay commented Jun 20, 2024

Fix broken JSON output

The subject can probably say Implement JSON output for . The JSON output supports a few dump options and -d is not implemented yet.

@MaskRay
Copy link
Member

MaskRay commented Jun 20, 2024

Per the LLVM PR guidelines, please avoid force pushes on your branch: they make it harder to review, because you can no longer see what's changed since the previous version. If rebasing is necessary, you can use a merge commit from the base branch. When you squash and merge upon submission, the merge commit (and any fixup commits) will disappear.

You can use getcord/spr if you prefer rebases and force pushes. The tool will propagate changes from your local branch to the PR branch and the PR branch only performs merges. https://maskray.me/blog/2023-09-09-reflections-on-llvm-switch-to-github-pull-requests#my-workflow

@feg208
Copy link
Contributor Author

feg208 commented Jun 20, 2024

You can use getcord/spr if you prefer rebases and force pushes. The tool will propagate changes from your local branch to the PR branch and the PR branch only performs merges. https://maskray.me/blog/2023-09-09-reflections-on-llvm-switch-to-github-pull-requests#my-workflow

Oh thanks a bunch. I'll check it out

@feg208 feg208 changed the title [llvm-readobj][ELF] Fix broken JSON output with --dynamic-table [llvm-readobj][ELF] Implement JSON output for --dynamic-table Jun 21, 2024
@feg208
Copy link
Contributor Author

feg208 commented Jun 21, 2024

The subject can probably say Implement JSON output for . The JSON output supports a few dump options and -d is not implemented yet.

Done. Hopefully I did not give offense with the prior phrasing. None was meant certainly.

# JSON64-NEXT: "Type": "NEEDED"
# JSON64-NEXT: },
# JSON64-NEXT: {
# JSON64-NEXT: "Tag": 2,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spacing has gone wrong again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blech. vim. fixed

this->W.printHex("Tag", Tag);
this->W.printHex("Value", Entry.getVal());
this->printAuxillaryDynamicTableEntryInfo(Entry);
this->W.printString("Type", this->Obj.getDynamicTagAsString(Tag));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: as they're directly related, I'd put the Type string line immediately after the Tag one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

# JSON64-NEXT: {
# JSON64-NEXT: "Tag": 30,
# JSON64-NEXT: "Value": 18446744073709551615,
# JSON64-NEXT: "Flags": "ORIGIN SYMBOLIC TEXTREL BIND_NOW STATIC_TLS ",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have thought that Flags should be a list anyway?

# JSON64-NEXT: {
# JSON64-NEXT: "Tag": 1879048187,
# JSON64-NEXT: "Value": 18446744073709551615,
# JSON64-NEXT: "Flags": "NOW GLOBAL GROUP NODELETE LOADFLTR INITFIRST NOOPEN ORIGIN DIRECT TRANS INTERPOSE NODEFLIB NODUMP CONFALT ENDFILTEE DISPRELDNE DISPRELPND NODIRECT IGNMULDEF NOKSYMS NOHDR EDITED NORELOC SYMINTPOSE GLOBAUDIT SINGLETON PIE ",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here as above: I feel like this would be better as a list, not a concatenated string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I agree. done

this->W.printString("Name", this->getDynamicString(Entry.getVal()));
break;
case DT_AUXILIARY:
LLVM_FALLTHROUGH;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure LLVM_FALLTHROUGH is needed where there's no body to the case statement. You should be able to just do:

case DT_AUXILIARY:
case DT_FILTER:
case DT_NEEDED:
  ListScope L(this->W, "Libraries");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

LLVM_FALLTHROUGH;
case DT_NEEDED: {
ListScope L(this->W, "Libraries");
this->W.printString(this->getDynamicString(Entry.getVal()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Libraries is a list, so you should have 1 list element per library. You may want to enhance the test case too to have multiple elements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this below but looking at the spec I am not clear how this can possibly have multiple elements. But there must be something I am missing

} break;
case DT_USED: {
ListScope L(this->W, "Objects");
this->W.printString(this->getDynamicString(Entry.getVal()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, for here and all the other list types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so reading the elf docs I think only the Path is really multiply valued. All the others are to an offset in the string table with no documented separator like ':' or ';' or anything. I wonder why the original was a

[D]

to begin with. I must be missing something though. I am going off https://refspecs.linuxfoundation.org/elf/elf.pdf 2-12

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, with a couple of nits. Just don't forget to update the commit message before merging.

Would you be okay to do a (hopefully) small follow-up change to tidy up the slightly weird LLVM output, where it is using lists for single-item elements, please?

this->W.printString(front);
Value = back;
}
} break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd put the break inside the brace. Having it here feels wrong syntactically, even if it's not invalid.

StringRef Value = this->getDynamicString(Entry.getVal());
ListScope L(this->W, "Path");
while (!Value.empty()) {
auto [front, back] = Value.split(':');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit

Suggested change
auto [front, back] = Value.split(':');
auto [Front, Back] = Value.split(':');

# JSON64-NEXT: "Type": "RPATH",
# JSON64-NEXT: "Value": 5,
# JSON64-NEXT: "Path": [
# JSON64-NEXT: "f"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and for RUNPATH, you should probably have multiple elements in the path string. That will impact the output of the other test cases in this file, so I'm okay if you want to defer it to the follow-up patch I mention in the out-of-line comment, if you prefer.

@feg208
Copy link
Contributor Author

feg208 commented Jun 24, 2024

LGTM now, with a couple of nits. Just don't forget to update the commit message before merging.

Will do.

Would you be okay to do a (hopefully) small follow-up change to tidy up the slightly weird LLVM output, where it is using lists for single-item elements, please?

Sure thing. I'll make it happen. I am assuming just in this place.

@jh7370
Copy link
Collaborator

jh7370 commented Jun 24, 2024

Sure thing. I'll make it happen. I am assuming just in this place.

As you've noticed yourself, it looks like the only case where there is a list is the RPATH/RUNPATH tags, so I guess take a look at ListScope occurrences in the relevant bit of code.

LGTM now, with a couple of nits. Just don't forget to update the commit message before merging.

Will do.

Would you be okay to do a (hopefully) small follow-up change to tidy up the slightly weird LLVM output, where it is using lists for single-item elements, please?

Sure thing. I'll make it happen. I am assuming just in this place.

I'm thinking

is the area that needs changing for the LLVM-style output.

@feg208
Copy link
Contributor Author

feg208 commented Jun 24, 2024

I'm thinking

is the area that needs changing for the LLVM-style output.

Do we want to just jettison the brackets? Or were you thinking of something else?

feg208 added 3 commits June 24, 2024 10:17
When printing JSON output with --dynamic-table I noticed that the
output is invalid JSON. This patch overrides the printDynamicTable()
function in the JSONELFDumper to return a list of dictionaries for the
DynamicSection value.

Before this commit the output was:
{
    "FileSummary": {
      "File": "bin/llvm-readelf",
      "Format": "elf64-x86-64",
      "Arch": "x86_64",
      "AddressSize": "64bit",
      "LoadName": "<Not found>"
    }DynamicSection [ (35 entries)
  Tag                Type         Name/Value
  0x000000000000001D RUNPATH      Library runpath: [$ORIGIN/../lib:]
  0x0000000000000001 NEEDED       Shared library: [libm.so.6]
  0x0000000000000001 NEEDED       Shared library: [libz.so.1]
  0x0000000000000001 NEEDED       Shared library: [libzstd.so.1]

Whereas now it is:

 "DynamicSection": [
      {
        "Tag": 29,
        "Type": "RUNPATH",
        "Value": 6322,
        "Path": [
          "$ORIGIN/../lib"
        ]
      },
      {
        "Tag": 1,
        "Type": "NEEDED",
        "Value": 6109,
        "Library": "libm.so.6"
      },
@feg208 feg208 force-pushed the users/feg208/fix_json_output_for_readelf_dynamic_table branch from 1f18a56 to 899ebf5 Compare June 24, 2024 17:18
@feg208 feg208 merged commit 0ab8198 into llvm:main Jun 24, 2024
5 of 6 checks passed
feg208 added a commit to feg208/llvm-project that referenced this pull request Jun 24, 2024
…ut for readobj to emphasize the single valued nature of NEEDED, SONAME, USED etc.

In the context of llvm#95976 it became clear that the output for readobj
implied multi valued entries in several cases in the elf headers that
the documentation only allowed for a single value. DT_NEEDED is the
example here where the value is an offset into the string table without
any sort of separator that could give you multiple entries. This patch
alters the LLVM output so that the single valued nature is emphasized.
For example the output was:
```
DynamicSection [ (35 entries)
  Tag                Type         Name/Value
  0x000000000000001D RUNPATH      Library runpath: [$ORIGIN/../lib:]
  0x0000000000000001 NEEDED       Shared library: [libm.so.6]
  0x0000000000000001 NEEDED       Shared library: [libz.so.1]
  0x0000000000000001 NEEDED       Shared library: [libzstd.so.1]
```
and is now
```
  Tag                Type         Name/Value
  0x000000000000001D RUNPATH      Library runpath: [$ORIGIN/../lib:]
  0x0000000000000001 NEEDED       Shared library: libm.so.6
  0x0000000000000001 NEEDED       Shared library: libz.so.1
  0x0000000000000001 NEEDED       Shared library: libzstd.so.1
```
This pr also tests that multi-valued rpaths are handled correctly in the
JSON case (i.e. they become proper lists) like:
```
{
    "Tag": 15,
    "Type": "RPATH",
    "Value": 9,
    "Path": [
        "x",
        "w",
        "U"
    ]
},
```
when separated by :
@jh7370
Copy link
Collaborator

jh7370 commented Jun 25, 2024

I'm thinking

is the area that needs changing for the LLVM-style output.

Do we want to just jettison the brackets? Or were you thinking of something else?

Just jettisoning the brackets. I suspect whoever implemented this option got themselves confused and tried to format things like a list for individual entries (note that it's certainly possible to have multiple DT_NEEDED etc tags, so a "summarised" version might have a list of needed libraries, but in the case we're talking about, we're only dealing with each tag individually, and therefore only a single entry).

feg208 added a commit that referenced this pull request Jun 26, 2024
…ut for readobj to emphasize the single valued nature of NEEDED, SONAME, USED etc. (#96562)

In the context of #95976 it became clear that the output for readobj
implied multi valued entries in several cases in the elf headers that
the documentation only allowed for a single value. DT_NEEDED is the
example here where the value is an offset into the string table without
any sort of separator that could give you multiple entries. This patch
alters the LLVM output so that the single valued nature is emphasized.
For example the output was:
```
DynamicSection [ (35 entries)
  Tag                Type         Name/Value
  0x000000000000001D RUNPATH      Library runpath: [$ORIGIN/../lib:]
  0x0000000000000001 NEEDED       Shared library: [libm.so.6]
  0x0000000000000001 NEEDED       Shared library: [libz.so.1]
  0x0000000000000001 NEEDED       Shared library: [libzstd.so.1]
```
and is now
```
  Tag                Type         Name/Value
  0x000000000000001D RUNPATH      Library runpath: [$ORIGIN/../lib:]
  0x0000000000000001 NEEDED       Shared library: libm.so.6
  0x0000000000000001 NEEDED       Shared library: libz.so.1
  0x0000000000000001 NEEDED       Shared library: libzstd.so.1
```
This pr also tests that multi-valued rpaths are handled correctly in the
JSON case (i.e. they become proper lists) like:
```
{
    "Tag": 15,
    "Type": "RPATH",
    "Value": 9,
    "Path": [
        "x",
        "w",
        "U"
    ]
},
```
when separated by :
lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this pull request Jul 3, 2024
…ut for readobj to emphasize the single valued nature of NEEDED, SONAME, USED etc. (llvm#96562)

In the context of llvm#95976 it became clear that the output for readobj
implied multi valued entries in several cases in the elf headers that
the documentation only allowed for a single value. DT_NEEDED is the
example here where the value is an offset into the string table without
any sort of separator that could give you multiple entries. This patch
alters the LLVM output so that the single valued nature is emphasized.
For example the output was:
```
DynamicSection [ (35 entries)
  Tag                Type         Name/Value
  0x000000000000001D RUNPATH      Library runpath: [$ORIGIN/../lib:]
  0x0000000000000001 NEEDED       Shared library: [libm.so.6]
  0x0000000000000001 NEEDED       Shared library: [libz.so.1]
  0x0000000000000001 NEEDED       Shared library: [libzstd.so.1]
```
and is now
```
  Tag                Type         Name/Value
  0x000000000000001D RUNPATH      Library runpath: [$ORIGIN/../lib:]
  0x0000000000000001 NEEDED       Shared library: libm.so.6
  0x0000000000000001 NEEDED       Shared library: libz.so.1
  0x0000000000000001 NEEDED       Shared library: libzstd.so.1
```
This pr also tests that multi-valued rpaths are handled correctly in the
JSON case (i.e. they become proper lists) like:
```
{
    "Tag": 15,
    "Type": "RPATH",
    "Value": 9,
    "Path": [
        "x",
        "w",
        "U"
    ]
},
```
when separated by :
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
…5976)

When printing JSON output with --dynamic-table I noticed that the output
is invalid JSON. This patch overrides the printDynamicTable() function
in the JSONELFDumper to return a list of dictionaries for the
DynamicSection value.
Before the output was:
```
 {
    "FileSummary": {
      "File": "bin/llvm-readelf",
      "Format": "elf64-x86-64",
      "Arch": "x86_64",
      "AddressSize": "64bit",
      "LoadName": "<Not found>"
    }DynamicSection [ (35 entries)
  Tag                Type         Name/Value
  0x000000000000001D RUNPATH      Library runpath: [$ORIGIN/../lib:]
  0x0000000000000001 NEEDED       Shared library: [libm.so.6]
  0x0000000000000001 NEEDED       Shared library: [libz.so.1]
  0x0000000000000001 NEEDED       Shared library: [libzstd.so.1]
```
Now the output looks like:
```
 "DynamicSection": [
      {
        "Tag": 29,
        "Type": "RUNPATH",
        "Value": 6322,
        "Path": [
          "$ORIGIN/../lib"
        ]
      },
      {
        "Tag": 1,
        "Type": "NEEDED",
        "Value": 6109,
        "Library": "libm.so.6"
      },
```
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
…ut for readobj to emphasize the single valued nature of NEEDED, SONAME, USED etc. (llvm#96562)

In the context of llvm#95976 it became clear that the output for readobj
implied multi valued entries in several cases in the elf headers that
the documentation only allowed for a single value. DT_NEEDED is the
example here where the value is an offset into the string table without
any sort of separator that could give you multiple entries. This patch
alters the LLVM output so that the single valued nature is emphasized.
For example the output was:
```
DynamicSection [ (35 entries)
  Tag                Type         Name/Value
  0x000000000000001D RUNPATH      Library runpath: [$ORIGIN/../lib:]
  0x0000000000000001 NEEDED       Shared library: [libm.so.6]
  0x0000000000000001 NEEDED       Shared library: [libz.so.1]
  0x0000000000000001 NEEDED       Shared library: [libzstd.so.1]
```
and is now
```
  Tag                Type         Name/Value
  0x000000000000001D RUNPATH      Library runpath: [$ORIGIN/../lib:]
  0x0000000000000001 NEEDED       Shared library: libm.so.6
  0x0000000000000001 NEEDED       Shared library: libz.so.1
  0x0000000000000001 NEEDED       Shared library: libzstd.so.1
```
This pr also tests that multi-valued rpaths are handled correctly in the
JSON case (i.e. they become proper lists) like:
```
{
    "Tag": 15,
    "Type": "RPATH",
    "Value": 9,
    "Path": [
        "x",
        "w",
        "U"
    ]
},
```
when separated by :
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants