Skip to content

Commit 6c2327a

Browse files
committed
[BPF] Add BTF generation for BPF target
BTF is the debug format for BPF, a kernel virtual machine and widely used for tracing, networking and security, etc ([1]). Currently only instruction streams are passed to kernel, the kernel verifier verifies them before execution. In order to provide better visibility of bpf programs to user space tools, some debug information, e.g., function names and debug line information are desirable for kernel so tools can get such information with better annotation for jited instructions for performance or other reasons. The dwarf is too complicated in kernel and for BPF. Hence, BTF is designed to be the debug format for BPF ([2]). Right now, pahole supports BTF for types, which are generated based on dwarf sections in the ELF file. In order to annotate performance metrics for jited bpf insns, it is necessary to pass debug line info to the kernel. Furthermore, we want to pass the actual code to the kernel because of the following reasons: . bpf program typically is small so storage overhead should be small. . in bpf land, it is totally possible that an application loads the bpf program into the kernel and then that application quits, so holding debug info by the user space application is not practical. . having source codes directly kept by kernel would ease deployment since the original source code does not need ship on every hosts and kernel-devel package does not need to be deployed even if kernel headers are used. The only reliable time to get the source code is during compilation time. This will result in both more accurate information and easier deployment as stated in the above. Another consideration is for JIT. The project like bcc use MCJIT to compile a C program into bpf insns and load them to the kernel ([3]). The generated BTF sections will be readily available for such cases as well. This patch implemented generation of BTF info in llvm compiler. The BTF related sections will be generated when both -target bpf and -g are specified. Two sections are generated: .BTF contains all the type and string information, and .BTF.ext contains the func_info and line_info. The separation is related to how two sections are used differently in bpf loader, e.g., linux libbpf ([4]). The .BTF section can be loaded into the kernel directly while .BTF.ext needs loader manipulation before loading to the kernel. The format of the each section is roughly defined in llvm:include/llvm/MC/MCBTFContext.h and from the implementation in llvm:lib/MC/MCBTFContext.cpp. A later example also shows the contents in each section. The type and func_info are gathered during CodeGen/AsmPrinter by traversing dwarf debug_info. The line_info is gathered in MCObjectStreamer before writing to the object file. After all the information is gathered, the two sections are emitted in MCObjectStreamer::finishImpl. With cmake CMAKE_BUILD_TYPE=Debug, the compiler can dump out all the tables except insn offset, which will be resolved later as relocation records. The debug type "btf" is used for BTFContext dump. Dwarf tests the debug info generation with llvm-dwarfdump to decode the binary sections and check whether the result is expected. Currently we do not have such a tool yet. We will implement btf dump functionality in bpftool ([5]) as the bpftool is considered the recommended tool for bpf introspection. The implementation for type and func_info is tested with linux kernel test cases. The line_info is visually checked with dump from linux kernel libbpf ([4]) and checked with readelf dumping section raw data. Note that the .BTF and .BTF.ext information will not be emitted to assembly code and there is no assembler support for BTF either. In the below, with a clang/llvm built with CMAKE_BUILD_TYPE=Debug, Each table contents are shown for a simple C program. -bash-4.2$ cat -n test.c 1 struct A { 2 int a; 3 char b; 4 }; 5 6 int test(struct A *t) { 7 return t->a; 8 } -bash-4.2$ clang -O2 -target bpf -g -mllvm -debug-only=btf -c test.c Type Table: [1] FUNC name_off=1 info=0x0c000001 size/type=2 param_type=3 [2] INT name_off=12 info=0x01000000 size/type=4 desc=0x01000020 [3] PTR name_off=0 info=0x02000000 size/type=4 [4] STRUCT name_off=16 info=0x04000002 size/type=8 name_off=18 type=2 bit_offset=0 name_off=20 type=5 bit_offset=32 [5] INT name_off=22 info=0x01000000 size/type=1 desc=0x02000008 String Table: 0 : 1 : test 6 : .text 12 : int 16 : A 18 : a 20 : b 22 : char 27 : test.c 34 : int test(struct A *t) { 58 : return t->a; FuncInfo Table: sec_name_off=6 insn_offset=<Omitted> type_id=1 LineInfo Table: sec_name_off=6 insn_offset=<Omitted> file_name_off=27 line_off=34 line_num=6 column_num=0 insn_offset=<Omitted> file_name_off=27 line_off=58 line_num=7 column_num=3 -bash-4.2$ readelf -S test.o ...... [12] .BTF PROGBITS 0000000000000000 0000028d 00000000000000c1 0000000000000000 0 0 1 [13] .BTF.ext PROGBITS 0000000000000000 0000034e 0000000000000050 0000000000000000 0 0 1 [14] .rel.BTF.ext REL 0000000000000000 00000648 0000000000000030 0000000000000010 16 13 8 ...... -bash-4.2$ The latest linux kernel ([6]) can already support .BTF with type information. The [7] has the reference implementation in linux kernel side to support .BTF.ext func_info. The .BTF.ext line_info support is not implemented yet. If you have difficulty accessing [6], you can manually do the following to access the code: git clone https://github.com/yonghong-song/bpf-next-linux.git cd bpf-next-linux git checkout btf The change will push to linux kernel soon once this patch is landed. References: [1]. https://www.kernel.org/doc/Documentation/networking/filter.txt [2]. https://lwn.net/Articles/750695/ [3]. https://github.com/iovisor/bcc [4]. https://github.com/torvalds/linux/tree/master/tools/lib/bpf [5]. https://github.com/torvalds/linux/tree/master/tools/bpf/bpftool [6]. https://github.com/torvalds/linux [7]. https://github.com/yonghong-song/bpf-next-linux/tree/btf Signed-off-by: Song Liu <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Differential Revision: https://reviews.llvm.org/D52950 llvm-svn: 344366
1 parent e67b68f commit 6c2327a

18 files changed

+1454
-1
lines changed

llvm/include/llvm/MC/MCBTFContext.h

Lines changed: 364 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,364 @@
1+
//===- MCBTFContext.h ---------------------------------------- *- C++ --*-===//
2+
//
3+
// The LLVM Compiler Infrastructure
4+
//
5+
// This file is distributed under the University of Illinois Open Source
6+
// License. See LICENSE.TXT for details.
7+
//
8+
// This header file contains two parts. The first part is the BTF ELF
9+
// specification in C format, and the second part is the various
10+
// C++ classes to manipulate the data structure in order to generate
11+
// the BTF related ELF sections.
12+
//===----------------------------------------------------------------------===//
13+
#ifndef LLVM_MC_MCBTFCONTEXT_H
14+
#define LLVM_MC_MCBTFCONTEXT_H
15+
16+
#include <linux/types.h>
17+
18+
#define BTF_MAGIC 0xeB9F
19+
#define BTF_VERSION 1
20+
21+
struct btf_header {
22+
__u16 magic;
23+
__u8 version;
24+
__u8 flags;
25+
__u32 hdr_len;
26+
27+
/* All offsets are in bytes relative to the end of this header */
28+
__u32 type_off; /* offset of type section */
29+
__u32 type_len; /* length of type section */
30+
__u32 str_off; /* offset of string section */
31+
__u32 str_len; /* length of string section */
32+
};
33+
34+
/* Max # of type identifier */
35+
#define BTF_MAX_TYPE 0x0000ffff
36+
/* Max offset into the string section */
37+
#define BTF_MAX_NAME_OFFSET 0x0000ffff
38+
/* Max # of struct/union/enum members or func args */
39+
#define BTF_MAX_VLEN 0xffff
40+
41+
struct btf_type {
42+
__u32 name_off;
43+
/* "info" bits arrangement
44+
* bits 0-15: vlen (e.g. # of struct's members)
45+
* bits 16-23: unused
46+
* bits 24-27: kind (e.g. int, ptr, array...etc)
47+
* bits 28-31: unused
48+
*/
49+
__u32 info;
50+
/* "size" is used by INT, ENUM, STRUCT and UNION.
51+
* "size" tells the size of the type it is describing.
52+
*
53+
* "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
54+
* FUNC and FUNC_PROTO.
55+
* "type" is a type_id referring to another type.
56+
*/
57+
union {
58+
__u32 size;
59+
__u32 type;
60+
};
61+
};
62+
63+
#define BTF_INFO_KIND(info) (((info) >> 24) & 0x0f)
64+
#define BTF_INFO_VLEN(info) ((info) & 0xffff)
65+
66+
#define BTF_KIND_UNKN 0 /* Unknown */
67+
#define BTF_KIND_INT 1 /* Integer */
68+
#define BTF_KIND_PTR 2 /* Pointer */
69+
#define BTF_KIND_ARRAY 3 /* Array */
70+
#define BTF_KIND_STRUCT 4 /* Struct */
71+
#define BTF_KIND_UNION 5 /* Union */
72+
#define BTF_KIND_ENUM 6 /* Enumeration */
73+
#define BTF_KIND_FWD 7 /* Forward */
74+
#define BTF_KIND_TYPEDEF 8 /* Typedef */
75+
#define BTF_KIND_VOLATILE 9 /* Volatile */
76+
#define BTF_KIND_CONST 10 /* Const */
77+
#define BTF_KIND_RESTRICT 11 /* Restrict */
78+
#define BTF_KIND_FUNC 12 /* Function */
79+
#define BTF_KIND_FUNC_PROTO 13 /* Function Prototype */
80+
#define BTF_KIND_MAX 13
81+
#define NR_BTF_KINDS 14
82+
83+
/* For some specific BTF_KIND, "struct btf_type" is immediately
84+
* followed by extra data.
85+
*/
86+
87+
/* BTF_KIND_INT is followed by a u32 and the following
88+
* is the 32 bits arrangement:
89+
*/
90+
#define BTF_INT_ENCODING(VAL) (((VAL) & 0x0f000000) >> 24)
91+
#define BTF_INT_OFFSET(VAL) (((VAL & 0x00ff0000)) >> 16)
92+
#define BTF_INT_BITS(VAL) ((VAL) & 0x000000ff)
93+
94+
/* Attributes stored in the BTF_INT_ENCODING */
95+
#define BTF_INT_SIGNED (1 << 0)
96+
#define BTF_INT_CHAR (1 << 1)
97+
#define BTF_INT_BOOL (1 << 2)
98+
99+
/* BTF_KIND_ENUM is followed by multiple "struct btf_enum".
100+
* The exact number of btf_enum is stored in the vlen (of the
101+
* info in "struct btf_type").
102+
*/
103+
struct btf_enum {
104+
__u32 name_off;
105+
__s32 val;
106+
};
107+
108+
/* BTF_KIND_ARRAY is followed by one "struct btf_array" */
109+
struct btf_array {
110+
__u32 type;
111+
__u32 index_type;
112+
__u32 nelems;
113+
};
114+
115+
/* BTF_KIND_STRUCT and BTF_KIND_UNION are followed
116+
* by multiple "struct btf_member". The exact number
117+
* of btf_member is stored in the vlen (of the info in
118+
* "struct btf_type").
119+
*/
120+
struct btf_member {
121+
__u32 name_off;
122+
__u32 type;
123+
__u32 offset; /* offset in bits */
124+
};
125+
126+
/* .BTF.ext section contains func_info and line_info.
127+
*/
128+
struct btf_ext_header {
129+
__u16 magic;
130+
__u8 version;
131+
__u8 flags;
132+
__u32 hdr_len;
133+
134+
__u32 func_info_off;
135+
__u32 func_info_len;
136+
__u32 line_info_off;
137+
__u32 line_info_len;
138+
};
139+
140+
struct bpf_func_info {
141+
__u32 insn_offset;
142+
__u32 type_id;
143+
};
144+
145+
struct btf_sec_func_info {
146+
__u32 sec_name_off;
147+
__u32 num_func_info;
148+
};
149+
150+
struct bpf_line_info {
151+
__u32 insn_offset;
152+
__u32 file_name_off;
153+
__u32 line_off;
154+
__u32 line_col; /* line num: line_col >> 10, col num: line_col & 0x3ff */
155+
};
156+
157+
struct btf_sec_line_info {
158+
__u32 sec_name_off;
159+
__u32 num_line_info;
160+
};
161+
162+
namespace llvm {
163+
164+
const char *const btf_kind_str[NR_BTF_KINDS] = {
165+
[BTF_KIND_UNKN] = "UNKNOWN",
166+
[BTF_KIND_INT] = "INT",
167+
[BTF_KIND_PTR] = "PTR",
168+
[BTF_KIND_ARRAY] = "ARRAY",
169+
[BTF_KIND_STRUCT] = "STRUCT",
170+
[BTF_KIND_UNION] = "UNION",
171+
[BTF_KIND_ENUM] = "ENUM",
172+
[BTF_KIND_FWD] = "FWD",
173+
[BTF_KIND_TYPEDEF] = "TYPEDEF",
174+
[BTF_KIND_VOLATILE] = "VOLATILE",
175+
[BTF_KIND_CONST] = "CONST",
176+
[BTF_KIND_RESTRICT] = "RESTRICT",
177+
[BTF_KIND_FUNC] = "FUNC",
178+
[BTF_KIND_FUNC_PROTO] = "FUNC_PROTO",
179+
};
180+
181+
#include "llvm/ADT/SmallVector.h"
182+
#include <map>
183+
184+
class MCBTFContext;
185+
class MCObjectStreamer;
186+
187+
// This is base class of all BTF KIND. It is also used directly
188+
// by the reference kinds:
189+
// BTF_KIND_CONST, BTF_KIND_PTR, BTF_KIND_VOLATILE,
190+
// BTF_KIND_TYPEDEF, BTF_KIND_RESTRICT, and BTF_KIND_FWD
191+
class BTFTypeEntry {
192+
protected:
193+
size_t Id; /* type index in the BTF list, started from 1 */
194+
struct btf_type BTFType;
195+
196+
public:
197+
BTFTypeEntry(size_t id, struct btf_type &type) :
198+
Id(id), BTFType(type) {}
199+
unsigned char getKind() { return BTF_INFO_KIND(BTFType.info); }
200+
void setId(size_t Id) { this->Id = Id; }
201+
size_t getId() { return Id; }
202+
void setNameOff(unsigned NameOff) { BTFType.name_off = NameOff; }
203+
204+
unsigned getTypeIndex() { return BTFType.type; }
205+
unsigned getNameOff() { return BTFType.name_off; }
206+
virtual size_t getSize() { return sizeof(struct btf_type); }
207+
virtual void print(raw_ostream &s, MCBTFContext& BTFContext);
208+
virtual void emitData(MCObjectStreamer *MCOS);
209+
};
210+
211+
// BTF_KIND_INT
212+
class BTFTypeEntryInt : public BTFTypeEntry {
213+
unsigned IntVal; // encoding, offset, bits
214+
215+
public:
216+
BTFTypeEntryInt(size_t id, struct btf_type &type, unsigned intval) :
217+
BTFTypeEntry(id, type), IntVal(intval) {}
218+
size_t getSize() { return BTFTypeEntry::getSize() + sizeof(unsigned); }
219+
void print(raw_ostream &s, MCBTFContext& BTFContext);
220+
void emitData(MCObjectStreamer *MCOS);
221+
};
222+
223+
// BTF_KIND_ENUM
224+
class BTFTypeEntryEnum : public BTFTypeEntry {
225+
std::vector<struct btf_enum> EnumValues;
226+
227+
public:
228+
BTFTypeEntryEnum(size_t id, struct btf_type &type,
229+
std::vector<struct btf_enum> &values) :
230+
BTFTypeEntry(id, type), EnumValues(values) {}
231+
size_t getSize() {
232+
return BTFTypeEntry::getSize() +
233+
BTF_INFO_VLEN(BTFType.info) * sizeof(struct btf_enum);
234+
}
235+
void print(raw_ostream &s, MCBTFContext& BTFContext);
236+
void emitData(MCObjectStreamer *MCOS);
237+
};
238+
239+
// BTF_KIND_ARRAY
240+
class BTFTypeEntryArray : public BTFTypeEntry {
241+
struct btf_array ArrayInfo;
242+
243+
public:
244+
BTFTypeEntryArray(size_t id, struct btf_type &type,
245+
struct btf_array &arrayinfo) :
246+
BTFTypeEntry(id, type), ArrayInfo(arrayinfo) {}
247+
size_t getSize() {
248+
return BTFTypeEntry::getSize() + sizeof(struct btf_array);
249+
}
250+
void print(raw_ostream &s, MCBTFContext& BTFContext);
251+
void emitData(MCObjectStreamer *MCOS);
252+
};
253+
254+
// BTF_KIND_STRUCT and BTF_KIND_UNION
255+
class BTFTypeEntryStruct : public BTFTypeEntry {
256+
std::vector<struct btf_member> Members;
257+
258+
public:
259+
BTFTypeEntryStruct(size_t id, struct btf_type &type,
260+
std::vector<struct btf_member> &members) :
261+
BTFTypeEntry(id, type), Members(members) {}
262+
size_t getSize() {
263+
return BTFTypeEntry::getSize() +
264+
BTF_INFO_VLEN(BTFType.info) * sizeof(struct btf_member);
265+
}
266+
void print(raw_ostream &s, MCBTFContext& BTFContext);
267+
void emitData(MCObjectStreamer *MCOS);
268+
};
269+
270+
// BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO
271+
class BTFTypeEntryFunc : public BTFTypeEntry {
272+
std::vector<unsigned> Parameters;
273+
274+
public:
275+
BTFTypeEntryFunc(size_t id, struct btf_type &type,
276+
std::vector<unsigned> &params) :
277+
BTFTypeEntry(id, type), Parameters(params) {}
278+
size_t getSize() {
279+
return BTFTypeEntry::getSize() +
280+
BTF_INFO_VLEN(BTFType.info) * sizeof(unsigned);
281+
}
282+
void print(raw_ostream &s, MCBTFContext& BTFContext);
283+
void emitData(MCObjectStreamer *MCOS);
284+
};
285+
286+
class BTFStringTable {
287+
size_t Size; // total size in bytes
288+
std::map<size_t, unsigned> OffsetToIdMap;
289+
std::vector<std::string> Table;
290+
291+
public:
292+
BTFStringTable() : Size(0) {}
293+
size_t getSize() { return Size; }
294+
std::vector<std::string> &getTable() { return Table; }
295+
size_t addString(std::string S) {
296+
// check whether the string already exists
297+
for (auto &OffsetM : OffsetToIdMap) {
298+
if (Table[OffsetM.second] == S)
299+
return OffsetM.first;
300+
}
301+
// not find, add to the string table
302+
size_t Offset = Size;
303+
OffsetToIdMap[Offset] = Table.size();
304+
Table.push_back(S);
305+
Size += S.size() + 1;
306+
return Offset;
307+
}
308+
std::string &getStringAtOffset(size_t Offset) {
309+
return Table[OffsetToIdMap[Offset]];
310+
}
311+
void showTable(raw_ostream &OS) {
312+
for (auto OffsetM : OffsetToIdMap)
313+
OS << OffsetM.first << " : " << Table[OffsetM.second]
314+
<< "\n";
315+
}
316+
};
317+
318+
struct BTFFuncInfo {
319+
const MCSymbol *Label;
320+
unsigned int TypeId;
321+
};
322+
323+
struct BTFLineInfo {
324+
MCSymbol *Label;
325+
unsigned int FileNameOff;
326+
unsigned int LineOff;
327+
unsigned int LineNum;
328+
unsigned int ColumnNum;
329+
};
330+
331+
class MCBTFContext {
332+
std::vector<std::unique_ptr<BTFTypeEntry>> TypeEntries;
333+
BTFStringTable StringTable;
334+
std::map<unsigned, std::vector<BTFFuncInfo>> FuncInfoTable;
335+
std::map<unsigned, std::vector<BTFLineInfo>> LineInfoTable;
336+
337+
friend class BTFTypeEntry;
338+
friend class BTFTypeEntryInt;
339+
friend class BTFTypeEntryEnum;
340+
friend class BTFTypeEntryArray;
341+
friend class BTFTypeEntryStruct;
342+
friend class BTFTypeEntryFunc;
343+
344+
public:
345+
void dump(raw_ostream& OS);
346+
void emitAll(MCObjectStreamer *MCOS);
347+
void emitCommonHeader(MCObjectStreamer *MCOS);
348+
void emitBTFSection(MCObjectStreamer *MCOS);
349+
void emitBTFExtSection(MCObjectStreamer *MCOS);
350+
351+
size_t addString(std::string S) {
352+
return StringTable.addString(S);
353+
}
354+
void addTypeEntry(std::unique_ptr<BTFTypeEntry> Entry);
355+
void addFuncInfo(unsigned SecNameOff, BTFFuncInfo Info) {
356+
FuncInfoTable[SecNameOff].push_back(Info);
357+
}
358+
void addLineInfo(unsigned SecNameOff, BTFLineInfo Info) {
359+
LineInfoTable[SecNameOff].push_back(Info);
360+
}
361+
};
362+
363+
}
364+
#endif

llvm/include/llvm/MC/MCContext.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ namespace llvm {
5656
class MCSymbolWasm;
5757
class SMLoc;
5858
class SourceMgr;
59+
class MCBTFContext;
5960

6061
/// Context object for machine code objects. This class owns all of the
6162
/// sections that it creates.
@@ -278,6 +279,9 @@ namespace llvm {
278279
/// Map of currently defined macros.
279280
StringMap<MCAsmMacro> MacroMap;
280281

282+
/// for BTF debug information
283+
std::unique_ptr<MCBTFContext> BTFCtx;
284+
281285
public:
282286
explicit MCContext(const MCAsmInfo *MAI, const MCRegisterInfo *MRI,
283287
const MCObjectFileInfo *MOFI,
@@ -286,6 +290,9 @@ namespace llvm {
286290
MCContext &operator=(const MCContext &) = delete;
287291
~MCContext();
288292

293+
void setBTFContext(std::unique_ptr<MCBTFContext> Ctx);
294+
std::unique_ptr<MCBTFContext> &getBTFContext() { return BTFCtx; }
295+
289296
const SourceMgr *getSourceManager() const { return SrcMgr; }
290297

291298
void setInlineSourceManager(SourceMgr *SM) { InlineSrcMgr = SM; }

0 commit comments

Comments
 (0)