Skip to content

Commit 257367c

Browse files
committed
Alexei Starovoitov says: ==================== pull-request: bpf-next 2022-01-06 We've added 41 non-merge commits during the last 2 day(s) which contain a total of 36 files changed, 1214 insertions(+), 368 deletions(-). The main changes are: 1) Various fixes in the verifier, from Kris and Daniel. 2) Fixes in sockmap, from John. 3) bpf_getsockopt fix, from Kuniyuki. 4) INET_POST_BIND fix, from Menglong. 5) arm64 JIT fix for bpf pseudo funcs, from Hou. 6) BPF ISA doc improvements, from Christoph. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits) bpf: selftests: Add bind retry for post_bind{4, 6} bpf: selftests: Use C99 initializers in test_sock.c net: bpf: Handle return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND() bpf/selftests: Test bpf_d_path on rdonly_mem. libbpf: Add documentation for bpf_map batch operations selftests/bpf: Don't rely on preserving volatile in PT_REGS macros in loop3 xdp: Add xdp_do_redirect_frame() for pre-computed xdp_frames xdp: Move conversion to xdp_frame out of map functions page_pool: Store the XDP mem id page_pool: Add callback to init pages when they are allocated xdp: Allow registering memory model without rxq reference samples/bpf: xdpsock: Add timestamp for Tx-only operation samples/bpf: xdpsock: Add time-out for cleaning Tx samples/bpf: xdpsock: Add sched policy and priority support samples/bpf: xdpsock: Add cyclic TX operation capability samples/bpf: xdpsock: Add clockid selection support samples/bpf: xdpsock: Add Dest and Src MAC setting for Tx-only operation samples/bpf: xdpsock: Add VLAN support for Tx-only operation libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API libbpf 1.0: Deprecate bpf_map__is_offload_neutral() ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2 parents 710ad98 + eff14fc commit 257367c

File tree

36 files changed

+1214
-368
lines changed

36 files changed

+1214
-368
lines changed

Documentation/bpf/instruction-set.rst

Lines changed: 91 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -19,23 +19,37 @@ The eBPF calling convention is defined as:
1919
R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if
2020
necessary across calls.
2121

22+
Instruction encoding
23+
====================
24+
25+
eBPF uses 64-bit instructions with the following encoding:
26+
27+
============= ======= =============== ==================== ============
28+
32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
29+
============= ======= =============== ==================== ============
30+
immediate offset source register destination register opcode
31+
============= ======= =============== ==================== ============
32+
33+
Note that most instructions do not use all of the fields.
34+
Unused fields shall be cleared to zero.
35+
2236
Instruction classes
23-
===================
37+
-------------------
2438

2539
The three LSB bits of the 'opcode' field store the instruction class:
2640

27-
========= =====
28-
class value
29-
========= =====
30-
BPF_LD 0x00
31-
BPF_LDX 0x01
32-
BPF_ST 0x02
33-
BPF_STX 0x03
34-
BPF_ALU 0x04
35-
BPF_JMP 0x05
36-
BPF_JMP32 0x06
37-
BPF_ALU64 0x07
38-
========= =====
41+
========= ===== ===============================
42+
class value description
43+
========= ===== ===============================
44+
BPF_LD 0x00 non-standard load operations
45+
BPF_LDX 0x01 load into register operations
46+
BPF_ST 0x02 store from immediate operations
47+
BPF_STX 0x03 store from register operations
48+
BPF_ALU 0x04 32-bit arithmetic operations
49+
BPF_JMP 0x05 64-bit jump operations
50+
BPF_JMP32 0x06 32-bit jump operations
51+
BPF_ALU64 0x07 64-bit arithmetic operations
52+
========= ===== ===============================
3953

4054
Arithmetic and jump instructions
4155
================================
@@ -60,66 +74,78 @@ The 4th bit encodes the source operand:
6074

6175
The four MSB bits store the operation code.
6276

63-
For class BPF_ALU or BPF_ALU64:
6477

65-
======== ===== =========================
78+
Arithmetic instructions
79+
-----------------------
80+
81+
BPF_ALU uses 32-bit wide operands while BPF_ALU64 uses 64-bit wide operands for
82+
otherwise identical operations.
83+
The code field encodes the operation as below:
84+
85+
======== ===== ==========================
6686
code value description
67-
======== ===== =========================
68-
BPF_ADD 0x00
69-
BPF_SUB 0x10
70-
BPF_MUL 0x20
71-
BPF_DIV 0x30
72-
BPF_OR 0x40
73-
BPF_AND 0x50
74-
BPF_LSH 0x60
75-
BPF_RSH 0x70
76-
BPF_NEG 0x80
77-
BPF_MOD 0x90
78-
BPF_XOR 0xa0
79-
BPF_MOV 0xb0 mov reg to reg
87+
======== ===== ==========================
88+
BPF_ADD 0x00 dst += src
89+
BPF_SUB 0x10 dst -= src
90+
BPF_MUL 0x20 dst \*= src
91+
BPF_DIV 0x30 dst /= src
92+
BPF_OR 0x40 dst \|= src
93+
BPF_AND 0x50 dst &= src
94+
BPF_LSH 0x60 dst <<= src
95+
BPF_RSH 0x70 dst >>= src
96+
BPF_NEG 0x80 dst = ~src
97+
BPF_MOD 0x90 dst %= src
98+
BPF_XOR 0xa0 dst ^= src
99+
BPF_MOV 0xb0 dst = src
80100
BPF_ARSH 0xc0 sign extending shift right
81101
BPF_END 0xd0 endianness conversion
82-
======== ===== =========================
102+
======== ===== ==========================
83103

84-
For class BPF_JMP or BPF_JMP32:
104+
BPF_ADD | BPF_X | BPF_ALU means::
85105

86-
======== ===== =========================
87-
code value description
88-
======== ===== =========================
89-
BPF_JA 0x00 BPF_JMP only
90-
BPF_JEQ 0x10
91-
BPF_JGT 0x20
92-
BPF_JGE 0x30
93-
BPF_JSET 0x40
94-
BPF_JNE 0x50 jump '!='
95-
BPF_JSGT 0x60 signed '>'
96-
BPF_JSGE 0x70 signed '>='
97-
BPF_CALL 0x80 function call
98-
BPF_EXIT 0x90 function return
99-
BPF_JLT 0xa0 unsigned '<'
100-
BPF_JLE 0xb0 unsigned '<='
101-
BPF_JSLT 0xc0 signed '<'
102-
BPF_JSLE 0xd0 signed '<='
103-
======== ===== =========================
106+
dst_reg = (u32) dst_reg + (u32) src_reg;
104107

105-
So BPF_ADD | BPF_X | BPF_ALU means::
108+
BPF_ADD | BPF_X | BPF_ALU64 means::
106109

107-
dst_reg = (u32) dst_reg + (u32) src_reg;
110+
dst_reg = dst_reg + src_reg
108111

109-
Similarly, BPF_XOR | BPF_K | BPF_ALU means::
112+
BPF_XOR | BPF_K | BPF_ALU means::
110113

111114
src_reg = (u32) src_reg ^ (u32) imm32
112115

113-
eBPF is using BPF_MOV | BPF_X | BPF_ALU to represent A = B moves. BPF_ALU64
114-
is used to mean exactly the same operations as BPF_ALU, but with 64-bit wide
115-
operands instead. So BPF_ADD | BPF_X | BPF_ALU64 means 64-bit addition, i.e.::
116+
BPF_XOR | BPF_K | BPF_ALU64 means::
116117

117-
dst_reg = dst_reg + src_reg
118+
src_reg = src_reg ^ imm32
119+
120+
121+
Jump instructions
122+
-----------------
118123

119-
BPF_JMP | BPF_EXIT means function exit only. The eBPF program needs to store
120-
the return value into register R0 before doing a BPF_EXIT. Class 6 is used as
121-
BPF_JMP32 to mean exactly the same operations as BPF_JMP, but with 32-bit wide
122-
operands for the comparisons instead.
124+
BPF_JMP32 uses 32-bit wide operands while BPF_JMP uses 64-bit wide operands for
125+
otherwise identical operations.
126+
The code field encodes the operation as below:
127+
128+
======== ===== ========================= ============
129+
code value description notes
130+
======== ===== ========================= ============
131+
BPF_JA 0x00 PC += off BPF_JMP only
132+
BPF_JEQ 0x10 PC += off if dst == src
133+
BPF_JGT 0x20 PC += off if dst > src unsigned
134+
BPF_JGE 0x30 PC += off if dst >= src unsigned
135+
BPF_JSET 0x40 PC += off if dst & src
136+
BPF_JNE 0x50 PC += off if dst != src
137+
BPF_JSGT 0x60 PC += off if dst > src signed
138+
BPF_JSGE 0x70 PC += off if dst >= src signed
139+
BPF_CALL 0x80 function call
140+
BPF_EXIT 0x90 function / program return BPF_JMP only
141+
BPF_JLT 0xa0 PC += off if dst < src unsigned
142+
BPF_JLE 0xb0 PC += off if dst <= src unsigned
143+
BPF_JSLT 0xc0 PC += off if dst < src signed
144+
BPF_JSLE 0xd0 PC += off if dst <= src signed
145+
======== ===== ========================= ============
146+
147+
The eBPF program needs to store the return value into register R0 before doing a
148+
BPF_EXIT.
123149

124150

125151
Load and store instructions
@@ -147,15 +173,15 @@ The size modifier is one of:
147173

148174
The mode modifier is one of:
149175

150-
============= ===== =====================
176+
============= ===== ====================================
151177
mode modifier value description
152-
============= ===== =====================
178+
============= ===== ====================================
153179
BPF_IMM 0x00 used for 64-bit mov
154-
BPF_ABS 0x20
155-
BPF_IND 0x40
156-
BPF_MEM 0x60
180+
BPF_ABS 0x20 legacy BPF packet access
181+
BPF_IND 0x40 legacy BPF packet access
182+
BPF_MEM 0x60 all normal load and store operations
157183
BPF_ATOMIC 0xc0 atomic operations
158-
============= ===== =====================
184+
============= ===== ====================================
159185

160186
BPF_MEM | <size> | BPF_STX means::
161187

arch/arm64/net/bpf_jit_comp.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -792,7 +792,10 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
792792
u64 imm64;
793793

794794
imm64 = (u64)insn1.imm << 32 | (u32)imm;
795-
emit_a64_mov_i64(dst, imm64, ctx);
795+
if (bpf_pseudo_func(insn))
796+
emit_addr_mov_i64(dst, imm64, ctx);
797+
else
798+
emit_a64_mov_i64(dst, imm64, ctx);
796799

797800
return 1;
798801
}

include/linux/bpf.h

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1669,17 +1669,17 @@ void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
16691669
struct btf *bpf_get_btf_vmlinux(void);
16701670

16711671
/* Map specifics */
1672-
struct xdp_buff;
1672+
struct xdp_frame;
16731673
struct sk_buff;
16741674
struct bpf_dtab_netdev;
16751675
struct bpf_cpu_map_entry;
16761676

16771677
void __dev_flush(void);
1678-
int dev_xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp,
1678+
int dev_xdp_enqueue(struct net_device *dev, struct xdp_frame *xdpf,
16791679
struct net_device *dev_rx);
1680-
int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp,
1680+
int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_frame *xdpf,
16811681
struct net_device *dev_rx);
1682-
int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx,
1682+
int dev_map_enqueue_multi(struct xdp_frame *xdpf, struct net_device *dev_rx,
16831683
struct bpf_map *map, bool exclude_ingress);
16841684
int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb,
16851685
struct bpf_prog *xdp_prog);
@@ -1688,7 +1688,7 @@ int dev_map_redirect_multi(struct net_device *dev, struct sk_buff *skb,
16881688
bool exclude_ingress);
16891689

16901690
void __cpu_map_flush(void);
1691-
int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_buff *xdp,
1691+
int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_frame *xdpf,
16921692
struct net_device *dev_rx);
16931693
int cpu_map_generic_redirect(struct bpf_cpu_map_entry *rcpu,
16941694
struct sk_buff *skb);
@@ -1866,26 +1866,26 @@ static inline void __dev_flush(void)
18661866
{
18671867
}
18681868

1869-
struct xdp_buff;
1869+
struct xdp_frame;
18701870
struct bpf_dtab_netdev;
18711871
struct bpf_cpu_map_entry;
18721872

18731873
static inline
1874-
int dev_xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp,
1874+
int dev_xdp_enqueue(struct net_device *dev, struct xdp_frame *xdpf,
18751875
struct net_device *dev_rx)
18761876
{
18771877
return 0;
18781878
}
18791879

18801880
static inline
1881-
int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp,
1881+
int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_frame *xdpf,
18821882
struct net_device *dev_rx)
18831883
{
18841884
return 0;
18851885
}
18861886

18871887
static inline
1888-
int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx,
1888+
int dev_map_enqueue_multi(struct xdp_frame *xdpf, struct net_device *dev_rx,
18891889
struct bpf_map *map, bool exclude_ingress)
18901890
{
18911891
return 0;
@@ -1913,7 +1913,7 @@ static inline void __cpu_map_flush(void)
19131913
}
19141914

19151915
static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu,
1916-
struct xdp_buff *xdp,
1916+
struct xdp_frame *xdpf,
19171917
struct net_device *dev_rx)
19181918
{
19191919
return 0;

include/linux/filter.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1019,6 +1019,10 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,
10191019
int xdp_do_redirect(struct net_device *dev,
10201020
struct xdp_buff *xdp,
10211021
struct bpf_prog *prog);
1022+
int xdp_do_redirect_frame(struct net_device *dev,
1023+
struct xdp_buff *xdp,
1024+
struct xdp_frame *xdpf,
1025+
struct bpf_prog *prog);
10221026
void xdp_do_flush(void);
10231027

10241028
/* The xdp_do_flush_map() helper has been renamed to drop the _map suffix, as

include/net/page_pool.h

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,8 @@ struct page_pool_params {
8080
enum dma_data_direction dma_dir; /* DMA mapping direction */
8181
unsigned int max_len; /* max DMA sync memory size */
8282
unsigned int offset; /* DMA addr offset */
83+
void (*init_callback)(struct page *page, void *arg);
84+
void *init_arg;
8385
};
8486

8587
struct page_pool {
@@ -94,6 +96,7 @@ struct page_pool {
9496
unsigned int frag_offset;
9597
struct page *frag_page;
9698
long frag_users;
99+
u32 xdp_mem_id;
97100

98101
/*
99102
* Data structure for allocation side
@@ -168,9 +171,12 @@ bool page_pool_return_skb_page(struct page *page);
168171

169172
struct page_pool *page_pool_create(const struct page_pool_params *params);
170173

174+
struct xdp_mem_info;
175+
171176
#ifdef CONFIG_PAGE_POOL
172177
void page_pool_destroy(struct page_pool *pool);
173-
void page_pool_use_xdp_mem(struct page_pool *pool, void (*disconnect)(void *));
178+
void page_pool_use_xdp_mem(struct page_pool *pool, void (*disconnect)(void *),
179+
struct xdp_mem_info *mem);
174180
void page_pool_release_page(struct page_pool *pool, struct page *page);
175181
void page_pool_put_page_bulk(struct page_pool *pool, void **data,
176182
int count);
@@ -180,7 +186,8 @@ static inline void page_pool_destroy(struct page_pool *pool)
180186
}
181187

182188
static inline void page_pool_use_xdp_mem(struct page_pool *pool,
183-
void (*disconnect)(void *))
189+
void (*disconnect)(void *),
190+
struct xdp_mem_info *mem)
184191
{
185192
}
186193
static inline void page_pool_release_page(struct page_pool *pool,

include/net/sock.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1209,6 +1209,7 @@ struct proto {
12091209
void (*unhash)(struct sock *sk);
12101210
void (*rehash)(struct sock *sk);
12111211
int (*get_port)(struct sock *sk, unsigned short snum);
1212+
void (*put_port)(struct sock *sk);
12121213
#ifdef CONFIG_BPF_SYSCALL
12131214
int (*psock_update_sk_prot)(struct sock *sk,
12141215
struct sk_psock *psock,

include/net/xdp.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,9 @@ bool xdp_rxq_info_is_reg(struct xdp_rxq_info *xdp_rxq);
260260
int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
261261
enum xdp_mem_type type, void *allocator);
262262
void xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq);
263+
int xdp_reg_mem_model(struct xdp_mem_info *mem,
264+
enum xdp_mem_type type, void *allocator);
265+
void xdp_unreg_mem_model(struct xdp_mem_info *mem);
263266

264267
/* Drivers not supporting XDP metadata can use this helper, which
265268
* rejects any room expansion for metadata as a result.

kernel/bpf/cpumap.c

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -746,15 +746,9 @@ static void bq_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_frame *xdpf)
746746
list_add(&bq->flush_node, flush_list);
747747
}
748748

749-
int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_buff *xdp,
749+
int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_frame *xdpf,
750750
struct net_device *dev_rx)
751751
{
752-
struct xdp_frame *xdpf;
753-
754-
xdpf = xdp_convert_buff_to_frame(xdp);
755-
if (unlikely(!xdpf))
756-
return -EOVERFLOW;
757-
758752
/* Info needed when constructing SKB on remote CPU */
759753
xdpf->dev_rx = dev_rx;
760754

0 commit comments

Comments
 (0)