Skip to content

Commit ac7ac43

Browse files
author
Alexei Starovoitov
committed
Merge branch 'New nf_conntrack kfuncs for insertion, changing timeout, status'
Kumar Kartikeya Dwivedi says: ==================== Introduce the following new kfuncs: - bpf_{xdp,skb}_ct_alloc - bpf_ct_insert_entry - bpf_ct_{set,change}_timeout - bpf_ct_{set,change}_status The setting of timeout and status on allocated or inserted/looked up CT is same as the ctnetlink interface, hence code is refactored and shared with the kfuncs. It is ensured allocated CT cannot be passed to kfuncs that expected inserted CT, and vice versa. Please see individual patches for details. Changelog: ---------- v6 -> v7: v6: https://lore.kernel.org/bpf/[email protected] * Use .long to encode flags (Alexei) * Fix description of KF_RET_NULL in documentation (Toke) v5 -> v6: v5: https://lore.kernel.org/bpf/[email protected] * Introduce kfunc flags, rework verifier to work with them * Add documentation for kfuncs * Add comment explaining TRUSTED_ARGS kfunc flag (Alexei) * Fix missing offset check for trusted arguments (Alexei) * Change nf_conntrack test minimum delta value to 8 v4 -> v5: v4: https://lore.kernel.org/bpf/[email protected] * Drop read-only PTR_TO_BTF_ID approach, use struct nf_conn___init (Alexei) * Drop acquire release pair code that is no longer required (Alexei) * Disable writes into nf_conn, use dedicated helpers (Florian, Alexei) * Refactor and share ctnetlink code for setting timeout and status * Do strict type matching on finding __ref suffix on argument to prevent passing nf_conn___init as nf_conn (offset = 0, match on walk) * Remove bpf_ct_opts parameter from bpf_ct_insert_entry * Update selftests for new additions, add more negative tests v3 -> v4: v3: https://lore.kernel.org/bpf/[email protected] * split bpf_xdp_ct_add in bpf_xdp_ct_alloc/bpf_skb_ct_alloc and bpf_ct_insert_entry * add verifier code to properly populate/configure ct entry * improve selftests v2 -> v3: v2: https://lore.kernel.org/bpf/[email protected] * add bpf_xdp_ct_add and bpf_ct_refresh_timeout kfunc helpers * remove conntrack dependency from selftests * add support for forcing kfunc args to be referenced and related selftests v1 -> v2: v1: https://lore.kernel.org/bpf/1327f8f5696ff2bc60400e8f3b79047914ccc837.1651595019.git.lorenzo@kernel.org * add bpf_ct_refresh_timeout kfunc selftest Kumar Kartikeya Dwivedi (10): bpf: Introduce 8-byte BTF set tools/resolve_btfids: Add support for 8-byte BTF sets bpf: Switch to new kfunc flags infrastructure bpf: Add support for forcing kfunc args to be trusted bpf: Add documentation for kfuncs net: netfilter: Deduplicate code in bpf_{xdp,skb}_ct_lookup net: netfilter: Add kfuncs to set and change CT timeout selftests/bpf: Add verifier tests for trusted kfunc args selftests/bpf: Add negative tests for new nf_conntrack kfuncs selftests/bpf: Fix test_verifier failed test in unprivileged mode ==================== Signed-off-by: Alexei Starovoitov <[email protected]>
2 parents 5cb62b7 + e3fa473 commit ac7ac43

File tree

23 files changed

+1139
-349
lines changed

23 files changed

+1139
-349
lines changed

Documentation/bpf/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ that goes into great technical depth about the BPF Architecture.
1919
faq
2020
syscall_api
2121
helpers
22+
kfuncs
2223
programs
2324
maps
2425
bpf_prog_run

Documentation/bpf/kfuncs.rst

Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
=============================
2+
BPF Kernel Functions (kfuncs)
3+
=============================
4+
5+
1. Introduction
6+
===============
7+
8+
BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux
9+
kernel which are exposed for use by BPF programs. Unlike normal BPF helpers,
10+
kfuncs do not have a stable interface and can change from one kernel release to
11+
another. Hence, BPF programs need to be updated in response to changes in the
12+
kernel.
13+
14+
2. Defining a kfunc
15+
===================
16+
17+
There are two ways to expose a kernel function to BPF programs, either make an
18+
existing function in the kernel visible, or add a new wrapper for BPF. In both
19+
cases, care must be taken that BPF program can only call such function in a
20+
valid context. To enforce this, visibility of a kfunc can be per program type.
21+
22+
If you are not creating a BPF wrapper for existing kernel function, skip ahead
23+
to :ref:`BPF_kfunc_nodef`.
24+
25+
2.1 Creating a wrapper kfunc
26+
----------------------------
27+
28+
When defining a wrapper kfunc, the wrapper function should have extern linkage.
29+
This prevents the compiler from optimizing away dead code, as this wrapper kfunc
30+
is not invoked anywhere in the kernel itself. It is not necessary to provide a
31+
prototype in a header for the wrapper kfunc.
32+
33+
An example is given below::
34+
35+
/* Disables missing prototype warnings */
36+
__diag_push();
37+
__diag_ignore_all("-Wmissing-prototypes",
38+
"Global kfuncs as their definitions will be in BTF");
39+
40+
struct task_struct *bpf_find_get_task_by_vpid(pid_t nr)
41+
{
42+
return find_get_task_by_vpid(nr);
43+
}
44+
45+
__diag_pop();
46+
47+
A wrapper kfunc is often needed when we need to annotate parameters of the
48+
kfunc. Otherwise one may directly make the kfunc visible to the BPF program by
49+
registering it with the BPF subsystem. See :ref:`BPF_kfunc_nodef`.
50+
51+
2.2 Annotating kfunc parameters
52+
-------------------------------
53+
54+
Similar to BPF helpers, there is sometime need for additional context required
55+
by the verifier to make the usage of kernel functions safer and more useful.
56+
Hence, we can annotate a parameter by suffixing the name of the argument of the
57+
kfunc with a __tag, where tag may be one of the supported annotations.
58+
59+
2.2.1 __sz Annotation
60+
---------------------
61+
62+
This annotation is used to indicate a memory and size pair in the argument list.
63+
An example is given below::
64+
65+
void bpf_memzero(void *mem, int mem__sz)
66+
{
67+
...
68+
}
69+
70+
Here, the verifier will treat first argument as a PTR_TO_MEM, and second
71+
argument as its size. By default, without __sz annotation, the size of the type
72+
of the pointer is used. Without __sz annotation, a kfunc cannot accept a void
73+
pointer.
74+
75+
.. _BPF_kfunc_nodef:
76+
77+
2.3 Using an existing kernel function
78+
-------------------------------------
79+
80+
When an existing function in the kernel is fit for consumption by BPF programs,
81+
it can be directly registered with the BPF subsystem. However, care must still
82+
be taken to review the context in which it will be invoked by the BPF program
83+
and whether it is safe to do so.
84+
85+
2.4 Annotating kfuncs
86+
---------------------
87+
88+
In addition to kfuncs' arguments, verifier may need more information about the
89+
type of kfunc(s) being registered with the BPF subsystem. To do so, we define
90+
flags on a set of kfuncs as follows::
91+
92+
BTF_SET8_START(bpf_task_set)
93+
BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
94+
BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
95+
BTF_SET8_END(bpf_task_set)
96+
97+
This set encodes the BTF ID of each kfunc listed above, and encodes the flags
98+
along with it. Ofcourse, it is also allowed to specify no flags.
99+
100+
2.4.1 KF_ACQUIRE flag
101+
---------------------
102+
103+
The KF_ACQUIRE flag is used to indicate that the kfunc returns a pointer to a
104+
refcounted object. The verifier will then ensure that the pointer to the object
105+
is eventually released using a release kfunc, or transferred to a map using a
106+
referenced kptr (by invoking bpf_kptr_xchg). If not, the verifier fails the
107+
loading of the BPF program until no lingering references remain in all possible
108+
explored states of the program.
109+
110+
2.4.2 KF_RET_NULL flag
111+
----------------------
112+
113+
The KF_RET_NULL flag is used to indicate that the pointer returned by the kfunc
114+
may be NULL. Hence, it forces the user to do a NULL check on the pointer
115+
returned from the kfunc before making use of it (dereferencing or passing to
116+
another helper). This flag is often used in pairing with KF_ACQUIRE flag, but
117+
both are orthogonal to each other.
118+
119+
2.4.3 KF_RELEASE flag
120+
---------------------
121+
122+
The KF_RELEASE flag is used to indicate that the kfunc releases the pointer
123+
passed in to it. There can be only one referenced pointer that can be passed in.
124+
All copies of the pointer being released are invalidated as a result of invoking
125+
kfunc with this flag.
126+
127+
2.4.4 KF_KPTR_GET flag
128+
----------------------
129+
130+
The KF_KPTR_GET flag is used to indicate that the kfunc takes the first argument
131+
as a pointer to kptr, safely increments the refcount of the object it points to,
132+
and returns a reference to the user. The rest of the arguments may be normal
133+
arguments of a kfunc. The KF_KPTR_GET flag should be used in conjunction with
134+
KF_ACQUIRE and KF_RET_NULL flags.
135+
136+
2.4.5 KF_TRUSTED_ARGS flag
137+
--------------------------
138+
139+
The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It
140+
indicates that the all pointer arguments will always be refcounted, and have
141+
their offset set to 0. It can be used to enforce that a pointer to a refcounted
142+
object acquired from a kfunc or BPF helper is passed as an argument to this
143+
kfunc without any modifications (e.g. pointer arithmetic) such that it is
144+
trusted and points to the original object. This flag is often used for kfuncs
145+
that operate (change some property, perform some operation) on an object that
146+
was obtained using an acquire kfunc. Such kfuncs need an unchanged pointer to
147+
ensure the integrity of the operation being performed on the expected object.
148+
149+
2.5 Registering the kfuncs
150+
--------------------------
151+
152+
Once the kfunc is prepared for use, the final step to making it visible is
153+
registering it with the BPF subsystem. Registration is done per BPF program
154+
type. An example is shown below::
155+
156+
BTF_SET8_START(bpf_task_set)
157+
BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
158+
BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
159+
BTF_SET8_END(bpf_task_set)
160+
161+
static const struct btf_kfunc_id_set bpf_task_kfunc_set = {
162+
.owner = THIS_MODULE,
163+
.set = &bpf_task_set,
164+
};
165+
166+
static int init_subsystem(void)
167+
{
168+
return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_task_kfunc_set);
169+
}
170+
late_initcall(init_subsystem);

include/linux/bpf.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1924,7 +1924,8 @@ int btf_check_subprog_arg_match(struct bpf_verifier_env *env, int subprog,
19241924
struct bpf_reg_state *regs);
19251925
int btf_check_kfunc_arg_match(struct bpf_verifier_env *env,
19261926
const struct btf *btf, u32 func_id,
1927-
struct bpf_reg_state *regs);
1927+
struct bpf_reg_state *regs,
1928+
u32 kfunc_flags);
19281929
int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
19291930
struct bpf_reg_state *reg);
19301931
int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *prog,

include/linux/btf.h

Lines changed: 42 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,43 @@
1212
#define BTF_TYPE_EMIT(type) ((void)(type *)0)
1313
#define BTF_TYPE_EMIT_ENUM(enum_val) ((void)enum_val)
1414

15-
enum btf_kfunc_type {
16-
BTF_KFUNC_TYPE_CHECK,
17-
BTF_KFUNC_TYPE_ACQUIRE,
18-
BTF_KFUNC_TYPE_RELEASE,
19-
BTF_KFUNC_TYPE_RET_NULL,
20-
BTF_KFUNC_TYPE_KPTR_ACQUIRE,
21-
BTF_KFUNC_TYPE_MAX,
22-
};
15+
/* These need to be macros, as the expressions are used in assembler input */
16+
#define KF_ACQUIRE (1 << 0) /* kfunc is an acquire function */
17+
#define KF_RELEASE (1 << 1) /* kfunc is a release function */
18+
#define KF_RET_NULL (1 << 2) /* kfunc returns a pointer that may be NULL */
19+
#define KF_KPTR_GET (1 << 3) /* kfunc returns reference to a kptr */
20+
/* Trusted arguments are those which are meant to be referenced arguments with
21+
* unchanged offset. It is used to enforce that pointers obtained from acquire
22+
* kfuncs remain unmodified when being passed to helpers taking trusted args.
23+
*
24+
* Consider
25+
* struct foo {
26+
* int data;
27+
* struct foo *next;
28+
* };
29+
*
30+
* struct bar {
31+
* int data;
32+
* struct foo f;
33+
* };
34+
*
35+
* struct foo *f = alloc_foo(); // Acquire kfunc
36+
* struct bar *b = alloc_bar(); // Acquire kfunc
37+
*
38+
* If a kfunc set_foo_data() wants to operate only on the allocated object, it
39+
* will set the KF_TRUSTED_ARGS flag, which will prevent unsafe usage like:
40+
*
41+
* set_foo_data(f, 42); // Allowed
42+
* set_foo_data(f->next, 42); // Rejected, non-referenced pointer
43+
* set_foo_data(&f->next, 42);// Rejected, referenced, but wrong type
44+
* set_foo_data(&b->f, 42); // Rejected, referenced, but bad offset
45+
*
46+
* In the final case, usually for the purposes of type matching, it is deduced
47+
* by looking at the type of the member at the offset, but due to the
48+
* requirement of trusted argument, this deduction will be strict and not done
49+
* for this case.
50+
*/
51+
#define KF_TRUSTED_ARGS (1 << 4) /* kfunc only takes trusted pointer arguments */
2352

2453
struct btf;
2554
struct btf_member;
@@ -30,16 +59,7 @@ struct btf_id_set;
3059

3160
struct btf_kfunc_id_set {
3261
struct module *owner;
33-
union {
34-
struct {
35-
struct btf_id_set *check_set;
36-
struct btf_id_set *acquire_set;
37-
struct btf_id_set *release_set;
38-
struct btf_id_set *ret_null_set;
39-
struct btf_id_set *kptr_acquire_set;
40-
};
41-
struct btf_id_set *sets[BTF_KFUNC_TYPE_MAX];
42-
};
62+
struct btf_id_set8 *set;
4363
};
4464

4565
struct btf_id_dtor_kfunc {
@@ -378,9 +398,9 @@ const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
378398
const char *btf_name_by_offset(const struct btf *btf, u32 offset);
379399
struct btf *btf_parse_vmlinux(void);
380400
struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog);
381-
bool btf_kfunc_id_set_contains(const struct btf *btf,
401+
u32 *btf_kfunc_id_set_contains(const struct btf *btf,
382402
enum bpf_prog_type prog_type,
383-
enum btf_kfunc_type type, u32 kfunc_btf_id);
403+
u32 kfunc_btf_id);
384404
int register_btf_kfunc_id_set(enum bpf_prog_type prog_type,
385405
const struct btf_kfunc_id_set *s);
386406
s32 btf_find_dtor_kfunc(struct btf *btf, u32 btf_id);
@@ -397,12 +417,11 @@ static inline const char *btf_name_by_offset(const struct btf *btf,
397417
{
398418
return NULL;
399419
}
400-
static inline bool btf_kfunc_id_set_contains(const struct btf *btf,
420+
static inline u32 *btf_kfunc_id_set_contains(const struct btf *btf,
401421
enum bpf_prog_type prog_type,
402-
enum btf_kfunc_type type,
403422
u32 kfunc_btf_id)
404423
{
405-
return false;
424+
return NULL;
406425
}
407426
static inline int register_btf_kfunc_id_set(enum bpf_prog_type prog_type,
408427
const struct btf_kfunc_id_set *s)

0 commit comments

Comments
 (0)