@@ -810,6 +810,35 @@ union bpf_attr {
810
810
* Return
811
811
* 0 on success, or a negative error in case of failure.
812
812
*
813
+ * u64 bpf_perf_event_read(struct bpf_map *map, u64 flags)
814
+ * Description
815
+ * Read the value of a perf event counter. This helper relies on a
816
+ * *map* of type **BPF_MAP_TYPE_PERF_EVENT_ARRAY**. The nature of
817
+ * the perf event counter is selected when *map* is updated with
818
+ * perf event file descriptors. The *map* is an array whose size
819
+ * is the number of available CPUs, and each cell contains a value
820
+ * relative to one CPU. The value to retrieve is indicated by
821
+ * *flags*, that contains the index of the CPU to look up, masked
822
+ * with **BPF_F_INDEX_MASK**. Alternatively, *flags* can be set to
823
+ * **BPF_F_CURRENT_CPU** to indicate that the value for the
824
+ * current CPU should be retrieved.
825
+ *
826
+ * Note that before Linux 4.13, only hardware perf event can be
827
+ * retrieved.
828
+ *
829
+ * Also, be aware that the newer helper
830
+ * **bpf_perf_event_read_value**\ () is recommended over
831
+ * **bpf_perf_event_read*\ () in general. The latter has some ABI
832
+ * quirks where error and counter value are used as a return code
833
+ * (which is wrong to do since ranges may overlap). This issue is
834
+ * fixed with bpf_perf_event_read_value(), which at the same time
835
+ * provides more features over the **bpf_perf_event_read**\ ()
836
+ * interface. Please refer to the description of
837
+ * **bpf_perf_event_read_value**\ () for details.
838
+ * Return
839
+ * The value of the perf event counter read from the map, or a
840
+ * negative error code in case of failure.
841
+ *
813
842
* int bpf_redirect(u32 ifindex, u64 flags)
814
843
* Description
815
844
* Redirect the packet to another net device of index *ifindex*.
@@ -1071,6 +1100,17 @@ union bpf_attr {
1071
1100
* Return
1072
1101
* 0 on success, or a negative error in case of failure.
1073
1102
*
1103
+ * int bpf_skb_under_cgroup(struct sk_buff *skb, struct bpf_map *map, u32 index)
1104
+ * Description
1105
+ * Check whether *skb* is a descendant of the cgroup2 held by
1106
+ * *map* of type **BPF_MAP_TYPE_CGROUP_ARRAY**, at *index*.
1107
+ * Return
1108
+ * The return value depends on the result of the test, and can be:
1109
+ *
1110
+ * * 0, if the *skb* failed the cgroup2 descendant test.
1111
+ * * 1, if the *skb* succeeded the cgroup2 descendant test.
1112
+ * * A negative error code, if an error occurred.
1113
+ *
1074
1114
* u32 bpf_get_hash_recalc(struct sk_buff *skb)
1075
1115
* Description
1076
1116
* Retrieve the hash of the packet, *skb*\ **->hash**. If it is
@@ -1091,6 +1131,37 @@ union bpf_attr {
1091
1131
* Return
1092
1132
* A pointer to the current task struct.
1093
1133
*
1134
+ * int bpf_probe_write_user(void *dst, const void *src, u32 len)
1135
+ * Description
1136
+ * Attempt in a safe way to write *len* bytes from the buffer
1137
+ * *src* to *dst* in memory. It only works for threads that are in
1138
+ * user context, and *dst* must be a valid user space address.
1139
+ *
1140
+ * This helper should not be used to implement any kind of
1141
+ * security mechanism because of TOC-TOU attacks, but rather to
1142
+ * debug, divert, and manipulate execution of semi-cooperative
1143
+ * processes.
1144
+ *
1145
+ * Keep in mind that this feature is meant for experiments, and it
1146
+ * has a risk of crashing the system and running programs.
1147
+ * Therefore, when an eBPF program using this helper is attached,
1148
+ * a warning including PID and process name is printed to kernel
1149
+ * logs.
1150
+ * Return
1151
+ * 0 on success, or a negative error in case of failure.
1152
+ *
1153
+ * int bpf_current_task_under_cgroup(struct bpf_map *map, u32 index)
1154
+ * Description
1155
+ * Check whether the probe is being run is the context of a given
1156
+ * subset of the cgroup2 hierarchy. The cgroup2 to test is held by
1157
+ * *map* of type **BPF_MAP_TYPE_CGROUP_ARRAY**, at *index*.
1158
+ * Return
1159
+ * The return value depends on the result of the test, and can be:
1160
+ *
1161
+ * * 0, if the *skb* task belongs to the cgroup2.
1162
+ * * 1, if the *skb* task does not belong to the cgroup2.
1163
+ * * A negative error code, if an error occurred.
1164
+ *
1094
1165
* int bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags)
1095
1166
* Description
1096
1167
* Resize (trim or grow) the packet associated to *skb* to the
@@ -1182,6 +1253,107 @@ union bpf_attr {
1182
1253
* Return
1183
1254
* The id of current NUMA node.
1184
1255
*
1256
+ * int bpf_skb_change_head(struct sk_buff *skb, u32 len, u64 flags)
1257
+ * Description
1258
+ * Grows headroom of packet associated to *skb* and adjusts the
1259
+ * offset of the MAC header accordingly, adding *len* bytes of
1260
+ * space. It automatically extends and reallocates memory as
1261
+ * required.
1262
+ *
1263
+ * This helper can be used on a layer 3 *skb* to push a MAC header
1264
+ * for redirection into a layer 2 device.
1265
+ *
1266
+ * All values for *flags* are reserved for future usage, and must
1267
+ * be left at zero.
1268
+ *
1269
+ * A call to this helper is susceptible to change the underlaying
1270
+ * packet buffer. Therefore, at load time, all checks on pointers
1271
+ * previously done by the verifier are invalidated and must be
1272
+ * performed again, if the helper is used in combination with
1273
+ * direct packet access.
1274
+ * Return
1275
+ * 0 on success, or a negative error in case of failure.
1276
+ *
1277
+ * int bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
1278
+ * Description
1279
+ * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
1280
+ * it is possible to use a negative value for *delta*. This helper
1281
+ * can be used to prepare the packet for pushing or popping
1282
+ * headers.
1283
+ *
1284
+ * A call to this helper is susceptible to change the underlaying
1285
+ * packet buffer. Therefore, at load time, all checks on pointers
1286
+ * previously done by the verifier are invalidated and must be
1287
+ * performed again, if the helper is used in combination with
1288
+ * direct packet access.
1289
+ * Return
1290
+ * 0 on success, or a negative error in case of failure.
1291
+ *
1292
+ * int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)
1293
+ * Description
1294
+ * Copy a NUL terminated string from an unsafe address
1295
+ * *unsafe_ptr* to *dst*. The *size* should include the
1296
+ * terminating NUL byte. In case the string length is smaller than
1297
+ * *size*, the target is not padded with further NUL bytes. If the
1298
+ * string length is larger than *size*, just *size*-1 bytes are
1299
+ * copied and the last byte is set to NUL.
1300
+ *
1301
+ * On success, the length of the copied string is returned. This
1302
+ * makes this helper useful in tracing programs for reading
1303
+ * strings, and more importantly to get its length at runtime. See
1304
+ * the following snippet:
1305
+ *
1306
+ * ::
1307
+ *
1308
+ * SEC("kprobe/sys_open")
1309
+ * void bpf_sys_open(struct pt_regs *ctx)
1310
+ * {
1311
+ * char buf[PATHLEN]; // PATHLEN is defined to 256
1312
+ * int res = bpf_probe_read_str(buf, sizeof(buf),
1313
+ * ctx->di);
1314
+ *
1315
+ * // Consume buf, for example push it to
1316
+ * // userspace via bpf_perf_event_output(); we
1317
+ * // can use res (the string length) as event
1318
+ * // size, after checking its boundaries.
1319
+ * }
1320
+ *
1321
+ * In comparison, using **bpf_probe_read()** helper here instead
1322
+ * to read the string would require to estimate the length at
1323
+ * compile time, and would often result in copying more memory
1324
+ * than necessary.
1325
+ *
1326
+ * Another useful use case is when parsing individual process
1327
+ * arguments or individual environment variables navigating
1328
+ * *current*\ **->mm->arg_start** and *current*\
1329
+ * **->mm->env_start**: using this helper and the return value,
1330
+ * one can quickly iterate at the right offset of the memory area.
1331
+ * Return
1332
+ * On success, the strictly positive length of the string,
1333
+ * including the trailing NUL character. On error, a negative
1334
+ * value.
1335
+ *
1336
+ * u64 bpf_get_socket_cookie(struct sk_buff *skb)
1337
+ * Description
1338
+ * If the **struct sk_buff** pointed by *skb* has a known socket,
1339
+ * retrieve the cookie (generated by the kernel) of this socket.
1340
+ * If no cookie has been set yet, generate a new cookie. Once
1341
+ * generated, the socket cookie remains stable for the life of the
1342
+ * socket. This helper can be useful for monitoring per socket
1343
+ * networking traffic statistics as it provides a unique socket
1344
+ * identifier per namespace.
1345
+ * Return
1346
+ * A 8-byte long non-decreasing number on success, or 0 if the
1347
+ * socket field is missing inside *skb*.
1348
+ *
1349
+ * u32 bpf_get_socket_uid(struct sk_buff *skb)
1350
+ * Return
1351
+ * The owner UID of the socket associated to *skb*. If the socket
1352
+ * is **NULL**, or if it is not a full socket (i.e. if it is a
1353
+ * time-wait or a request socket instead), **overflowuid** value
1354
+ * is returned (note that **overflowuid** might also be the actual
1355
+ * UID value for the socket).
1356
+ *
1185
1357
* u32 bpf_set_hash(struct sk_buff *skb, u32 hash)
1186
1358
* Description
1187
1359
* Set the full hash for *skb* (set the field *skb*\ **->hash**)
0 commit comments