Skip to content

Commit 9d2355b

Browse files
committed
Merge branch 'cmsg_timestamp'
Soheil Hassas Yeganeh says: ==================== add TX timestamping via cmsg This patch series aim at enabling TX timestamping via cmsg. Currently, to occasionally sample TX timestamping on a socket, applications need to call setsockopt twice: first for enabling timestamps and then for disabling them. This is an unnecessary overhead. With cmsg, in contrast, applications can sample TX timestamps per sendmsg(). This patch series adds the code for processing SO_TIMESTAMPING for cmsg's of the SOL_SOCKET level, and adds the glue code for TCP, UDP, and RAW for both IPv4 and IPv6. This implementation supports overriding timestamp generation flags (i.e., SOF_TIMESTAMPING_TX_*) but not timestamp reporting flags. Applications must still enable timestamp reporting via setsockopt to receive timestamps. This series does not change existing timestamping behavior for applications that are using socket options. I will follow up with another patch to enable timestamping for active TFO (client-side TCP Fast Open) and also setting packet mark via cmsgs. Thanks! Changes in v2: - Replace u32 with __u32 in the documentation. Changes in v3: - Fix the broken build for L2TP (due to changes in IPv6). ==================== Signed-off-by: David S. Miller <[email protected]>
2 parents 833716e + fd91e12 commit 9d2355b

File tree

27 files changed

+231
-78
lines changed

27 files changed

+231
-78
lines changed

Documentation/networking/timestamping.txt

Lines changed: 45 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,11 +44,17 @@ timeval of SO_TIMESTAMP (ms).
4444
Supports multiple types of timestamp requests. As a result, this
4545
socket option takes a bitmap of flags, not a boolean. In
4646

47-
err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val, &val);
47+
err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val,
48+
sizeof(val));
4849

4950
val is an integer with any of the following bits set. Setting other
5051
bit returns EINVAL and does not change the current state.
5152

53+
The socket option configures timestamp generation for individual
54+
sk_buffs (1.3.1), timestamp reporting to the socket's error
55+
queue (1.3.2) and options (1.3.3). Timestamp generation can also
56+
be enabled for individual sendmsg calls using cmsg (1.3.4).
57+
5258

5359
1.3.1 Timestamp Generation
5460

@@ -71,13 +77,16 @@ SOF_TIMESTAMPING_RX_SOFTWARE:
7177
kernel receive stack.
7278

7379
SOF_TIMESTAMPING_TX_HARDWARE:
74-
Request tx timestamps generated by the network adapter.
80+
Request tx timestamps generated by the network adapter. This flag
81+
can be enabled via both socket options and control messages.
7582

7683
SOF_TIMESTAMPING_TX_SOFTWARE:
7784
Request tx timestamps when data leaves the kernel. These timestamps
7885
are generated in the device driver as close as possible, but always
7986
prior to, passing the packet to the network interface. Hence, they
8087
require driver support and may not be available for all devices.
88+
This flag can be enabled via both socket options and control messages.
89+
8190

8291
SOF_TIMESTAMPING_TX_SCHED:
8392
Request tx timestamps prior to entering the packet scheduler. Kernel
@@ -90,7 +99,8 @@ SOF_TIMESTAMPING_TX_SCHED:
9099
machines with virtual devices where a transmitted packet travels
91100
through multiple devices and, hence, multiple packet schedulers,
92101
a timestamp is generated at each layer. This allows for fine
93-
grained measurement of queuing delay.
102+
grained measurement of queuing delay. This flag can be enabled
103+
via both socket options and control messages.
94104

95105
SOF_TIMESTAMPING_TX_ACK:
96106
Request tx timestamps when all data in the send buffer has been
@@ -99,6 +109,7 @@ SOF_TIMESTAMPING_TX_ACK:
99109
over-report measurement, because the timestamp is generated when all
100110
data up to and including the buffer at send() was acknowledged: the
101111
cumulative acknowledgment. The mechanism ignores SACK and FACK.
112+
This flag can be enabled via both socket options and control messages.
102113

103114

104115
1.3.2 Timestamp Reporting
@@ -183,6 +194,37 @@ having access to the contents of the original packet, so cannot be
183194
combined with SOF_TIMESTAMPING_OPT_TSONLY.
184195

185196

197+
1.3.4. Enabling timestamps via control messages
198+
199+
In addition to socket options, timestamp generation can be requested
200+
per write via cmsg, only for SOF_TIMESTAMPING_TX_* (see Section 1.3.1).
201+
Using this feature, applications can sample timestamps per sendmsg()
202+
without paying the overhead of enabling and disabling timestamps via
203+
setsockopt:
204+
205+
struct msghdr *msg;
206+
...
207+
cmsg = CMSG_FIRSTHDR(msg);
208+
cmsg->cmsg_level = SOL_SOCKET;
209+
cmsg->cmsg_type = SO_TIMESTAMPING;
210+
cmsg->cmsg_len = CMSG_LEN(sizeof(__u32));
211+
*((__u32 *) CMSG_DATA(cmsg)) = SOF_TIMESTAMPING_TX_SCHED |
212+
SOF_TIMESTAMPING_TX_SOFTWARE |
213+
SOF_TIMESTAMPING_TX_ACK;
214+
err = sendmsg(fd, msg, 0);
215+
216+
The SOF_TIMESTAMPING_TX_* flags set via cmsg will override
217+
the SOF_TIMESTAMPING_TX_* flags set via setsockopt.
218+
219+
Moreover, applications must still enable timestamp reporting via
220+
setsockopt to receive timestamps:
221+
222+
__u32 val = SOF_TIMESTAMPING_SOFTWARE |
223+
SOF_TIMESTAMPING_OPT_ID /* or any other flag */;
224+
err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val,
225+
sizeof(val));
226+
227+
186228
1.4 Bytestream Timestamps
187229

188230
The SO_TIMESTAMPING interface supports timestamping of bytes in a

drivers/net/tun.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -861,7 +861,8 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
861861
goto drop;
862862

863863
if (skb->sk && sk_fullsock(skb->sk)) {
864-
sock_tx_timestamp(skb->sk, &skb_shinfo(skb)->tx_flags);
864+
sock_tx_timestamp(skb->sk, skb->sk->sk_tsflags,
865+
&skb_shinfo(skb)->tx_flags);
865866
sw_tx_timestamp(skb);
866867
}
867868

include/net/ip.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ static inline unsigned int ip_hdrlen(const struct sk_buff *skb)
5656
}
5757

5858
struct ipcm_cookie {
59+
struct sockcm_cookie sockc;
5960
__be32 addr;
6061
int oif;
6162
struct ip_options_rcu *opt;
@@ -550,7 +551,7 @@ int ip_options_rcv_srr(struct sk_buff *skb);
550551

551552
void ipv4_pktinfo_prepare(const struct sock *sk, struct sk_buff *skb);
552553
void ip_cmsg_recv_offset(struct msghdr *msg, struct sk_buff *skb, int offset);
553-
int ip_cmsg_send(struct net *net, struct msghdr *msg,
554+
int ip_cmsg_send(struct sock *sk, struct msghdr *msg,
554555
struct ipcm_cookie *ipc, bool allow_ipv6);
555556
int ip_setsockopt(struct sock *sk, int level, int optname, char __user *optval,
556557
unsigned int optlen);

include/net/ipv6.h

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -867,7 +867,8 @@ int ip6_append_data(struct sock *sk,
867867
int odd, struct sk_buff *skb),
868868
void *from, int length, int transhdrlen, int hlimit,
869869
int tclass, struct ipv6_txoptions *opt, struct flowi6 *fl6,
870-
struct rt6_info *rt, unsigned int flags, int dontfrag);
870+
struct rt6_info *rt, unsigned int flags, int dontfrag,
871+
const struct sockcm_cookie *sockc);
871872

872873
int ip6_push_pending_frames(struct sock *sk);
873874

@@ -884,7 +885,8 @@ struct sk_buff *ip6_make_skb(struct sock *sk,
884885
void *from, int length, int transhdrlen,
885886
int hlimit, int tclass, struct ipv6_txoptions *opt,
886887
struct flowi6 *fl6, struct rt6_info *rt,
887-
unsigned int flags, int dontfrag);
888+
unsigned int flags, int dontfrag,
889+
const struct sockcm_cookie *sockc);
888890

889891
static inline struct sk_buff *ip6_finish_skb(struct sock *sk)
890892
{

include/net/sock.h

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1418,8 +1418,11 @@ void sk_send_sigurg(struct sock *sk);
14181418

14191419
struct sockcm_cookie {
14201420
u32 mark;
1421+
u16 tsflags;
14211422
};
14221423

1424+
int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg,
1425+
struct sockcm_cookie *sockc);
14231426
int sock_cmsg_send(struct sock *sk, struct msghdr *msg,
14241427
struct sockcm_cookie *sockc);
14251428

@@ -2054,19 +2057,21 @@ static inline void sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk,
20542057
sk->sk_stamp = skb->tstamp;
20552058
}
20562059

2057-
void __sock_tx_timestamp(const struct sock *sk, __u8 *tx_flags);
2060+
void __sock_tx_timestamp(__u16 tsflags, __u8 *tx_flags);
20582061

20592062
/**
20602063
* sock_tx_timestamp - checks whether the outgoing packet is to be time stamped
20612064
* @sk: socket sending this packet
2065+
* @tsflags: timestamping flags to use
20622066
* @tx_flags: completed with instructions for time stamping
20632067
*
20642068
* Note : callers should take care of initial *tx_flags value (usually 0)
20652069
*/
2066-
static inline void sock_tx_timestamp(const struct sock *sk, __u8 *tx_flags)
2070+
static inline void sock_tx_timestamp(const struct sock *sk, __u16 tsflags,
2071+
__u8 *tx_flags)
20672072
{
2068-
if (unlikely(sk->sk_tsflags))
2069-
__sock_tx_timestamp(sk, tx_flags);
2073+
if (unlikely(tsflags))
2074+
__sock_tx_timestamp(tsflags, tx_flags);
20702075
if (unlikely(sock_flag(sk, SOCK_WIFI_STATUS)))
20712076
*tx_flags |= SKBTX_WIFI_STATUS;
20722077
}

include/net/tcp.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -754,7 +754,8 @@ struct tcp_skb_cb {
754754
TCPCB_REPAIRED)
755755

756756
__u8 ip_dsfield; /* IPv4 tos or IPv6 dsfield */
757-
/* 1 byte hole */
757+
__u8 txstamp_ack:1, /* Record TX timestamp for ack? */
758+
unused:7;
758759
__u32 ack_seq; /* Sequence number ACK'd */
759760
union {
760761
struct inet_skb_parm h4;

include/net/transp_v6.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,8 @@ void ip6_datagram_recv_specific_ctl(struct sock *sk, struct msghdr *msg,
4242

4343
int ip6_datagram_send_ctl(struct net *net, struct sock *sk, struct msghdr *msg,
4444
struct flowi6 *fl6, struct ipv6_txoptions *opt,
45-
int *hlimit, int *tclass, int *dontfrag);
45+
int *hlimit, int *tclass, int *dontfrag,
46+
struct sockcm_cookie *sockc);
4647

4748
void ip6_dgram_sock_seq_show(struct seq_file *seq, struct sock *sp,
4849
__u16 srcp, __u16 destp, int bucket);

include/uapi/linux/net_tstamp.h

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,16 @@ enum {
3131
SOF_TIMESTAMPING_LAST
3232
};
3333

34+
/*
35+
* SO_TIMESTAMPING flags are either for recording a packet timestamp or for
36+
* reporting the timestamp to user space.
37+
* Recording flags can be set both via socket options and control messages.
38+
*/
39+
#define SOF_TIMESTAMPING_TX_RECORD_MASK (SOF_TIMESTAMPING_TX_HARDWARE | \
40+
SOF_TIMESTAMPING_TX_SOFTWARE | \
41+
SOF_TIMESTAMPING_TX_SCHED | \
42+
SOF_TIMESTAMPING_TX_ACK)
43+
3444
/**
3545
* struct hwtstamp_config - %SIOCGHWTSTAMP and %SIOCSHWTSTAMP parameter
3646
*

net/can/raw.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -755,7 +755,7 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
755755
if (err < 0)
756756
goto free_skb;
757757

758-
sock_tx_timestamp(sk, &skb_shinfo(skb)->tx_flags);
758+
sock_tx_timestamp(sk, sk->sk_tsflags, &skb_shinfo(skb)->tx_flags);
759759

760760
skb->dev = dev;
761761
skb->sk = sk;

net/core/sock.c

Lines changed: 37 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -832,7 +832,8 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
832832
!(sk->sk_tsflags & SOF_TIMESTAMPING_OPT_ID)) {
833833
if (sk->sk_protocol == IPPROTO_TCP &&
834834
sk->sk_type == SOCK_STREAM) {
835-
if (sk->sk_state != TCP_ESTABLISHED) {
835+
if ((1 << sk->sk_state) &
836+
(TCPF_CLOSE | TCPF_LISTEN)) {
836837
ret = -EINVAL;
837838
break;
838839
}
@@ -1866,27 +1867,51 @@ struct sk_buff *sock_alloc_send_skb(struct sock *sk, unsigned long size,
18661867
}
18671868
EXPORT_SYMBOL(sock_alloc_send_skb);
18681869

1870+
int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg,
1871+
struct sockcm_cookie *sockc)
1872+
{
1873+
u32 tsflags;
1874+
1875+
switch (cmsg->cmsg_type) {
1876+
case SO_MARK:
1877+
if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
1878+
return -EPERM;
1879+
if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
1880+
return -EINVAL;
1881+
sockc->mark = *(u32 *)CMSG_DATA(cmsg);
1882+
break;
1883+
case SO_TIMESTAMPING:
1884+
if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
1885+
return -EINVAL;
1886+
1887+
tsflags = *(u32 *)CMSG_DATA(cmsg);
1888+
if (tsflags & ~SOF_TIMESTAMPING_TX_RECORD_MASK)
1889+
return -EINVAL;
1890+
1891+
sockc->tsflags &= ~SOF_TIMESTAMPING_TX_RECORD_MASK;
1892+
sockc->tsflags |= tsflags;
1893+
break;
1894+
default:
1895+
return -EINVAL;
1896+
}
1897+
return 0;
1898+
}
1899+
EXPORT_SYMBOL(__sock_cmsg_send);
1900+
18691901
int sock_cmsg_send(struct sock *sk, struct msghdr *msg,
18701902
struct sockcm_cookie *sockc)
18711903
{
18721904
struct cmsghdr *cmsg;
1905+
int ret;
18731906

18741907
for_each_cmsghdr(cmsg, msg) {
18751908
if (!CMSG_OK(msg, cmsg))
18761909
return -EINVAL;
18771910
if (cmsg->cmsg_level != SOL_SOCKET)
18781911
continue;
1879-
switch (cmsg->cmsg_type) {
1880-
case SO_MARK:
1881-
if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
1882-
return -EPERM;
1883-
if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
1884-
return -EINVAL;
1885-
sockc->mark = *(u32 *)CMSG_DATA(cmsg);
1886-
break;
1887-
default:
1888-
return -EINVAL;
1889-
}
1912+
ret = __sock_cmsg_send(sk, msg, cmsg, sockc);
1913+
if (ret)
1914+
return ret;
18901915
}
18911916
return 0;
18921917
}

net/ipv4/ip_sockglue.c

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -219,11 +219,12 @@ void ip_cmsg_recv_offset(struct msghdr *msg, struct sk_buff *skb,
219219
}
220220
EXPORT_SYMBOL(ip_cmsg_recv_offset);
221221

222-
int ip_cmsg_send(struct net *net, struct msghdr *msg, struct ipcm_cookie *ipc,
222+
int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
223223
bool allow_ipv6)
224224
{
225225
int err, val;
226226
struct cmsghdr *cmsg;
227+
struct net *net = sock_net(sk);
227228

228229
for_each_cmsghdr(cmsg, msg) {
229230
if (!CMSG_OK(msg, cmsg))
@@ -244,6 +245,12 @@ int ip_cmsg_send(struct net *net, struct msghdr *msg, struct ipcm_cookie *ipc,
244245
continue;
245246
}
246247
#endif
248+
if (cmsg->cmsg_level == SOL_SOCKET) {
249+
if (__sock_cmsg_send(sk, msg, cmsg, &ipc->sockc))
250+
return -EINVAL;
251+
continue;
252+
}
253+
247254
if (cmsg->cmsg_level != SOL_IP)
248255
continue;
249256
switch (cmsg->cmsg_type) {

net/ipv4/ping.c

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -737,17 +737,16 @@ static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
737737
/* no remote port */
738738
}
739739

740+
ipc.sockc.tsflags = sk->sk_tsflags;
740741
ipc.addr = inet->inet_saddr;
741742
ipc.opt = NULL;
742743
ipc.oif = sk->sk_bound_dev_if;
743744
ipc.tx_flags = 0;
744745
ipc.ttl = 0;
745746
ipc.tos = -1;
746747

747-
sock_tx_timestamp(sk, &ipc.tx_flags);
748-
749748
if (msg->msg_controllen) {
750-
err = ip_cmsg_send(sock_net(sk), msg, &ipc, false);
749+
err = ip_cmsg_send(sk, msg, &ipc, false);
751750
if (unlikely(err)) {
752751
kfree(ipc.opt);
753752
return err;
@@ -768,6 +767,8 @@ static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
768767
rcu_read_unlock();
769768
}
770769

770+
sock_tx_timestamp(sk, ipc.sockc.tsflags, &ipc.tx_flags);
771+
771772
saddr = ipc.addr;
772773
ipc.addr = faddr = daddr;
773774

0 commit comments

Comments
 (0)