Skip to content

Commit d9962b0

Browse files
diandersdavem330
authored andcommitted
r8152: Block future register access if register access fails
Even though the functions to read/write registers can fail, most of the places in the r8152 driver that read/write register values don't check error codes. The lack of error code checking is problematic in at least two ways. The first problem is that the r8152 driver often uses code patterns similar to this: x = read_register() x = x | SOME_BIT; write_register(x); ...with the above pattern, if the read_register() fails and returns garbage then we'll end up trying to write modified garbage back to the Realtek adapter. If the write_register() succeeds that's bad. Note that as of commit f53a7ad ("r8152: Set memory to all 0xFFs on failed reg reads") the "garbage" returned by read_register() will at least be consistent garbage, but it is still garbage. It turns out that this problem is very serious. Writing garbage to some of the hardware registers on the Ethernet adapter can put the adapter in such a bad state that it needs to be power cycled (fully unplugged and plugged in again) before it can enumerate again. The second problem is that the r8152 driver generally has functions that are long sequences of register writes. Assuming everything will be OK if a random register write fails in the middle isn't a great assumption. One might wonder if the above two problems are real. You could ask if we would really have a successful write after a failed read. It turns out that the answer appears to be "yes, this can happen". In fact, we've seen at least two distinct failure modes where this happens. On a sc7180-trogdor Chromebook if you drop into kdb for a while and then resume, you can see: 1. We get a "Tx timeout" 2. The "Tx timeout" queues up a USB reset. 3. In rtl8152_pre_reset() we try to reinit the hardware. 4. The first several (2-9) register accesses fail with a timeout, then things recover. The above test case was actually fixed by the patch ("r8152: Increase USB control msg timeout to 5000ms as per spec") but at least shows that we really can see successful calls after failed ones. On a different (AMD) based Chromebook with a particular adapter, we found that during reboot tests we'd also sometimes get a transitory failure. In this case we saw -EPIPE being returned sometimes. Retrying worked, but retrying is not always safe for all register accesses since reading/writing some registers might have side effects (like registers that clear on read). Let's fully lock out all register access if a register access fails. When we do this, we'll try to queue up a USB reset and try to unlock register access after the reset. This is slightly tricker than it sounds since the r8152 driver has an optimized reset sequence that only works reliably after probe happens. In order to handle this, we avoid the optimized reset if probe didn't finish. Instead, we simply retry the probe routine in this case. When locking out access, we'll use the existing infrastructure that the driver was using when it detected we were unplugged. This keeps us from getting stuck in delay loops in some parts of the driver. Signed-off-by: Douglas Anderson <[email protected]> Reviewed-by: Grant Grundler <[email protected]> Signed-off-by: David S. Miller <[email protected]>
1 parent 715f67f commit d9962b0

File tree

1 file changed

+176
-31
lines changed

1 file changed

+176
-31
lines changed

drivers/net/usb/r8152.c

Lines changed: 176 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -773,6 +773,9 @@ enum rtl8152_flags {
773773
SCHEDULE_TASKLET,
774774
GREEN_ETHERNET,
775775
RX_EPROTO,
776+
IN_PRE_RESET,
777+
PROBED_WITH_NO_ERRORS,
778+
PROBE_SHOULD_RETRY,
776779
};
777780

778781
#define DEVICE_ID_LENOVO_USB_C_TRAVEL_HUB 0x721e
@@ -953,6 +956,8 @@ struct r8152 {
953956
u8 version;
954957
u8 duplex;
955958
u8 autoneg;
959+
960+
unsigned int reg_access_reset_count;
956961
};
957962

958963
/**
@@ -1200,6 +1205,96 @@ static unsigned int agg_buf_sz = 16384;
12001205

12011206
#define RTL_LIMITED_TSO_SIZE (size_to_mtu(agg_buf_sz) - sizeof(struct tx_desc))
12021207

1208+
/* If register access fails then we block access and issue a reset. If this
1209+
* happens too many times in a row without a successful access then we stop
1210+
* trying to reset and just leave access blocked.
1211+
*/
1212+
#define REGISTER_ACCESS_MAX_RESETS 3
1213+
1214+
static void rtl_set_inaccessible(struct r8152 *tp)
1215+
{
1216+
set_bit(RTL8152_INACCESSIBLE, &tp->flags);
1217+
smp_mb__after_atomic();
1218+
}
1219+
1220+
static void rtl_set_accessible(struct r8152 *tp)
1221+
{
1222+
clear_bit(RTL8152_INACCESSIBLE, &tp->flags);
1223+
smp_mb__after_atomic();
1224+
}
1225+
1226+
static
1227+
int r8152_control_msg(struct r8152 *tp, unsigned int pipe, __u8 request,
1228+
__u8 requesttype, __u16 value, __u16 index, void *data,
1229+
__u16 size, const char *msg_tag)
1230+
{
1231+
struct usb_device *udev = tp->udev;
1232+
int ret;
1233+
1234+
if (test_bit(RTL8152_INACCESSIBLE, &tp->flags))
1235+
return -ENODEV;
1236+
1237+
ret = usb_control_msg(udev, pipe, request, requesttype,
1238+
value, index, data, size,
1239+
USB_CTRL_GET_TIMEOUT);
1240+
1241+
/* No need to issue a reset to report an error if the USB device got
1242+
* unplugged; just return immediately.
1243+
*/
1244+
if (ret == -ENODEV)
1245+
return ret;
1246+
1247+
/* If the write was successful then we're done */
1248+
if (ret >= 0) {
1249+
tp->reg_access_reset_count = 0;
1250+
return ret;
1251+
}
1252+
1253+
dev_err(&udev->dev,
1254+
"Failed to %s %d bytes at %#06x/%#06x (%d)\n",
1255+
msg_tag, size, value, index, ret);
1256+
1257+
/* Block all future register access until we reset. Much of the code
1258+
* in the driver doesn't check for errors. Notably, many parts of the
1259+
* driver do a read/modify/write of a register value without
1260+
* confirming that the read succeeded. Writing back modified garbage
1261+
* like this can fully wedge the adapter, requiring a power cycle.
1262+
*/
1263+
rtl_set_inaccessible(tp);
1264+
1265+
/* If probe hasn't yet finished, then we'll request a retry of the
1266+
* whole probe routine if we get any control transfer errors. We
1267+
* never have to clear this bit since we free/reallocate the whole "tp"
1268+
* structure if we retry probe.
1269+
*/
1270+
if (!test_bit(PROBED_WITH_NO_ERRORS, &tp->flags)) {
1271+
set_bit(PROBE_SHOULD_RETRY, &tp->flags);
1272+
return ret;
1273+
}
1274+
1275+
/* Failing to access registers in pre-reset is not surprising since we
1276+
* wouldn't be resetting if things were behaving normally. The register
1277+
* access we do in pre-reset isn't truly mandatory--we're just reusing
1278+
* the disable() function and trying to be nice by powering the
1279+
* adapter down before resetting it. Thus, if we're in pre-reset,
1280+
* we'll return right away and not try to queue up yet another reset.
1281+
* We know the post-reset is already coming.
1282+
*/
1283+
if (test_bit(IN_PRE_RESET, &tp->flags))
1284+
return ret;
1285+
1286+
if (tp->reg_access_reset_count < REGISTER_ACCESS_MAX_RESETS) {
1287+
usb_queue_reset_device(tp->intf);
1288+
tp->reg_access_reset_count++;
1289+
} else if (tp->reg_access_reset_count == REGISTER_ACCESS_MAX_RESETS) {
1290+
dev_err(&udev->dev,
1291+
"Tried to reset %d times; giving up.\n",
1292+
REGISTER_ACCESS_MAX_RESETS);
1293+
}
1294+
1295+
return ret;
1296+
}
1297+
12031298
static
12041299
int get_registers(struct r8152 *tp, u16 value, u16 index, u16 size, void *data)
12051300
{
@@ -1210,9 +1305,10 @@ int get_registers(struct r8152 *tp, u16 value, u16 index, u16 size, void *data)
12101305
if (!tmp)
12111306
return -ENOMEM;
12121307

1213-
ret = usb_control_msg(tp->udev, tp->pipe_ctrl_in,
1214-
RTL8152_REQ_GET_REGS, RTL8152_REQT_READ,
1215-
value, index, tmp, size, USB_CTRL_GET_TIMEOUT);
1308+
ret = r8152_control_msg(tp, tp->pipe_ctrl_in,
1309+
RTL8152_REQ_GET_REGS, RTL8152_REQT_READ,
1310+
value, index, tmp, size, "read");
1311+
12161312
if (ret < 0)
12171313
memset(data, 0xff, size);
12181314
else
@@ -1233,9 +1329,9 @@ int set_registers(struct r8152 *tp, u16 value, u16 index, u16 size, void *data)
12331329
if (!tmp)
12341330
return -ENOMEM;
12351331

1236-
ret = usb_control_msg(tp->udev, tp->pipe_ctrl_out,
1237-
RTL8152_REQ_SET_REGS, RTL8152_REQT_WRITE,
1238-
value, index, tmp, size, USB_CTRL_SET_TIMEOUT);
1332+
ret = r8152_control_msg(tp, tp->pipe_ctrl_out,
1333+
RTL8152_REQ_SET_REGS, RTL8152_REQT_WRITE,
1334+
value, index, tmp, size, "write");
12391335

12401336
kfree(tmp);
12411337

@@ -1244,10 +1340,8 @@ int set_registers(struct r8152 *tp, u16 value, u16 index, u16 size, void *data)
12441340

12451341
static void rtl_set_unplug(struct r8152 *tp)
12461342
{
1247-
if (tp->udev->state == USB_STATE_NOTATTACHED) {
1248-
set_bit(RTL8152_INACCESSIBLE, &tp->flags);
1249-
smp_mb__after_atomic();
1250-
}
1343+
if (tp->udev->state == USB_STATE_NOTATTACHED)
1344+
rtl_set_inaccessible(tp);
12511345
}
12521346

12531347
static int generic_ocp_read(struct r8152 *tp, u16 index, u16 size,
@@ -8262,7 +8356,7 @@ static int rtl8152_pre_reset(struct usb_interface *intf)
82628356
struct r8152 *tp = usb_get_intfdata(intf);
82638357
struct net_device *netdev;
82648358

8265-
if (!tp)
8359+
if (!tp || !test_bit(PROBED_WITH_NO_ERRORS, &tp->flags))
82668360
return 0;
82678361

82688362
netdev = tp->netdev;
@@ -8277,7 +8371,9 @@ static int rtl8152_pre_reset(struct usb_interface *intf)
82778371
napi_disable(&tp->napi);
82788372
if (netif_carrier_ok(netdev)) {
82798373
mutex_lock(&tp->control);
8374+
set_bit(IN_PRE_RESET, &tp->flags);
82808375
tp->rtl_ops.disable(tp);
8376+
clear_bit(IN_PRE_RESET, &tp->flags);
82818377
mutex_unlock(&tp->control);
82828378
}
82838379

@@ -8290,9 +8386,11 @@ static int rtl8152_post_reset(struct usb_interface *intf)
82908386
struct net_device *netdev;
82918387
struct sockaddr sa;
82928388

8293-
if (!tp)
8389+
if (!tp || !test_bit(PROBED_WITH_NO_ERRORS, &tp->flags))
82948390
return 0;
82958391

8392+
rtl_set_accessible(tp);
8393+
82968394
/* reset the MAC address in case of policy change */
82978395
if (determine_ethernet_addr(tp, &sa) >= 0) {
82988396
rtnl_lock();
@@ -9494,17 +9592,29 @@ static u8 __rtl_get_hw_ver(struct usb_device *udev)
94949592
__le32 *tmp;
94959593
u8 version;
94969594
int ret;
9595+
int i;
94979596

94989597
tmp = kmalloc(sizeof(*tmp), GFP_KERNEL);
94999598
if (!tmp)
95009599
return 0;
95019600

9502-
ret = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0),
9503-
RTL8152_REQ_GET_REGS, RTL8152_REQT_READ,
9504-
PLA_TCR0, MCU_TYPE_PLA, tmp, sizeof(*tmp),
9505-
USB_CTRL_GET_TIMEOUT);
9506-
if (ret > 0)
9507-
ocp_data = (__le32_to_cpu(*tmp) >> 16) & VERSION_MASK;
9601+
/* Retry up to 3 times in case there is a transitory error. We do this
9602+
* since retrying a read of the version is always safe and this
9603+
* function doesn't take advantage of r8152_control_msg().
9604+
*/
9605+
for (i = 0; i < 3; i++) {
9606+
ret = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0),
9607+
RTL8152_REQ_GET_REGS, RTL8152_REQT_READ,
9608+
PLA_TCR0, MCU_TYPE_PLA, tmp, sizeof(*tmp),
9609+
USB_CTRL_GET_TIMEOUT);
9610+
if (ret > 0) {
9611+
ocp_data = (__le32_to_cpu(*tmp) >> 16) & VERSION_MASK;
9612+
break;
9613+
}
9614+
}
9615+
9616+
if (i != 0 && ret > 0)
9617+
dev_warn(&udev->dev, "Needed %d retries to read version\n", i);
95089618

95099619
kfree(tmp);
95109620

@@ -9603,25 +9713,14 @@ static bool rtl8152_supports_lenovo_macpassthru(struct usb_device *udev)
96039713
return 0;
96049714
}
96059715

9606-
static int rtl8152_probe(struct usb_interface *intf,
9607-
const struct usb_device_id *id)
9716+
static int rtl8152_probe_once(struct usb_interface *intf,
9717+
const struct usb_device_id *id, u8 version)
96089718
{
96099719
struct usb_device *udev = interface_to_usbdev(intf);
96109720
struct r8152 *tp;
96119721
struct net_device *netdev;
9612-
u8 version;
96139722
int ret;
96149723

9615-
if (intf->cur_altsetting->desc.bInterfaceClass != USB_CLASS_VENDOR_SPEC)
9616-
return -ENODEV;
9617-
9618-
if (!rtl_check_vendor_ok(intf))
9619-
return -ENODEV;
9620-
9621-
version = rtl8152_get_version(intf);
9622-
if (version == RTL_VER_UNKNOWN)
9623-
return -ENODEV;
9624-
96259724
usb_reset_device(udev);
96269725
netdev = alloc_etherdev(sizeof(struct r8152));
96279726
if (!netdev) {
@@ -9784,10 +9883,20 @@ static int rtl8152_probe(struct usb_interface *intf,
97849883
else
97859884
device_set_wakeup_enable(&udev->dev, false);
97869885

9886+
/* If we saw a control transfer error while probing then we may
9887+
* want to try probe() again. Consider this an error.
9888+
*/
9889+
if (test_bit(PROBE_SHOULD_RETRY, &tp->flags))
9890+
goto out2;
9891+
9892+
set_bit(PROBED_WITH_NO_ERRORS, &tp->flags);
97879893
netif_info(tp, probe, netdev, "%s\n", DRIVER_VERSION);
97889894

97899895
return 0;
97909896

9897+
out2:
9898+
unregister_netdev(netdev);
9899+
97919900
out1:
97929901
tasklet_kill(&tp->tx_tl);
97939902
cancel_delayed_work_sync(&tp->hw_phy_work);
@@ -9796,10 +9905,46 @@ static int rtl8152_probe(struct usb_interface *intf,
97969905
rtl8152_release_firmware(tp);
97979906
usb_set_intfdata(intf, NULL);
97989907
out:
9908+
if (test_bit(PROBE_SHOULD_RETRY, &tp->flags))
9909+
ret = -EAGAIN;
9910+
97999911
free_netdev(netdev);
98009912
return ret;
98019913
}
98029914

9915+
#define RTL8152_PROBE_TRIES 3
9916+
9917+
static int rtl8152_probe(struct usb_interface *intf,
9918+
const struct usb_device_id *id)
9919+
{
9920+
u8 version;
9921+
int ret;
9922+
int i;
9923+
9924+
if (intf->cur_altsetting->desc.bInterfaceClass != USB_CLASS_VENDOR_SPEC)
9925+
return -ENODEV;
9926+
9927+
if (!rtl_check_vendor_ok(intf))
9928+
return -ENODEV;
9929+
9930+
version = rtl8152_get_version(intf);
9931+
if (version == RTL_VER_UNKNOWN)
9932+
return -ENODEV;
9933+
9934+
for (i = 0; i < RTL8152_PROBE_TRIES; i++) {
9935+
ret = rtl8152_probe_once(intf, id, version);
9936+
if (ret != -EAGAIN)
9937+
break;
9938+
}
9939+
if (ret == -EAGAIN) {
9940+
dev_err(&intf->dev,
9941+
"r8152 failed probe after %d tries; giving up\n", i);
9942+
return -ENODEV;
9943+
}
9944+
9945+
return ret;
9946+
}
9947+
98039948
static void rtl8152_disconnect(struct usb_interface *intf)
98049949
{
98059950
struct r8152 *tp = usb_get_intfdata(intf);

0 commit comments

Comments
 (0)