Linux USB3.0 接移动硬盘频繁卡死问题解决方法
问题
在 rockpi
(系统 ubuntu 20.4
)上通过USB3.0连接大容量硬盘,读取数据的时候,会频繁导致系统卡死,只能重启机器。
查看到的一些日志信息:
[ 529.728684] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 529.729554] xhci-hcd xhci-hcd.9.auto: @00000000db5ca2b0 00000000 00000000 1b000000 1a078001
[ 533.658284] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 533.659159] xhci-hcd xhci-hcd.9.auto: @00000000db5ca700 00000000 00000000 1b000000 18078001
[ 536.888700] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 536.889570] xhci-hcd xhci-hcd.9.auto: @00000000db5ca340 00000000 00000000 1b000000 1a078001
[ 547.392564] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 547.393418] xhci-hcd xhci-hcd.9.auto: @00000000db5ca990 00000000 00000000 1b000000 18078000
[ 551.313637] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 551.314494] xhci-hcd xhci-hcd.9.auto: @00000000db5caf50 00000000 00000000 1b000000 18078000
[ 567.382210] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 567.383086] xhci-hcd xhci-hcd.9.auto: @00000000db5cab30 00000000 00000000 1b000000 18078000
[ 581.136868] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 581.137739] xhci-hcd xhci-hcd.9.auto: @00000000db5cae50 00000000 00000000 1b000000 18078000
[ 585.063728] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 585.064582] xhci-hcd xhci-hcd.9.auto: @00000000db5ca410 00000000 00000000 1b000000 18078001
[ 598.768816] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 598.769687] xhci-hcd xhci-hcd.9.auto: @00000000db5caa20 00000000 00000000 1b000000 18078001
[ 602.693613] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 602.694491] xhci-hcd xhci-hcd.9.auto: @00000000db5cafc0 00000000 00000000 1b000000 18078001
[ 616.468609] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 616.469481] xhci-hcd xhci-hcd.9.auto: @00000000db5caf50 00000000 00000000 1b000000 18078001
[ 620.399073] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 620.399952] xhci-hcd xhci-hcd.9.auto: @00000000db5ca510 00000000 00000000 1b000000 18078000
[ 634.432399] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 634.433254] xhci-hcd xhci-hcd.9.auto: @00000000db5cab20 00000000 00000000 1b000000 18078000
[ 638.353145] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 638.353998] xhci-hcd xhci-hcd.9.auto: @00000000db5ca0e0 00000000 00000000 1b000000 18078001
[ 654.242082] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 654.242945] xhci-hcd xhci-hcd.9.auto: @00000000db5ca140 00000000 00000000 1b000000 18078000
[ 684.977605] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 684.978471] xhci-hcd xhci-hcd.9.auto: @00000000db5cad50 00000000 00000000 1b000000 18078000
[ 684.979676] xhci-hcd xhci-hcd.9.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 684.980530] xhci-hcd xhci-hcd.9.auto: @00000000db5cad70 00000000 00000000 1b000000 18058000
[ 690.469456] usb 6-1.1.1: device not accepting address 59, error -71
[ 762.191000] xhci-hcd xhci-hcd.8.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring
[ 762.191858] xhci-hcd xhci-hcd.8.auto: @00000000eeb957d0 00000000 00000000 1b000000 11078001
然后会导致 Kernel panic
的严重错误:
[ 762.907283] blk_update_request: I/O error, dev sdl, sector 6511155816
[ 840.425331] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
[ 840.425933] Not tainted 4.4.154-112-rockchip-gfdb18c8bab17 #1
[ 840.426499] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 840.427479] Kernel panic - not syncing: hung_task: blocked tasks
[ 840.428026] CPU: 4 PID: 39 Comm: khungtaskd Not tainted 4.4.154-112-rockchip-gfdb18c8bab17 #1
[ 840.428782] Hardware name: ROCK PI 4B (DT)
[ 840.429153] Call trace:
[ 840.429387] [<ffffff80080888d8>] dump_backtrace+0x0/0x220
[ 840.429872] [<ffffff8008088b1c>] show_stack+0x24/0x30
[ 840.430334] [<ffffff800856ebec>] dump_stack+0x98/0xc0
[ 840.430798] [<ffffff80081724bc>] panic+0xe8/0x23c
[ 840.431226] [<ffffff8008139e90>] proc_dohung_task_timeout_secs+0x0/0x7c
[ 840.431817] [<ffffff80080ba310>] kthread+0xe0/0xf0
[ 840.432252] [<ffffff8008082ef0>] ret_from_fork+0x10/0x20
[ 840.432748] CPU0: stopping
[ 840.433037] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.154-112-rockchip-gfdb18c8bab17 #1
[ 840.433784] Hardware name: ROCK PI 4B (DT)
[ 840.434157] Call trace:
[ 840.434413] [<ffffff80080888d8>] dump_backtrace+0x0/0x220
[ 840.434913] [<ffffff8008088b1c>] show_stack+0x24/0x30
[ 840.435377] [<ffffff800856ebec>] dump_stack+0x98/0xc0
[ 840.435841] [<ffffff800808db74>] handle_IPI+0x1d0/0x248
[ 840.436313] [<ffffff8008080f24>] gic_handle_irq+0x17c/0x180
[ 840.436818] Exception stack(0xffffff8009243d70 to 0xffffff8009243ea0)
有网友也遇到类似的问题:
这里^1,还有这里^2 。在第二个链接里作者推测是系统上的USB相关驱动的问题。
暂时的 workaround 是禁用 UAS 内核驱动。代价是读写速度的下降。
UAS
uas
是USB Attached Storage
,使用的协议是 USB MSC USB Attached SCSI Protocol
关于usb
协议相关可以看^3
查看系统内核当前加载的模块:
$ lsmod
Module Size Used by
xt_conntrack 16384 1
ipt_MASQUERADE 16384 1
nf_nat_masquerade_ipv4 16384 1 ipt_MASQUERADE
nf_conntrack_netlink 36864 0
xt_addrtype 16384 2
iptable_filter 16384 1
iptable_nat 16384 1
nf_conntrack_ipv4 24576 2
nf_defrag_ipv4 16384 1 nf_conntrack_ipv4
nf_nat_ipv4 16384 1 iptable_nat
nf_nat 20480 2 nf_nat_ipv4,nf_nat_masquerade_ipv4
nf_conntrack 126976 6 nf_nat,nf_nat_ipv4,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_netlink,nf_conntrack_ipv4
overlay 45056 1
binfmt_misc 20480 1
uas 20480 0
usb_storage 61440 15 uas
bcmdhd 1183744 0
autofs4 40960 3
可以看到 uas module
查看系统的 usb
设备:
$ lsusb -t
/: Bus 08.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
/: Bus 07.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 480M
/: Bus 06.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 3, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 5, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 2: Dev 7, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 3: Dev 10, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 4: Dev 13, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 2: Dev 4, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 8, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 2: Dev 11, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 3: Dev 14, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 4: Dev 17, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 3: Dev 6, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 22, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 2: Dev 12, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 3: Dev 16, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 4: Dev 23, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 4: Dev 9, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 15, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 2: Dev 18, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 3: Dev 19, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 4: Dev 20, If 0, Class=Mass Storage, Driver=uas, 5000M
/: Bus 05.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 480M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 1: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 2: Dev 4, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 3: Dev 5, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 4: Dev 6, If 0, Class=Hub, Driver=hub/4p, 480M
/: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=ohci-platform/1p, 12M
/: Bus 03.Port 1: Dev 1, Class=root_hub, Driver=ohci-platform/1p, 12M
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=ehci-platform/1p, 480M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=ehci-platform/1p, 480M
其中有很多 Class=Mass Storage, Driver=uas, 5000M
就是使用了 uas
作为驱动。
可以选择在系统中完全禁用掉uas
$ sudo vim /etc/modprobe.d/blacklist.conf
在最后添加
blacklist uas
保存然后重启机器。
但是,测试发现这么操作会导致上面使用了uas
的设备直接找不到驱动无法正常工作。
我们需要为这些设备直接指定使用更基础的 usb-storage
模块作为驱动,同时禁用掉uas
。
先获取设备的 idVendor:idProduct
$ lsusb | awk '{print $6}' | sort -u
05e3:0610
05e3:0626
152d:0578
152d:9561
1d6b:0001
1d6b:0002
1d6b:0003
在 /etc/modprobe.d/
目录下添加一个文件disable-uas.conf
(名字可以任意定)
$ sudo vim /etc/modprobe.d/disable-uas.conf
添加:
options usb-storage quirks=05e3:0610:u,05e3:0626:u,152d:0578:u,1d6b:0001:u,1d6b:0002:u,1d6b:0003:u
然后输入
sudo update-initramfs -u
sudo reboot
重启机器。
再通过 lsusb -t
查看设备:
/: Bus 08.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
/: Bus 07.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 480M
/: Bus 06.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 3, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 5, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 2: Dev 7, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 3: Dev 10, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 4: Dev 13, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 2: Dev 4, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 8, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 2: Dev 11, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 3: Dev 14, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 4: Dev 17, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 3: Dev 6, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 22, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 2: Dev 12, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 3: Dev 16, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 4: Dev 23, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 4: Dev 9, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 15, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 2: Dev 18, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 3: Dev 19, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 4: Dev 20, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/: Bus 05.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 480M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 1: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 2: Dev 4, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 3: Dev 5, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 4: Dev 6, If 0, Class=Hub, Driver=hub/4p, 480M
/: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=ohci-platform/1p, 12M
/: Bus 03.Port 1: Dev 1, Class=root_hub, Driver=ohci-platform/1p, 12M
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=ehci-platform/1p, 480M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=ehci-platform/1p, 480M
之前使用 uas
驱动设备都使用了usb-storage
。
通过dmesg | grep 'scsi host' -A5 -B5
查看系统日志也可以看出:
[ 5.434458] usb 6-1.1.3: Manufacturer: BIAZE
[ 5.434464] usb 6-1.1.3: SerialNumber: 000000000799
[ 5.463418] usb 6-1.1.1: UAS is blacklisted for this device, using usb-storage instead
[ 5.463436] usb-storage 6-1.1.1:1.0: USB Mass Storage device detected
[ 5.465385] usb-storage 6-1.1.1:1.0: Quirks match for vid 152d pid 9561: 800000
[ 5.469635] scsi host0: usb-storage 6-1.1.1:1.0
[ 5.470394] usb 6-1.1.2: UAS is blacklisted for this device, using usb-storage instead
[ 5.470410] usb-storage 6-1.1.2:1.0: USB Mass Storage device detected
[ 5.473329] usb-storage 6-1.1.2:1.0: Quirks match for vid 152d pid 9561: 800000
...
...
该设备禁用了uas
,使用usb-storage
。
到此已经解决问题。对于其他不同的操作系统比如 Linux on a Raspberry Pi,如果上述方法不生效,可以试一试下面的配置:
$ sudo vim /boot/cmdline.txt
添加:
usb-storage quirks=05e3:0610:u,05e3:0626:u,152d:0578:u,152d:9561:u,1d6b:0001:u,1d6b:0002:u,1d6b:0003:u
然后重启机器。
参考
1 https://forum.pine64.org/showthread.php?tid=5832
2 https://forum.pine64.org/showthread.php?tid=5137
3 https://www.crifan.com/files/doc/docbook/usb_disk_driver/release/html/usb_disk_driver.html?spm=a2c6h.12873639.0.0.4f412263nKhXtS#ch02_msc_basic
4 https://leo.leung.xyz/wiki/Disable_UAS
5 https://askubuntu.com/questions/1266804/blacklist-uas-drivers-in-kernel