When USB Meets ZFS: A Cautionary Tale from the Ankh-Morpork Homelab

There is a particular kind of homelab hubris that strikes at 1am, when everything is quiet, the pool is healthy, and you think: how hard could it be to just add one more disk?

This is the story of what happened next.

The Setup

Igor is an AOOSTAR WTR PRO, a compact mini-PC built around the AMD Ryzen 7 5825U, with four internal SATA bays, two M.2 NVMe slots, and a pair of 2.5GbE ports. It is a capable little machine, repurposed as an Ubuntu server running ZFS 2.4.1, chosen specifically for its ZFS support over the older versions shipped with Proxmox and sitting inside the Ankh-Morpork segment of a Discworld-themed homelab network.

Storage comes from a TerraMaster TDAS enclosure with six 6TB spinning disks connected over USB 3.2, forming a ZFS RAIDZ2 pool called storage. Eighteen terabytes usable, 55% full, holding disk images alongside volumes exported via iSCSI. It has been working fine for general NAS workloads. Light reads, occasional writes, the usual.

The four internal SATA bays held a set of 5.5TB Hitachi drives occupied by a ghost of the machine’s previous life: a pair of degraded 64-slot MD RAID arrays that had lost 60 of their members, a dead ZFS pool called zpool2 with a missing fifth member and a hostid of zero, and a freshly-created RAID1 array that had only existed for six days. Igor was clearly a machine with history.

After clearing all of that out and wiping the Hitachi drives clean, the obvious question was: what now? Four free 5.5TB SATA drives, a pool that could use more space. The answer seemed obvious.

The Plan

ZFS 2.3 introduced RAIDZ expansion, the ability to add disks to an existing RAIDZ vdev one at a time with the pool staying online throughout. ZFS 2.4.1 is running on igor. The maths looked good.

zpool attach storage raidz2-0 /dev/disk/by-id/wwn-0x5000cca271e37b70

One command. The reflow started immediately. 18.3TB to copy, initial speed 111MB/s, estimated completion just under 48 hours. Seemed fine.

It was not fine.

What Actually Happened

Within hours, the first disk dropped off the pool:

sd 6:0:0:0: Device offlined - not ready after error recovery
usb 2-2.3.1: USB disconnect, device number 6
xhci_hcd 0000:06:00.3: Timeout while waiting for setup device command

The xHCI controller, the AMD Renoir/Cezanne USB 3.1 controller at 0000:06:00.3, was timing out trying to re-enumerate the disk. The pool went DEGRADED. Then SUSPENDED. Then the fun really started.

Over the next several hours, the pattern repeated without variation. The reflow would accelerate to 150-180MB/s, the USB controller would saturate, a disk would drop, the pool would suspend, the controller would hang trying to re-enumerate, a power cycle would be required, and the disks would come back with different /dev/sdX names. Then repeat.

The pool was built using raw sdX device paths rather than by-id paths, so every reboot reshuffled the device names and ZFS could not reconnect automatically. This was fixed with an export and re-import using zpool import -d /dev/disk/by-id, which at least made the device naming stable going forward.

But the drops kept coming. At one point, five of seven vdev members went UNAVAIL simultaneously. The pool accumulated four data errors. The reflow estimate ballooned from 48 hours to over 10 days as the speed cratered with each recovery cycle.

Why It Happens

The AOOSTAR WTR PRO has two AMD USB controllers, both visible in lspci:

06:00.3 USB controller: AMD Renoir/Cezanne USB 3.1
06:00.4 USB controller: AMD Renoir/Cezanne USB 3.1

All six TerraMas disks are on Bus 002, hanging off 06:00.3. Bus 004 (the second controller) has no physical ports exposed on this machine. It exists in lspci but there is nowhere to plug anything into it.

The topology inside Bus 002 is a cascade:

Bus 002 (single 10Gbps controller)
└── Hub
    ├── Disk
    └── Hub
        ├── Disk
        ├── Disk
        ├── Disk
        └── Disk

Four disks are behind two cascaded hubs before even reaching the controller. All six share a single USB root.

The problem is not USB speed. 10Gbps is theoretically plenty. The problem is the UAS (USB Attached SCSI) command queue. Each disk can have up to 32 commands in flight simultaneously under UAS. Multiply that by six disks, add the reflow’s parallel read-from-all-disks-write-to-all-disks workload, and the xHCI controller’s command queue saturates. It starts timing out. Disks fall off.

There is a deeper hardware quirk at play too. The ASMedia controller chip inside the TerraMas enclosure is powered by the USB bus itself, not by the enclosure’s external PSU. The external PSU powers the drives. The controller chip relies on igor’s USB port for its own power. On a mini-PC with modest USB bus power output, sustained heavy IO can starve the ASMedia chip and cause it to reset.

This is a known issue. The TerraMaster forums have threads about it going back years, across the D4-320, D5-300C, D6-320, and D8 Hybrid. The recommended fixes range from switching to the usb-storage driver (BOT mode instead of UAS, slower but far more stable) to using a powered USB hub to supply stable bus power to the ASMedia chip independently of the host USB port.

The Undetachable Hitchhiker

The twist that turned a recoverable situation into a long-term commitment: RAIDZ expansion cannot be cancelled.

Once you attach a disk to a RAIDZ vdev with zpool attach, it is part of that vdev permanently until the reflow completes. There is no zpool detach for RAIDZ members, only for mirror members. The Hitachi drive is in the pool, the reflow must complete, and 18.3TB needs to be rewritten.

cannot detach wwn-0x5000cca271e37b70: only applicable to mirror and replacing vdevs

So the choice became: nurse the reflow through to completion over USB, or find a way to make the USB connection stable enough to last the duration.

The Attempted Fixes

ZFS tunables helped somewhat. Reducing the async IO queue depth stopped the most aggressive saturation:

echo 2 > /sys/module/zfs/parameters/zfs_vdev_async_read_max_active
echo 2 > /sys/module/zfs/parameters/zfs_vdev_async_write_max_active
echo 8 > /sys/module/zfs/parameters/zfs_vdev_max_active

Speed dropped from 180MB/s to around 36MB/s at the conservative end, but the pool stayed up longer between drops. The estimate stretched to nearly a week.

Switching cables and ports made no meaningful difference. The bottleneck is the controller, not the physical connection. A Thunderbolt cable on a non-Thunderbolt port is just a cable. A USB-C port that routes to the same overloaded xHCI controller is still the same overloaded xHCI controller.

A powered USB hub is the current plan. The RSHTECH RSH-A107D, eight ports with six at 10Gbps and a 24W 12V/2A power adapter, is on order. By routing the TerraMas through a hub with its own 12V supply, the ASMedia chip inside the enclosure should get stable bus power regardless of what igor’s USB port is doing under load. Several users in the TerraMaster forums report this resolving their disconnection issues, with speeds of 200MB/s or better sustained.

Switching to BOT mode via the usb-storage driver is the fallback if the hub does not help. It serialises commands and prevents queue saturation at the cost of roughly 10x throughput. Around 30-50MB/s sustained. Slow, but stable. At 30MB/s the reflow would take around seven days. Painful but predictable.

Lessons

USB is not a storage bus. It works for light NAS workloads. It does not work for multi-disk sustained parallel IO lasting days. RAIDZ expansion, resilvers, and scrubs all fall into the category of workloads you should not attempt over USB unless you have verified your specific controller can handle it.

RAIDZ expansion is a one-way door. Before attaching a disk, be sure the host can sustain the reflow workload. There is no going back.

By-id paths matter. Building a pool using /dev/sdX paths on a machine where USB devices re-enumerate with different names on every reboot is a problem waiting to happen. Always use /dev/disk/by-id.

Know your hardware. The AOOSTAR WTR PRO is a solid mini-PC for a homelab node, but it has no PCIe slots for expansion, no second accessible USB controller, and modest USB bus power. Understanding these constraints before committing to a storage architecture would have saved several hours of pain.

The right architecture for igor is primary storage on the internal SATA drives over proper SATA with no USB involved, and the TerraMas USB pool for secondary bulk storage and backup targets only. The workloads it was actually designed for.

The hub arrives in a few hours. The reflow continues. The pool is, for now, ONLINE.

igor is a node in the ankh-morpork.discworld.network cluster, named after the eponymous assistant from Terry Pratchett’s Discworld series. The irony of a machine called Igor requiring repeated resuscitation is not lost.

Q's Place, Q's Space