The RAID That Almost Ate Christmas
Date: December 29, 2025 Duration: Multiple sessions across 4 days Issue: Synology NAS failure, data recovery needed Root Cause: Btrfs corruption + USB transport instability
The Setup
Dad’s Synology NAS died. Four 6TB HGST drives in SHR (basically RAID5). Years of family photos, documents, and media. The NAS wouldn’t boot, and every attempt to access the volume failed.
I grabbed the drives and brought them to my Gentoo workstation. Time for some data archaeology.
The Layer Cake
Synology doesn’t just put files on a disk. Oh no. That would be too easy.
Physical disks (/dev/sdX)
└── Partition sdX5 (linux_raid_member)
└── mdadm RAID5 (/dev/md127)
└── LVM Physical Volume
└── Volume Group: vg1
└── Logical Volume: vg1-volume_1
└── Filesystem: Btrfs
Five layers. Each one had to work before I could see a single file.
Phase 1: Disk Discovery
Connected all four drives via USB docking station. The system saw them immediately.
lsblk -f
The sdX5 partitions showed as linux_raid_member with label Redcone:2. Good sign — the RAID metadata was intact.
Phase 2: RAID Assembly
First problem: /proc/mdstat returned “No such file or directory.”
The kernel module wasn’t loaded.
sudo modprobe md_mod
Now to assemble the RAID. The key word here is read-only. Never, ever touch source data during recovery.
sudo mdadm --assemble --readonly /dev/md127 \
/dev/sdb5 /dev/sdc5 /dev/sdd5 /dev/sde5
cat /proc/mdstat
# md127 : active (read-only) raid5 [UUUU]
All four drives present. Array intact.
Phase 3: LVM Activation
sudo pvscan
sudo vgscan
sudo vgchange -ay
Got a warning about “old PV header” — typical with Synology’s modified LVM. Ignored it.
lsblk | grep vg1-volume_1
The logical volume appeared. Three layers down, two to go.
Phase 4: The Btrfs Problem
This is where things got ugly.
Tried mounting:
sudo mount -o ro /dev/mapper/vg1-volume_1 /mnt/recovery
Failed. Btrfs complained about corrupted metadata.
The temptation to run btrfs check --repair was real. But that command modifies data. On a recovery disk. With irreplaceable family photos.
Nope.
Phase 5: The Destination Disaster
Borrowed a 24TB USB HDD for the recovery destination. Formatted ext4, mounted at /mnt/recovery_dest.
Started the copy. Watched it progress. Then:
ls /mnt/recovery_dest
# Input/output error
The destination drive had disconnected. Mid-copy. Without warning.
Checked dmesg:
error -71
USB disconnect
uas_eh_abort_handler
EXT4 journal aborted
Remounting filesystem read-only
USB docks under sustained write load. The cable was fine. The dock was fine. But together, under continuous data transfer, they gave up.
This wasn’t ext4’s fault. This was transport failure.
Phase 6: UFS Explorer
Linux tools gave me block-level access. But for file-level recovery from corrupted Btrfs? I needed something stronger.
UFS Explorer understands the full stack: MD RAID → LVM → Btrfs. It can reconstruct directory trees without mounting.
Opened UFS, pointed it at the assembled RAID. It found the Btrfs volume. Started browsing directories.
Documents: visible. Photos: visible. All those family memories: right there.
Phase 7: The Copy (Take Two)
Switched to a USB NVMe enclosure with a Samsung 970 EVO Plus. Solid state, no moving parts, stable under sustained writes.
Started the priority copy:
- Documents folder first
- Family photos second
- Everything else after
This time, no disconnects. No errors. Just steady progress.
The Lessons
Read-only is sacred. When recovering data, you are not allowed to write to source disks. Period. Use --readonly flags everywhere.
USB docks lie. They claim to support sustained data transfer. Under recovery workloads — hundreds of GB, continuous writes — they choke. Use NVMe or internal SATA when possible.
The layer cake matters. Synology’s storage stack is: disk → partition → mdadm → LVM → Btrfs. You must work through each layer in order. Skipping steps doesn’t work.
Don’t repair before extracting. btrfs check --repair sounds helpful. It’s not. It modifies metadata. On corrupted filesystems, that can destroy your last chance at recovery.
Professional tools exist for a reason. I spent a day trying to do this with native Linux utilities. UFS Explorer solved it in an hour.
The Status
| Component | Status |
|---|---|
| RAID assembly | Working |
| LVM activation | Working |
| Native Btrfs mount | Failed (corruption) |
| USB HDD destination | Failed (transport) |
| NVMe destination | Working |
| UFS Explorer recovery | Success |
Documents and photos recovered. Dad has his data back.
The drives are still sitting on my desk. I should probably ship them back. But first, maybe I’ll run a few more verification passes. Just to be safe.