Most homelab storage problems are not actually storage problems.
They are one of these instead:
- the wrong box has the wrong job
- the mount options are wrong
- the container user does not match the mounted files
- the network is the bottleneck, not the disks
- RAID got confused for backup
I learned every one of those the hard way.
So if I were rebuilding my storage stack from scratch, this is the model I would use:
- give every storage system a clear role
- make ownership explicit instead of hoping CIFS will do the right thing
- tune mount options for the workload you actually have
- measure before buying cache or faster hardware
- assume recovery day will happen and prepare for it now
That is the difference between “my NAS feels weird sometimes” and “I know exactly where to look when something breaks.”
Start with roles, not brands
The first big improvement in my lab was stopping the “everything goes everywhere” mindset.
Different storage systems are good at different jobs:
| Role | What I want | What I actually use |
|---|---|---|
| Media and large bulk storage | Big capacity, easy drive expansion, approachable UI | Unraid |
| Family backups and photo archive | Friendly appliance UX, snapshot/backup workflows | Synology |
| Active projects and builds | Low latency, predictable performance | Local NVMe / workstation storage |
| Offsite resilience | Another physical location or encrypted cloud copy | Cross-site sync + cloud backup |
In practice, that means my setup is split by purpose instead of by impulse:
- Unraid handles the big media footprint
- Synology handles backup-heavy and family-friendly workflows
- local NVMe handles active work and anything latency-sensitive
- cross-site sync and encrypted cloud backup cover the “what if a whole box or whole site disappears?” question
This matters because the tuning is different for each role.
A media share mounted read-only for Plex wants different settings than a writable project share, and both want different settings than a snapshot destination.
If you skip the role question, every mount and every container becomes a compromise.
Permissions are the first thing to make explicit
The most repeatable storage problem in a homelab is still permissions.
You mount a NAS share. The files are obviously there. The container can see the mount. And then the app acts like the directory is empty, read-only, or cursed.
That is usually not magic. It is a UID/GID mismatch.
The Synology/Docker trap
On Synology, a lot of Docker pain starts with this assumption:
“If I set PUID and PGID, the container will fix ownership for me.”
Sometimes the image tries. Synology often refuses the chown. Then you get scary startup logs and a container that still kind of runs but behaves badly.
The safer pattern is to fix ownership before the container starts.
# Create the app directory first
mkdir -p /volume1/docker/myapp
# Set ownership to the user the container should run as
sudo chown -R 1026:100 /volume1/docker/myapp
sudo chmod -R 755 /volume1/docker/myapp
Then match the container process to that user:
services:
app:
image: your-image
environment:
- PUID=1026
- PGID=100
volumes:
- /volume1/docker/myapp:/config
The important part is not the exact numbers. It is that the mount and the process agree.
If the mounted files belong to one identity and the container runs as another, the app will fail in weird ways that look like application bugs.
The CIFS version of the same problem
When the storage is remote and mounted over CIFS/SMB, you have another translation layer in the way.
By default, CIFS often presents files as root:root on the host, which can turn into nobody:nogroup inside containers. Plex, Jellyfin, or any other service running as a normal UID will not love that.
For mounted media shares, I now make ownership explicit in the mount itself:
//192.168.20.8/Media /mnt/media cifs \
credentials=/etc/samba/credentials,\
uid=1000,\
gid=1000,\
file_mode=0755,\
dir_mode=0755 0 0
And then I make the container match:
services:
plex:
image: linuxserver/plex
environment:
- PUID=1000
- PGID=1000
volumes:
- /mnt/media:/media
That pairing fixes an enormous number of “the files exist but the app sees nothing” problems.
The permission checklist I actually use now
Before I blame Docker, Plex, or the NAS, I check these first:
# On the host
ls -la /mnt/media
# Inside the container
docker exec -it plex /bin/bash
ls -la /media
# On a Synology host
id your_username
What I want to confirm:
- the files really exist on the host
- the container sees the same directories
- the visible UID/GID inside the mount matches the app process
- the mount is readable and traversable
If those four things line up, the rest of the debugging gets much smaller.
Tune CIFS for the workload you actually have
The next big lesson was that default CIFS settings are often correct for office file servers and wrong for homelab media workloads.
That showed up most clearly in a Plex setup where the mount technically worked, but playback had tiny stutters and intermittent weirdness.
The revealing command was:
mount | grep Content
The problem mount had options like:
cache=strictactimeo=1
That is incredibly conservative. It means the kernel keeps revalidating metadata constantly.
For media files that almost never change, that is wasted round-tripping.
Better defaults for read-only media
For a media share that is mostly read-only, this is much closer to what I want:
//nas-ip/Media /mnt/media cifs \
credentials=/etc/samba/credentials,\
uid=1000,\
gid=1000,\
file_mode=0755,\
dir_mode=0755,\
vers=3.0,\
cache=loose,\
actimeo=30,\
ro,\
noatime,\
hard,\
_netdev 0 0
Here is why each one matters:
cache=loose
- Better for files that do not change during playback
- Great for movies, TV, audiobooks, and other static media
actimeo=30
- Cuts down constant metadata revalidation
- Reduces the “why is this stuttering every second?” class of problem
ro
- If Plex only needs to read media, mount it read-only
- Fewer chances to do something dumb accidentally
hard
- Better reliability when the NAS is slow or briefly unhappy
- For streaming media, I want the mount to wait, not fail fast and return nonsense
_netdev
- Makes boot ordering less stupid
- Important whenever the mount depends on the network being up first
When not to use those settings
Do not cargo-cult the media mount template onto every share.
For writable project shares, user home directories, or collaborative editing directories, aggressive caching and relaxed metadata behavior may not be what you want.
That is why the role question matters.
I use different mount behavior for:
- read-only media
- writable app data
- backup destinations
- human-edited document shares
The homelab version of “it depends” is still true. It just depends on the workload, not on superstition.
Measure before you buy cache
I almost wasted money on NVMe cache because the NAS felt slow.
What actually fixed the decision was benchmarking.
First, I tested the network with iperf3:
# On NAS
docker run -d --name iperf3 -p 5201:5201 networkstatic/iperf3 -s
# On desktop
iperf3 -c 10.42.0.10
The result was about 940 Mbps on a Gigabit connection.
That told me something important immediately:
For large sequential transfers, the network was already close to the real limit.
So if the workload is mostly:
- Plex streaming
- big backups
- large file transfers
then NVMe cache is usually not the first fix.
Where NVMe cache helps
It helps when the workload is random I/O heavy:
- VMs running from NAS storage
- databases
- large multi-user file serving
- containers doing lots of small-file reads and writes
Where NVMe cache does not save you
It does not magically improve:
- sequential media streaming over 1GbE
- backup jobs that are already network-bound
- a bad CIFS mount configuration
- a permission mismatch
- a user buying hardware because the UI felt sluggish once
If your Plex library stutters because actimeo=1 is hammering metadata checks, no amount of NVMe cache will teach your mount options to behave.
If your app cannot read a share because it sees everything as nobody:nogroup, the cache is not the problem either.
Benchmark first. Then optimize the real bottleneck.
Distributed storage needs a sync plan, not just mounts
Once storage spans more than one site, the problem changes again.
Now you are not just thinking about shares. You are thinking about resilience.
The setup that makes the most sense to me is:
- keep active work local and fast
- keep bulk media where it belongs
- sync critical documents and backups somewhere else
- encrypt anything that leaves your physical control
For cross-site or cloud sync, the pattern I trust is explicit and boring:
# Example pattern
rclone sync ~/Documents remote-backup:/backup/docs
rclone sync remote-backup:/backup cloud_crypt:/backup --fast-list --transfers 16
The exact remote names will vary, but the lesson is stable:
Do not assume “I have a NAS” means “I have a backup.”
A second box that receives replicated data is a backup plan. A cloud copy encrypted before upload is a backup plan. A runbook that tells you what gets restored first is a backup plan.
A single storage appliance full of important data is just a single point of failure with a prettier dashboard.
Recovery starts before the failure
The hardest storage lesson is still the simplest one:
RAID is not a backup.
Redundancy helps with uptime. It does not save you from corruption, user error, multiple failures, or the wrong repair click at the wrong time.
I learned that during a Synology array failure where one bad drive became a bigger recovery problem and the filesystem stack turned into exactly the kind of mess you do not want to learn under pressure.
The parts that mattered most were not clever.
They were discipline.
The recovery rules I trust now
1. Stop touching the failing system
Do not keep clicking repair because you are scared. Do not reboot just to feel like something happened. Do not let panic make writes to the only copy.
2. Work from images, not originals
If the data matters, image the drives first.
dd if=/dev/sdb of=/backup/disk1.img bs=4M status=progress conv=sync,noerror
That conv=sync,noerror pattern matters because it keeps the imaging job moving across bad sectors instead of dying on the first read error.
3. Understand the storage stack
On appliances like Synology, the layers matter:
- partition
- RAID
- LVM
- filesystem
- files
If you try to skip straight to mounting the wrong device, you waste time and sometimes make things worse.
4. Use the right recovery tool for the situation
I like open tools. I also like getting my data back.
When the layered Synology stack was too damaged for the normal Linux mount path to be safe and useful, UFS Explorer was worth paying for. That is not ideology. That is pragmatism.
5. Prioritize recovery by replaceability
Not every terabyte deserves the same urgency.
My order now is always:
- irreplaceable personal files first
- difficult-to-recreate work second
- easily reacquired media last
That prevents panic from making you recover the wrong 8 TB first.
The maintenance checklist that matters more than heroics
If you want fewer storage emergencies, do more boring maintenance.
This is the checklist I would actually hand to another homelabber:
Monthly
btrfs scrub start /volume1
Why:
- catches checksum and read issues before you meet them on the worst possible day
Weekly
- review SMART health
- check failed disks, degraded arrays, and storage notifications
- confirm backups and sync jobs are still running
Before deploying or remounting anything important
- verify UID/GID on the host
- verify UID/GID inside the container
- verify the mount options match the workload
- verify the app can actually read the path it needs
Before spending money on “performance”
- run
iperf3 - run a real disk benchmark if needed
- identify whether the bottleneck is network, random I/O, sequential I/O, or bad config
Before disaster strikes
- document the storage layout
- document what is backed up where
- document which data gets restored first
- keep credentials, share names, and mount points organized in a way that future-you can follow under stress
The storage model I would actually recommend
If you want the short version, here it is.
For media
- large-capacity NAS
- read-only CIFS mount for players and servers
cache=loose, longeractimeo, and explicit UID/GID
For Docker app data
- avoid magical ownership assumptions
- pre-create directories
- set ownership before the container starts
- match
PUIDandPGIDto the mounted filesystem
For active work
- keep it local on fast storage
- sync out copies instead of editing directly over a slow or flaky remote share
For backups
- at least one other destination
- ideally another site or encrypted cloud copy
- test restores, not just backup jobs
For recovery
- image first
- work from copies
- understand the storage layers
- stop pretending redundancy is enough
The lesson underneath all of this
Homelab storage gets easier the moment you stop treating it like one giant bucket.
It is not one thing.
It is:
- bulk storage
- app storage
- media delivery
- backup storage
- sync transport
- recovery planning
Each of those deserves a specific design.
When you make the role explicit, the right mount options get easier. When you make ownership explicit, containers stop acting haunted. When you measure first, you stop buying placebo upgrades. When you prepare for recovery early, you make fewer desperate mistakes later.
That is what “storage done right” really means.
Not perfect hardware. Not the fanciest NAS.
Just a system where you already know what each piece is for, how it should be mounted, and what you will do when one of them fails.