Homelab Storage Done Right: NAS Permissions, CIFS Mounts, Sync, and Recovery

Most homelab storage problems are not actually storage problems.

They are one of these instead:

  • the wrong box has the wrong job
  • the mount options are wrong
  • the container user does not match the mounted files
  • the network is the bottleneck, not the disks
  • RAID got confused for backup

I learned every one of those the hard way.

So if I were rebuilding my storage stack from scratch, this is the model I would use:

  1. give every storage system a clear role
  2. make ownership explicit instead of hoping CIFS will do the right thing
  3. tune mount options for the workload you actually have
  4. measure before buying cache or faster hardware
  5. assume recovery day will happen and prepare for it now

That is the difference between “my NAS feels weird sometimes” and “I know exactly where to look when something breaks.”

Start with roles, not brands

The first big improvement in my lab was stopping the “everything goes everywhere” mindset.

Different storage systems are good at different jobs:

Role What I want What I actually use
Media and large bulk storage Big capacity, easy drive expansion, approachable UI Unraid
Family backups and photo archive Friendly appliance UX, snapshot/backup workflows Synology
Active projects and builds Low latency, predictable performance Local NVMe / workstation storage
Offsite resilience Another physical location or encrypted cloud copy Cross-site sync + cloud backup

In practice, that means my setup is split by purpose instead of by impulse:

  • Unraid handles the big media footprint
  • Synology handles backup-heavy and family-friendly workflows
  • local NVMe handles active work and anything latency-sensitive
  • cross-site sync and encrypted cloud backup cover the “what if a whole box or whole site disappears?” question

This matters because the tuning is different for each role.

A media share mounted read-only for Plex wants different settings than a writable project share, and both want different settings than a snapshot destination.

If you skip the role question, every mount and every container becomes a compromise.

Permissions are the first thing to make explicit

The most repeatable storage problem in a homelab is still permissions.

You mount a NAS share. The files are obviously there. The container can see the mount. And then the app acts like the directory is empty, read-only, or cursed.

That is usually not magic. It is a UID/GID mismatch.

The Synology/Docker trap

On Synology, a lot of Docker pain starts with this assumption:

“If I set PUID and PGID, the container will fix ownership for me.”

Sometimes the image tries. Synology often refuses the chown. Then you get scary startup logs and a container that still kind of runs but behaves badly.

The safer pattern is to fix ownership before the container starts.

# Create the app directory first
mkdir -p /volume1/docker/myapp

# Set ownership to the user the container should run as
sudo chown -R 1026:100 /volume1/docker/myapp
sudo chmod -R 755 /volume1/docker/myapp

Then match the container process to that user:

services:
  app:
    image: your-image
    environment:
      - PUID=1026
      - PGID=100
    volumes:
      - /volume1/docker/myapp:/config

The important part is not the exact numbers. It is that the mount and the process agree.

If the mounted files belong to one identity and the container runs as another, the app will fail in weird ways that look like application bugs.

The CIFS version of the same problem

When the storage is remote and mounted over CIFS/SMB, you have another translation layer in the way.

By default, CIFS often presents files as root:root on the host, which can turn into nobody:nogroup inside containers. Plex, Jellyfin, or any other service running as a normal UID will not love that.

For mounted media shares, I now make ownership explicit in the mount itself:

//192.168.20.8/Media /mnt/media cifs \
  credentials=/etc/samba/credentials,\
  uid=1000,\
  gid=1000,\
  file_mode=0755,\
  dir_mode=0755 0 0

And then I make the container match:

services:
  plex:
    image: linuxserver/plex
    environment:
      - PUID=1000
      - PGID=1000
    volumes:
      - /mnt/media:/media

That pairing fixes an enormous number of “the files exist but the app sees nothing” problems.

The permission checklist I actually use now

Before I blame Docker, Plex, or the NAS, I check these first:

# On the host
ls -la /mnt/media

# Inside the container
docker exec -it plex /bin/bash
ls -la /media

# On a Synology host
id your_username

What I want to confirm:

  • the files really exist on the host
  • the container sees the same directories
  • the visible UID/GID inside the mount matches the app process
  • the mount is readable and traversable

If those four things line up, the rest of the debugging gets much smaller.

Tune CIFS for the workload you actually have

The next big lesson was that default CIFS settings are often correct for office file servers and wrong for homelab media workloads.

That showed up most clearly in a Plex setup where the mount technically worked, but playback had tiny stutters and intermittent weirdness.

The revealing command was:

mount | grep Content

The problem mount had options like:

  • cache=strict
  • actimeo=1

That is incredibly conservative. It means the kernel keeps revalidating metadata constantly.

For media files that almost never change, that is wasted round-tripping.

Better defaults for read-only media

For a media share that is mostly read-only, this is much closer to what I want:

//nas-ip/Media /mnt/media cifs \
  credentials=/etc/samba/credentials,\
  uid=1000,\
  gid=1000,\
  file_mode=0755,\
  dir_mode=0755,\
  vers=3.0,\
  cache=loose,\
  actimeo=30,\
  ro,\
  noatime,\
  hard,\
  _netdev 0 0

Here is why each one matters:

cache=loose

  • Better for files that do not change during playback
  • Great for movies, TV, audiobooks, and other static media

actimeo=30

  • Cuts down constant metadata revalidation
  • Reduces the “why is this stuttering every second?” class of problem

ro

  • If Plex only needs to read media, mount it read-only
  • Fewer chances to do something dumb accidentally

hard

  • Better reliability when the NAS is slow or briefly unhappy
  • For streaming media, I want the mount to wait, not fail fast and return nonsense

_netdev

  • Makes boot ordering less stupid
  • Important whenever the mount depends on the network being up first

When not to use those settings

Do not cargo-cult the media mount template onto every share.

For writable project shares, user home directories, or collaborative editing directories, aggressive caching and relaxed metadata behavior may not be what you want.

That is why the role question matters.

I use different mount behavior for:

  • read-only media
  • writable app data
  • backup destinations
  • human-edited document shares

The homelab version of “it depends” is still true. It just depends on the workload, not on superstition.

Measure before you buy cache

I almost wasted money on NVMe cache because the NAS felt slow.

What actually fixed the decision was benchmarking.

First, I tested the network with iperf3:

# On NAS
docker run -d --name iperf3 -p 5201:5201 networkstatic/iperf3 -s

# On desktop
iperf3 -c 10.42.0.10

The result was about 940 Mbps on a Gigabit connection.

That told me something important immediately:

For large sequential transfers, the network was already close to the real limit.

So if the workload is mostly:

  • Plex streaming
  • big backups
  • large file transfers

then NVMe cache is usually not the first fix.

Where NVMe cache helps

It helps when the workload is random I/O heavy:

  • VMs running from NAS storage
  • databases
  • large multi-user file serving
  • containers doing lots of small-file reads and writes

Where NVMe cache does not save you

It does not magically improve:

  • sequential media streaming over 1GbE
  • backup jobs that are already network-bound
  • a bad CIFS mount configuration
  • a permission mismatch
  • a user buying hardware because the UI felt sluggish once

If your Plex library stutters because actimeo=1 is hammering metadata checks, no amount of NVMe cache will teach your mount options to behave.

If your app cannot read a share because it sees everything as nobody:nogroup, the cache is not the problem either.

Benchmark first. Then optimize the real bottleneck.

Distributed storage needs a sync plan, not just mounts

Once storage spans more than one site, the problem changes again.

Now you are not just thinking about shares. You are thinking about resilience.

The setup that makes the most sense to me is:

  • keep active work local and fast
  • keep bulk media where it belongs
  • sync critical documents and backups somewhere else
  • encrypt anything that leaves your physical control

For cross-site or cloud sync, the pattern I trust is explicit and boring:

# Example pattern
rclone sync ~/Documents remote-backup:/backup/docs
rclone sync remote-backup:/backup cloud_crypt:/backup --fast-list --transfers 16

The exact remote names will vary, but the lesson is stable:

Do not assume “I have a NAS” means “I have a backup.”

A second box that receives replicated data is a backup plan. A cloud copy encrypted before upload is a backup plan. A runbook that tells you what gets restored first is a backup plan.

A single storage appliance full of important data is just a single point of failure with a prettier dashboard.

Recovery starts before the failure

The hardest storage lesson is still the simplest one:

RAID is not a backup.

Redundancy helps with uptime. It does not save you from corruption, user error, multiple failures, or the wrong repair click at the wrong time.

I learned that during a Synology array failure where one bad drive became a bigger recovery problem and the filesystem stack turned into exactly the kind of mess you do not want to learn under pressure.

The parts that mattered most were not clever.

They were discipline.

The recovery rules I trust now

1. Stop touching the failing system

Do not keep clicking repair because you are scared. Do not reboot just to feel like something happened. Do not let panic make writes to the only copy.

2. Work from images, not originals

If the data matters, image the drives first.

dd if=/dev/sdb of=/backup/disk1.img bs=4M status=progress conv=sync,noerror

That conv=sync,noerror pattern matters because it keeps the imaging job moving across bad sectors instead of dying on the first read error.

3. Understand the storage stack

On appliances like Synology, the layers matter:

  • partition
  • RAID
  • LVM
  • filesystem
  • files

If you try to skip straight to mounting the wrong device, you waste time and sometimes make things worse.

4. Use the right recovery tool for the situation

I like open tools. I also like getting my data back.

When the layered Synology stack was too damaged for the normal Linux mount path to be safe and useful, UFS Explorer was worth paying for. That is not ideology. That is pragmatism.

5. Prioritize recovery by replaceability

Not every terabyte deserves the same urgency.

My order now is always:

  • irreplaceable personal files first
  • difficult-to-recreate work second
  • easily reacquired media last

That prevents panic from making you recover the wrong 8 TB first.

The maintenance checklist that matters more than heroics

If you want fewer storage emergencies, do more boring maintenance.

This is the checklist I would actually hand to another homelabber:

Monthly

btrfs scrub start /volume1

Why:

  • catches checksum and read issues before you meet them on the worst possible day

Weekly

  • review SMART health
  • check failed disks, degraded arrays, and storage notifications
  • confirm backups and sync jobs are still running

Before deploying or remounting anything important

  • verify UID/GID on the host
  • verify UID/GID inside the container
  • verify the mount options match the workload
  • verify the app can actually read the path it needs

Before spending money on “performance”

  • run iperf3
  • run a real disk benchmark if needed
  • identify whether the bottleneck is network, random I/O, sequential I/O, or bad config

Before disaster strikes

  • document the storage layout
  • document what is backed up where
  • document which data gets restored first
  • keep credentials, share names, and mount points organized in a way that future-you can follow under stress

The storage model I would actually recommend

If you want the short version, here it is.

For media

  • large-capacity NAS
  • read-only CIFS mount for players and servers
  • cache=loose, longer actimeo, and explicit UID/GID

For Docker app data

  • avoid magical ownership assumptions
  • pre-create directories
  • set ownership before the container starts
  • match PUID and PGID to the mounted filesystem

For active work

  • keep it local on fast storage
  • sync out copies instead of editing directly over a slow or flaky remote share

For backups

  • at least one other destination
  • ideally another site or encrypted cloud copy
  • test restores, not just backup jobs

For recovery

  • image first
  • work from copies
  • understand the storage layers
  • stop pretending redundancy is enough

The lesson underneath all of this

Homelab storage gets easier the moment you stop treating it like one giant bucket.

It is not one thing.

It is:

  • bulk storage
  • app storage
  • media delivery
  • backup storage
  • sync transport
  • recovery planning

Each of those deserves a specific design.

When you make the role explicit, the right mount options get easier. When you make ownership explicit, containers stop acting haunted. When you measure first, you stop buying placebo upgrades. When you prepare for recovery early, you make fewer desperate mistakes later.

That is what “storage done right” really means.

Not perfect hardware. Not the fanciest NAS.

Just a system where you already know what each piece is for, how it should be mounted, and what you will do when one of them fails.