I have a Synology NAS with about 27 TB of data on it — family photos, my Plex library, code, archived business files, and a growing pile of audio masters. The library is going to roughly double in the next year. Off-site backup has been an open problem for me for the better part of a year, and the path I ended up on is weird enough that I want to write it down end-to-end, with enough detail that you could rebuild it.
The short version: HyperBackup talks S3 to a 350-line Python proxy I wrote, which fans every write across five Google Drive accounts using rclone. It looks like AWS to the NAS and looks like five separate humans to Google. In the middle, it costs me $0 in storage and pushes ~3.75 TB/day of new data into the cloud.
Here's how I got there.
I have a 1 Gbps symmetric fiber line and 27 TB to ship. At line rate that's roughly 2.5 days. In practice I never came close.
I tried, in order:
backup@ Google account on my Workspace tenant. After about three weeks it was sitting on ~400,000 files perpetually "syncing." Throughput hovered between 1–3 MB/s. Restarting Cloud Sync would briefly burst, then settle back to that floor.I kept blaming my network. It turned out to be Google's API.
Once I started tailing logs on the Synology and on a Proxmox host running rclone with --rc, the picture cleared up. All three tools were doing the same thing: uploading millions of tiny chunks one at a time against the Drive API. Each chunk is one or two HTTP requests. Each request counts against the per-user-per-100-seconds API quota.
The relevant Drive quotas, none of which Google publishes prominently:
The throttle I was hitting was the requests-per-100-seconds one. With chunks running 4–16 MB and a steady stream of metadata calls (list, mkdir, stat…), my effective ceiling sat right around 2–3 MB/s sustained — exactly what I was seeing. Pushing harder triggered 403s with userRateLimitExceeded or rateLimitExceeded, and the tools would back off into the floor.
There's a second multiplier I missed for weeks: rclone with its default config shares a global OAuth client (202264815644.apps.googleusercontent.com) with every other rclone user on the planet. That global client gets its own rate-limit pool, which is permanently saturated. The fix is to register your own OAuth client in a GCP project and pass it via --drive-client-id / --drive-client-secret. This alone roughly 5x'd my sustained rate, and is non-optional for anything serious.
Once I understood the throttle, two things became obvious:
HyperBackup understands point 1 natively — it produces ~50 MB pack files by default. It does not understand point 2 at all; it knows about one Google account.
So either I patch HyperBackup (no), or I put something in front of Google Drive that looks like a single bucket to HyperBackup and silently fans the writes across N accounts.
My first attempt at "something in front of Google Drive" was a custom backup engine. I called it Cass Vault, started it on May 18, and killed it on May 20. About 2,000 lines of Python — FastAPI service, SQLite schema for accounts/chunks/packs/snapshots/jobs, an APScheduler-driven worker that walked the source tree, content-addressed chunks via blake3, packed them into 25 GB pack files with zstd-1 compression, encrypted with age, and uploaded via rclone to a pool of Drive accounts.
It worked. It was also a restic reimplementation, and I was writing the bug list as fast as I was writing the features. SQLite contention under five concurrent jobs. OOM on the LXC because three jobs × eight packs × 64 MB buffers blew through the container's memory limit. Edge cases around resumable uploads that I'd already shipped in production code at $work and didn't want to debug again at home.
The unlock came when I realized I was solving the wrong problem. HyperBackup already does the bundle-and-content-address-and-encrypt dance. It does it well. It produces files that look like Pegasus_1/Pool/4/12/327.index.2 — binary blobs, content-addressed, 5–50 MB each, that it knows how to talk S3 to. I didn't need a new backup engine. I needed a fake S3 endpoint.
The replacement is called cass-s3. It's a single FastAPI process that speaks the subset of the S3 API that HyperBackup actually uses, persists metadata in Postgres, stages object bodies on local disk, and uploads each finished part to one of five Google Drive accounts via rclone. To HyperBackup it's just an S3 bucket; to Google it's five unrelated humans doing modest backups.
Synology (HyperBackup)
|
| S3 PUT/GET/etc., HTTPS
v
https://vault.cwfrazier.com (nginx, Let's Encrypt)
|
v
cass-s3 (FastAPI on CT 200, :9000)
+--------+--------+
| |
Postgres local disk (staging)
(object index) |
v
rclone (one process per account)
+----+----+----+----+----+
| | | | |
backup@1 ... backup@5 (Google Drive)
The components, with enough detail to rebuild:
/volume1/*. HyperBackup is the only thing on it that knows about cass-s3.HyperBackup uses very little of the S3 API. The full set cass-s3 had to implement to keep HyperBackup happy:
ListBuckets, HeadBucket, CreateBucket, GetBucketLocationListObjectsV2 (with prefix + delimiter)HeadObject, GetObject (including Range)PutObject (single-shot)CreateMultipartUpload, UploadPart, CompleteMultipartUpload, AbortMultipartUploadDeleteObject (HyperBackup uses this when pruning old generations)That's it. No ACLs, no versioning, no policies, no tagging, no SSE-KMS. AWS Signature V4 verification is implemented but loose — the access key/secret are static and live in cass-s3's config and in HyperBackup. There is no IAM.
Virtual-host-style addressing (e.g. https://pegasus-prod.vault.cwfrazier.com/) is supported via a Route 53 wildcard *.vault.cwfrazier.com pointed at the proxy and a DNS-01 wildcard cert from Let's Encrypt. Both path-style and virtual-host-style work; HyperBackup happens to use path-style.
One Postgres table per concept, all in one DB called cass_s3:
buckets — name, created_at, region (always "us-east-1" because HyperBackup doesn't care).objects — (bucket, key) primary key, size, etag, content_type, account_id (which Drive account holds the final part), drive_file_id, uploaded_at, deleted_at (soft delete for the prune window).multipart_uploads — upload_id, bucket, key, started_at, completed_at, account_id (which account this upload's parts are staging toward).parts — (upload_id, part_number) PK, etag, size, local_path while staging.accounts — (id, email, oauth_tokens, daily_bytes_uploaded, daily_window_started_at, enabled). The scheduler reads this to decide where to send the next upload.One important design choice: a multipart upload picks its account at CreateMultipartUpload time and sticks with it until CompleteMultipartUpload. You can't fan parts of the same object across accounts, because the only way to assemble them on the Drive side is to upload the final concatenated blob to one account.
For a single-shot PutObject:
/opt/cass-s3/data/parts/<uuid>.bin. (Originally this was on the LXC's root filesystem. That bit me — see the incident below.)accounts — least-recently-used among accounts whose daily-bytes counter is below 700 GB. Drive's hard cap is 750 GB but I keep a 50 GB safety margin.rclone copyto /opt/cass-s3/data/parts/<uuid>.bin gdrive-<account>:cass-s3/<bucket>/<sha256-of-key>.bin --drive-chunk-size=64M --transfers=1 --tpslimit=50.objects.200 with the etag (MD5 of the body) to HyperBackup.For multipart:
CreateMultipartUpload allocates an upload_id (UUID), picks an account, returns the ID.UploadPart streams to /opt/cass-s3/data/parts/<upload_id>/<part_number>.bin and records the row.CompleteMultipartUpload concatenates the parts in order to a single file (cheap: it's a streaming cat on the same disk), rclones that single file to the chosen account, writes the objects row, deletes the staging directory.Two things matter about this:
Concatenation, not stitching. S3 multipart lets the client concatenate parts on the server side. Google Drive has no equivalent. So I concatenate locally and upload one blob. This means a multipart upload temporarily needs 2 × object_size on disk: the parts plus the concatenated file. For HyperBackup, which produces ~50 MB pack files, that's nothing. For a giant ad-hoc aws s3 cp of a 50 GB file, it matters — budget your staging volume accordingly.
One account per object, not per part. If a part fails to upload to the chosen account, the entire CompleteMultipartUpload fails and HyperBackup retries from CreateMultipartUpload. There is no recover-by-switching-accounts. This was a deliberate trade for simplicity.
Symmetric. GetObject looks up objects, finds the account and Drive file ID, streams via rclone cat back to the HTTP response. Range requests use rclone cat --offset --count. There is no caching layer — reads are rare (HyperBackup mostly only restores during disaster recovery), so cold-cache latency is fine.
Each Drive account gets its own OAuth grant, but they all share one OAuth client — a Web OAuth client I created in a dedicated GCP project (applied-pipe-496717-a8) with the redirect URI https://vault.cwfrazier.com/api/oauth/callback and the drive + email scopes. The shared client matters because it dodges the global rclone client's rate-limit pool entirely.
The dashboard at https://vault.cwfrazier.com/_dashboard has an "Add account" button. The flow:
/api/oauth/start, which redirects to Google's consent screen with access_type=offline and prompt=consent (to force a refresh token)./api/oauth/callback?code=....accounts.[gdrive-<email>] section gets written to /root/.config/rclone/rclone.conf with that account's refresh token and the shared client ID/secret.I currently have five accounts authed. Adding accounts 6–10 is a UI click. Past about 10, my uplink saturates before I can take advantage of more quota.
A small Svelte SPA at vault.cwfrazier.com/_dashboard that polls the API. Browser requests (Accept: text/html) get redirected to the dashboard automatically; S3 clients get their normal XML responses. Panels:
The Synology end is straightforward once cass-s3 is up. Create a new HyperBackup task, type "S3 Storage," then:
Custom Server URLvault.cwfrazier.com (HTTPS, port 443)v4Pathus-east-1 (ignored by cass-s3, but HyperBackup wants something)SeEd-PGotHTwJLpLMDykpQ; secret lives in the cass-s3 .env and Bitwarden.pegasus-prod. HyperBackup will CreateBucket it for you on first run.Tune the task's "Maximum number of concurrent backup tasks" up — I run four. Anything higher and the Synology's CPU becomes the bottleneck before the network does.
Fix: I symlinked /opt/cass-s3/data/parts/ to a dedicated 500 GB LVM volume and added a janitor that rms any staging file older than 24 hours (defensive — should never happen if the upload path is healthy). Lesson: staging volume sizing matters. The right formula is (max_concurrent_uploads × max_object_size × 2) with a generous safety margin. For HyperBackup's 50 MB pack files this is laughable; for ad-hoc large objects it's not.
Each Drive account gets ~750 GB/day of writes. I leave 50 GB of headroom and round down, so I plan for 700 GB/day/account. With five accounts that's 3.5 TB/day of new data. Initial seed of 27 TB ≈ 8 days at the cap, which is roughly what I observed (it ran a bit slower while I was tuning).
Free tier per Google account is 15 GB. Workspace Business Standard accounts on my tenant get a pooled 2 TB by default, but in practice the cap that matters is the daily-write throttle, not the storage. I've stayed below 1.5 TB per account so far; if I bump up against pool limits I'll buy storage add-ons rather than restructure the design.
| Accounts | Daily write ceiling | Seed time for 50 TB |
|---|---|---|
| 1 | 0.7 TB/day | ~71 days |
| 5 (today) | 3.5 TB/day | ~14 days |
| 10 | 7.0 TB/day | ~7 days (uplink-bound at ~600 Mbps) |
SQLITE_BUSY even with WAL mode and a 30s busy_timeout. Switching to Postgres took an afternoon and made the concurrency story trivial.Storage: $0. The five Drive accounts are all on my existing Workspace tenant. The "extra" accounts are real users I would have provisioned anyway.
Compute: cass-s3 runs on hardware I already had (Proxmox host, ~50 MB RAM, single-digit % CPU at peak). nginx and Let's Encrypt are free. Postgres is shared with the rest of the homelab.
Versus Backblaze B2 at 27 TB × $6/TB/month = $162/month, or AWS Glacier Deep Archive at roughly $27/month for storage but a $700+ retrieval bill the one time I'd ever need it. The Drive round-about pays for itself the day it's deployed.
DeleteObject, but the deleted Drive files currently go to that account's trash and need a periodic empty-trash sweep. Trivial cron, just haven't written it yet.If you want to do this yourself, the moving parts are all small and the architecture diagram up top is the whole thing. The hardest bit was admitting that the answer was "stop trying to write a backup tool, write a fake S3 endpoint" — everything else followed.