Back to case studies
Case study · Media and Music

From MP4 upload to processed catalog, no human in the loop.

5K-song catalog
Processed end-to-end, zero manual steps

At a glance

Industry
Media and Music
Build type
Pipeline
Tech stack
n8nDockerFFmpegSoXACRCloudGoogle APIs
What it does
  • Scans Google Drive every 5 minutes for new MP4 uploads under each artist’s folder
  • Deduplicates against the artist’s own library via ACRCloud (not the global commercial catalog)
  • Detects AI-generated audio fingerprints via ACRCloud’s identify response
  • Pitch-preserving tempo stretch to 5:00–5:10 with FFmpeg’s rubberband filter
  • Layers Freeverb-algorithm reverb (SoX) and outputs a clean WAV + extended MP4 + thumbnail
  • Logs every status transition to the artist’s tab in a shared Google Sheet (16-column schema)
The work

A music distribution platform — about 5,000 songs across many independent artists — was running their upload processing across three tools: a Zapier job that copied each MP4 from a shared Google Drive folder into the right artist subfolder, an older n8n workflow that logged the file to a Google Sheet, and a manual studio step where the operator would dedup-check, run a pitch-preserving tempo stretch, and add reverb on the output. The chain worked, but it broke if any one step needed a fix, and it didn’t scale: every new artist meant more manual processing, more zaps, and more places to lose track of a duplicate or a bad upload.

We replaced the entire chain with a single self-hosted n8n workflow on a small Hetzner VPS — 58 nodes from ingest to logged catalog row. Every five minutes the pipeline scans the shared Drive root for new MP4s, runs each through ACRCloud deduplication against the artist’s own library (never the global commercial catalog, per the client’s hard requirement), checks for AI-generated audio fingerprints, applies a pitch-preserving tempo stretch with FFmpeg’s rubberband filter to land between 5:00 and 5:10, layers a SoX Freeverb reverb on the output, extracts a thumbnail at a random frame, packages a clean WAV, and uploads three artifacts back to Drive — all while writing a row to the artist’s own tab in a shared Google Sheet at every status transition. The dedup library is self-populating: every successfully processed song uploads to ACR’s custom bucket, so the next match is automatic.

The platform now processes its 5,000-song catalog through one autonomous pipeline, no manual studio step, no Zapier orchestration, and no operator dedup checks. New artists onboard by creating a Drive folder and a Sheet tab — the workflow handles the rest. The whole system runs on a $14/month VPS.

Engineering challenges
01

Pitch-preserving tempo stretch (not speed change)

FFmpeg with the rubberband filter for studio-grade pitch-preserving time-stretch. The pipeline computes the exact tempo factor for each input to land in the 5:00–5:10 target window, with an atempo chain fallback for builds where rubberband isn’t compiled in.

02

Dedup that respects the artist’s own library, not Spotify-style global match

ACRCloud’s custom-bucket identify, not their global commercial catalog. The pipeline parses only metadata.custom_files[] matches as duplicates and explicitly ignores metadata.music[] (commercial catalog) matches. Every successfully processed song is uploaded to the artist’s bucket, making the dedup library self-populating.

03

Heavy media processing on a small VPS

Hetzner CPX21 ($14/mo, 3 vCPU, 4GB RAM, 2GB swap as a safety net for FFmpeg spikes). SplitInBatches set to 1 so the workflow processes one MP4 at a time — the CPU never gets pinned, and a single oversized upload can’t cascade. The Docker image pins n8n 1.110.0 (the last release before n8n switched to Hardened Images, which strip apk and break the FFmpeg + rubberband + SoX install step).

Ready to build?

15 minutes. No pitch deck. Just a conversation about what you’re trying to solve.

Book an intro
Or send a note

Or email ops@dualitylabs.ai