- cross-posted to:
- [email protected]
- [email protected]
- [email protected]
- cross-posted to:
- [email protected]
- [email protected]
- [email protected]
We backed up Spotify (metadata and music files). It’s distributed in bulk torrents (~300TB), grouped by popularity.
This release includes the largest publicly available music metadata database with 256 million tracks and 186 million unique ISRCs.
It’s the world’s first “preservation archive” for music which is fully open (meaning it can easily be mirrored by anyone with enough disk space), with 86 million music files, representing around 99.6% of listens.
After Meta scraped all their books they have the perfect defense now. All they have to say is “we’re training a music AI” and they’re apparently untouchable.
Well, they have to say “we’re training a music AI” while slipping several million dollars into the pockets of the right people. Rich people don’t win legal battles by actually proving what they did isn’t illegal, they do it by discreetly paying people to say they did.
Often and increasingly they are not bothering with the discretion part anymore.
Are these the actual music files you can use to play as well?
The article claims that they are:
We backed up Spotify (metadata and music files).
Would be amazing if it was. I would love to just have Spotify’s music on my nas
If your nas has 300tb spare, you could.
How much could it cost $10,000

This gif is going to completely lose its punch in a couple years.
A RAID6 of 24 * 20TB drives could contain that with both parity and hotswap, with room to spare. Let’s say $400 per refurb drive, $2500 rackmount SAS enclosure, $2000 SAS RAID card, $14,100 total. Assuming you already have the server and power and SAS cables.
You could budget this way down. I run 10+2 12TB with Unraid. No reason for a raid card if it’s for archive and personal use.
100% this. People who store easily replaceable media on RAID are just throwing away money (unless you have a need for faster read/write). If it’s your family photos, copy of your in progress thesis, or other irreplaceable piece of info/content go for it.
I have like 40tb Unraid NAS and I get asked pretty much every time I talk to someone about it how I do backups. Easy, I backup my *arr stack databases and in case of a failure I restore them and let it pull down everything over time. Which I have done in the past when I wanted to upgrade quality, easier for me to scrub it all and start over than make upgrade profiles and such.
Or that’s what I would have done, now I mostly use DebridService du jour and Stremio :-)
Considering the Price per TB is 10-11 US dollars, it’s gonna cost $3500 max
10US dollar per TB?? 🤣🤣 More like 30/35€ per TB for a good graded HDD!
Let’s not talk about SSDs or nvme which are more in the 120€/TB.
I always hear people say that storage comes cheap nowaday… I’m still looking for that cheap HDD on amazon… It has been 10 years 🤣🤣
$10/TB is a bit low, but not far off. Serverpartdeals has refurbed enterprise/NAS drives at about $15/TB right now, and thats with the AI pressure driving up prices. I recall seeing 18TB drives around $12/TB a few months back.
The above is well loved vendor in the IT space. Much better place to buy from than Amazon, as they actually guarantee their inventory is legitimate.
US of A often has way lower hdd prices compared to Europe.
Take the serverpartdeals price and add shipping and import tax.
Is it that cheap now?? I would kill for 10tb
Here you go mate. They dont have 10TB in stock, but the do have 20TB refurbs at around $15/TB.

That’s not $10-$11 as suggested above, nor is it $15. And that’s not even a new drive.
It’s nowhere near that cheap.
I read it as 30tb in my head. 300tb is a bit more than i can manage
I’d wager 70% of what’s on Spotify is not worth preserving since its AI slop.
Yeah as with most of the internet, it’s only worth downloading anything uploaded before 2023.
So far, LLMs have done so much more harm than help.
I’m not convinced AI slop can compete with the back log of organic slop personally.
But yeah a fuckton is probably slop either way
AI slop is accelerating exponentially for the foreseeable future. It won’t take long for world data storage to be a limiting factor.
Interestingly enough, with the data they provide, figuring out how much of it is AI slop wouldn’t be that hard I think
They’ve released torrents of the metadata, and they plan to release the music files, but they haven’t yet. They intend to start by offering the downloads as bulk torrents, but they’re open to considering implementing the ability to download single songs in the future.
So in short, yes, but you can’t download them yet
Not yet, but that’s the end goal. The tricky part is that they’re only offering bulk downloads for now, which means downloading a single artist or album would be difficult/impossible. You’d need to download the entire compressed file of like 300GB of music, then extract the specific songs/artists/albums you wanted. The goal for now is preservation, meaning they want to make the bulk download as easy as possible, to make sure people can preserve it. Once they’ve got that in a pretty good spot, they may look into allowing more granular downloads.
Ok, how do we download this?
Step 1: Buy £6,000 worth of identical hard drives and a motherboard with 16 SATA ports. Or £12,000 worth and a RAID 1 server rig. Or £24,000 and RAID 6
You can get PCI cards to add more sata ports, they don’t all need to be on the motherboard
Does this make Anna’s Archive the top music tracker now? Lol move over RED
Is that ALL off the songs?
“Honey, all I need is $10,000 for a server and we’ll never pay for Spotify again”
Omg if my girlfriend had $10k to give me for a server I could buy like a RAM or two!
300TB


Dns blocked in germany. fun.
Simply choose a private DNS server like mullvad,quad,etc. and it should work…
https://www.privacyguides.org/en/dns/
From the mega thread here
Does DoH work? (Secure DNS in firefox’s settings)
It’s probably blocked by your particular ISP, not every German ISP.
This is such important work o7
I cannot fathom the legal fees that will be incurred if they release 99.6% of Spotify to the public for free. Holy fucking shit.
Yeah they’re definitely gonna catch some legal action for this, even tho I’m all for them releasing the data
Maybe they could say they’re AI. Apparently copyright infringement’s legal then.
Have to find them first
Almost makes me want to get into torrenting again. But dab.yeet.su, squid.wtf, and doubledouble.top usually have me covered with ddl
Anyone knows if spotify metadata have BPM and keys?
Both. Per the SQL schema printed in the article, table
track_audio_featureshas both fields tempo and key along with many other technical. Worth checking out, it’s near the bottom of the page.Mashup artist detectedWould love lmao. Just bought a second hand VDJ and I’m starting to experiment with mixxx, and I don’t know is the style I like (latincore and adjacents) or if the BPM detected of mixxx isn’t that good.
Good on you for starting that up! I wish you much success in your mixing and/or producing journey!
BPM yes, keys I’m not so sure.
Yes, and it hasn’t been easy to dig up until recently. There were a few ways to search the “hidden” metadata fields that Spotify uses internally. But it definitely hasn’t been easy or straightforward.
Those hidden fields are how Spotify recommends similar artists. You have a few bands on repeat with specific instruments, chord progressions, and singer vocal range? Gee, maybe you’ll enjoy other bands that are similar to that…
Does anyone see the torrent links?
It says,
The data will be released in different stages on our Torrents page:
[X] Metadata (Dec 2025)
[ ] Music files (releasing in order of popularity)
[ ] Additional file metadata (torrent paths and checksums)
[ ] Album art
[ ] .zstdpatch files (to reconstruct original files before we added embedded metadata)
Yeah, I saw that and assumed they’d be torrents of the metadata.
I didn’t find it in the blog post but it’s listed in the website https://annas-archive.org/torrents/spotify
Now make it streamable and make a stremio-like music client 🤞
Record labels themselves would march on foot to burn down the archives
Would be interesting if someone checked what % of that archive is slopified.
Honestly, this is the best time to snapshot it, because even with the slop already there, the exponential increase that’s about to happen will absolutely dwarf what’s there now.

















