2 min read

Anna’s Archive Scrapes Spotify, Creates Music “Preservation Archive”

At 256 million tracks it represents the “largest publicly available music metadata database”

Anna’s Archive, the online search engine that indexes books, academic papers and other texts from a range of open and unauthorized digital libraries, has scraped Spotify with the aim of building “a music archive primarily aimed at preservation.”

Why it matters:

  • As per Anna’s Blog, this is “the world’s first ‘preservation archive’ for music which is fully open.”

A brief overview:

  • The group has archived metadata for Spotify’s 256 million tracks, and audio files for 86 million songs, representing around 99.6% of listens.

  • All up the files weigh in at around 300TB in size.

  • Music released after July 2025 may not be present.

  • The data will be released in different stages on the platform’s Torrents page.

Why they did it:

  • It’s part of the platform’s mission to preserve “humanity’s knowledge and culture.”

  • Though acknowledging that music is already well preserved due to enthusiasts digitizing their music collections and sharing them through torrents “or other digital means,” it’s identified three problematic areas with that set-up:

  • Over-focus on the most popular artists, ignoring the “long tail” of music.

  • Over-focus on the highest possible quality, with enormous file sizes making it difficult to keep a full archive “of all music that humanity has ever produced.”

  • There is no authoritative list of torrents representing all music ever produced.

Torrents-only:

  • The blog clarifies that for now this is a “torrents-only archive aimed at preservation.”

  • If the interest is there, however, it may add downloading of individual files to Anna’s Archive.

Can they do this legally?

  • In short, no.

  • Mass scraping audio files and redistributing them via torrents violates copyright law in many countries, not to mention Spotify’s terms of service.

  • As per Android Authority, even if Anna’s Archive claims this is about preservation rather than piracy, “good intentions” isn’t generally grounds for copyright exemption.

  • The publication suggests takedown requests and/or legal threats could be forthcoming from Spotify and the record companies.

Anna’s Archive, the online search engine that indexes books, academic papers and other texts from a range of open and unauthorized digital libraries, has scraped Spotify with the aim of building “a music archive primarily aimed at preservation.”

Why it matters:

  • As per Anna’s Blog, this is “the world’s first ‘preservation archive’ for music which is fully open.”

A brief overview:

  • The group has archived metadata for Spotify’s 256 million tracks, and audio files for 86 million songs, representing around 99.6% of listens.

  • All up the files weigh in at around 300TB in size.

  • Music released after July 2025 may not be present.

  • The data will be released in different stages on the platform’s Torrents page.

Why they did it:

  • It’s part of the platform’s mission to preserve “humanity’s knowledge and culture.”

  • Though acknowledging that music is already well preserved due to enthusiasts digitizing their music collections and sharing them through torrents “or other digital means,” it’s identified three problematic areas with that set-up:

  • Over-focus on the most popular artists, ignoring the “long tail” of music.

  • Over-focus on the highest possible quality, with enormous file sizes making it difficult to keep a full archive “of all music that humanity has ever produced.”

  • There is no authoritative list of torrents representing all music ever produced.

Torrents-only:

  • The blog clarifies that for now this is a “torrents-only archive aimed at preservation.”

  • If the interest is there, however, it may add downloading of individual files to Anna’s Archive.

Can they do this legally?

  • In short, no.

  • Mass scraping audio files and redistributing them via torrents violates copyright law in many countries, not to mention Spotify’s terms of service.

  • As per Android Authority, even if Anna’s Archive claims this is about preservation rather than piracy, “good intentions” isn’t generally grounds for copyright exemption.

  • The publication suggests takedown requests and/or legal threats could be forthcoming from Spotify and the record companies.

Anna’s Archive, the online search engine that indexes books, academic papers and other texts from a range of open and unauthorized digital libraries, has scraped Spotify with the aim of building “a music archive primarily aimed at preservation.”

Why it matters:

  • As per Anna’s Blog, this is “the world’s first ‘preservation archive’ for music which is fully open.”

A brief overview:

  • The group has archived metadata for Spotify’s 256 million tracks, and audio files for 86 million songs, representing around 99.6% of listens.

  • All up the files weigh in at around 300TB in size.

  • Music released after July 2025 may not be present.

  • The data will be released in different stages on the platform’s Torrents page.

Why they did it:

  • It’s part of the platform’s mission to preserve “humanity’s knowledge and culture.”

  • Though acknowledging that music is already well preserved due to enthusiasts digitizing their music collections and sharing them through torrents “or other digital means,” it’s identified three problematic areas with that set-up:

  • Over-focus on the most popular artists, ignoring the “long tail” of music.

  • Over-focus on the highest possible quality, with enormous file sizes making it difficult to keep a full archive “of all music that humanity has ever produced.”

  • There is no authoritative list of torrents representing all music ever produced.

Torrents-only:

  • The blog clarifies that for now this is a “torrents-only archive aimed at preservation.”

  • If the interest is there, however, it may add downloading of individual files to Anna’s Archive.

Can they do this legally?

  • In short, no.

  • Mass scraping audio files and redistributing them via torrents violates copyright law in many countries, not to mention Spotify’s terms of service.

  • As per Android Authority, even if Anna’s Archive claims this is about preservation rather than piracy, “good intentions” isn’t generally grounds for copyright exemption.

  • The publication suggests takedown requests and/or legal threats could be forthcoming from Spotify and the record companies.

👋 Disclosures & Transparency Block
  • This story was written with information from Anna’s Archive and Android Authority.

  • We covered it because it’s news of Spotify being scraped.

📨 Subscribe to NIF

Get news dropped in your inbox 👇

📨 Subscribe to NIF

Get news dropped in your inbox 👇

Related Articles