Last.fm is a music scrobbling and discovery service with a comprehensive public API. We use it to collect listening history, enrich track metadata, and analyze music taste over time.

Overview

The Last.fm API provides access to scrobbles (listening history), artist/album/track metadata, user libraries, and community-generated tags. It’s free for non-commercial use with rate limiting at 5 requests/second.1

Primary use cases:

  • Collecting complete scrobble history for analysis
  • Enriching tracks with genre tags, durations, and metadata
  • Building music taste profiles and discovery recommendations
  • Tracking listening trends over time

This implementation follows the API Sync Pattern — OAuth refresh tokens, incremental fetching, individual JSON files per entity, and automated CI/CD syncing.

Architecture: Collection vs Enrichment

Last.fm’s API is best understood as two layers: collection endpoints and enrichment endpoints.

Collection Layer

These endpoints return your listening history — what you’ve scrobbled, when, and basic track info:

  • user.getRecentTracks — recent scrobbles (paginated)
  • user.getLovedTracks — tracks marked as loved
  • user.getTopArtists — artists by play count
  • user.getTopAlbums — albums by play count
  • user.getTopTracks — tracks by play count

Characteristics:

  • Return lists of things you’ve listened to
  • Include basic metadata (artist name, track title, album)
  • Paginated (typically 50-200 items per page)
  • Fast to fetch (1-2 seconds per page)

Collection strategy:

  • Fetch all pages for complete history
  • Store raw JSON responses (preserve everything)
  • Update incrementally (new scrobbles only)

Enrichment Layer

These endpoints return detailed metadata about music — not your listening, but the music itself:

  • track.getInfo — duration, MBID, album, playcount, tags
  • artist.getInfo — biography, similar artists, stats
  • artist.getTopTagsgenre tags (community folksonomy)
  • album.getInfo — release date, track list, cover art
  • tag.getTopArtists — artists associated with a genre tag

Characteristics:

  • Return metadata about a specific item
  • Much more detailed than collection endpoints
  • Not paginated (one item per request)
  • Slow to fetch at scale (5 req/sec = ~5 hours for 100k tracks)

Enrichment strategy:

  • Separate pass after collection completes
  • Fetch once per unique track/artist/album (not per scrobble)
  • Cache aggressively (metadata is mostly static)
  • Rate limit carefully (5 req/sec hard limit)

The Genre Tagging Approach

One of the most valuable enrichment use cases is genre classification via artist.getTopTags:

  1. For each unique artist in your library, fetch artist.getTopTags
  2. Get community folksonomy tags (e.g., “indie rock”, “electronic”, “post-punk”)
  3. Weight by tag count (more votes = more reliable)
  4. Every scrobble inherits its artist’s tags
  5. Result: full genre classification across entire listening history

Example:

# artist.getTopTags response for "Radiohead"
{
  "toptags": {
    "tag": [
      {"name": "alternative rock", "count": 100},
      {"name": "indie", "count": 85},
      {"name": "experimental", "count": 72},
      {"name": "electronic", "count": 45}
    ]
  }
}

Now every Radiohead scrobble can be tagged with these genres, weighted by confidence.

Endpoint Coverage

What We’re Collecting

Current endpoints used in scrobble scraper:

  • user.getRecentTracks — complete scrobble history
  • user.getInfo — profile stats, registration date
  • user.getTopArtists — play count by artist
  • user.getTopAlbums — play count by album
  • user.getTopTracks — play count by track
  • user.getWeeklyChartList — historical weekly charts
  • user.getWeeklyArtistChart — artist plays per week
  • user.getWeeklyAlbumChart — album plays per week
  • user.getWeeklyTrackChart — track plays per week
  • user.getFriends — social connections
  • user.getLovedTracks — favorited tracks
  • user.getPersonalTags — user’s own categorization

Critical Missing Endpoints

Enrichment layer currently not implemented:

  • track.getInfomost important — adds duration, MBID, accurate album, tags
  • artist.getInfo — biography, similar artists, full stats
  • artist.getTopTagsgenre classification (folksonomy tags)
  • artist.getTopAlbums — discography discovery
  • artist.getTopTracks — signature songs
  • album.getInfo — release dates, track lists, cover art URLs

User preference signals:

  • user.getBannedTracks — negative preference data (what you explicitly dislike)
  • user.getArtistTracks — per-artist listening history (subset of all scrobbles)

Discovery and recommendation:

  • track.getSimilar — similar tracks (for recommendations)
  • artist.getSimilar — similar artists
  • tag.getTopArtists — genre exploration
  • tag.getTopTracks — genre discovery

Rate Limiting

Hard limit: 5 requests/second
Consequences: IP ban (temporary or permanent)

At scale:

  • 100,000 unique tracks = ~5.5 hours of enrichment
  • 10,000 unique artists = ~33 minutes of enrichment
  • 1,000 albums = ~3.3 minutes of enrichment

Best practice:

  • Add 250ms delay between requests (4 req/sec, safe buffer)
  • Use exponential backoff on errors
  • Cache everything (metadata rarely changes)
  • Enrich in batches overnight, not on-demand

API Authentication

Last.fm supports both API key (read-only) and OAuth (write access for scrobbling).

For data collection (read-only):

API_KEY=$(rbw get --field API_KEY "LastFM API Key")
curl "https://ws.audioscrobbler.com/2.0/?method=user.getRecentTracks&user=USERNAME&api_key=${API_KEY}&format=json"

Store the API key in Vaultwarden as “LastFM API Key”.

Data Storage Strategy

Recommended approach:

  1. Raw JSON archive — store complete API responses

    • Preserves everything (even fields you don’t use yet)
    • Enables retroactive analysis without re-fetching
    • Example: data/raw/user.getRecentTracks_page_001.json
  2. Normalized database — extract fields you need

    • SQLite for querying, filtering, aggregation
    • Indexed for fast lookups
    • Regenerate from raw JSON if schema changes
  3. Enrichment cache — separate from collection

    • data/enrichment/track_{mbid}.json
    • data/enrichment/artist_{mbid}.json
    • Check cache before API call (avoid re-fetching)

Use Cases

Music Taste Profiling

Combine collection + enrichment to build genre distribution:

  1. Fetch all scrobbles (user.getRecentTracks)
  2. Extract unique artists
  3. Enrich with artist.getTopTags (genre tags)
  4. Aggregate scrobbles by genre
  5. Visualize genre evolution over time

Discovery Recommendations

Find new music similar to what you already love:

  1. Identify top artists (user.getTopArtists)
  2. For each, fetch artist.getSimilar
  3. Filter out artists you already have
  4. Rank by similarity score + genre match
  5. Present as “You might like…”

Track how your taste changes:

  1. Fetch weekly charts (user.getWeeklyArtistChart)
  2. Calculate genre distribution per week
  3. Plot over time to see taste evolution
  4. Identify discovery events (new genres entering rotation)

See Also

Footnotes

  1. The 5 req/sec limit is widely cited in community documentation and consistent with the observed behavior, though Last.fm’s official API terms do not publish a specific numeric rate limit. The practical safety buffer is 4 req/sec (250ms delay). See Last.fm API ToS.