Last.fm is a music scrobbling and discovery service with a comprehensive public API. We use it to collect listening history, enrich track metadata, and analyze music taste over time.
Overview
The Last.fm API provides access to scrobbles (listening history), artist/album/track metadata, user libraries, and community-generated tags. It’s free for non-commercial use with rate limiting at 5 requests/second.1
Primary use cases:
- Collecting complete scrobble history for analysis
- Enriching tracks with genre tags, durations, and metadata
- Building music taste profiles and discovery recommendations
- Tracking listening trends over time
This implementation follows the API Sync Pattern — OAuth refresh tokens, incremental fetching, individual JSON files per entity, and automated CI/CD syncing.
Architecture: Collection vs Enrichment
Last.fm’s API is best understood as two layers: collection endpoints and enrichment endpoints.
Collection Layer
These endpoints return your listening history — what you’ve scrobbled, when, and basic track info:
user.getRecentTracks— recent scrobbles (paginated)user.getLovedTracks— tracks marked as loveduser.getTopArtists— artists by play countuser.getTopAlbums— albums by play countuser.getTopTracks— tracks by play count
Characteristics:
- Return lists of things you’ve listened to
- Include basic metadata (artist name, track title, album)
- Paginated (typically 50-200 items per page)
- Fast to fetch (1-2 seconds per page)
Collection strategy:
- Fetch all pages for complete history
- Store raw JSON responses (preserve everything)
- Update incrementally (new scrobbles only)
Enrichment Layer
These endpoints return detailed metadata about music — not your listening, but the music itself:
track.getInfo— duration, MBID, album, playcount, tagsartist.getInfo— biography, similar artists, statsartist.getTopTags— genre tags (community folksonomy)album.getInfo— release date, track list, cover arttag.getTopArtists— artists associated with a genre tag
Characteristics:
- Return metadata about a specific item
- Much more detailed than collection endpoints
- Not paginated (one item per request)
- Slow to fetch at scale (5 req/sec = ~5 hours for 100k tracks)
Enrichment strategy:
- Separate pass after collection completes
- Fetch once per unique track/artist/album (not per scrobble)
- Cache aggressively (metadata is mostly static)
- Rate limit carefully (5 req/sec hard limit)
The Genre Tagging Approach
One of the most valuable enrichment use cases is genre classification via artist.getTopTags:
- For each unique artist in your library, fetch
artist.getTopTags - Get community folksonomy tags (e.g., “indie rock”, “electronic”, “post-punk”)
- Weight by tag count (more votes = more reliable)
- Every scrobble inherits its artist’s tags
- Result: full genre classification across entire listening history
Example:
# artist.getTopTags response for "Radiohead"
{
"toptags": {
"tag": [
{"name": "alternative rock", "count": 100},
{"name": "indie", "count": 85},
{"name": "experimental", "count": 72},
{"name": "electronic", "count": 45}
]
}
}Now every Radiohead scrobble can be tagged with these genres, weighted by confidence.
Endpoint Coverage
What We’re Collecting
Current endpoints used in scrobble scraper:
user.getRecentTracks— complete scrobble historyuser.getInfo— profile stats, registration dateuser.getTopArtists— play count by artistuser.getTopAlbums— play count by albumuser.getTopTracks— play count by trackuser.getWeeklyChartList— historical weekly chartsuser.getWeeklyArtistChart— artist plays per weekuser.getWeeklyAlbumChart— album plays per weekuser.getWeeklyTrackChart— track plays per weekuser.getFriends— social connectionsuser.getLovedTracks— favorited tracksuser.getPersonalTags— user’s own categorization
Critical Missing Endpoints
Enrichment layer currently not implemented:
track.getInfo— most important — adds duration, MBID, accurate album, tagsartist.getInfo— biography, similar artists, full statsartist.getTopTags— genre classification (folksonomy tags)artist.getTopAlbums— discography discoveryartist.getTopTracks— signature songsalbum.getInfo— release dates, track lists, cover art URLs
User preference signals:
user.getBannedTracks— negative preference data (what you explicitly dislike)user.getArtistTracks— per-artist listening history (subset of all scrobbles)
Discovery and recommendation:
track.getSimilar— similar tracks (for recommendations)artist.getSimilar— similar artiststag.getTopArtists— genre explorationtag.getTopTracks— genre discovery
Rate Limiting
Hard limit: 5 requests/second
Consequences: IP ban (temporary or permanent)
At scale:
- 100,000 unique tracks = ~5.5 hours of enrichment
- 10,000 unique artists = ~33 minutes of enrichment
- 1,000 albums = ~3.3 minutes of enrichment
Best practice:
- Add 250ms delay between requests (4 req/sec, safe buffer)
- Use exponential backoff on errors
- Cache everything (metadata rarely changes)
- Enrich in batches overnight, not on-demand
API Authentication
Last.fm supports both API key (read-only) and OAuth (write access for scrobbling).
For data collection (read-only):
API_KEY=$(rbw get --field API_KEY "LastFM API Key")
curl "https://ws.audioscrobbler.com/2.0/?method=user.getRecentTracks&user=USERNAME&api_key=${API_KEY}&format=json"Store the API key in Vaultwarden as “LastFM API Key”.
Data Storage Strategy
Recommended approach:
-
Raw JSON archive — store complete API responses
- Preserves everything (even fields you don’t use yet)
- Enables retroactive analysis without re-fetching
- Example:
data/raw/user.getRecentTracks_page_001.json
-
Normalized database — extract fields you need
- SQLite for querying, filtering, aggregation
- Indexed for fast lookups
- Regenerate from raw JSON if schema changes
-
Enrichment cache — separate from collection
data/enrichment/track_{mbid}.jsondata/enrichment/artist_{mbid}.json- Check cache before API call (avoid re-fetching)
Use Cases
Music Taste Profiling
Combine collection + enrichment to build genre distribution:
- Fetch all scrobbles (
user.getRecentTracks) - Extract unique artists
- Enrich with
artist.getTopTags(genre tags) - Aggregate scrobbles by genre
- Visualize genre evolution over time
Discovery Recommendations
Find new music similar to what you already love:
- Identify top artists (
user.getTopArtists) - For each, fetch
artist.getSimilar - Filter out artists you already have
- Rank by similarity score + genre match
- Present as “You might like…”
Listening Trends
Track how your taste changes:
- Fetch weekly charts (
user.getWeeklyArtistChart) - Calculate genre distribution per week
- Plot over time to see taste evolution
- Identify discovery events (new genres entering rotation)
Related Projects
- Morning Briefing — uses Last.fm API for Sunday listening digest
- Concert Radar — cross-references Last.fm artists with Bay Area shows
- Training Metrics and Automated Coaching — similar enrichment pattern for fitness data
See Also
- Last.fm API Documentation
- Credential Management — secure API key storage
- Agent Skills — includes skills that call the Last.fm API
- Quantified-Self Health Analytics — uses Last.fm scrobble timestamps as a behavioral proxy for late-night activity and sleep delay
Footnotes
-
The 5 req/sec limit is widely cited in community documentation and consistent with the observed behavior, though Last.fm’s official API terms do not publish a specific numeric rate limit. The practical safety buffer is 4 req/sec (250ms delay). See Last.fm API ToS. ↩