Spotify "Shuffle": What's Going On, and Does it Matter?
Shane has been noticing some strange patterns with Spotify "shuffle"
Spotify shuffle isn’t actually true shuffle.
I’ve known this for a while, and theres a good amount of content out there explaining how it works. Taking a peek at that quick blog post is a good starting place.
I’m not so sure that’s all there is to it.
Here, check this out:
This is pulled from a web application called last.fm, which records and archives my listening habits. When I first registered in 2005 it was pulling directly from my iTunes library, and now it pulls from my Spotify library.
These four songs came on chronologically in a “shuffle” of my Master Library playlist, which consists of 8,165 songs as of publication of this blog. Spotify, way back in 2014, gave an example of how their dithering model shuffle worked by giving the example of spreading out artists evenly in a playlist algorithmically.
In the above image provided by Spotify, the red dots indicate songs on a playlist by The White Stripes. Instead of a true shuffle, they are spread out evenly across the playlist.
But let’s again look at the image of the four songs from my master library. The song at the bottom was the one played first, as you can tell by the timestamps. Do you see any connections here? Keep in mind, this is a playlist of over 8,000 songs.
First, “Fall in Love” by Phantogram segues into “Real Love Baby” by Father John Misty. Pretty easy to see the “love connection” there. Nothing crazy, though. Lots of songs have the word “love” in the title.
Well, let’s take a look at song three. “Phantoms” by Young Galaxy. Well, song one was by Phantogram. There’s not too many “Phanto-” songs or artists out there. Well, track four is “Babylon” by David Gray. Just two tracks after “Real Love Baby”.
So, as you can see, all four of these songs, which were played chronologically in a playlist of over 8,000 songs, are connected in some way. I like to call this “sticky shuffle.” If you have a friend who sucks at shuffling cards, you’ll note that groups of a few cards tend to stick together in the shuffling of the deck. Same concept here. These songs seemed to stick to each other based on keywords and likely other variables, too. You can also tell by the time stamps that the songs are roughly the same length (though this isn’t odd at all, most “pop” songs check in in between 3-4 minutes). If you listen, you’ll find the tempo (BPM) of the songs to be quite similar. This is more interesting to me than the song lengths.
Regardless, it is clear to me that something is going on here. Especially given the size of my Master Library playlist, it becomes increasingly clear when the algorithm is algorithm-ing when statistical improbabilities such as the one highlighted above manifest themselves.
So why does this even matter?
Let’s take some quotes from a 2019 Big Think article titled “How Spotify manipulates your emotions and sells your data,” which highlights the book Spotify Teardown: Inside the Black Box of Streaming Music by a group of 5 engineers at the Massachusetts Institute of Technology:
Music is only the layer you hear above “a cacophony of other data.” Using browser plugin Ghostery and network data capture tool Fiddler, the authors worked with a programmer to discover no less than 22 mostly advertising-related companies in that cacophony, tracking listening habits and providing real-time analytics. This data is packaged and resold.
It’s clear selling ad space and enabling advertising in general is a huge part of the Spotify business model, as it is for the vast majority of major corporations, but what is going on with Spotify user data prior to advertisements reaching the user, and how much do playlist algorithms play a factor in the process?
Spotify wants you to spend a lot of time on its product. I pay for the premium service, because I do indeed spend a lot of time using its product and I do not want to be inundated with advertisements every few songs. I am aware my data is being collected regardless, though the convenience of the Spotify product outweighs the collected data in my eyes.
It does not make me any less curious about what Spotify is testing with its algorithms. How much of our emotional state is influenced by playlists and algorithms? What exactly is going on with the algorithm for shuffle? Clearly dithering and the Fisher-Yates shuffle do not paint the full picture, and 2014 was a long time ago.
It’s evident in other ways how shuffle operates. Songs I’ve been listening to recently usually find their way to the top of my shuffle, and songs I added to the playlist back in 2013 seldom re-surface in a regular listening session. I would imagine past listening habits are also tracked to see which songs kept you listening, which songs ended a listening session, etcetera. I’m sure they have quite a lot of data, seeing as I’ve played around 70,000 songs on Spotify since 2013.
Does the good outweigh the bad? Does Spotify cater to me by playing what I want to hear, or am I robbed of hearing songs that normally had an equal shot of surfacing in a “true shuffle”? How much does Spotify know about my moods and emotional states? How valuable is listening data, really? To whom is this listening data sold? What do they know? Am I losing a sense of agency and room for the serendipitous by constantly listening to algorithmically-driven playlists?
There’s plenty to unpack here, but the first step is awareness, and it’s clear that there’s more than meets the eye when it comes to Spotify and its business practices and shuffle ain’t really shuffle.