Music Preference Dispersion Through the Last.fm Network: Data Collection Approach

First an update on my attempt to visualize more of the data I collected last week. Despite my best efforts, I just can’t get any visualization programs to run all of the data I have (now over 1,000,000 friendships). It seems like when I go past about 25,000-50,000 users, the time to process the data goes way beyond anything my system can handle. Too bad. If someone has a super computer somewhere running social network visualization software on it, let me know and I’ll send you my data.

However, I’m far from ready to call it quits. Instead, I’m taking a new approach. Here’s the plan:

1. Rather than collecting data on all users, I will start with one or two “seed” users and perform a snowballing data collection process with 2 degrees of freedom. This should yield about 5,000 friendships per seed (in fact I did this for one seed already and it came to 4,844 friendships).
2. I will also collect data on users who are not part of that particular social network to use as a control.
3. Then I will collect the music listening history for each user. This, from my experience, will take a while.
4. Then I will attempt to map the diffusion of new song introductions to this network.

The idea is that as a new song emerges it has to start somewhere. If social networks help spread new songs/artists, than I would expect to see a spreading of preferences throughout the network. So the probability of a “friend” (or a friend of a friend) choosing to listen to a song is greater than the probability of a non-friend.

This should let me do some interesting statistical analyses as well as produce some cool visualization movies. Imagine a mesh with each node representing a user. Because I have a time series of listening history I can have each node “light up” when that user listens to a song. If there is no network effect than the lighting up should be random. However, if there is, than I should observe a spreading of “lights” throughout the network.

I’m leaving for a conference tomorrow so my blogging rate will slow down a bit (back next Monday), but I’ll have my data-collecting programs running in the mean time. Hopefully, when I return I’ll have more data than I’ll know what to do with!


Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • StumbleUpon
  • Live
  • Technorati
  • Reddit
  • YahooMyWeb

Leave a Reply