Managing Last.FM's "Mountain of Data" 139
Rob Spengler writes "Last.FM co-founder Richard Jones says the biggest asset the company owns is 'hundreds of terabytes of user data.' Jones adds, '... playing with that data is one of the most fun things about working at the company.' Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day. The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,' the company's song/artist naming algorithm, can correctly determine a track even with tens of thousands of false entries. Jones says sitting on that much data has even helped police: 'thieves listening to music on an Audioscrobbler-powered media player have helped police in the US, UK, and other countries track down users' stolen laptops.' Does sitting on a mountain of data make Last.FM powerful enough to start making a stand against the record industry? CBS certainly thinks so — they bought the company for £140 (~$200) million last year."
Data is valuable (Score:5, Interesting)
Re: (Score:3, Insightful)
Re: (Score:1, Interesting)
I dont necessarely even use the data for anything, I just like how its there and I can play around with it and search thru it. I just go to a webservice, make a scripts to harvest the valuable data from it, save it to db and let scripts peridiocally check if theres new data, either thru my own scripts or RSS.
Back in the Audioscrobbler days Last.FM used to provide
Re: (Score:1, Interesting)
And when searching for youtube videos from my irc bot it gives a warm feeling knowing it comes from my local db instead of youtube's, no matter how useless that i
Re:Data is valuable (Score:4, Insightful)
Sounds like a slight variation on those people who have TB's of movies/music/videos/TV episodes/etc that they will never have the time to watch/listen to.
Re: (Score:2, Offtopic)
I'm sort of half and half. I collect mountains of ebooks from alt.binaries.ebooks.technical (currently close to 84GB). I probably have fifty thousand dollars worth of stuff. Ninety nine percent of it I will never read, because who has the time? BUT it's there as a reference,too. If I need to know something in more depth than I can get with a quick Google, I've got my huge library. And I have a reading list of the most important stuff that I do need to read. I also capture tons of Web pages with information
Re: (Score:3, Interesting)
and tell me mr Coward what have you deducted from you pile of information
So what if he has never done one useful thing with it? People like that provide a public service, its people like that which enabled DejaNews and now Google Groups to reconstruct much of the historical usenet. If his hobby is data hording, then let him horde. It doesn't cost you a dime, but one day it might possibly be of great benefit.
Re: (Score:3, Insightful)
Good points. You had me until you said "entity" (do you know what that means? I doubt it) in the place of, I assume, "commodity".
Oh and the repeat after me bit is silly. The "information" you have is worthless on its own. It only becomes valuable when it's coupled with lots of other similar "informations" from other people. By retaining this information you're only preventing someone from making money, without any benefit for yourself, which is arguably dickish. Oh and saying that "information is more valua
Re: (Score:2)
Re:Data is valuable (Score:5, Funny)
Re: (Score:2, Funny)
I disagree. When I was a senior in HS, we had a smoking hot student teacher. I would have paid to get molested by her.
LK
Comment removed (Score:5, Funny)
Re: (Score:3, Funny)
unique order of songs (Score:5, Interesting)
what i find most interesting is the order certain songs "go together", like listening to a song from Slayer, followed by, say, "someday i suppose" from the bosstones. when composing songlists, i appreciate how similar songs and moods can flow, but also how the contrast of dissimilar songs can SOMETIMES compliment each other.
a large database could ferret out such instances that might occur frequently in multiple playlists.
Re: (Score:3, Funny)
To get you into the right mood, think of the impact it could have on mind manipulation
Re: (Score:2, Interesting)
What about "songs are mostly played in alphabetical order"? :)
Re: (Score:3, Insightful)
So your contribution, then, is noise.
But this noise does not affect the signal, which is still there. It's just harder to find.
Nobody ever said mining a mountain of data like this would be a trivial task.
Now What... (Score:2, Interesting)
I have a similar site that I wrote (pre-audioscrobbler). Granted it's crap, but I have mountains of data also. Closer to 1 tb than hundreds of tb. The question is, how do you monetize the data?
I just don't see how this data is "worth" 200 million bucks. I have some amazing algorithms to do similar cleaning, caching, and recommendations, but still what is that worth?
This is a fairly legit question. If you can figure it out, I can explain to my wife why I have 3 servers in my closet.
Re:Now What... (Score:5, Insightful)
If you could (accurately) answer that question, then you'd act upon the answer...
Why do you think Google ads are Google's bread and butter as far as cashflow goes? The reason is that Google has a treasure trove of user data, probably more than anyone else, so they can really make contextual ads work. Anyone can write an ad engine, but not everyone has access to mountains and mountains of user data.
You might be surprised at how important context is when you're trying to promote something. Say you're trying to promote an online RPG like Game! [wittyrpg.com], if you took a random collection of people, probably less than 5% of them would be interested in playing, but if you can target gamers specifically, that number might jump to 50%. If you're paying for every impression, that makes a world of difference.
So not only do you need to understand your audience, you also need to effectively target them. Now, how do you do that? Data mining of course, and the more data the better.
Pretty much all data has value, figuring out how to turn that data into money is extremely subjective and might involve some black magic, and definitely requires luck too.
Re: (Score:1)
Re: (Score:1)
Re: (Score:1)
Only reason I bothered replying is that, for now Last.fm is probably my favorite online listening outlet. I prefer it over downloading (legal or illegal) simply because I don't have to download or store anything. This also keeps me open to new and similar artists.
As a musician I really like this feature.
I'd be heartbroken if somehow CBS turned this into a monthly fee type radio function like Sirius or something. I think there is a lot more potential for mobile
Re: (Score:2)
http://slacker.com/ [slacker.com] does something similar. You choose artists, and based on that it chooses similar artists to play. They offer a portable that does a good job of recreating that experience on the go.
Re: (Score:3, Interesting)
Re: (Score:3, Funny)
figuring out how to turn that data into money ... might involve some black magic, and definitely requires luck too.
So what you are saying is:
1. Data
2. ???
3. Profit!
:~)
Re: (Score:2)
You're confusing context-sensitive advertising with behavioral targeting.
All you need to do context sensitive advertising is a bot to crawl your publishers pages (Google uses MediaBot) and the ability to cache the HTML and do data-structure and keyword-density analysis on it.
This isn't easy, but having user data is unnecessary.
Of course, the best systems will pair context sensitivity with behavioral sensitivity and produce truly valuable ads.
I'm a developer at a large CPA advertising startup. We're building
Re: (Score:2)
Actually, I think with data like Google has, you can also make context-sensitive things better. For example, say someone searches for BMW, a naive context-sensitive engine would display ads about BMWs, but you can go beyond that, a BMW is a car, so you can display ads about cars too, and if your engine is really good, it'll know that BMWs are high end luxury cars, and not to show ads about beat up used Fords, for example. That informat
Re: (Score:2)
True.
The first example is all about so-called "semantic web" technology. And the thing is, Google's index does contain the data you'd need to build semantic context about, as in your example, what a "BMW" is.
But that info Google makes available via its API to anybody willing to pay a couple cents.
The SECRET data Google has available is what makes your second example possible, and smaller ad networks simply don't have the breadth of publishers needed to gather a dataset that rich for each user.
The technique,
Re: (Score:2)
Hmm, interesting. I didn't know that they made that information available. I've made a mental note for future reference.
Re: (Score:1)
Re:Now What... (Score:5, Funny)
Information wants to be free.
Information wants to be a ballerina.
Re:Now What... (Score:5, Funny)
Information wants to be free.
Information wants to be a ballerina.
Then information needs to get her fat ass on a diet or she's never going to fit into that tutu and make Mommy proud!
Re: (Score:2)
That kind of parenting made information a heroine addicted stripper, now come over here and rub your data against me for a dollar.
Re: (Score:2)
That kind of parenting made information a heroine addicted stripper
It made Miss Information into a hero, and also an addicted stripper?
b
It's so popular... (Score:5, Funny)
The summary wasn't insulting enough, so I think I'll just add a bit extra.
Last.FM is so popular that if you aren't familiar with the service, you must be a drooling, knuckle dragging luddite.
Apparently I'm not one of the cool kids. I'm sad now, and my feelings are hurt.
Re: (Score:3, Funny)
Last.FM is so popular that if you aren't familiar with the service, you must be a drooling, knuckle dragging luddite... a step away from churning your own butter.
Sorry, had to add my own.
Re: (Score:2)
a step away from churning your own butter
I do churn my own butter, you insensitive clod....
Re: (Score:1)
Well, I've never been to a barn raising, and I don't churn butter. I do make home-made ice cream, though, does that count? We also make and can our own spaghetti sauce, and applesauce. I don't have a cell phone because it's bad enough we have a landline; one of my life goals is to someday live in a house with no phones whatsoever. And I prefer the 80-key XT keyboard over the newer 101-key layout. And I still use Perl
Re: (Score:2)
all this data yet so much gets missed (Score:2, Insightful)
Re: (Score:2)
I missed a Metric show that I wouldn't have they, who know I'm a Metric fan, warned me.
They know what I like, and they have info about albuns and shows, how had it is to fire an actually interesting newsletter once in a while.
Re: (Score:2)
They can notify you when a band you might be interested is playing in the area, just subscribe to their recommended events calender or RSS feed.
Re: (Score:1)
Well have tried keeping tabs on what dozens and dozens are up to? As some bands do slip through the cracks. Though MusicBrainz is now offer this with their collections service telling you when a band has a new album out.
No revolution (Score:5, Interesting)
CBS certainly thinks so -- they bought the company for £140 (~$200) million last year.
Which is why whatever comes of them, at best it will be evolutionary. CBS is part of the old guard RIAA corps, they are just one of the faces of Viacom - all controlled by Summer Redstone. They may have brought some money to the table, but they brought a whole ton of baggage with them too. Enough baggage to make this privacy freak decide they couldn't be trusted with all that data they've been collecting (for example, if they can track down a stolen laptop, they can track down someone playing an MP3 from an illegally leaked pre-release album).
Re: (Score:1)
Re: (Score:2)
Oligopoly means minimal competition. You assume that CBS has figured out that the game has changed enough that the RIAA membership is no longer an effective monopoly. Given the goose-egg of evidence to support that theory, I sincerely doubt they have.
Re: (Score:2)
OTOH they did bought place infested with people used both to p2p downloading and to new forms of promotion/legit distribution channels, and whose musical taste doesn't reflect current radio charts at all.
One would thought they knew what they were buying...so who knows.
So... I've been living on Mars? (Score:3, Insightful)
Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day.
You know, I'm not exactly what you'd call a Luddite, yet I've never heard of Last.FM. Am I the only one? I kind of doubt it.
I have a general gripe about anyone who writes "for those who have been living on Mars" anytime they reference some moderately popular company, service, or product. It smacks of arrogance, as if to say, if you don't have the same interests as I do, you're obviously disconnected from the mainstream.
Or perhaps I'm just annoyed for being called out on being a bit older and out of touch? Bah!
>>goes back to guarding lawn with a shotgun from an old rocking chair...
Re: (Score:2)
Re: (Score:3, Insightful)
I've never heard of them either and the bit about living on Mars also irks. And for all the arrogance in that, the summary makes it sound like the internet radio outfit needs fancy algorythms to tell what music they're playing. WTF don't they just program the correct name when they add a new song to their database? I'd read the article, but my shuttle back to Mars is leaving...
Re: (Score:2)
They use the algorithm to determine what you're playing, not what they're playing. It sounds like they are saying they can figure out what song you are scrobbling without looking at the songs tag info.
Re: (Score:2)
GreatGPPost:DutchGun(3RWX)@/.#1a6Kp-8CR0ded:
> > Last.FM, for those who have been living on
> > Mars for the last two years, is the largest
> > online radio outlet, with millions of
> > listeners per day.
>
> You know, I'm not exactly what you'd call a
> Luddite, yet I've never heard of Last.FM.
> Am I the only one? I kind of doubt it.
>
> I have a general gripe about anyone who
> writes "for those who have been living on
> Mars" anytime they reference some
> moderately
Re: (Score:2)
Wow, how high were you when you wrote that? I only skimmed it so I can't respond to the whole thing but you are however right that the lyric is quite dark. However that's fitting really since it's about Romeo and Juliet carrying out their suicides together. That being the case, the everlasting peace isn't really of the desirable sort either.
The inclusion of the lyric is not meant as an endorsement of either mutual suicide or wishing death on the Montague and Capulet clans, I am in fact a rather strict pacif
Re: (Score:2)
Now I'm completely lost. How would YOU not know what song YOU are playing? If there is song tag info, wouldn't your player display that for you? Why on Earth would anyone need to connect to some service for this info?
Re: (Score:2)
I don't know really, I'm just trying to interpret what the article is saying. All I do know is that the algorithm is being used to ID your music, not theirs so that's why just tagging their music correctly is not an option.
Re: (Score:2)
Er, how about if you're trying to listen to something other than music?
Why would a song have any tag other than "song", to discriminate it from items that have tags "interesting", "worth paying attention to", etc. (BTW - by "player", do you mean one of those things for playing audiobooks on.)
Speaking as a Martian, I had actually heard of Last.fm ; I hadn't actuall
Re: (Score:3, Informative)
Well, Amarok has a config menu entry with a big old icon with the label "last.fm" on it. Everyone who ever used Amarok had to pass his's cursor over the label "last.fm", which has been there for a few years, mind you. Other media players also support last.fm, whether through a plugin or even built in. So you may have not been living under a rock but you sure were quite a bit distracted. For at least the last 6 years or so.
On a side note, I've made a point of turning on the last.fm plugin for a simple reason
Re: (Score:1)
Hilarious! Last.fm is widely known yet no one uses Amarok nor Linux. This is the proof that Linux geeks are so outside of the latest Internet services.
"But now that I've learnt that last.fm is not only tracking down contributors but also is owned by one of RIAA's record companies... Well, let's just say that the plugin is off and will never be turned on again."
On the contrary. I always have it turned on so that the dumb record companies will know that i don't listen to the crappy music they sell.
Re: (Score:1)
Hilarious! Last.fm is widely known yet no one uses Amarok nor Linux.
Hilarious is your ignorant observation.
The amarok team received a lot of requests for a Windows port. With the introduction of QT4 the port should be pretty usable by now.
And why is that people are asking for amarok to run on Windows?
This is the proof that Linux geeks are so outside of the latest Internet services.
I think you have not been living on Mars but rather on Pluto. .NET framework and Silverlight to GNU/Linux?
* Did you miss the news about a new OS stack called Android?
* What about Microsoft (not limited to) developing their
* What about Adobe releasing the first and unique 64bit
Re: (Score:1, Informative)
Maybe the sudden appearance of trash like kayne west and britney spears on the top of last.fm's charts has something to do with it.
The current top 10 in last.fm's artist charts:
1. Coldplay
2. Radiohead
3. The Beatles
4. The Killers
5. Metallica
6. Red Hot Chili Peppers
7. Muse
8. Linkin Park
9. Nirvana
10. Pink Floyd
Sure, it's not just free jazz - there are a number of "well-known" names in there, and arguably, it's all mainstream. But Britney Spears and Kayne West it's not.
(And Pink Floyd, which you specifical mention as having "been there" in the charts in the past, with the implication that these days are gone, still are. FWIW, Queen also st
Re: (Score:2)
Or maybe... just maybe... that sort of music actually is popular. And now that the service is getting to be more mainstream and less the private playground of geeks the charts are starting to reflect more (current) mainstream artists.
A lot of people actually like that crap. Sad but true.
Re: (Score:2)
This is something similar to what I was thinking, I've never heard of them but maybe it's because I listen to my music from other parts of the world. Read that as Japan, France, Germany, S.Korea and UK not in any particular order either but it's mostly the DJ's and/or the individual mixers I'm listening to these days.
I suppose it's the option to having a mass of indie choices that I can happily give a middle finger to anyone who decides to sell out along the way.
Re: (Score:1)
Re: (Score:2)
Even by the standard of press releases it seemed to be a particularly rubbish and arrogant press release (and I'm someone who actually uses last.fm).
I'm not sure what it was doing here. What do the editors think this is - the BBC technology pages or something?
Re: (Score:2, Insightful)
"IF Y'AINT SEEN THIS THEN Y'AINT SEEN NOTHIN!"
Which is pressed and kneaded as needed to "you have to have been living (under a rock | on mars | in a laundry hamper) for the last (year | ten years | few decades | all your life) if you haven'
Re: (Score:2)
Last.FM is pretty OK, but I would much rather do business with a company which doesn't have a co-founder who calls it "fun" to play with my personal data.
So you'd rather be lied to?
Re: (Score:1, Offtopic)
Re: (Score:3, Interesting)
I read it differently (Score:2)
I always see that from the writer's viewpoint, as if he's saying "Look, I know this isn't news, and I'm just getting around to writing about it a few years later, but I really do have something interesting to say about it! So I will acknowledge its apparent staleness with a jokey aside before I get to the point."
Good thing writing isn't some sort of Rorschach test where we can each imbue it with our own insecurities, eh?
Re: (Score:2)
Last.FM has been covered on Slashdot before. What other reason, other than living on Mars, does one have for not keeping on top of Slashdot news?
Re: (Score:2)
Last.fm is (among many uses) for finding 'new' music you will probably like.
If you're an older demographic (like me, 38) you're much more likely to keep listening to the same ACDC and Metallica crap that all the (mostly Clear Channel) towers spew. New music usually requires a time and an emotional investment, scarce resources as you get older.
Re: (Score:2)
I completely agree with everything you said and thought the same thing while reading the summary (I've also never heard of these guys). I think anyone that seems to make these ridiculous statements about mars or rocks is simply out of touch themselves.
Re: (Score:2)
I, for one, would gladly exchange my 5-year old familiarity with Last.fm/audioscrobbler for your mars base.
Surpassing Pandora (Score:5, Insightful)
The company surpassed Pandora and others largely due to its unique datamining features
I would think that being available outside of the USA may have helped quite a bit as well.
Re: (Score:2)
Re: (Score:2)
Which also massively helps already better, IMHO, approach to categorising music (Pandora has manual one where trained monekys describe properties of artist/track, Last.fm takes notice of partially overlapping user libraries/etc.) - whole world is there to build database (plus one doesn't have to actually listen to the radio to build it) Which in turn makes it even better, and...
The real danger (Score:5, Interesting)
Seriously though, I have found using the site to be pretty enjoyable. And the advertisements are actually worth keeping AdBlock turned off for. I found a few new artists, some unsigned, that way. I like all the various widgets and things that can crunch my data. Songbird has a last.fm plugin/addon that makes for very easy integration. It's just really useful. I've also found concerts on the site.
I rarely use the social side of it, except with friends I already know. But that's me.
Re: (Score:3, Interesting)
Haha, if it gives you any comfort, I'm the same way. With how iTunes/iPod work - incrementing the count when the track finishes - I'm constantly waiting for songs to end before picking another one, or leaving tracks that have silence at the end to finish completely. Really wish it incremented at 75% complete or something.
Re: (Score:2)
Re: (Score:2)
It does properly count repeated playings of one song at least for 4 years (I often listen to something like that...if some "new" (to me) song grabs me totally)
BTW, you are aware that by artificially inflating playcount/library you're defeating the purpose of Last.fm? Recommendations both for you and on the whole site (if a lot of people would do that) suffer...for some totally unimportant number in your profile.
Re: (Score:2)
Re: (Score:2)
Oh, I just understood "...any time I need to even up my numbers..." as having some playcount target as a goal/etc.
But...it also does show repeatedly played songs in Currently listening...at least for me, and for the past 4 years... (which I actually don't like...I would prefer if it group them)
Re: (Score:2)
Re: (Score:2)
Yes, it does sound like that a bit ;P
And I can still stand by what I've said - you're defeating the purpose of Last.fm in gambling the stats like that; it's not "what my Last.fm library should look like", but "what I listen to". The idea is to reflect also how your musical habits change over time (which DOES influence recommendations/etc.). And if a band releases a single earlier...well, it seems like the purpose in that is actually popularising this particular song, isn't it? What's next? Inflating numbers
Re: (Score:2)
> I'm constantly waiting for songs to end before
> picking another one, or leaving tracks that have
> silence at the end to finish completely. Really
> wish it incremented at 75% complete or something.
Amarok submits to Last.fm after playing about half of the track. Yet another reason to use Amarok...
Re: (Score:3, Informative)
Yet another reason to use Amarok...
On his iPod?
GP was talking about the iTunes play counts, not the Last.fm play counts. Every app/plugin I've tried (including the official Last.fm app) either scrobbles at 50% or allows the user to configure the percentage. Yet another reason to be free to use whichever media player one prefers...
Re: (Score:3, Informative)
Re: (Score:2)
Last.fm counts a song as played after you listen to half of it.
Re: (Score:1)
I first found last.fm when trying amarok. I loved the idea of automatically submitting the name of listened tracks to a database which is used to build statistics among other listeners.
During some time using the service, I found some artists with a music genre similar to the one I used to listen. This was great since I don't listen commercial or popular music, so musics I like are difficult to find.
Since last.fm also works as social network (it is the single and only one I use as a matter of fact) I receive
Is their T.O.U. even legal? Would you agree? (Score:1)
Anyway, here is a quote from their Terms of Use agreement.
"It is important for you to refer to these Terms of Use from time to time to make sure that you are aware of any additions, revisions, or modifications that we may have made to these Terms of Use. Your continued use of the Website constitutes your acceptance of the new Terms of Use."
Re: (Score:2)
> ...are they just making that up?
They are making pretty much all of it up.
Re: (Score:2)
You are responsible for who uses your computer to access websites in your name? How is that yuck? That would be true regardless of whether they said it.
companies biggest asset is my privacy .. (Score:2)
'Without privacy, there cannot be freedom. And without freedom, there cannot be personal or social growth'
Re: (Score:2)
Not half a creepy as those websites you've been visiting; and let's not get started about your taste in music...
surpassed Pandora ... (Score:3, Informative)
The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,'
I'd say they surpassed Pandora only because Pandora locked out all non-US users a while back. For people who just wanted to listen to music and find out about new artists, Pandora was so much better IMO, last.fm has a clunky, overloaded UI and is too much like myspace ...
Re: (Score:2)
It certainly is not like MySpace, only one block on the right is fully customizable by the user and the ads are smaller and not so intrusive. I think the layout is pretty clear: your music stats in the middle with a shoutbox at the bottom, radio and site stats on the right. Actually, I love last.fm as a website, it's easy to navigate and interact with, everything feels very intuitive, it's a really well-made site; I often wish they had made Facebook...
What I'm more critical about is the way they handle data
Re: (Score:2)
Last.fm is pretty much incapable of recognizing identical tracks if they don't have exactly the same name.
And that's what the guy was saying in TFA, that that was their biggest problem.
What I would recommend is a registered user editing capability, with visible audit records of track title/group/etc. data changed retaining previous data, and a notify flag for users seeing bad edits made so the offender can be blocked if enough alerts come in.
Recognition could com
I've heard of Last.FM! (Score:2)
I've heard of Last.FM and I have been living on Mars, you insensitive clod!
Oddly enough, even here on Mars, just as in the US, they have a 3-listen limit on any track thanks to the RIAA.
So, three shall be the number of the counting. Thanks CBS, thanks Last.FM and thanks RIAA. In fact, I've said thank you back by turning off the autoscrobbler and reducing the data that you can use to make money off of me.
Speaking for my fellow Martians - you're welcome, Last.FM!
small correction (Score:1)
Why someone uses current exchange rates? Should be £140 (~$280).
If Last.FM Is So Smart... (Score:3, Funny)
Then why the hell is it that when I run the "Recommendations [www.last.fm]" stream the algorithm occasionally freaks out and starts pushing one unlistenable noise attack after another at me with tags like brutal death metal [www.last.fm], cybergrind [www.last.fm], czech [www.last.fm], death metal [www.last.fm], deathgrind [www.last.fm], goregrind [www.last.fm], grind [www.last.fm], grindcore [www.last.fm], noisecore [www.last.fm], porngrind [www.last.fm], pornogrind [www.last.fm], etc. No matter how many times I click the "Do Not Want" button the stuff just keeps coming. It's like a neighbour from hell. And then there's the days when I get nothing but lesbian deathcore vegan grind [www.last.fm].
The Last.FM brainfarts seem to persist no matter how many times yoy try to train the recommendation engine using the like/ban buttons and the only way to get them to "reset" to something vaguely approximating normality is to log out, log back in, and run the Library [www.last.fm] stream for a while.
Still, even with this weirdness it's still better than Pandora at finding new music I actually like.