Flutterby™! : Data quality issues

Next unread comment / Catchup all unread comments User Account Info | Logout | XML/Pilot/etc versions | Long version (with comments) | Weblog archives | Site Map | | Browse Topics

Data quality issues

2009-03-18 13:28:08.025905+00 by Dan Lyke 3 comments

I just discovered how fast my new Windows machine rips CDs, so I'm feeding it discs occasionally. I wish that the Windows Media Player MP3 compressor was better than it is, but at 384k/sec it sounds okay on the hardware I've got (at the defaults it was atrocious, even on the little 4" high speakers I've got).

What I'm totally amazed at is the lack of data quality on the CD recognition. I mean, it's one thing to miss Kitten on the Keys - Kitty Muffins or The Incredibles wrap party soundtrack CD, but it's quite another to be missing disc one of Woodstock Two altogether, or the track names on the Slatkin / St. Louis Symphony recording of La Mer.

I'm not sure I've got any particular thesis here, it's just weird to find both holes in public data, and places where my tastes are so niche that I'm the first person to go there.

[ related topics: Music Microsoft Invention and Design Journalism and Media Burlesque ]

comments in ascending chronological order (reverse):

#Comment Re: made: 2009-03-18 15:16:25.006846+00 by: markd

I got the wife an ipod for her birthday many years ago, and decided to fill it up first with her music collection.

Then I discovered just how many obscure oboe CDs she had. uhnnnngg.

#Comment Re: made: 2009-03-19 12:57:35.190293+00 by: other_todd

CD recognition, I learned years ago, is best crowdsourced. The ripper I use, Audiograbber, a program so good that it's no longer made or maintained by its creator [insert stock rant about how crap inevitably drives out good stuff here], uses FreeDB to look up CDs. I think it is safe to say I have some real obscura among my disks, a lot of small- and private-label stuff, yet I have managed to stump FreeDB exactly once. You do get a certain amount of nerd-boy oneupmanship (sometimes you'll get back three or four versions of the disc info because someone felt that someone else was DOING IT WRONG and had to put in their competing version instead of making a correction), but on the whole it works like a charm.

Media Player's CD lookup is known to be fail, and I would not describe what it uses as "public data."

#Comment Re: made: 2009-03-19 13:50:29.651633+00 by: Dan Lyke

Yeah, my experience last time I did ripping was that FreeDB was quite a bit better at the obscure and older stuff than CDDB, CDDB won on the new releases of popular things (of which I've very few to none). And that was either while I was working with or shortly after my stint at Gracenote.