Flutterby™! : Weblogs.com weirdness

Next unread comment / Catchup all unread comments User Account Info | Logout | XML/Pilot/etc versions | Long version (with comments) | Weblog archives | Site Map | | Browse Topics

Weblogs.com weirdness

2003-04-15 09:03:17.659455-07 by Dan Lyke 2 comments

I finally figured it out! Weblogs.com blocks wget. The error message is entirely non-intuitive,

Your crawler is hitting our servers too hard. Please slow down, it's hurting the service we provide to our customers. Thanks.

and I'd been trying to figure out why I was getting the error sometimes and not others, and from different IP addresses. Sigh. Oh well, lacking a clue bat I guess I'll have to code up something simple that gives a personalised client name in Perl.

[ related topics: Software Engineering Perl Weblogs Open Source ]

comments in ascending chronological order (reverse):

#Comment made: 2003-04-15 09:36:51.456214-07 by: Mark A. Hershberger

Thanks! I was getting this from doc.weblogs.com

Shouldn't you be able to cloak with "-U"?

#Comment made: 2003-04-15 10:06:11.091444-07 by: Dan Lyke

It was simple enough to use LWP::UserAgent[Wiki], and that's probably the right way to do things anyway. I didn't bother making my wget command line any more complex. Oddly, the LWP::UserAgent[Wiki] "GET" command, with a default string, worked just fine, but now I've got a unique user agent string and I don't hit the server more frequently than 2 hours.

Comment policy

We will not edit your comments. However, we may delete your comments, or cause them to be hidden behind another link, if we feel they detract from the conversation. Commercial plugs are fine, if they are relevant to the conversation, and if you don't try to pretend to be a consumer. Annoying endorsements will be deleted if you're lucky, if you're not a whole bunch of people smarter and more articulate than you will ridicule you, and we will leave such ridicule in place.


Flutterby™ is a trademark claimed by

Dan Lyke
for the web publications at www.flutterby.com and www.flutterby.net.