2002-05-07 17:10:40-07 by TC 23 comments
Whoa..a good article about how backlinking is effecting bloging. Talk about a google flare.
comments in ascending chronological order (reverse):
aren't the two synonymous?
ef-fect Pronunciation Key (-fkt) n. Something brought about by a cause or agent; a result. The power to produce an outcome or achieve a result; influence: The drug had an immediate effect on the pain. The government's action had no effect on the trade imbalance.
Interesting (not the effecting/affecting debate - I solved that yesterday when I had to look it up to decide which one to use ;-). But the article lost me in the first paragraph and never really got me back. I thought Disenchanted had a much better/clearer description of the technique.
Any bets on how long it will take somebody to take advantage of this to direct more traffic to their site?
The usage note under Dictonary.com's definitition of "affect" is pretty explanatory, and I'm not sure Todd was wrong on this one. Essentially, if you use "effect" in this context you're putting the sense of it in terms of future performance, "affect" would mean on-going changes.
Okay, we now have backlinking on Flutterby, updated every half hour. First few things to make it in don't have very specific links, and I'm definitely not happy with how it fits in the design (below the comments, above the comment box), but the Flutterby design overall sucks right now, so deal with it.
<linkslut>If someone wanted to test this from an external site, I'd appreciate it</linkslut>
General rule of thumb: Effect is the noun, affect is the verb.
Well... glad to see some of you are interested in this new pattern of data oscillation and it's effect/affect on search engine indexing. Shawn: I'd take that bet, but I believe it has already happened. Dan: I didn't mean imply FLutterby needed this now but I do think it's tres cool.
<silliness>Effecting/Affecting debate: I applogize to "Affect" enthusiast in the Flutterby community. I shall endevour to post in the near future in such a way as to garner "Affect" some equal time.</silliness>
Dan; I'm already linking to the front page. Does that count? ;-P
I already see that I need to build some filters to cut out duplicates and the various syndication systems running on localhost: I parsed all the logs I've got, and entries 4300, 4364 and 4383 have the most inbound links right now.
Shawn, I haven't counted links to the front page because history shows that such things get spammed fairly heavily. Maybe it's time to say "so what" and put inbound links on another page. Probably only after I've figured out how to clean up the links, though, many aren't externally accessible.
How do you -- or do you even bother to -- recognize when an entry in the referer log is from a backlink? I haven't quite wrapped my brain around this yet, but it seems that backlinks contain a different quality of information than regular links, and should be treated differently. Especially, there seems to be real potential for some odd feedback loops here.
I'm not yet looking for circular links, making the link back from here by definition creates a loop of some sort. But long before I get to dealing with that I'm still trying to figure out how to deal with the 8 different Metafilter links to this entry, not to mention the gazillion 192.168.*.* and 127.0.0.1 and other inaccessible to the outside world addresses.
Yeah I was going to send you an email but since you opened up development concerns publicly (btw that's a hint for you clever geeks to propose a better solution should one cross your mind) I was going to suggest dns lookup (just dig it) and not create an entry for anything that didn't resolv. Or perhaps a better idea: use the lwp package and not add the link if http kicks an error. With the multi domains (simple pattern matching? yes?) Circular links? ah well I guess I should leave something for someone else to make a suggestion on.
I thought about a daemon that used HTTP checks, but I'm having trouble figuring a good way to schedule those (don't want to just do "SELECT id,url FROM urls" and hit 20k URLs, some of which are probably hosted on the same server, all at once...), and that doesn't catch things like submission forms where a Flutterby entry has been entered as a submission for a story elsewhere (which I've seen a couple of times). At the very least, though, I should be grabbing "title" tags off the new links. I'll see what I can do.
For right now, Flutterby main page posters will have a "remove this from the list" button after every URL. This is a global operation, so please only do it if the URL is clearly wrong, or if it's a main page and there's an archive page link there too. For instance, I removed a bunch of the different MetaFilter views from which the link will scroll in a few days.
Dan: thankfully the inaccessible-to-the-outside addresses are defined in three particular ranges, so it's not as hard to filter them as it might first appear.
Actually it's not quite as simple as RFC 1918 as I'm also running into machines which have valid externally accessable IP addresses where port 80 (or the port that the referrer is coming from) is firewalled. I think it's time to sit down and do the thinking necessary to make a link-checking daemon that works in a non-destructive fashion, I'll also need to get some robots.txt samples so I can see if I can reasonably exclude what people exclude from search engines.
Well if you have a daemon tracking the backlinks as they come in it might not be so bad although it make flutterby more open to DDOS attacks.
Have you considered using the Google APIs to search for pages that link to one of your pages? I've been pondering both that in addition to the type of solutions Dan seems to have been thinking about. The nice thing about using Google to do the heavy lifting is that they already know how to deal with the robot exclusion protocols.
Hmm. I can't post a new comment, but I can edit this one, so I'll point out that I meant searching using a link: search, rather than a url: search. I'm not sure what the exact differences are, but the former seems to work a lot better for me.
I've tried a couple of "url:..." style searches, and it doesn't seem to be the right magic words. At this point I think the next step is to start looking at RSS feeds and manually splitting entries to try to find entries with matching links.
Try "link:..." style searches instead. Searching for link:http://davespicks.com/writing/ programming/mackeys.html gives me about what I expected.
This entry doesn't show up with link:http://www.flutterby.com/archives/viewentry.cgi?id=4962 or link:www.flutterby.com/archives/viewentry.cgi?id=4962. I'll have to poke further, see if some of my old rants show up. I also need to get mod_redirect working so that I can give these things real URLs.
Well, there's a complete explanation of the advanced operators for google at http:// www.google.com/help/operators.html
But I suspect the fact that your URLs don't "look real" may be causing google to not bother with 'em. I used _mod_rewrite_ (http://httpd.apache.org/docs/mod/ mod_rewrite.html) to pretty up my URLs. How to Succeed With URLs got me most of the way there.
Don't you have perl embeded in Postgres? I seem to recall this is the case. You can use pglogd to keep your logs (I'm testing this on my sites now) and use perl with URI->canonical (or what-have-you) to clean up the referers on the INSERTS.
No idea if that scales or not. I don't think the sticking point would be pglogd or postgres -- it'd probably be the cleanup function, if anything.
Just a thought.
We will not edit your comments. However, we may delete your comments, or cause them to be hidden behind another link, if we feel they detract from the conversation. Commercial plugs are fine, if they are relevant to the conversation, and if you don't try to pretend to be a consumer. Annoying endorsements will be deleted if you're lucky, if you're not a whole bunch of people smarter and more articulate than you will ridicule you, and we will leave such ridicule in place.
Connectivity provided by highertech.net , awesome bandwidth, well away from fault lines and other potential for natural disasters, reliable, and run by cool people.
Questions, comments, flames: contact Dan Lyke
Flutterby™ is a trademark claimed byDan Lyke for the web publications at www.flutterby.com and www.flutterby.net.