Flutterby™! : Backlinking

Next unread comment / Catchup all unread comments User Account Info | Logout | XML/Pilot/etc versions | Long version (with comments) | Weblog archives | Site Map | | Browse Topics

Backlinking

2002-05-08 00:10:40+00 by TC 23 comments

Whoa..a good article about how backlinking is effecting bloging. Talk about a google flare.

[ related topics: Flutterby Meta Cool Technology ]

comments in ascending chronological order (reverse):

#Comment made: 2002-05-08 12:32:54+00 by: anser

affecting

#Comment made: 2002-05-08 15:38:26+00 by: TC [edit history]

aren't the two synonymous?
ef-fect Pronunciation Key (-fkt) n. Something brought about by a cause or agent; a result. The power to produce an outcome or achieve a result; influence: The drug had an immediate effect on the pain. The government's action had no effect on the trade imbalance.

#Comment made: 2002-05-08 15:51:56+00 by: Shawn

Interesting (not the effecting/affecting debate - I solved that yesterday when I had to look it up to decide which one to use ;-). But the article lost me in the first paragraph and never really got me back. I thought Disenchanted had a much better/clearer description of the technique.

Any bets on how long it will take somebody to take advantage of this to direct more traffic to their site?

#Comment made: 2002-05-08 15:51:59+00 by: Dan Lyke

The usage note under Dictonary.com's definitition of "affect" is pretty explanatory, and I'm not sure Todd was wrong on this one. Essentially, if you use "effect" in this context you're putting the sense of it in terms of future performance, "affect" would mean on-going changes.

#Comment made: 2002-05-08 18:03:20+00 by: Dan Lyke [edit history]

Okay, we now have backlinking on Flutterby, updated every half hour. First few things to make it in don't have very specific links, and I'm definitely not happy with how it fits in the design (below the comments, above the comment box), but the Flutterby design overall sucks right now, so deal with it.

<linkslut>If someone wanted to test this from an external site, I'd appreciate it</linkslut>

#Comment made: 2002-05-08 18:47:03+00 by: Jette

General rule of thumb: Effect is the noun, affect is the verb.

#Comment made: 2002-05-08 21:39:11+00 by: TC

Well... glad to see some of you are interested in this new pattern of data oscillation and it's effect/affect on search engine indexing. Shawn: I'd take that bet, but I believe it has already happened. Dan: I didn't mean imply FLutterby needed this now but I do think it's tres cool.

<silliness>Effecting/Affecting debate: I applogize to "Affect" enthusiast in the Flutterby community. I shall endevour to post in the near future in such a way as to garner "Affect" some equal time.</silliness>

#Comment made: 2002-05-08 23:18:11+00 by: Shawn [edit history]

Dan; I'm already linking to the front page. Does that count? ;-P

#Comment made: 2002-05-08 23:25:48+00 by: Dan Lyke

I already see that I need to build some filters to cut out duplicates and the various syndication systems running on localhost: I parsed all the logs I've got, and entries 4300, 4364 and 4383 have the most inbound links right now.

#Comment made: 2002-05-09 00:05:08+00 by: Dan Lyke

Shawn, I haven't counted links to the front page because history shows that such things get spammed fairly heavily. Maybe it's time to say "so what" and put inbound links on another page. Probably only after I've figured out how to clean up the links, though, many aren't externally accessible.

#Comment made: 2002-05-09 04:04:35+00 by: dexev

How do you -- or do you even bother to -- recognize when an entry in the referer log is from a backlink? I haven't quite wrapped my brain around this yet, but it seems that backlinks contain a different quality of information than regular links, and should be treated differently. Especially, there seems to be real potential for some odd feedback loops here.

#Comment made: 2002-05-09 15:56:23+00 by: Dan Lyke

I'm not yet looking for circular links, making the link back from here by definition creates a loop of some sort. But long before I get to dealing with that I'm still trying to figure out how to deal with the 8 different Metafilter links to this entry, not to mention the gazillion 192.168.*.* and 127.0.0.1 and other inaccessible to the outside world addresses.

#Comment made: 2002-05-09 17:13:47+00 by: TC

Yeah I was going to send you an email but since you opened up development concerns publicly (btw that's a hint for you clever geeks to propose a better solution should one cross your mind) I was going to suggest dns lookup (just dig it) and not create an entry for anything that didn't resolv. Or perhaps a better idea: use the lwp package and not add the link if http kicks an error. With the multi domains (simple pattern matching? yes?) Circular links? ah well I guess I should leave something for someone else to make a suggestion on.

#Comment made: 2002-05-09 17:30:22+00 by: Dan Lyke

I thought about a daemon that used HTTP checks, but I'm having trouble figuring a good way to schedule those (don't want to just do "SELECT id,url FROM urls" and hit 20k URLs, some of which are probably hosted on the same server, all at once...), and that doesn't catch things like submission forms where a Flutterby entry has been entered as a submission for a story elsewhere (which I've seen a couple of times). At the very least, though, I should be grabbing "title" tags off the new links. I'll see what I can do.

For right now, Flutterby main page posters will have a "remove this from the list" button after every URL. This is a global operation, so please only do it if the URL is clearly wrong, or if it's a main page and there's an archive page link there too[Wiki]. For instance, I removed a bunch of the different MetaFilter views from which the link will scroll in a few days.

#Comment made: 2002-05-09 17:46:26+00 by: Mars Saxman

Dan: thankfully the inaccessible-to-the-outside addresses are defined in three particular ranges, so it's not as hard to filter them as it might first appear.

-Mars

#Comment made: 2002-05-09 17:54:09+00 by: Dan Lyke

Actually it's not quite as simple as RFC 1918 as I'm also running into machines which have valid externally accessable IP addresses where port 80 (or the port that the referrer is coming from) is firewalled. I think it's time to sit down and do the thinking necessary to make a link-checking daemon that works in a non-destructive fashion, I'll also need to get some robots.txt samples so I can see if I can reasonably exclude what people exclude from search engines.

#Comment made: 2002-05-09 22:18:17+00 by: TC

Well if you have a daemon tracking the backlinks as they come in it might not be so bad although it make flutterby more open to DDOS attacks.
http://www.google.com/robots.txt

#Comment made: 2002-05-13 17:34:10+00 by: DaveP [edit history]

Have you considered using the Google APIs to search for pages that link to one of your pages? I've been pondering both that in addition to the type of solutions Dan seems to have been thinking about. The nice thing about using Google to do the heavy lifting is that they already know how to deal with the robot exclusion protocols.

Hmm. I can't post a new comment, but I can edit this one, so I'll point out that I meant searching using a link: search, rather than a url: search. I'm not sure what the exact differences are, but the former seems to work a lot better for me.

#Comment made: 2002-05-13 21:36:31+00 by: Dan Lyke

I've tried a couple of "url:..." style searches, and it doesn't seem to be the right magic words. At this point I think the next step is to start looking at RSS feeds and manually splitting entries to try to find entries with matching links.

#Comment made: 2002-05-16 00:02:03+00 by: DaveP

Try "link:..." style searches instead. Searching for link:http://davespicks.com/writing/ programming/mackeys.html gives me about what I expected.

#Comment made: 2002-05-16 00:17:48+00 by: Dan Lyke

This entry doesn't show up with link:http://www.flutterby.com/archives/viewentry.cgi?id=4962 or link:www.flutterby.com/archives/viewentry.cgi?id=4962. I'll have to poke further, see if some of my old rants show up. I also need to get mod_redirect working so that I can give these things real URLs.

#Comment made: 2002-05-16 03:59:24+00 by: DaveP

Well, there's a complete explanation of the advanced operators for google at http:// www.google.com/help/operators.html

But I suspect the fact that your URLs don't "look real" may be causing google to not bother with 'em. I used _mod_rewrite_ (http://httpd.apache.org/docs/mod/ mod_rewrite.html) to pretty up my URLs. How to Succeed With URLs got me most of the way there.

#Comment made: 2002-05-17 17:08:21+00 by: Mark A. Hershberger [edit history]

Dan,

Don't you have perl embeded in Postgres? I seem to recall this is the case. You can use pglogd to keep your logs (I'm testing this on my sites now) and use perl with URI->canonical (or what-have-you) to clean up the referers on the INSERTS.

No idea if that scales or not. I don't think the sticking point would be pglogd or postgres -- it'd probably be the cleanup function, if anything.

Just a thought.