Dan rants: Newwwsboy2: Newwws Harder?

This is about the evolution of a software package, but it's also about how my vision shaped the software that shaped Flutterby, and how each is evolving and changing the others.

Almost a year and a half ago I caught the web logging bug. I was a regular reader of Scripting News , but wanted a resource that was less about the personalities of the Sili Valley and more about the cool technologies and edgey social issues that underlay real change. Since the resource I wanted didn't exist, I set about creating it (although had I known a few other 'blogs existed I may not have).

I'd tried to do a couple of other regularly updated web pages, a few friends and I had tried a 'zine ages ago, I tried writing for my own pages, and I'd always failed. I wrote huge long screeds for mailing lists, and forwarded on cool URLs to friends, so the information was there, I just wasn't using it. Obviously I needed a computer to take over the boring parts of the job for me.

Newwwsboy was written to take advantage of all of those messages I was turning out and never doing anything useful with.

The idea was simple: A couple of special case e-mail addresses in my domain that I could CC the human readable messages I sent to other mailing lists or friends that would convert the messages to HTML, insert links into the appropriate index files, handling archiving and such automatically.

I built some simple rules for how to convert my e-mail to HTML based on how I was already writing (for things like showing emphasis, inserting links, quoting other people, inserting preformatted blocks of code), coded them up in Perl , and a software package was born.

I made the software freely available and have had a couple of other users, and in the course of the year and a half I ran across other 'blogs (we called them "micro portals" for a while, then "web logs", then Peter Merholz shortened that to "'blog"), and learned a lot. At first I thought that I'd want to make the forum collaborative, but I watched Memepool and learned that without a strong editorial voice I really didn't care, and filtering is better taken care of by just checking the links I want to read.

More recently, my users reinforced this decision when I mentioned that I was going to stop duplicating links from other sites. People wanted my take on links, and why I thought they were important.

I thought initially that having a glossary that would try to match quoted text to a database of previous links would be interesting, but when I implemented it I could never remember what I had and hadn't already linked, so after a few mistakes I ended up specifying the links every time anyway.

Relatedly I discovered that not being able to make corrections until I get home at night is frustrating, and that other people don't run their domains the way I do (big surprise there), so setting up strange e-mail addresses was difficult for them.

I found a couple of little glitches and assumptions that I'd made since Newwwsboy was my first Perl app.

As I experimented with a search engine I kept getting header and footer information in my keywords, giving me lots of extraneous hits, so I went back to my archeologic... errr... chronological filing method.

Recently, when I went to add a few guest columnists to Flutterby I discovered that my templating system was quite error prone.

Perhaps most importantly, I discovered that the formatting engine I wrote is very useful, and I want to revamp all my web pages to take advantage of it.

So itching in the back of my mind waiting to get out is a new application. And over the weekend Jorn Barger wrote an anti-XML screed which made me think "wait a minute, Newwwsboy already does most of that!" So now I'm making notes for the next step.

Here's what I've come up with:

Rather than rendering straight into HTML it has to keep the messages in its native format and be able to rerender the HTML when things that will affect the underlying database change. HTML doesn't have all of the information that my initial pages had, and I could insert more tags and comments into the HTML to keep that information but that would be needlessly obfuscating HTML. I've already got the original information.

Furthermore, HTML changes, or at least my use of it does. Adding the color selection code was a Perl regular expression, but as I get more and more redundant header and footer information those regular expressions get pretty bizarre, and I really should make room for CSS. I also need to go back through my pages and add description meta tags so I don't get my stock header on every search engine description. That's not a quick hack any more.

I need my formatting engine to be smart enough to extract more information out of my standard e-mail format. It should be able to figure out lists. It should see things that look like poetry and not blow away all of the line wrapping. Jorn points to No-Tags Markup which seems to have some good ideas similar to mine.

It should go beyond e-mail. E-mail is handy, it's great to be able to CC a message and have it appear, but it's difficult for people with cheap domains to set up. Also, many people use other editors as their primary writing tools. And it might even be nice to occasionally have a dialog with the tool, let it say "I don't understand this" and "here's what I think you meant", correct it where it's wrong and see the results immediately. So it should run, at least partially, as CGI.

Unfortunately to have a reasonable database means that your web host company either has to be running something SQL or keep a stable version of gdbm or another hash manager that can be bound to Perl around.

It should also have the ability to suck in large external blocks of data for revamping my existing web pages.

It should have hooks for calls to external databases, possibly via XML-RPC but also via some sort of HTML filters so I can check author and book titles against the Amazon or whatever databases.

And right now I maintain copies of my web pages on my server at home and upload the changed ones to the server on the big pipe, a better synchronization system must be worked out.

I welcome discussions and additions.

Monday, July 12^th, 1999 danlyke@flutterby.com