Flutterby™! : XHTML and user text

Next unread comment / Catchup all unread comments User Account Info | Logout | XML/Pilot/etc versions | Long version (with comments) | Weblog archives | Site Map | | Browse Topics

XHTML and user text

2002-06-13 15:48:00+00 by Dan Lyke 10 comments

So yesterday's XHTML efforts bring up a few user interface questions. A big problem with validating XHTML is getting the tag nesting right; in the simplest case <blockquote> can't be inside a <p>aragraph. I do a lot of hoop jumping to make sure that messages entered in "Text" mode are okay, but how should I indicate that user entered HTML is invalid? Suggestions appreciated.

[ related topics: Web development User Interface Flutterby Meta ]

comments in ascending chronological order (reverse):

#Comment made: 2002-06-13 19:37:02+00 by: markpasc

Could it be run through HTML Tidy?

#Comment made: 2002-06-13 21:22:15+00 by: Dan Lyke

Thanks for the pointer to that, the name rang a bell, but I hadn't read the page before. I'm already doing a bunch of things like HTML Tidy, I'll have to take a look at it to see how it handles weird tag nestings, and then work that into the interface somehow. I think I also need a "preview" button to go along with this new enforcement.

#Comment made: 2002-06-13 21:39:03+00 by: meuon

What.. you don't want to assume we all practice good hand coded HTML techniques and practices?

HTML Tidy looks neat, I'll be playing with it soon myself. I need it.

#Comment made: 2002-06-13 22:55:46+00 by: Shawn

HTML Tidy is very cool. I haven't used it in awhile, but I found it an absolute necessity when I was taking over other people's code - especially stuff that was done in... <shudder>... Frontpage.

I used a version embedded/linked to in HTML-Kit instead of the command-line version though.

#Comment made: 2002-06-14 13:47:48+00 by: DaveP [edit history]

My solution at http://davespicks.com/addpick.php tries to make the distinction between block-level formatting (which is controlled by the larger template for each day's entries) vs. inline formatting, which I allow in entries.

It ain't perfect, but restricting the tags that can be used (pretty easy to do using PHP's strip_tags function) is making it easier for me to keep the XHTML clean. Of course there are four years of old data that are going to need to be cleaned up, but that's not a critical task. At least new entries are coming in clean.

#Comment made: 2002-06-14 16:31:38+00 by: Anita Rowland

tidy is also good for getting rid of Word html/xml cruft when converting from a doc. I also used it as a plug-in for HTML-kit. Tidy's got a ton of options.

#Comment made: 2002-06-14 17:13:27+00 by: mkelley

HTML Tidy works wonders, I've been using a Tidy interface for BBEdit and just like using it to clean up Dreamweaver 4's coding for my work sites.

#Comment made: 2002-06-14 18:23:41+00 by: Dan Lyke

Dave, I've thought about just allowing the font, phrase and special tags, but lists and tables (used for tabular data) are important enough to me that I really want to allow most of the block content too, and think some control on the headings would be apropos as well. Some of this is that I'm trying to build a content system that's useful beyond just comments. So I think your solution is fine, but I'm not willing to stop there.

For the rest of you, if an entry ends up with a best guess at the tag locations, and where it can't figure out how to make them legal actually quotes them into the page (so you actually see the <caption> tag, perhaps in a different color), is that a reasonable interface for invalid tag placement?

#Comment made: 2002-06-14 23:25:50+00 by: Shawn

Sounds reasonable to me. I'm not sure I follow this as being a big problem though. The current edit system seems to work fine for me. If something doesn't come through the way I intended, I just edit it.

#Comment made: 2002-06-15 18:45:17+00 by: DaveP

Dan, I think the difference between where we're going is that you're building a content system, while I'm trying to builda system that helps me generate HTML.

After deciding to move away from Frontier, I had a fairly tough time getting all the data out of Frontier's storage. It turned out to be easiest to just take the HTML pages and massage them with a set of scripts. Since then, I've decided that the simplest solution for me is to store most of my data as HTML files (with some additions that fit within XML). I have scripts that convert the XML tags into HTML, and I have scripts that help me generate the files in the first place. But the data storage is still relatively plain-text, which makes life simpler for me if (or when) I decide to change everything about how I do my website(s).