Flutterby™! : UTF8 redux

Next unread comment / Catchup all unread comments User Account Info | Logout | XML/Pilot/etc versions | Long version (with comments) | Weblog archives | Site Map | | Browse Topics

UTF8 redux

2013-10-29 17:14:17.531665+00 by Dan Lyke 5 comments

Argh. Okay, the tableflip emojii doesn't work, but


should now work everywhere. Gonna rewrite everything in C so it's got sane Unicode handling. Grrr.

[ related topics: Work, productivity and environment ]

comments in ascending chronological order (reverse):

#Comment Re: made: 2013-10-30 03:15:42.094424+00 by: spc476


#Comment Re: made: 2013-10-30 03:15:51.486175+00 by: spc476


#Comment Re: made: 2013-10-30 06:50:09.672524+00 by: ebradway

Gonna rewrite everything in C

I honestly cannot tell if you are being sarcastic. Nor can I guess which is worse: rewriting what is essentially a bunch of string parsing routines in C... or dealing with Unicode in a higher-level language.

I can't tell you how many times I've allowed myself to be fooled by "use this module and unicode just works". Yeah, it "just works" until it doesn't and then you are right back where you started wrapping anything and everything in conversion calls.

#Comment Re: made: 2013-10-30 10:37:47.014469+00 by: meuon

"use this module and unicode just works" Crying, because I hear that all the time about almost everthing, not just unicode.

#Comment Re: made: 2013-10-30 13:27:49.336289+00 by: Dan Lyke [edit history]

Yeah, that's why I'm talking about C again. I'm only half-serious, but I've got a good portion of the Flutterby.net code rewritten in C++, and knowing what and how the conversions are happening, and how to reference them in the code, is a big deal.

It should be hard: I should be able to say "Everything in is UTF-8, everything out is UTF-8, in the middle regular expressions should work on wide characters". It isn't. But worse, it isn't in ways that are like someone half-implemented UTF-8 in Perl and then dropped the ball.