Re: [AD] unicode proposal

[ Thread Index | Date Index | More Archives ]

On Thu, 2009-01-22 at 17:34 +1100, Peter Wang wrote:
> I basically agree with your function lists.

I'll try to make a patch. Hard part will be changing all the code using

> Later, I think I'd like to introduce explicit iterator types rather than
> bare pointers for ugetx (and give it a better name).  Also if sticking
> with UTF-8 (or even UTF-16) there's no reason not to introduce backwards
> traversal.

Iterator types?

> The API is prone to buffer overflows and arbitrarily truncated strings
> due to the use of preallocated buffers.  Probably the solution is
> dynamic allocation.

Or do we go even further and use something akin to ?

> An ISO-8859-1 converter could stay as it's purely a re-encoding.

Well, converting UTF8 to Latin1 simply will fail most of the time. I
guess we could replace missing letters with ?. No idea what our current
conversion functions do. Also 7-bit-ASCII and UTF-16 conversions
probably can stay. We just should stop somewhere, if someone needs
things like Latin2 or BIG5 they really should use an additional library
which can do a much better job (or go the easy way and keep everything
in UTF8 even if slightly less efficient).

Elias Pschernig <elias@xxxxxxxxxx>

Mail converted by MHonArc 2.6.19+