Re: [AD] Unicode again

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]

To: "Vincent Penquerc'h" <vincent@xxxxxxxxxx>
Subject: Re: [AD] Unicode again
From: Shawn Hargreaves <shawn@xxxxxxxxxx>
Date: Sat, 16 Jun 2001 17:13:18 +0100
Cc: "Conductors@xxxxxxxxxx. Com" <conductors@xxxxxxxxxx>

Vincent Penquerc'h <vincent@xxxxxxxxxx> writes:
> > And ustrncpy() doesn't behave the same as the ANSI strncpy(): the 
> > former always appends a terminating null character whereas, according 
> > to my docs, the latter doesn't. I don't think we should change that 
> > now, but we must update the docs.
> 
> I think the u* versions should have the same behavior as the libc 
> versions, since they're designed to be (almost) replacements for anyone 
> who wants to use unicode in a program. I'd call the discrepancy a bug 
> and fix it.

A tough one. I was intending the Unicode stuff to be an exact copy of 
libc, but made that one different because the libc version just seems 
totally broken to me (and when I originally had it with standard libc 
behaviour, it was a nightmare trying to write actual code using it and 
having to keep manually adding zero terminators on the end of 
everything, which isn't trivial with a variable width character set 
since you can't always tell exactly where to put them).

The strn functions are supposed to be for avoiding buffer overflows, so 
what is the point of one that can sometimes leave a string unterminated? 
Seems like a recipie for bugs to me, especially since code that leaves 
out a manual termination will work fine in normal conditions, only going 
wrong when the buffer size is exceeded. The Allegro behaviour is pretty 
much compatible with code written for the libc version (I can't think of 
any standard examples where it will break), and much safer and more 
convenient IMHO.

On the other hand, that's quite an arrogant opinion to have, and I can 
see a benefit to being totally libc compatible even if that design is 
broken. The real problem, though, is if strncpy can't be relied upon to 
always add the trailing zero, how do you do that by hand? How do you know 
where to put it given that UTF-8 characters aren't always the same width? 
The only correct way is to scan forwards from the start of the string 
testing the size of each character, then back up one after you overflow, 
which is ridiculous if you have to do that after every single copy call...

-- 
Shawn Hargreaves - shawn@xxxxxxxxxx - http://www.talula.demon.co.uk/

Follow-Ups:
- RE: [AD] Unicode again
  - From: Vincent Penquerc'h

References:
- [AD] Unicode again
  - From: Eric Botcazou
- RE: [AD] Unicode again
  - From: Vincent Penquerc'h

Messages sorted by: [ date | thread ]
Prev by Date: RE: [AD] set_volume_per_voice() maximum volume
Next by Date: Re: [AD] set_volume_per_voice() maximum volume
Previous by thread: Re: [AD] Unicode again
Next by thread: RE: [AD] Unicode again

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/