Re: [AD] Proposal to kill non-UTF-8 support

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


On Fri, Jan 04, 2002 at 11:32:17PM +1100, Peter Wang wrote:
> I'm proposing to remove all Unicode and codepage support so that the
> only strings Allegro understands are UTF-8 strings.

Big one... it will make the set_uformat go away, and let the user change
encodings on the fly without affecting the lib, since it will be an addon.
I agree, but there's still a problem which happens with 4.0 too.

> - No conversion to/from ASCII source, thus easier to get right.  This
> is more important for addons than Allegro itself.
>   
> - Unicode conversion routines are available in separate libraries, so
> they shouldn't be duplicated in Allegro.  And I don't think this is
> really Allegro's domain.

Then please make the IO file routines unaware of unicode, otherwise you
have the following situation with actual filesystems like fat or ext2
(AFAIK) which are just 8bit ascii:

 - user get's pure 8bit filename manually or from a directory listing
 - converts the 8bit filename to utf8 and calls al_pack_fopen
 - al_pack_fopen converts utf8 to pure 8bit ascii in order to open
   the file

The following example illustrates the problem:

#include <allegro.h>
#include <stdio.h>

void callback(const char *filename, int attrib, int param)
{
   printf("Filename '%s'\n", filename);
}

int main(int argc, char *argv[])
{
   allegro_init();
   for_each_file("*.*", 0, callback, 0);
   return 0;
}
END_OF_MAIN();

Running this in a directory with these files...
-rw-rw-r--    1 gregorio gregorio        5 ene  4 18:49 Estoyñ'áíó ásé...
-rwxrwxr-x    1 gregorio gregorio   140160 ene  4 18:48 test
-rw-rw-r--    1 gregorio gregorio      250 ene  4 18:48 test.c

I get:
Filename 'test.c'
Filename 'test'
Filename 'Estoyñ'áíó ásé...'

Letting Allegro convert filenames to utf8 is not a great idea. I am
using a latin1 codepage, but what happens with somebody using another
codepage with filenames, like euc-kr? Allegro could convert those 8bit
characters to wrong utf8 if no codepage is specified, and since you are
proposing to keep codepages away, functions like al_pack_fopen would be
a pain to deal with unless you can hook in them codepage conversions,
which I guess is what you want to get rid off.

If you make al_pack_fopen unaware of unicode, the problem goes away
from the lib and can be dealt in the user program, which aware of
internationalization issues should let the user choose the filename
encoding (maybe through allegro.cfg file or linux like LANG environment
variable) to display the name correctly in the conversion for Allegro
text output functions.

Oh wait, how would you do that with say, file_selector? Yeah, rip the
GUI too, let's SDLize everything...

PD: I haven't dealt with this situation myself, it's hypothetical. Would
    be much better to ask somebody actually using another encoding for
    the filenames or some unicode guru to hear what really happens,
    maybe they are forced to use latin1/utf8?

-- 
 Grzegorz Adam Hankiewicz   gradha@xxxxxxxxxx   http://gradha.infierno.org/



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/