Re: [hatari-devel] Character conversion for filenames in GEMDOS HD emulation |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/hatari-devel Archives
]
- To: "hatari-devel@xxxxxxxxxxxxxxxxxxx" <hatari-devel@xxxxxxxxxxxxxxxxxxx>
- Subject: Re: [hatari-devel] Character conversion for filenames in GEMDOS HD emulation
- From: Max Böhm <mboehm3@xxxxxxxxx>
- Date: Wed, 16 Jul 2014 23:00:16 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:references:from:content-type:in-reply-to:message-id:date:to :content-transfer-encoding:mime-version; bh=zxsL2XzmlhXjJzT0e9BIUNNrduDH2lo7h3KY1Nq2qVo=; b=uklFsDx9Lem4tAHDAM/7+PeIhrqy+7lgl0OQhZHKFJz8weCb5RcDkrlqyixF8PQnIQ Sj9yzrDRpV24ufNuyCo9GToJvn19AVM/GZbtF2917MJ7vtmcsijToGgVSp8MtgFae6Xn pX6QXC2cZhARrmfZpiHKx3Hog88JL22p+eCeJPXm21Y9lJbz9efXF44EZe1KsBf+mNIN l0C49obCI8xKuoQAO/nWcpY6PwdcRqIckBabjMNpjhAV4e2Zgp6Iw/eMMMPAwi9GH9/Z vKJ6DaKCnF52fgX9JB+7quKJiVszsf4tBzAcYudiq+s3ZzHZFg7d8xQI28v1C74vKqru RP/g==
Hi Thomas,
> Max, I now had a closer look at your patch, and I think it's basically
> a good approach, but there are some things that I'd like to discuss:
>
> 1) I really dislike this part in gemdos.c:
> #ifdef WIN32
> Str_AtariToWindows(pszFileName, pszFileNameHost, INVALID_CHAR);
> #else
> Str_AtariToUtf8(pszFileName, pszFileNameHost);
> #endif
> In the end, there is no need to export both functions to other files,
> so I think it would be better to have a "Str_AtariToHost" and a
> "Str_HostToAtari" where the implementation in str.c is taking care of
> the differences instead.
>
> 2) The extra step with mapWindowsToUnicode looks cumbersome ... why
> don't you add a proper mapAtariToWindows table directly instead?
>
> 3) Str_AtariToUtf8 can create a destination string that is "longer" than
> the source, since UTF8 characters can take multiple bytes, right? There
> seems to be at least one hunk in your patch where you don't take this
> into account so the destination buffer could overflow.
>
> 4) What if the (Linux) host sytem does not use a UTF-8 locale? I think
> there might still be some people around who use some latin-x locale
> instead.
>
> Thomas
Thanks for your comments which make sense to me. I'm currently on travel. I'll provide an updated patch when I'm back on Wednesday next week.
The reason for using unicode tables as as the source and derive the Windows <-> Atari mapping from it was, that mapping tables for unicode exist for basically all character sets. This makes it easier to add support for a new character set. Internally a Windows <-> Atari table is created using lazy initialization when the mapping is first used.
Currently only the utf8 and cp1252 character encodings are implemented. I can add support to read a mapping table from a file and/or add tables for additional character sets.
Max