Re: [AD] al_font_textprintf() and uvszprintf() |
[ Thread Index |
Date Index
| More lists.liballeg.org/allegro-developers Archives
]
On 2008-10-29, Thomas Fjellstrom <tfjellstrom@xxxxxxxxxx> wrote:
> On Wednesday 29 October 2008, Peter Wang wrote:
> > On 2008-10-29, Thomas Fjellstrom <tfjellstrom@xxxxxxxxxx> wrote:
> > > whatever is the current... It currently should handle unicode incoming
> > > strings. If it doesn't, sorry, my-bad.
> > >
> > > What should happen is the fshook api should send in "native" format
> > > strings, which could be ASCII, some codepage, 16bit unicode or UTF8
> > > depending on the platform and "locale". And fshook should be able to take
> > > in whatever format the user gives it that allegro supports.
> >
> > Sorry, I didn't find that very clear.
> >
> > Let's take an example. Say the filesystem encoding is UTF-8, but the
> > user decided to make his application use ISO-8859-1 (Latin-1) just because.
> > What happens with the path API?
>
> I would think the path api could do whatever it wants. (use the current
> encoding), its fshooks and the current driver that needs to care.
>
> This way everything is in the format the user expects, and the only place that
> needs a conversion is inside the fshook driver. Which does need fixed, the
> current stdio code doesn't bother to detect what the OS actually wants, and it
> really should. and all that would take is a uconvert(U_CURRENT, ...,
> os_wants_this_mode, ...); wherever a path/string gets passed to the OS.
So if the current encoding is not a superset of the filesystem encoding,
it just doesn't work?
> > I would expect the path API to work with UTF-8 and not Latin-1,
> > otherwise there may be files on disk which are not accessible!
> > Of course, if the font addon works with Latin-1 then there may
> > be valid paths that won't display on the screen.
> >
> > And then there are Unicode normalisations to consider.
>
> Hm?
http://en.wikipedia.org/wiki/Unicode_normalization
As an example, consider if the user generates a path in Latin-1 but the
filesystem uses UTF-8. When we convert, there's a question of whether
accented characters should be precomposed or decomposed. Unices
traditionally don't interpret path characters apart from '/' and NUL
so we if we convert wrong we won't able to open the file.
From what I can tell, generally people input strings in NFC (precomposed)
form so that's what gets stored on disk -- except on Mac OS X, which
uses NFD (decomposed). But it's okay to pass NFC strings to Mac OS X as
somewhere along the line it normalises to NFD.
However: what are the file names we get back from Mac OS X, say if we
are scanning a directory with readdir()? Will they come back in NFD,
and will that trip up users?
Peter