* setup2.sgml (setup-locale): Mention three character codes per

ISO 639-3.

	* setup2.sgml (setup-locale): Adapt description to the C using ASCII
	change in 1.7.2.
This commit is contained in:
Corinna Vinschen 2010-01-17 14:55:57 +00:00
parent d24015235c
commit 0b8e38dd8b
2 changed files with 38 additions and 15 deletions

View File

@ -1,3 +1,14 @@
2010-01-17 Corinna Vinschen <corinna@vinschen.de>
* setup2.sgml (setup-locale): Mention three character codes per
ISO 639-3.
2010-01-17 Corinna Vinschen <corinna@vinschen.de>
Andy Koppe <andy.koppe@gmail.com>
* setup2.sgml (setup-locale): Adapt description to the C using ASCII
change in 1.7.2.
2010-01-16 Christopher Faylor <me+cygwin@cgf.cx>
* setup-net.sgml: Remove obsolete assertion.

View File

@ -183,8 +183,11 @@ specifier is</para>
language[[_TERRITORY][.charset][@modifier]]
</screen>
<para>"language" is a lowercase two character string per ISO 639-1,
"TERRITORY" is an uppercase two character string per ISO 3166, charset is
<para>"language" is a lowercase two character string per ISO 639-1, or,
if there is no ISO 639-1 code for the language (for instance, "Lower Sorbian"),
a three character string per ISO 639-3.</para>
<para>"TERRITORY" is an uppercase two character string per ISO 3166, charset is
one of a list of supported character sets, and the modifier doesn't matter
here (though it might for some applications). If you're interested in the
exact description, you can find it in the online publication of the POSIX
@ -197,21 +200,23 @@ manual pages on the homepage of the
"de_CH" language = German, territory = Switzerland, default charset
"fr_FR.UTF-8" language = french, territory = France, charset = UTF-8
"ko_KR.eucKR" language = korean, territory = South Korea, charset = eucKR
"syr_SY" language = Syriac, territory = Syria, default charset
</screen>
<para>
At application startup, the application's locale is set to the default
"C" or "POSIX" locale. Under Cygwin, this locale defaults to the UTF-8
character set. If you want to stick to the "C" locale and only change to
another charset, you can define this by setting one of the locale environment
variables to "C.charset". For instance</para>
"C" or "POSIX" locale. Under Cygwin 1.7.2 and later, this locale defaults
to the ASCII character set on the application level. If you want to stick
to the "C" locale and only change to another charset, you can define this
by setting one of the locale environment variables to "C.charset". For
instance</para>
<screen>
"C.ISO-8859-1"
</screen>
<para>The default locale in the absence of the aforementioned locale
environment variables is "C.UTF-8".</para>
<note><para>The default locale in the absence of the aforementioned locale
environment variables is "C.UTF-8".</para></note>
<para>Windows uses the UTF-16 charset exclusively to store the names
of any object used by the Operating System. This is especially important
@ -232,8 +237,8 @@ process.</para>
However, even if one of the locale environment variables is set to
some other value than "C", this does <emphasis>only</emphasis> affect
how Cygwin itself converts filenames. As the POSIX standard requires,
it's the applications responsibility to activate that locale for its
own purpose, typically by using the call</para>
it's the application's responsibility to activate that locale for its
own purposes, typically by using the call</para>
<screen>
setlocale (LC_ALL, "");
@ -244,6 +249,18 @@ lost: If the application calls setlocale as above, and there is none
of the important locale variables set in the environment, the locale
is set to the default locale, which is "C.UTF-8".</para>
<para>But what about applications which are not locale-aware? Per POSIX,
they are running in the "C" or "POSIX" locale, which implies the ASCII
charset. The Cygwin DLL itself, however, will nevertheless use the locale
set in the environment (or the "C.UTF-8" default locale) for converting
filenames etc.</para>
<para>When the locale set in the environment specifies an ASCII charset,
for example "C" or "en_US.ASCII", Cygwin will still use UTF-8
under the hood to translate filenames. This allows for easier
interoperability with applications running in the default "C.UTF-8" locale.
</para>
<para>
Right now the language and territory, as well as the modifier, are not
important to Cygwin, except to fix a single problem. There's a class of
@ -274,11 +291,6 @@ How does that work?</para>
<itemizedlist mark="bullet">
<listitem><para>
The default locale is the "C" or "POSIX" locale. Under Cygwin this locale
defaults to the UTF-8 character set.</para>
</listitem>
<listitem><para>
Assume that you've set one of the aforementioned environment variables to some
valid POSIX locale value, other than "C" and "POSIX". Assume further that