* cygwinenv.sgml: Move "codepage:xxx" to the removed options section.

Change text accordingly.
	* new-features.sgml: Try to explain new way to define character sets.
This commit is contained in:
Corinna Vinschen 2009-03-24 12:37:02 +00:00
parent 161211d186
commit 1c6743b74d
3 changed files with 39 additions and 28 deletions

View File

@ -1,3 +1,9 @@
2009-03-24 Corinna Vinschen <corinna@vinschen.de>
* cygwinenv.sgml: Move "codepage:xxx" to the removed options section.
Change text accordingly.
* new-features.sgml: Try to explain new way to define character sets.
2009-03-18 Corinna Vinschen <corinna@vinschen.de>
* cygwin-ug-net.in.sgml: Update date.

View File

@ -11,29 +11,6 @@ by prefixing with <literal>no</literal>.</para>
<itemizedlist mark="bullet">
<listitem>
<para><envar>codepage:[ansi|oem|utf8]</envar> - This option controls
which single- or multibyte character set is used for file and console
operations. Windows is using UTF-16 characters internally and this
option specifies how 8-byte character sets are converted to UTF-16 and
vice versa. The default setting is <envar>ansi</envar> which means,
conversion is based on the current ANSI codepage, typically 1252 in
many Western language versions of Windows. The name originates from the
ANSI Latin1 (ISO 8859-1) standard, used in Windows 1.0, though the
character sets have since diverged from any standard. The second
setting selects an older, DOS-based character set, containing various
line drawing and special characters. It is called <envar>oem</envar>
since it was originally encoded in the firmware of IBM PCs by original
equipment manufacturers (OEMs).</para>
<para>If you find that some characters (especially non-US or 'graphical' ones)
do not display correctly in Cygwin, you can use this option to select an
appropriate codepage. Finally, <envar>utf8</envar> treats all file names
and console characters as UTF-8 chars. Please note that, for correct
operation, you have to set the environment variable LANG or LC_ALL to
somthing like "en_US.UTF-8", otherwise many applications will not be
able to recognize UTF-8 strings correctly.</para>
</listitem>
<listitem>
<para><envar>(no)dosfilewarning</envar> - If set, Cygwin will warn the
first time a user uses an "MS-DOS" style path name rather than a POSIX-style
@ -194,6 +171,16 @@ information, read the documentation in <xref linkend="mount-table"></xref> and
<xref linkend="pathnames-casesensitive"></xref>.</para>
</listitem>
<listitem>
<para><envar>codepage:[ansi|oem]</envar> - This option controled
which character set is used for file and console operations. Since Cygwin
is now doing all character conversion by itself, depending on the
application call to the <function>setlocale()</function> function, and in
turn by the setting of the environment variables <envar>$LANG</envar>,
<envar>$LC_ALL</envar>, or <envar>$LC_CTYPE</envar>, this setting
got useless.</para>
</listitem>
<listitem>
<para><envar>(no)ntea</envar> - This option has been removed since it
only fakes security which is considered dangerous and useless. It also

View File

@ -17,13 +17,18 @@
are only local to the current session and disappear when the last
Cygwin process in the session exits.
- If a filename cannot be represented in the current character set,
the character will be converted to a sequence Ctrl-N + UTF-8 representation
of the character. This allows to access all files, even those not
having a valid representation of their filename in the current character
set (codepage). To have always a valid string, use the UTF-8 charset
by setting the environment variable $LANG, $LC_ALL, or $LC_CTYPE to a
valid POSIX value, for instance in Cygwin.bat like this:
set LC_CTYPE=en_US.UTF-8
- PATH_MAX is now 4096. Internally, path names can be as long as the
underlying OS can handle (32K).
- UTF-8 filenames are supported now. So far, this requires to set
the environment variable CYGWIN to contain "codepage:utf8". but this
will likely disappear at one point. The setting of $LANG or $LC_CTYPE
will be used instead.
- struct dirent now supports d_type, filled out with DT_REG or DT_DIR.
All other file types return as DT_UNKNOWN for performance reasons.
@ -176,6 +181,19 @@
<sect2 id="ov-new1.7-posix"><title>Other POSIX related changes</title>
<screen>
- A lot of character sets are supported now via a call to setlocale().
The setting of the environment variables $LANG, $LC_ALL or $LC_CTYPE will
be used. For instance, setting $LANG to "de_DE.ISO-8859-15" before
starting a Cygwin session will use the ISO-8859-15 character set in
the entire session. UTF-8 is supported as well, as in "en_US.UTF-8".
The full list of supported character sets: "ASCII", "ISO-8859-x" with x
in 1-16, except 12, "UTF-8", Windows codepages "CPxxx", with xxx in
(437, 720, 737, 775, 850, 852, 855, 857, 858, 862, 866, 874, 1125,
1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258), "JIS", "SJIS",
"eucJP", "Big5". The leading language and territory part (en_US) is not
used by Cygwin yet, but is required for POSIX compatibility.
- Allow multiple concurrent read locks per thread for pthread_rwlock_t.
- Implement pthread_kill(thread, 0) as per POSIX.