938 lines
41 KiB
Plaintext
938 lines
41 KiB
Plaintext
<sect1 id="using-pathnames"><title>Mapping path names</title>
|
|
|
|
<sect2 id="pathnames-intro"><title>Introduction</title>
|
|
|
|
<para>Cygwin supports both Win32- and POSIX-style paths. Directory
|
|
delimiters may be either forward slashes or backslashes. Paths using
|
|
backslashes are always handled as Win32 paths. POSIX paths must only
|
|
use forward slashes as delimiter, otherwise they are treated as Win32
|
|
paths and file access might fail in surprising ways. UNC pathnames
|
|
(starting with two slashes and a network name) are also supported.</para>
|
|
|
|
<note><para>The usage of Win32 paths, though possible, is deprecated,
|
|
since it circumvents important internal path handling mechanisms.
|
|
See <xref linkend="pathnames-win32"></xref> and
|
|
<xref linkend="pathnames-win32-api"></xref> for more information.
|
|
</para></note>
|
|
|
|
<para>POSIX operating systems (such as Linux) do not have the concept
|
|
of drive letters. Instead, all absolute paths begin with a
|
|
slash (instead of a drive letter such as "c:") and all file systems
|
|
appear as subdirectories (for example, you might buy a new disk and
|
|
make it be the <filename>/disk2</filename> directory).</para>
|
|
|
|
<para>Because many programs written to run on UNIX systems assume
|
|
the existance of a single unified POSIX file system structure, Cygwin
|
|
maintains a special internal POSIX view of the Win32 file system
|
|
that allows these programs to successfully run under Windows. Cygwin
|
|
uses this mapping to translate from POSIX to Win32 paths as
|
|
necessary.</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="mount-table"><title>The Cygwin Mount Table</title>
|
|
|
|
<para>The <filename>/etc/fstab</filename> file is used to map Win32
|
|
drives and network shares into Cygwin's internal POSIX directory tree.
|
|
This is a similar concept to the typical UNIX fstab file. The mount
|
|
points stored in <filename>/etc/fstab</filename> are globally set for
|
|
all users. Sometimes there's a requirement to have user specific
|
|
mount points. The Cygwin DLL supports user specific fstab files.
|
|
These are stored in the directory <filename>/etc/fstab.d</filename>
|
|
and the name of the file is the Cygwin username of the user, as it's
|
|
stored in the <filename>/etc/passwd</filename> file. The structure of the
|
|
user specific file is identical to the system-wide
|
|
<filename>fstab</filename> file.</para>
|
|
|
|
<para>The file fstab contains descriptive information about the various file
|
|
systems. fstab is only read by programs, and not written; it is the
|
|
duty of the system administrator to properly create and maintain this
|
|
file. Each filesystem is described on a separate line; fields on each
|
|
line are separated by tabs or spaces. Lines starting with '#' are
|
|
comments.</para>
|
|
|
|
<para>The first field describes the block special device or
|
|
remote filesystem to be mounted. On Cygwin, this is the native Windows
|
|
path which the mount point links in. As path separator you MUST use a
|
|
slash. Usage of a backslash might lead to unexpected results. UNC
|
|
paths (using slashes, not backslashes) are allowed. If the path
|
|
contains spaces these can be escaped as <literal>'\040'</literal>.</para>
|
|
|
|
<para>The second field describes the mount point for the filesystem.
|
|
If the name of the mount point contains spaces these can be
|
|
escaped as '\040'.</para>
|
|
|
|
<para>The third field describes the type of the filesystem.
|
|
Cygwin supports any string here, since the file system type is usually
|
|
not evaluated. The notable exception is the file system type
|
|
cygdrive. This type is used to set the cygdrive prefix.</para>
|
|
|
|
<para>The fourth field describes the mount options associated
|
|
with the filesystem. It is formatted as a comma separated list of
|
|
options. It contains at least the type of mount (binary or text) plus
|
|
any additional options appropriate to the filesystem type. Recognized
|
|
options are binary, text, nouser, user, exec, notexec, cygexec, nosuid,
|
|
posix=[0|1]. The meaning of the options is as follows.</para>
|
|
|
|
<screen>
|
|
acl - Cygwin uses the filesystem's access control lists (ACLs) to
|
|
implement real POSIX permissions (default). This flag only
|
|
affects filesystems supporting ACLs (NTFS) and is ignored
|
|
otherwise.
|
|
auto - Ignored.
|
|
binary - Files default to binary mode (default).
|
|
bind - Allows to remount part of the file hierarchy somewhere else.
|
|
In contrast to other entries, the first field in the fstab
|
|
line specifies an absolute POSIX path. This path is remounted
|
|
to the POSIX path specified as the second path. The conversion
|
|
to a Win32 path is done on the fly. Only the root path and
|
|
paths preceding the bind entry in the fstab file are used to
|
|
convert the POSIX path in the first field to an absolute Win32
|
|
path. Note that symlinks are ignored while performing this path
|
|
conversion.
|
|
cygexec - Treat all files below mount point as cygwin executables.
|
|
dos - Always convert leading spaces and trailing dots and spaces to
|
|
characters in the UNICODE private use area. This allows to use
|
|
broken filesystems which only allow DOS filenames, even if they
|
|
are not recognized as such by Cygwin.
|
|
exec - Treat all files below mount point as executable.
|
|
ihash - Always fake inode numbers rather than using the ones returned
|
|
by the filesystem. This allows to use broken filesystems which
|
|
don't return unambiguous inode numbers, even if they are not
|
|
recognized as such by Cygwin.
|
|
noacl - Cygwin ignores filesystem ACLs and only fakes a subset of
|
|
permission bits based on the DOS readonly attribute. This
|
|
behaviour is the default on FAT and FAT32. The flag is
|
|
ignored on NFS filesystems.
|
|
nosuid - No suid files are allowed (currently unimplemented).
|
|
notexec - Treat all files below mount point as not executable.
|
|
nouser - Mount is a system-wide mount.
|
|
override - Force the override of an immutable mount point (currently "/").
|
|
posix=0 - Switch off case sensitivity for paths under this mount point
|
|
(default for the cygdrive prefix).
|
|
posix=1 - Switch on case sensitivity for paths under this mount point
|
|
(default for all other mount points).
|
|
text - Files default to CRLF text mode line endings.
|
|
user - Mount is a user mount.
|
|
</screen>
|
|
|
|
<para>While normally the execute permission bits are used to evaluate
|
|
executability, this is not possible on filesystems which don't support
|
|
permissions at all (like FAT/FAT32), or if ACLs are ignored on filesystems
|
|
supporting them (see the aforementioned <literal>acl</literal> mount option).
|
|
In these cases, the following heuristic is used to evaluate if a file is
|
|
executable: Files ending in certain extensions (.exe, .com, .bat, .btm,
|
|
.cmd) are assumed to be executable. Files whose first two characters begin
|
|
with '#!' are also considered to be executable.
|
|
The <literal>exec</literal> option is used to instruct Cygwin that the
|
|
mounted file is "executable". If the <literal>exec</literal> option is used
|
|
with a directory then all files in the directory are executable.
|
|
This option allows other files to be marked as executable and avoids the
|
|
overhead of opening each file to check for a '#!'. The
|
|
<literal>cygexec</literal> option is very similar to <literal>exec</literal>,
|
|
but also prevents Cygwin from setting up commands and environment variables
|
|
for a normal Windows program, adding another small performance gain. The
|
|
opposite of these options is the <literal>notexec</literal> option, which
|
|
means that no files should be marked as executable under that mount point.</para>
|
|
<para>A correct root directory is quite essential to the operation of
|
|
Cygwin. A default root directory is evaluated at startup so a
|
|
<filename>fstab</filename> entry for the root directory is not necessary.
|
|
If it's wrong, nothing will work as expected. Therefore, the root directory
|
|
evaluated by Cygwin itself is treated as an immutable mount point and can't
|
|
be overridden in /etc/fstab... unless you think you really know what you're
|
|
doing. In this case, use the <literal>override</literal> flag in the options
|
|
field in the <filename>/etc/fstab</filename> file. Since this is a dangerous
|
|
thing to do, do so at your own risk.</para>
|
|
|
|
<para><filename>/usr/bin</filename> and <filename>/usr/lib</filename> are
|
|
by default also automatic mount points generated by the Cygwin DLL similar
|
|
to the way the root directory is evaluated. <filename>/usr/bin</filename>
|
|
points to the directory the Cygwin DLL is installed in,
|
|
<filename>/usr/lib</filename> is supposed to point to the
|
|
<filename>/lib</filename> directory. This choice is safe and usually
|
|
shouldn't be changed. An fstab entry for them is not required.</para>
|
|
|
|
<para><literal>nouser</literal> mount points are not overridable by a later
|
|
call to <command>mount</command>.
|
|
Mount points given in <filename>/etc/fstab</filename> are by default
|
|
<literal>nouser</literal> mount points, unless you specify the option
|
|
<literal>user</literal>. This allows the administrator to set certain
|
|
paths so that they are not overridable by users. In contrast, all mount
|
|
points in the user specific fstab file are <literal>user</literal> mount
|
|
points.</para>
|
|
|
|
<para>The fifth and sixth field are ignored. They are
|
|
so far only specified to keep a Linux-like fstab file layout.</para>
|
|
|
|
<para>Note that you don't have to specify an fstab entry for the root dir,
|
|
unless you want to have the root dir pointing to somewhere entirely
|
|
different (hopefully you know what you're doing), or if you want to
|
|
mount the root dir with special options (for instance, as text mount).</para>
|
|
|
|
<para>Example entries:</para>
|
|
|
|
<itemizedlist spacing="compact">
|
|
<listitem>
|
|
<para>Just a normal mount point:</para>
|
|
<screen>c:/foo /bar fat32 binary 0 0</screen>
|
|
</listitem>
|
|
<listitem>
|
|
<para>A mount point for a textmode mount with case sensitivity switched off:</para>
|
|
<screen>C:/foo /bar/baz ntfs text,posix=0 0 0</screen>
|
|
</listitem>
|
|
<listitem>
|
|
<para>A mount point for a Windows directory with spaces in it:</para>
|
|
<screen>C:/Documents\040and\040Settings /docs ext3 binary 0 0</screen>
|
|
</listitem>
|
|
<listitem>
|
|
<para>A mount point for a remote directory without ACL support:</para>
|
|
<screen>//server/share/subdir /srv/subdir smbfs binary,noacl 0 0</screen>
|
|
</listitem>
|
|
<listitem>
|
|
<para>This is just a comment:</para>
|
|
<screen># This is just a comment</screen>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Set the cygdrive prefix to /mnt:</para>
|
|
<screen>none /mnt cygdrive binary 0 0</screen>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Remount /var to /usr/var:</para>
|
|
<screen>/var /usr/var none bind</screen>
|
|
<para>Assuming <filename>/var</filename> points to
|
|
<filename>C:/cygwin/var</filename>, <filename>/usr/var</filename> now
|
|
also points to <filename>C:/cygwin/var</filename>. This is equivalent
|
|
to the Linux <literal>bind</literal> option available since
|
|
Linux 2.4.0.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>Whenever Cygwin generates a Win32 path from a POSIX one, it uses
|
|
the longest matching prefix in the mount table. Thus, if
|
|
<filename>C:</filename> is mounted as <filename>/c</filename> and also
|
|
as <filename>/</filename>, then Cygwin would translate
|
|
<filename>C:/foo/bar</filename> to <filename>/c/foo/bar</filename>.
|
|
This translation is normally only used when trying to derive the
|
|
POSIX equivalent current directory. Otherwise, the handling of MS-DOS
|
|
filenames bypasses the mount table.
|
|
</para>
|
|
|
|
<para>If you want to see the current set of mount points valid in your
|
|
session, you can invoking the Cygwin tool <command>mount</command> without
|
|
arguments:</para>
|
|
|
|
<example id="pathnames-mount-ex">
|
|
<title>Displaying the current set of mount points</title>
|
|
<screen>
|
|
<prompt>bash$</prompt> <userinput>mount</userinput>
|
|
f:/cygwin/bin on /usr/bin type system (binary,auto)
|
|
f:/cygwin/lib on /usr/lib type system (binary,auto)
|
|
f:/cygwin on / type system (binary,auto)
|
|
e:/src on /usr/src type system (binary)
|
|
c: on /cygdrive/c type user (binary,posix=0,user,noumount,auto)
|
|
e: on /cygdrive/e type user (binary,posix=0,user,noumount,auto)
|
|
</screen>
|
|
</example>
|
|
|
|
<para>You can also use the <command>mount</command> command to add
|
|
new mount points, and the <command>umount</command> to delete
|
|
them. However, since they are only noted in memory, these mount
|
|
points will disappear as soon as your last Cygwin process ends.
|
|
See <xref linkend="mount"></xref> and <xref linkend="umount"></xref> for more
|
|
information.</para>
|
|
|
|
<note><para>
|
|
When you upgrade an existing older Cygwin installation to Cygwin 1.7,
|
|
your old system mount points (stored in the HKEY_LOCAL_MACHINE branch
|
|
of your registry) are read by a script and the <filename>/etc/fstab</filename>
|
|
file is generated from these entries. Note that entries for
|
|
<filename>/</filename>, <filename>/usr/bin</filename>, and
|
|
<filename>/usr/lib</filename> are <emphasis>never</emphasis> generated.
|
|
</para>
|
|
|
|
<para>
|
|
The old user mount points in your HKEY_CURRENT_USER branch of the registry
|
|
are not used to generate <filename>/etc/fstab</filename>. If you want
|
|
to create a user specific <filename>/etc/fstab.d/${USER}</filename> file
|
|
from your old entries, there's a script available which does exactly
|
|
that for you, <filename>/bin/copy-user-registry-fstab</filename>. Just
|
|
start the script and it will create your user specific fstab file. Stop
|
|
all your Cygwin processes and restart them, and you can simply use your
|
|
old user mount points as before.
|
|
</para></note>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="cygdrive"><title>The cygdrive path prefix</title>
|
|
|
|
<para>As already outlined in <xref linkend="ov-hi-files"></xref>, you can
|
|
access arbitary drives on your system by using the cygdrive path prefix.
|
|
The default value for this prefix is <literal>/cygdrive</literal>, and
|
|
a path to any drive can be constructed by using the cygdrive prefix and
|
|
appending the drive letter as subdirectory, like this:</para>
|
|
|
|
<screen>
|
|
bash$ ls -l /cygdrive/f/somedir
|
|
</screen>
|
|
|
|
<para>This lists the content of the directory F:\somedir.</para>
|
|
|
|
<para>The cygdrive prefix is a virtual directory under which all drives
|
|
on a system are subsumed. The mount options of the cygdrive prefix is
|
|
used for all file access through the cygdrive prefixed drives. For instance,
|
|
assuming the cygdrive mount options are <literal>binary,posix=0</literal>,
|
|
then any file <literal>/cygdrive/x/file</literal> will be opened in
|
|
binary mode by default (mount option <literal>binary</literal>, and the case
|
|
of the filename doesn't matter (mount option <literal>posix=0</literal>.
|
|
</para>
|
|
|
|
<para>The cygdrive prefix flags are also used for all UNC paths starting with
|
|
two slashes, unless they are accessed through a mount point. For instance,
|
|
consider these <filename>/etc/fstab</filename> entries:</para>
|
|
|
|
<screen>
|
|
//server/share /mysrv ntfs posix=1,acl 0 0
|
|
none /cygdrive cygdrive posix=0,noacl 0 0
|
|
</screen>
|
|
|
|
<para>Assume there's a file <filename>\\server\share\foo</filename> on the
|
|
share. When accessing it as <filename>/mysrv/foo</filename>, then the flags
|
|
<literal>posix=1,acl</literal> of the /mysrv mount point are used. When
|
|
accessing it as <filename>//server/share/foo</filename>, then the flags
|
|
for the cygdrive prefix, <literal>posix=0,noacl</literal> are used.</para>
|
|
|
|
<note><para>This only applies to UNC paths using forward slashes. When
|
|
using backslashes the flags for native paths are used. See
|
|
<xref linkend="pathnames-win32"></xref>.</para></note>
|
|
|
|
<para>The cygdrive prefix may be changed in the fstab file as outlined above.
|
|
Please note that you must not use the cygdrive prefix for any other mount
|
|
point. For instance this:</para>
|
|
|
|
<screen>
|
|
none /cygdrive cygdrive binary 0 0
|
|
D: /cygdrive/d somefs text 0 0
|
|
</screen>
|
|
|
|
<para>will not make file access using the /mnt/d path prefix suddenly using
|
|
textmode. If you want to mount any drive explicitly in another mode than
|
|
the cygdrive prefix, use a distinct path prefix:</para>
|
|
|
|
<screen>
|
|
none /cygdrive cygdrive binary 0 0
|
|
D: /mnt/d somefs text 0 0
|
|
</screen>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-win32"><title>Using native Win32 paths</title>
|
|
|
|
<para>Using native Win32 paths in Cygwin, while possible, is generally
|
|
inadvisable. Those paths circumvent all internal integrity checking and
|
|
bypass the information given in the Cygwin mount table.</para>
|
|
|
|
<para>The following paths are treated as native Win32 paths in Cygwin:</para>
|
|
|
|
<itemizedlist spacing="compact">
|
|
<listitem>
|
|
<para>All paths starting with a drive specifier</para>
|
|
<screen>
|
|
C:\foo
|
|
C:/foo
|
|
</screen>
|
|
</listitem>
|
|
<listitem>
|
|
<para>All paths containing at least one backslash as path component</para>
|
|
<screen>
|
|
C:/foo/bar<emphasis role='bold'>\</emphasis>baz/...
|
|
</screen>
|
|
</listitem>
|
|
<listitem>
|
|
<para>UNC paths using backslashes</para>
|
|
<screen>
|
|
\\server\share\...
|
|
</screen>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>When accessing files using native Win32 paths as above, Cygwin uses a
|
|
default setting for the mount flags. All paths using DOS notation will be
|
|
treated as caseinsensitive, and permissions are just faked as if the
|
|
underlying drive is a FAT drive. This also applies to NTFS and other
|
|
filesystems which usually are capable of casesensitivity and storing
|
|
permissions.</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-win32-api"><title>Using the Win32 file API in Cygwin applications</title>
|
|
|
|
<para>Special care must be taken when your application uses Win32 file API
|
|
functions like <function>CreateFile</function> to access files using relative
|
|
pathnames.</para>
|
|
|
|
<para>When a Cygwin application is started, the Win32 idea of the current
|
|
working directory (CWD) is set to an invalid directory. This works around
|
|
the problem that the Win32 CWD is locked in a way which restricts POSIX
|
|
functionality. However, the side effect is that a call to, say,
|
|
<function>CreateFile ("foo", ...);</function> will fail, since the Win32
|
|
notion of the CWD is <emphasis>not</emphasis> the same as the Cygwin notion
|
|
of the CWD.</para>
|
|
|
|
<para>So, in general, don't use the Win32 file API in Cygwin applications.
|
|
If you <emphasis>really</emphasis> need to access files using the Win32 API,
|
|
and if you <emphasis>really</emphasis> have to use relative pathnames,
|
|
you have two choices.</para>
|
|
|
|
<itemizedlist spacing="compact">
|
|
<listitem>
|
|
<para>Either you call <function>SetCurrentDirectory</function> before
|
|
calling <function>CreateFile</function>.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Or you compile your application as native Win32 (mingw) executable,
|
|
rather than as Cygwin executable.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-additional"><title>Additional Path-related Information</title>
|
|
|
|
<para>The <command>cygpath</command> program provides the ability to
|
|
translate between Win32 and POSIX pathnames in shell scripts. See
|
|
<xref linkend="cygpath"></xref> for the details.</para>
|
|
|
|
<para>The <envar>HOME</envar>, <envar>PATH</envar>, and
|
|
<envar>LD_LIBRARY_PATH</envar> environment variables are automatically
|
|
converted from Win32 format to POSIX format (e.g. from
|
|
<filename>c:/cygwin\bin</filename> to <filename>/bin</filename>, if
|
|
there was a mount from that Win32 path to that POSIX path) when a Cygwin
|
|
process first starts.</para>
|
|
|
|
<para>Symbolic links can also be used to map Win32 pathnames to POSIX.
|
|
For example, the command
|
|
<command>ln -s //pollux/home/joe/data /data</command> would have about
|
|
the same effect as creating a mount point from
|
|
<filename>//pollux/home/joe/data</filename> to <filename>/data</filename>
|
|
using <command>mount</command>, except that symbolic links cannot set
|
|
the default file access mode. Other differences are that the mapping is
|
|
distributed throughout the file system and proceeds by iteratively
|
|
walking the directory tree instead of matching the longest prefix in a
|
|
kernel table. Note that symbolic links will only work on network
|
|
drives that are properly configured to support the "system" file
|
|
attribute. Many do not do so by default (the Unix Samba server does
|
|
not by default, for example).</para>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="using-specialnames"><title>Special filenames</title>
|
|
|
|
<sect2 id="pathnames-etc"><title>Special files in /etc</title>
|
|
|
|
<para>Certain files in Cygwin's <filename>/etc</filename> directory are
|
|
read by Cygwin before the mount table has been established. The list
|
|
of files is</para>
|
|
|
|
<screen>
|
|
/etc/fstab
|
|
/etc/fstab.d/$USER
|
|
/etc/passwd
|
|
/etc/group
|
|
</screen>
|
|
|
|
<para>These file are read using native Windows NT functions which have
|
|
no notion of Cygwin symlinks or POSIX paths. For that reason
|
|
there are a few requirements as far as <filename>/etc</filename> is
|
|
concerned.</para>
|
|
|
|
<para>To access these files, the Cygwin DLL evaluates it's own full
|
|
Windows path, strips off the innermost directory component and adds
|
|
"\etc". Let's assume the Cygwin DLL is installed as
|
|
<filename>C:\cygwin\bin\cygwin1.dll</filename>. First the DLL name as
|
|
well as the innermost directory (<filename>bin</filename>) is stripped
|
|
off: <filename>C:\cygwin\</filename>. Then "etc" and the filename to
|
|
look for is attached: <filename>C:\cygwin\etc\fstab</filename>. So the
|
|
/etc directory must be parallel to the directory in which the cygwin1.dll
|
|
exists and <filename>/etc</filename> must not be a Cygwin symlink
|
|
pointing to another directory. Consequentially none of the files from
|
|
the above list, including the directory <filename>/etc/fstab.d</filename>
|
|
is allowed to be a Cygwin symlink either.</para>
|
|
|
|
<para>However, native NTFS symlinks and reparse points are transparent
|
|
when accessing the above files so all these files as well as
|
|
<filename>/etc</filename> itself may be NTFS symlinks or reparse
|
|
points.</para>
|
|
|
|
<para>Last but not least, make sure that these files are world-readable.
|
|
Every process of any user account has to read these files potentially,
|
|
so world-readability is essential. The only exception are the user
|
|
specific files <filename>/etc/fstab.d/$USER</filename>, which only have
|
|
to be readable by the $USER user account itself.</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-dosdevices"><title>Invalid filenames</title>
|
|
|
|
<para>Filenames invalid under Win32 are not necessarily invalid
|
|
under Cygwin since release 1.7.0. There are a few rules which
|
|
apply to Windows filenames. Most notably, DOS device names like
|
|
<filename>AUX</filename>, <filename>COM1</filename>,
|
|
<filename>LPT1</filename> or <filename>PRN</filename> (to name a few)
|
|
cannot be used as filename or extension in a native Win32 application.
|
|
So filenames like <filename>prn.txt</filename> or <filename>foo.aux</filename>
|
|
are invalid filenames for native Win32 applications.</para>
|
|
|
|
<para>This restriction doesn't apply to Cygwin applications. Cygwin
|
|
can create and access files with such names just fine. Just don't try
|
|
to use these files with native Win32 applications.</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-specialchars">
|
|
<title>Forbidden characters in filenames</title>
|
|
|
|
<para>Some characters are disallowed in filenames on Windows filesystems.
|
|
These forbidden characters are the ASCII control characters from ASCII
|
|
value 1 to 31, plus the following characters which have a special meaning
|
|
in the Win32 API:</para>
|
|
|
|
<screen>
|
|
" * : < > ? | \
|
|
</screen>
|
|
|
|
<para>Cygwin can't fix this, but it has a method to workaround this
|
|
restriction. All of the above characters, except for the backslash,
|
|
are converted to special UNICODE characters in the range 0xf000 to 0xf0ff
|
|
(the "Private use area") when creating or accessing files.</para>
|
|
|
|
<para>The backslash has to be exempt from this conversion, because Cygwin
|
|
accepts Win32 filenames including backslashes as path separators on input.
|
|
Converting backslashes using the above method would make this impossible.</para>
|
|
|
|
<para>Additionally Win32 filenames can't contain trailing dots and spaces
|
|
for DOS backward compatibility. When trying to create files with trailing
|
|
dots or spaces, all of them are removed before the file is created. This
|
|
restriction only affects native Win32 applications. Cygwin applications
|
|
can create and access files with trailing dots and spaces without problems.
|
|
</para>
|
|
|
|
<para>An exception from this rule are some network filesystems (NetApp,
|
|
NWFS) which choke on these filenames. They return with an error like
|
|
"No such file or directory" when trying to create such files. Starting
|
|
with Cygwin 1.7.6, Cygwin recognizes these filesystems and works around
|
|
this problem by applying the same rule as for the other forbidden characters.
|
|
Leading spaces and trailing dots and spaces will be converted to UNICODE
|
|
characters in the private use area. This behaviour can be switched on
|
|
explicitely for a filesystem or a directory tree by using the mount option
|
|
<literal>dos</literal>.</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-unusual">
|
|
<title>Filenames with unusual (foreign) characters</title>
|
|
|
|
<para> Windows filesystems use Unicode encoded as UTF-16
|
|
to store filename information. If you don't use the UTF-8
|
|
character set (see <xref linkend="setup-locale"></xref>) then there's a
|
|
chance that a filename is using one or more characters which have no
|
|
representation in the character set you're using.</para>
|
|
|
|
<note><para>In the default "C" locale, Cygwin creates filenames using
|
|
the UTF-8 charset. This will always result in some valid filename by
|
|
default, but again might impose problems when switching to a non-"C"
|
|
or non-"UTF-8" charset.</para></note>
|
|
|
|
<note><para>To avoid this scenario altogether, always use UTF-8 as the
|
|
character set.</para></note>
|
|
|
|
<para>If you don't want or can't use UTF-8 as character set for whatever
|
|
reason, you will nevertheless be able to access the file. How does that
|
|
work? When Cygwin converts the filename from UTF-16 to your character
|
|
set, it recognizes characters which can't be converted. If that occurs,
|
|
Cygwin replaces the non-convertible character with a special character
|
|
sequence. The sequence starts with an ASCII CAN character (hex code
|
|
0x18, equivalent Control-X), followed by the UTF-8 representation of the
|
|
character. The result is a filename containing some ugly looking
|
|
characters. While it doesn't <emphasis>look</emphasis> nice, it
|
|
<emphasis>is</emphasis> nice, because Cygwin knows how to convert this
|
|
filename back to UTF-16. The filename will be converted using your
|
|
usual character set. However, when Cygwin recognizes an ASCII CAN
|
|
character, it skips over the ASCII CAN and handles the following bytes as
|
|
a UTF-8 character. Thus, the filename is symmetrically converted back to
|
|
UTF-16 and you can access the file.</para>
|
|
|
|
<note><para>Please be aware that this method is not entirely foolproof.
|
|
In some character set combinations it might not work for certain native
|
|
characters.</para>
|
|
|
|
<para>Only by using the UTF-8 charset you can avoid this problem safely.
|
|
</para></note>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-casesensitive">
|
|
<title>Case sensitive filenames</title>
|
|
|
|
<para>In the Win32 subsystem filenames are only case-preserved, but not
|
|
case-sensitive. You can't access two files in the same directory which
|
|
only differ by case, like <filename>Abc</filename> and
|
|
<filename>aBc</filename>. While NTFS (and some remote filesystems)
|
|
support case-sensitivity, the NT kernel starting with Windows XP does
|
|
not support it by default. Rather, you have to tweak a registry setting
|
|
and reboot. For that reason, case-sensitivity can not be supported by Cygwin,
|
|
unless you change that registry value.</para>
|
|
|
|
<para>If you really want case-sensitivity in Cygwin, you can switch it
|
|
on by setting the registry value</para>
|
|
|
|
<screen>
|
|
HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\kernel\obcaseinsensitive
|
|
</screen>
|
|
|
|
<para>to 0 and reboot the machine. For least surprise, Cygwin expects
|
|
this registry value also on Windows NT4 and Windows 2000, which usually
|
|
both don't know this registry key. If you want case-sensitivity on these
|
|
systems, create that registry value and set it to 0. On these systems
|
|
(and <emphasis role='bold'>only</emphasis> on these systems) you don't have to reboot to bring it
|
|
into effect, rather stopping all Cygwin processes and then restarting them
|
|
is sufficient.</para>
|
|
|
|
<note>
|
|
<para>
|
|
When installing Microsoft's Services For Unix (SFU), you're asked if
|
|
you want to use case-sensitive filenames. If you answer "yes" at this point,
|
|
the installer will change the aforementioned registry value to 0, too. So, if
|
|
you have SFU installed, there's some chance that the registry value is already
|
|
set to case sensitivity.
|
|
</para>
|
|
</note>
|
|
|
|
<para>After you set this registry value to 0, Cygwin will be case-sensitive
|
|
by default on NTFS and NFS filesystems. However, there are limitations:
|
|
while two <emphasis role='bold'>programs</emphasis> <filename>Abc.exe</filename>
|
|
and <filename>aBc.exe</filename> can be created and accessed like other files,
|
|
starting applications is still case-insensitive due to Windows limitations
|
|
and so the program you try to launch may not be the one actually started. Also,
|
|
be aware that using two filenames which only differ by case might
|
|
result in some weird interoperability issues with native Win32 applications.
|
|
You're using case-sensitivity at your own risk. You have been warned! </para>
|
|
|
|
<para>Even if you use case-sensitivity, it might be feasible to switch to
|
|
case-insensitivity for certain paths for better interoperability with
|
|
native Win32 applications (even if it's just Windows Explorer). You can do
|
|
this on a per-mount point base, by using the "posix=0" mount option in
|
|
<filename>/etc/fstab</filename>, or your <filename>/etc/fstab.d/$USER</filename>
|
|
file.</para>
|
|
|
|
<para><filename>/cygdrive</filename> paths are case-insensitive by default.
|
|
The reason is that the native Windows %PATH% environment variable is not
|
|
always using the correct case for all paths in it. As a result, if you use
|
|
case-sensitivity on the <filename>/cygdrive</filename> prefix, your shell
|
|
might claim that it can't find Windows commands like <command>attrib</command>
|
|
or <command>net</command>. To ease the pain, the <filename>/cygdrive</filename>
|
|
path is case-insensitive by default and you have to use the "posix=1" setting
|
|
explicitly in <filename>/etc/fstab</filename> or
|
|
<filename>/etc/fstab.d/$USER</filename> to switch it to case-sensitivity,
|
|
or you have to make sure that the native Win32 %PATH% environment variable
|
|
is using the correct case for all paths throughout.</para>
|
|
|
|
<para>Note that mount points as well as device names and virtual
|
|
paths like /proc are always case-sensitive! The only exception are
|
|
the subdirectories and filenames under /proc/registry, /proc/registry32
|
|
and /proc/registry64. Registry access is always case-insensitive.
|
|
Read on for more information.</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-posixdevices"> <title>POSIX devices</title>
|
|
<para>There is no need to create a POSIX <filename>/dev</filename>
|
|
directory as Cygwin automatically simulates it internally.
|
|
These devices cannot be seen with the command <command>ls /dev/</command>
|
|
although commands such as <command>ls /dev/tty</command> work fine.
|
|
If you want to be able to see all well-known devices in
|
|
<filename>/dev/</filename>, you can use Igor Pechtchanski's
|
|
<ulink
|
|
url="http://cygwin.com/ml/cygwin/2004-03/txt00028.txt">create_devices.sh</ulink>
|
|
script. This script does not add the raw disk devices, though. Again,
|
|
it's not necessary to see an existing device in /dev to access it. The script
|
|
is just for the fun of it.
|
|
</para>
|
|
|
|
<para>
|
|
Cygwin supports the following character devices commonly found on POSIX systems:
|
|
</para>
|
|
|
|
<screen>
|
|
/dev/null
|
|
/dev/zero
|
|
/dev/full
|
|
|
|
/dev/console Pseudo device name for the standard console window created
|
|
by Windows. Same as the one used for cmd.exe. Every one
|
|
of them has this name. It's not quite comparable with the
|
|
console device on UNIX machines.
|
|
|
|
/dev/tty The current tty of a session running in a pseudo tty.
|
|
/dev/ptmx Pseudo tty master device.
|
|
/dev/ttym
|
|
|
|
/dev/tty0 Pseudo ttys are numbered from /dev/tty0 upwards as they are
|
|
/dev/tty1 requested.
|
|
...
|
|
|
|
/dev/ttyS0 Serial communication devices. ttyS0 == Win32 COM1,
|
|
/dev/ttyS1 ttyS1 == COM2, etc.
|
|
...
|
|
|
|
/dev/pipe
|
|
/dev/fifo
|
|
|
|
/dev/mem The physical memory of the machine. Note that access to the
|
|
/dev/port physical memory has been restricted with Windows Server 2003.
|
|
/dev/kmem Since this OS, you can't access physical memory from user space.
|
|
|
|
/dev/kmsg Kernel message pipe, for usage with sys logger services.
|
|
|
|
/dev/random Random number generator.
|
|
/dev/urandom
|
|
|
|
/dev/dsp Default sound device of the system.
|
|
</screen>
|
|
|
|
<para>
|
|
Cygwin also has several Windows-specific devices:
|
|
</para>
|
|
|
|
<screen>
|
|
/dev/com1 The serial ports, starting with COM1 which is the same as ttyS0.
|
|
/dev/com2 Please use /dev/ttySx instead.
|
|
...
|
|
|
|
/dev/conin Same as Windows CONIN$.
|
|
/dev/conout Same as Windows CONOUT$.
|
|
/dev/clipboard The Windows clipboard, text only
|
|
/dev/windows The Windows message queue.
|
|
</screen>
|
|
|
|
<para>
|
|
Block devices are accessible by Cygwin processes using fixed POSIX device
|
|
names. These POSIX device names are generated using a direct conversion
|
|
from the POSIX namespace to the internal NT namespace.
|
|
E.g. the first harddisk is the NT internal device \device\harddisk0\partition0
|
|
or the first partition on the third harddisk is \device\harddisk2\partition1.
|
|
The first floppy in the system is \device\floppy0, the first CD-ROM is
|
|
\device\cdrom0 and the first tape drive is \device\tape0.</para>
|
|
|
|
<para>The mapping from physical device to the name of the device in the
|
|
internal NT namespace can be found in various places. For hard disks and
|
|
CD/DVD drives, the Windows "Disk Management" utility (part of the
|
|
"Computer Management" console) shows that the mapping of "Disk 0" is
|
|
\device\harddisk0. "CD-ROM 2" is \device\cdrom2. Another place to find
|
|
this mapping is the "Device Management" console. Disks have a
|
|
"Location" number, tapes have a "Tape Symbolic Name", etc.
|
|
Unfortunately, the places where this information is found is not very
|
|
well-defined.</para>
|
|
|
|
<para>
|
|
For external disks (USB-drives, CF-cards in a cardreader, etc) you can use
|
|
Cygwin to show the mapping. <filename>/proc/partitions</filename>
|
|
contains a list of raw drives known to Cygwin. The <command>df</command>
|
|
command shows a list of drives and their respective sizes. If you match
|
|
the information between <filename>/proc/partitions</filename> and the
|
|
<command>df</command> output, you should be able to figure out which
|
|
external drive corresponds to which raw disk device name.</para>
|
|
|
|
<note><para>Apart from tape devices which are not block devices and are
|
|
by default accessed directly, accessing mass storage devices raw
|
|
is something you should only do if you know what you're doing and know how to
|
|
handle the information. <emphasis role='bold'>Writing</emphasis> to a raw
|
|
mass storage device you should only do if you
|
|
<emphasis role='bold'>really</emphasis> know what you're doing and are aware
|
|
of the fact that any mistake can destroy important information, for the
|
|
device, and for you. So, please, handle this ability with care.
|
|
<emphasis role='bold'>You have been warned.</emphasis></para></note>
|
|
|
|
<para>
|
|
Last but not least, the mapping from POSIX /dev namespace to internal
|
|
NT namespace is as follows:
|
|
</para>
|
|
|
|
<screen>
|
|
POSIX device name Internal NT device name
|
|
|
|
/dev/st0 \device\tape0, rewind
|
|
/dev/nst0 \device\tape0, no-rewind
|
|
/dev/st1 \device\tape1
|
|
/dev/nst1 \device\tape1
|
|
...
|
|
/dev/st15
|
|
/dev/nst15
|
|
|
|
/dev/fd0 \device\floppy0
|
|
/dev/fd1 \device\floppy1
|
|
...
|
|
/dev/fd15
|
|
|
|
/dev/sr0 \device\cdrom0
|
|
/dev/sr1 \device\cdrom1
|
|
...
|
|
/dev/sr15
|
|
|
|
/dev/scd0 \device\cdrom0
|
|
/dev/scd1 \device\cdrom1
|
|
...
|
|
/dev/scd15
|
|
|
|
/dev/sda \device\harddisk0\partition0 (whole disk)
|
|
/dev/sda1 \device\harddisk0\partition1 (first partition)
|
|
...
|
|
/dev/sda15 \device\harddisk0\partition15 (fifteenth partition)
|
|
|
|
/dev/sdb \device\harddisk1\partition0
|
|
/dev/sdb1 \device\harddisk1\partition1
|
|
|
|
[up to]
|
|
|
|
/dev/sddx \device\harddisk127\partition0
|
|
/dev/sddx1 \device\harddisk127\partition1
|
|
...
|
|
/dev/sddx15 \device\harddisk127\partition15
|
|
</screen>
|
|
|
|
<para>
|
|
if you don't like these device names, feel free to create symbolic
|
|
links as they are created on Linux systems for convenience:
|
|
</para>
|
|
|
|
<screen>
|
|
ln -s /dev/sr0 /dev/cdrom
|
|
ln -s /dev/nst0 /dev/tape
|
|
...
|
|
</screen>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-exe"><title>The .exe extension</title>
|
|
|
|
<para>Win32 executable filenames end with <filename>.exe</filename>
|
|
but the <filename>.exe</filename> need not be included in the command,
|
|
so that traditional UNIX names can be used. However, for programs that
|
|
end in <filename>.bat</filename> and <filename>.com</filename>, you
|
|
cannot omit the extension. </para>
|
|
|
|
<para>As a side effect, the <command> ls filename</command> gives
|
|
information about <filename>filename.exe</filename> if
|
|
<filename>filename.exe</filename> exists and <filename>filename</filename>
|
|
does not. In the same situation the function call
|
|
<function>stat("filename",..)</function> gives information about
|
|
<filename>filename.exe</filename>. The two files can be distinguished
|
|
by examining their inodes, as demonstrated below.
|
|
<screen>
|
|
<prompt>bash$</prompt> <userinput>ls * </userinput>
|
|
a a.exe b.exe
|
|
<prompt>bash$</prompt> <userinput>ls -i a a.exe</userinput>
|
|
445885548 a 435996602 a.exe
|
|
<prompt>bash$</prompt> <userinput>ls -i b b.exe</userinput>
|
|
432961010 b 432961010 b.exe
|
|
</screen>
|
|
If a shell script <filename>myprog</filename> and a program
|
|
<filename>myprog.exe</filename> coexist in a directory, the shell
|
|
script has precedence and is selected for execution of
|
|
<command>myprog</command>. Note that this was quite the reverse up to
|
|
Cygwin 1.5.19. It has been changed for consistency with the rest of Cygwin.
|
|
</para>
|
|
|
|
<para>The <command>gcc</command> compiler produces an executable named
|
|
<filename>filename.exe</filename> when asked to produce
|
|
<filename>filename</filename>. This allows many makefiles written
|
|
for UNIX systems to work well under Cygwin.</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-proc"><title>The /proc filesystem</title>
|
|
<para>
|
|
Cygwin, like Linux and other similar operating systems, supports the
|
|
<filename>/proc</filename> virtual filesystem. The files in this
|
|
directory are representations of various aspects of your system,
|
|
for example the command <userinput>cat /proc/cpuinfo</userinput>
|
|
displays information such as what model and speed processor you have.
|
|
</para>
|
|
<para>
|
|
One unique aspect of the Cygwin <filename>/proc</filename> filesystem
|
|
is <filename>/proc/registry</filename>, see next section.
|
|
</para>
|
|
<para>
|
|
The Cygwin <filename>/proc</filename> is not as complete as the
|
|
one in Linux, but it provides significant capabilities. The
|
|
<systemitem>procps</systemitem> package contains several utilities
|
|
that use it.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-proc-registry"><title>The /proc/registry filesystem</title>
|
|
<para>
|
|
The <filename>/proc/registry</filename> filesystem provides read-only
|
|
access to the Windows registry. It displays each <literal>KEY</literal>
|
|
as a directory and each <literal>VALUE</literal> as a file. As anytime
|
|
you deal with the Windows registry, use caution since changes may result
|
|
in an unstable or broken system. There are additionally subdirectories called
|
|
<filename>/proc/registry32</filename> and <filename>/proc/registry64</filename>.
|
|
They are identical to <filename>/proc/registry</filename> on 32 bit
|
|
host OSes. On 64 bit host OSes, <filename>/proc/registry32</filename>
|
|
opens the 32 bit processes view on the registry, while
|
|
<filename>/proc/registry64</filename> opens the 64 bit processes view.
|
|
</para>
|
|
<para>
|
|
Reserved characters ('/', '\', ':', and '%') or reserved names
|
|
(<filename>.</filename> and <filename>..</filename>) are converted by
|
|
percent-encoding:
|
|
<screen>
|
|
<prompt>bash$</prompt> <userinput>regtool list -v '\HKEY_LOCAL_MACHINE\SYSTEM\MountedDevices'</userinput>
|
|
...
|
|
\DosDevices\C: (REG_BINARY) = cf a8 97 e8 00 08 fe f7
|
|
...
|
|
<prompt>bash$</prompt> <userinput>cd /proc/registry/HKEY_LOCAL_MACHINE/SYSTEM</userinput>
|
|
<prompt>bash$</prompt> <userinput>ls -l MountedDevices</userinput>
|
|
...
|
|
-r--r----- 1 Admin SYSTEM 12 Dec 10 11:20 %5CDosDevices%5CC%3A
|
|
...
|
|
<prompt>bash$</prompt> <userinput>od -t x1 MountedDevices/%5CDosDevices%5CC%3A</userinput>
|
|
0000000 cf a8 97 e8 00 08 fe f7 01 00 00 00
|
|
</screen>
|
|
The unnamed (default) value of a key can be accessed using the filename
|
|
<filename>@</filename>.
|
|
</para>
|
|
<para>
|
|
If a registry key contains a subkey and a value with the same name
|
|
<filename>foo</filename>, Cygwin displays the subkey as
|
|
<filename>foo</filename> and the value as <filename>foo%val</filename>.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="pathnames-at"><title>The @pathnames</title>
|
|
<para>To circumvent the limitations on shell line length in the native
|
|
Windows command shells, Cygwin programs expand their arguments
|
|
starting with "@" in a special way. If a file
|
|
<filename>pathname</filename> exists, the argument
|
|
<filename>@pathname</filename> expands recursively to the content of
|
|
<filename>pathname</filename>. Double quotes can be used inside the
|
|
file to delimit strings containing blank space.
|
|
Embedded double quotes must be repeated.
|
|
In the following example compare the behaviors of the bash built-in
|
|
<command>echo</command> and of the program <command>/bin/echo</command>.</para>
|
|
|
|
<example id="pathnames-at-ex"><title> Using @pathname</title>
|
|
<screen>
|
|
<prompt>bash$</prompt> <userinput>echo 'This is "a long" line' > mylist</userinput>
|
|
<prompt>bash$</prompt> <userinput>echo @mylist</userinput>
|
|
@mylist
|
|
<prompt>bash$</prompt> <userinput>cmd</userinput>
|
|
<prompt>c:\></prompt> <userinput>c:\cygwin\bin\echo @mylist</userinput>
|
|
This is a long line
|
|
</screen>
|
|
</example>
|
|
</sect2>
|
|
</sect1>
|