diff --git a/winsup/doc/gprof.xml b/winsup/doc/gprof.xml
new file mode 100644
index 000000000..b2043cbfd
--- /dev/null
+++ b/winsup/doc/gprof.xml
@@ -0,0 +1,250 @@
+
+
+
+Profiling Cygwin Programs
+
+Introduction
+Profiling is a way to analyze your program to find out where it is
+spending its time. You might need to do this if it seems your program is
+taking more time to do its job than you think it should. It is always
+preferable to profile your program than to just guess where the time is
+being spent; even expert programmers are known to guess badly at this.
+
+
+In Cygwin, you enable profiling with a compiler flag and you display
+the resulting profiling data with gprof. Read on to find out how.
+
+
+To enable profiling of your program, first compile it with an
+additional gcc flag: -pg. That flag should be used
+when compiling every source file of the program. If your program has a
+Makefile, you would add the flag to all gcc compilation commands or to the
+CFLAGS= setting. A manual compilation that enables profiling looks like this:
+
+
+
+bash$ gcc -pg -g -o myapp myapp.c
+
+
+The -pg flag causes gcc to do two additional
+things as it compiles your program. First, a small bit of code is added to
+the beginning of each function that records its address and the address it
+was called from at run time. gprof uses this data to generate a call graph.
+Second, gcc arranges to have a special "front end" added to the beginning
+of your program. The front end starts a recurring timer and every time the
+timer fires, 100 times per second, the currently executing address is saved.
+gprof uses this data to generate a "flat profile" showing where your
+program is spending its time.
+
+
+After compiling your program (and linking it, if you do that as a
+separate step), you are ready to profile it. Just run it as you normally
+would. If there are specific code paths you want to profile, take the
+actions that would exercise those code paths. When your program exits,
+you will have an additional file in the current directory: gmon.out.
+That file contains the profiling data gprof processes and displays.
+
+
+gprof has many flags to control its operation. The
+gprof man page details everything gprof can do. We
+will only use a few of gprof's flags here. You launch gprof as follows:
+
+bash$ gprof [flags] appname [datafile]...
+
+If you don't specify any flags, gprof operates as if you gave it flags
+-p -q which means: generate a flat profile with
+descriptive text and generate a call graph with more descriptive text. In
+the examples below we will give specific flags to gprof to demonstrate
+specific displays. We'll also use flag -b which
+means: be brief, i.e. don't display the descriptive text. You can also
+specify a trailing list of one or more profiling data files. If you don't,
+gprof assumes gmon.out is the only file to process and display.
+
+
+
+Examples
+Flat profile
+
+bash$ gprof -b -p myapp
+Flat profile:
+
+Each sample counts as 0.01 seconds.
+ % cumulative self self total
+ time seconds seconds calls s/call s/call name
+ 25.11 13.34 13.34 1 13.34 13.34 func0
+ 25.00 26.62 13.28 1 13.28 13.28 func1
+ 25.00 39.90 13.28 1 13.28 13.28 func3
+ 24.89 53.12 13.22 1 13.22 13.22 func2
+
+
+Call graph
+
+bash$ gprof -b -q myapp
+ Call graph
+
+
+granularity: each sample hit covers 4 byte(s) for 0.02% of 53.12 seconds
+
+index % time self children called name
+ <spontaneous>
+[1] 100.0 0.00 53.12 main [1]
+ 13.34 0.00 1/1 func0 [2]
+ 13.28 0.00 1/1 func1 [3]
+ 13.28 0.00 1/1 func3 [4]
+ 13.22 0.00 1/1 func2 [5]
+-----------------------------------------------
+ 13.34 0.00 1/1 main [1]
+[2] 25.1 13.34 0.00 1 func0 [2]
+-----------------------------------------------
+ 13.28 0.00 1/1 main [1]
+[3] 25.0 13.28 0.00 1 func1 [3]
+-----------------------------------------------
+ 13.28 0.00 1/1 main [1]
+[4] 25.0 13.28 0.00 1 func3 [4]
+-----------------------------------------------
+ 13.22 0.00 1/1 main [1]
+[5] 24.9 13.22 0.00 1 func2 [5]
+-----------------------------------------------
+
+
+Index by function name
+
+ [2] func0 [5] func2
+ [3] func1 [4] func3
+
+
+Source line profile
+
+bash$ gprof -b -l myapp
+Flat profile:
+
+Each sample counts as 0.01 seconds.
+ % cumulative self self total
+ time seconds seconds calls s/call s/call name
+ 25.11 13.34 13.34 1 13.34 13.34 func0 (myapp.c:9 @ 1004010e0)
+ 25.00 26.62 13.28 1 13.28 13.28 func1 (myapp.c:10 @ 10040111a)
+ 25.00 39.90 13.28 1 13.28 13.28 func3 (myapp.c:12 @ 10040118e)
+ 24.89 53.12 13.22 1 13.22 13.22 func2 (myapp.c:11 @ 100401154)
+
+
+ Call graph
+
+
+granularity: each sample hit covers 4 byte(s) for 0.02% of 53.12 seconds
+
+index % time self children called name
+ 13.34 0.00 1/1 main (myapp.c:26 @ 10040123c) [1]
+[2] 25.1 13.34 0.00 1 func0 (myapp.c:9 @ 1004010e0) [2]
+-----------------------------------------------
+ 13.28 0.00 1/1 main (myapp.c:26 @ 10040123c) [1]
+[3] 25.0 13.28 0.00 1 func1 (myapp.c:10 @ 10040111a) [3]
+-----------------------------------------------
+ 13.28 0.00 1/1 main (myapp.c:28 @ 100401246) [5]
+[4] 25.0 13.28 0.00 1 func3 (myapp.c:12 @ 10040118e) [4]
+-----------------------------------------------
+ 13.22 0.00 1/1 main (myapp.c:27 @ 100401241) [7]
+[6] 24.9 13.22 0.00 1 func2 (myapp.c:11 @ 100401154) [6]
+-----------------------------------------------
+
+
+Index by function name
+
+ [2] func0 (myapp.c:9 @ 1004010e0) [6] func2 (myapp.c:11 @ 100401154)
+ [3] func1 (myapp.c:10 @ 10040111a) [4] func3 (myapp.c:12 @ 10040118e)
+
+
+
+Special situations
+Profiling multi-threaded programs
+Multi-threaded programs are profiled just like single-threaded programs.
+There is no mechanism to turn profiling on or off for specific threads.
+gprof combines the data for all threads when generating its displays.
+
+
+
+Profiling programs that fork
+Programs that fork, i.e., use the fork() system call with or without
+using exec() afterwards, require special care. Since there is only one
+gmon.out file, profiling data from the parent process might get overwritten
+by the child process, or vice-versa, after a fork(). You can avoid this by
+setting the environment variable GMON_OUT_PREFIX before running your
+program. If the variable is non-empty, its contents will be used as a
+prefix to name the profiling data files. Here's an example:
+
+
+
+
+bash$ export GMON_OUT_PREFIX=myapp.out
+bash$ ./myapp -fork
+bash$ ls myapp.out*
+myapp.out.2728 myapp.out.3224
+
+bash$ gprof -bp myapp myapp.out.2728
+Flat profile:
+
+Each sample counts as 0.01 seconds.
+ % cumulative self self total
+ time seconds seconds calls s/call s/call name
+ 50.25 30.28 30.28 2 15.14 15.14 func3
+ 24.99 45.34 15.06 1 15.06 15.06 func1
+ 24.76 60.26 14.92 1 14.92 14.92 func2
+
+bash$ gprof -bp myapp myapp.out.3224
+Flat profile:
+
+Each sample counts as 0.01 seconds.
+ % cumulative self self total
+ time seconds seconds calls s/call s/call name
+ 49.25 29.36 29.36 2 14.68 14.68 func3
+ 25.43 44.52 15.16 1 15.16 15.16 func1
+ 25.33 59.62 15.10 1 15.10 15.10 func2
+
+bash$ gprof -bp myapp myapp.out*
+Flat profile:
+
+Each sample counts as 0.01 seconds.
+ % cumulative self self total
+ time seconds seconds calls s/call s/call name
+ 49.75 59.64 59.64 4 14.91 14.91 func3
+ 25.21 89.86 30.22 2 15.11 15.11 func1
+ 25.04 119.88 30.02 2 15.01 15.01 func2
+
+As the last gprof command above shows, gprof can combine the data
+from a selection of profiling data files to generate its displays. Just
+list the names of those files at the end of the gprof command; you can use
+a wildcard here. NOTE: If you update your program, remember to remove stale
+profiling data files before profiling your program again. If you aren't
+careful about this, gprof could combine data from your most recent version
+with stale data from prior versions, possibly giving misleading displays.
+
+
+
+Getting better profiling resolution
+To get better resolution (i.e., more data points) when profiling
+your program, try running it multiple times with the environment variable
+GMON_OUT_PREFIX set, as described in the previous situation. There will be
+multiple profiling data files generated and you can have gprof combine
+the data from all of them into one display.
+
+
+
+Profiling programs with their libraries
+At the time of this writing Cygwin's profiling support only allows
+for one range of addresses per program. It is hard-wired to be the range
+covering the .text segment of your program, which is where your code resides.
+If you build your program with static libraries (e.g., libfoo.a), the code
+from those libraries is linked into your program's .text segment so will be
+included when profiling. But dynamic libraries (e.g., libfoo.dll) reside in
+other address ranges and code within them won't be included.
+
+
+
+Profiling Cygwin itself
+Due to the issue mentioned in the previous situation and other issues,
+at the time of this writing there is no support for profiling Cygwin itself.
+
+
+
+
+
diff --git a/winsup/doc/programming.xml b/winsup/doc/programming.xml
index 4b65c4090..05ca12e90 100644
--- a/winsup/doc/programming.xml
+++ b/winsup/doc/programming.xml
@@ -9,4 +9,5 @@
+
diff --git a/winsup/doc/setup-env.xml b/winsup/doc/setup-env.xml
index 4b519a2a1..ca5acc9a6 100644
--- a/winsup/doc/setup-env.xml
+++ b/winsup/doc/setup-env.xml
@@ -71,9 +71,16 @@ first starts. Most Cygwin applications do not make use of the
dlopen () call and do not need this variable.
+The GMON_OUT_PREFIX environment variable is helpful
+in some situations when profiling Cygwin programs. For more information,
+see . Only software developers with a
+need to profile their programs have a use for this variable.
+
+
In addition to PATH, HOME,
-and LD_LIBRARY_PATH, there are three other environment
+LD_LIBRARY_PATH, and GMON_OUT_PREFIX,
+there are three other environment
variables which, if they exist in the Windows environment, are
converted to UNIX format: TMPDIR, TMP,
and TEMP. The first is not set by default in the