newlib-cygwin/winsup/cygwin/DevNotes

2012-06-02  cgf-000010

<1.7.16>
- Fix emacs problem which exposed an issue with Cygwin's select() function.
  If a signal arrives while select is blocking and the program longjmps
  out of the signal handler then threads and memory may be left hanging.
  Fixes: http://cygwin.com/ml/cygwin/2012-05/threads.html#00275
</1.7.16>

This was try #4 or #5 to get select() signal handling working right.
It's still not there but it should now at least not leak memory or
threads.

I mucked with the interface between cygwin_select and select_stuff::wait
so that the "new" loop in select_stuff::wait() was essentially moved
into the caller.  cygwin_select now uses various enum states to decide
what to do.  It builds the select linked list at the beginning of the
loop, allowing wait() to tear everything down and restart.  This is
necessary before calling a signal handler because the signal handler may
longjmp away.

I initially had this all coded up to use a special signal_cleanup
callback which could be called when a longjmp is called in a signal
handler.  And cygwin_select() set up and tore down this callback.  Once
I got everything compiling it, of course, dawned on me that just because
you call a longjmp in a signal handler it doesn't mean that you are
jumping *out* of the signal handler.  So, if the signal handler invokes
the callback and returns it will be very bad for select().  Hence, this
slower, but hopefully more correct implementation.

(I still wonder if some sort of signal cleanup callback might still
be useful in the future)

TODO: I need to do an audit of other places where this problem could be
occurring.

As alluded to above, select's signal handling is still not right.  It
still acts as if it could call a signal handler from something other
than the main thread but, AFAICT, from my STC, this doesn't seem to be
the case.  It might be worthwhile to extend cygwait to just magically
figure this out and not even bother using w4[0] for scenarios like this.

2012-05-16  cgf-000009

<1.7.16>
- Fix broken console mouse handling.  Reported here:
  http://cygwin.com/ml/cygwin/2012-05/msg00360.html
</1.7.16>

I did a cvs annotate on smallprint.cc and see that the code to translate
%characters > 127 to 0x notation was in the 1.1 revision.  Then I
checked the smallprint.c predecessor.  It was in the 1.1 version of that
program too, which means that this odd change has probably been around
since <= 2000.

Since __small_sprintf is supposed to emulate sprintf, I got rid of the
special case handling.  This may affect fhandler_socket::bind.  If so, we
should work around this problem there rather than keeping this strange
hack in __small_printf.

2012-05-14  cgf-000008

<1.7.16>
- Fix hang when zero bytes are written to a pty using
  Windows WriteFile or equivalent.  Fixes:
  http://cygwin.com/ml/cygwin/2012-05/msg00323.html
</1.7.16>

cgf-000002, as usual, fixed one thing while breaking another.  See
Larry's predicament in: http://goo.gl/oGEr2 .

The problem is that zero byte writes to the pty pipe caused the dread
end-of-the-world-as-we-know-it problem reported on the mailing list
where ReadFile reads zero bytes even though there is still more to read
on the pipe.  This is because that change caused a 'record' to be read
and a record can be zero bytes.

I was never really keen about using a throwaway buffer just to get a
count of the number of characters available to be read in the pty pipe.
On closer reading of the documentation for PeekNamedPipe it seemed like
the sixth argument to PeekNamedPipe should return what I needed without
using a buffer.  And, amazingly, it did, except that the problem still
remained - a zero byte message still screwed things up.

So, we now detect the case where there is zero bytes available as a
message but there are bytes available in the pipe.  In that scenario,
return the bytes available in the pipe rather than the message length of
zero.  This could conceivably cause problems with pty pipe handling in
this scenario but since the only way this scenario could possibly happen
is when someone is writing zero bytes using WriteFile to a pty pipe, I'm
ok with that.

2012-05-14  cgf-000007

<1.7.16>
- Fix invocation of strace from a cygwin process.  Fixes:
  http://cygwin.com/ml/cygwin/2012-05/msg00292.html
</1.7.16>

The change in cgf-000004 introduced a problem for processes which load
cygwin1.dll dynamically.  strace.exe is the most prominent example of
this.

Since the parent handle is now closed for "non-Cygwin" processes, when
strace.exe tried to dynamically load cygwin1.dll, the handle was invalid
and child_info_spawn::handle_spawn couldn't use retrieve information
from the parent.  This eventually led to a strace_printf error due to an
attempt to dereference an unavailable cygheap.  Probably have to fix
this someday.  You shouldn't use the cygheap while attempting to print
an error about the inavailability of said cygheap.

This was fixed by saving the parent pid in child_info_spawn and calling
OpenProcess for the parent pid and using that handle iff a process is
dynamically loaded.

2012-05-12  cgf-000006

<1.7.16>
- Fix hang when calling pthread_testcancel in a canceled thread.
  Fixes some of: http://cygwin.com/ml/cygwin/2012-05/msg00186.html
</1.7.16>

This should fix the first part of the reported problem in the above
message.  The cancel seemed to actually be working but, the fprintf
eventually ended up calling pthread_testcancel.  Since we'd gotten here
via a cancel, it tried to recursively call the cancel handler causing a
recursive loop.

2012-05-12  cgf-000005

<1.7.16>
- Fix pipe creation problem which manifested as a problem creating a
fifo.  Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00253.html
</1.7.16>

My change on 2012-04-28 introduced a problem with fifos.  The passed
in name was overwritten.  This was because I wasn't properly keeping
track of the length of the generated pipe name when there was a
name passed in to fhandler_pipe::create.

There was also another problem in fhandler_pipe::create.  Since fifos
use PIPE_ACCESS_DUPLEX and PIPE_ACCESS_DUPLEX is an or'ing of
PIPE_ACCESS_INBOUND and PIPE_ACCESS_OUTBOUND, using PIPE_ACCESS_OUTBOUND
as a "never-used" option for PIPE_ADD_PID in fhandler.h was wrong.  So,
fifo creation attempted to add the pid of a pipe to the name which is
wrong for fifos.

2012-05-08  cgf-000004

The change for cgf-000003 introduced a new problem:
http://cygwin.com/ml/cygwin/2012-05/msg00154.html
http://cygwin.com/ml/cygwin/2012-05/msg00157.html

Since a handle associated with the parent is no longer being duplicated
into a non-cygwin "execed child", Windows is free to reuse the pid of
the parent when the parent exits.  However, since we *did* duplicate a
handle pointing to the pid's shared memory area into the "execed child",
the shared memory for the pid was still active.

Since the shared memory was still available, if a new process reuses the
previous pid, Cygwin would detect that the shared memory was not created
and had a "PID_REAPED" flag.  That was considered an error, and, so, it
would set procinfo to NULL and pinfo::thisproc would die since this
situation is not supposed to occur.

I fixed this in two ways:

1) If a shared memory region has a PID_REAPED flag then zero it and
reuse it.  This should be safe since you are not really supposed to be
querying the shared memory region for anything after PID_REAPED has been
set.

2) Forego duping a copy of myself_pinfo if we're starting a non-cygwin
child for exec.

It seems like 2) is a common theme and an audit of all of the handles
that are being passed to non-cygwin children is in order for 1.7.16.

The other minor modification that was made in this change was to add the
pid of the failing process to fork error output.  This helps slightly
when looking at strace output, even though in this case it was easy to
find what was failing by looking for '^---' when running the "stv"
strace dumper.  That found the offending exception quickly.

2012-05-07  cgf-000003

<1.7.15>
Don't make Cygwin wait for all children of a non-cygwin child program.
Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00063.html,
       http://cygwin.com/ml/cygwin/2012-05/msg00075.html
</1.7.15>

This problem is due to a recent change which added some robustness and
speed to Cygwin's exec/spawn handling by not trying to force inheritance
every time a process is started.  See ChangeLog entries starting on
2012-03-20, and multiple on 2012-03-21.

Making the handle inheritable meant that, as usual, there were problems
with non-Cygwin processes.  When Cygwin "execs" a non-Cygwin process N,
all of its N + 1, N + 2, ...  children will also inherit the handle.
That means that Cygwin will wait until all subprocesses have exited
before it returns.

I was willing to make this a restriction of starting non-Cygwin
processes but the problem with allowing that is that it can cause the
creation of a "limbo" pid when N exits and N + 1 and friends are still
around.  In this scenario, Cygwin dutifully notices that process N has
died and sets the exit code to indicate that but N's parent will wait on
rd_proc_pipe and will only return when every N + ...  windows process
has exited.

The removal of cygheap::pid_handle was not related to the initial
problem that I set out to fix.  The change came from the realization
that we were duping the current process handle into the child twice and
only needed to do it once.  The current process handle is used by exec
to keep the Windows pid "alive" so that it will not be reused.  So, now
we just close parent in child_info_spawn::handle_spawn iff we're not
execing.

In debugging this it bothered me that 'ps' identified a nonactive pid as
active.  Part of the reason for this was the 'parent' handle in
child_info was opened in non-Cygwin processes, keeping the pid alive.
That has been kluged around (more changes after 1.7.15) but that didn't
fix the problem.  On further investigation, this seems to be caused by
the fact that the shared memory region pid handles were still being
passed to non-cygwin children, keeping the pid alive in a limbo-like
fashion.  This was easily fixed by having pinfo::init() consider a
memory region with PID_REAPED as not available.  A more robust fix
should be considered for 1.7.15+ where these handles are not passed
to non-cygwin processes.

This fixed the problem where a pid showed up in the list after a user
does something like: "bash$ cmd /c start notepad" but, for some reason,
it does not fix the problem where "bash$ setsid cmd /c start notepad".
That bears investigation after 1.7.15 is released but it is not a
regression and so is not a blocker for the release.

2012-05-03  cgf-000002

<1.7.15>
Fix problem where too much input was attempted to be read from a
pty slave.  Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00049.html
</1.7.15>

My change on 2012/04/05 reintroduced the problem first described by:
http://cygwin.com/ml/cygwin/2011-10/threads.html#00445

The problem then was, IIRC, due to the fact that bytes sent to the pty
pipe were not written as records.  Changing pipe to PIPE_TYPE_MESSAGE in
pipe.cc fixed the problem since writing lines to one side of the pipe
caused exactly that the number of characters to be read on the other
even if there were more characters in the pipe.

To debug this, I first replaced fhandler_tty.cc with the 1.258,
2012/04/05 version.  The test case started working when I did that.

So, then, I replaced individual functions, one at a time, in
fhandler_tty.cc with their previous versions.  I'd expected this to be a
problem with fhandler_pty_master::process_slave_output since that had
seen the most changes but was surprised to see that the culprit was
fhandler_pty_slave::read().

The reason was that I really needed the bytes_available() function to
return the number of bytes which would be read in the next operation
rather than the number of bytes available in the pipe.  That's because
there may be a number of lines available to be read but the number of
bytes which will be read by ReadFile should reflect the mode of the pty
and, if there is a line to read, only the number of bytes in the line
should be seen as available for the next read.

Having bytes_available() return the number of bytes which would be read
seemed to fix the problem but it could subtly change the behavior of
other callers of this function.  However, I actually think this is
probably a good thing since they probably should have been seeing the
line behavior.

2012-05-02  cgf-000001

<1.7.15>
Fix problem setting parent pid to 1 when process with children execs
itself.  Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00009.html
</1.7.15>

Investigating this problem with strace showed that ssh-agent was
checking the parent pid and getting a 1 when it shouldn't have.  Other
stuff looked ok so I chose to consider this a smoking gun.

Going back to the version that the OP said did not have the problem, I
worked forward until I found where the problem first occurred -
somewhere around 2012-03-19.  And, indeed, the getppid call returned the
correct value in the working version.  That means that this stopped
working when I redid the way the process pipe was inherited around
this time period.

It isn't clear why (and I suspect I may have to debug this further at
some point) this hasn't always been a problem but I made the obvious fix.
We shouldn't have been setting ppid = 1 when we're about to pass off to
an execed process.

As I was writing this, I realized that it was necessary to add some
additional checks.  Just checking for "have_execed" isn't enough.  If
we've execed a non-cygwin process then it won't know how to deal with
any inherited children.  So, always set ppid = 1 if we've execed a
non-cygwin process.
* DevNotes: Add entry cgf-000010. * select.cc (set_handle_or_return_if_not_open): Remove unneeded final backslash from definition. (cygwin_select): Reorganize to incorporate outer retry loop. Move remaining time recalculation here for retry case. Use select_stuff::wait_states for loop control. (select_stuff::cleanup): Avoid unneeded initialization. (select_stuff::wait): Modify definition to return select_stuff::wait_states. Eliminate is_cancelable. Don't element 1 of an array if it is a cancel handle. Remove loop. Rely on being called from enclosing loop in cygwin_select. Remove time recalculation when restarting. Try harder to always return from the bottom. * select.h (select_stuff::wait_state): New enum. (select_stuff::wait): Modify declaration to return select_stuff::wait_states. 2012-06-03 10:59:20 +08:00			`2012-06-02 cgf-000010`

			`<1.7.16>`
			`- Fix emacs problem which exposed an issue with Cygwin's select() function.`
			`If a signal arrives while select is blocking and the program longjmps`
			`out of the signal handler then threads and memory may be left hanging.`
			`Fixes: http://cygwin.com/ml/cygwin/2012-05/threads.html#00275`
			`</1.7.16>`

			`This was try #4 or #5 to get select() signal handling working right.`
			`It's still not there but it should now at least not leak memory or`
			`threads.`

			`I mucked with the interface between cygwin_select and select_stuff::wait`
			`so that the "new" loop in select_stuff::wait() was essentially moved`
			`into the caller. cygwin_select now uses various enum states to decide`
			`what to do. It builds the select linked list at the beginning of the`
			`loop, allowing wait() to tear everything down and restart. This is`
			`necessary before calling a signal handler because the signal handler may`
			`longjmp away.`

			`I initially had this all coded up to use a special signal_cleanup`
			`callback which could be called when a longjmp is called in a signal`
			`handler. And cygwin_select() set up and tore down this callback. Once`
			`I got everything compiling it, of course, dawned on me that just because`
			`you call a longjmp in a signal handler it doesn't mean that you are`
			`jumping out of the signal handler. So, if the signal handler invokes`
			`the callback and returns it will be very bad for select(). Hence, this`
			`slower, but hopefully more correct implementation.`

			`(I still wonder if some sort of signal cleanup callback might still`
			`be useful in the future)`

			`TODO: I need to do an audit of other places where this problem could be`
			`occurring.`

			`As alluded to above, select's signal handling is still not right. It`
			`still acts as if it could call a signal handler from something other`
			`than the main thread but, AFAICT, from my STC, this doesn't seem to be`
			`the case. It might be worthwhile to extend cygwait to just magically`
			`figure this out and not even bother using w4[0] for scenarios like this.`

* DevNotes: Add entry cgf-000009. * smallprint.cc (__small_vsprintf): Always treat '%c' and '%C' as characters. Don't decode them if they are > 127. (__small_vswprintf): Ditto. 2012-05-17 10:18:41 +08:00			`2012-05-16 cgf-000009`

			`<1.7.16>`
			`- Fix broken console mouse handling. Reported here:`
			`http://cygwin.com/ml/cygwin/2012-05/msg00360.html`
			`</1.7.16>`

			`I did a cvs annotate on smallprint.cc and see that the code to translate`
			`%characters > 127 to 0x notation was in the 1.1 revision. Then I`
			`checked the smallprint.c predecessor. It was in the 1.1 version of that`
			`program too, which means that this odd change has probably been around`
			`since <= 2000.`

			`Since __small_sprintf is supposed to emulate sprintf, I got rid of the`
			`special case handling. This may affect fhandler_socket::bind. If so, we`
			`should work around this problem there rather than keeping this strange`
			`hack in __small_printf.`

* DevNotes: Add entry cgf-000008. * fhandler_tty.cc (bytes_available): Simplify by returning the number of bytes available in the message unless that is zero. 2012-05-16 09:56:41 +08:00			`2012-05-14 cgf-000008`

			`<1.7.16>`
			`- Fix hang when zero bytes are written to a pty using`
			`Windows WriteFile or equivalent. Fixes:`
			`http://cygwin.com/ml/cygwin/2012-05/msg00323.html`
			`</1.7.16>`

			`cgf-000002, as usual, fixed one thing while breaking another. See`
			`Larry's predicament in: http://goo.gl/oGEr2 .`

			`The problem is that zero byte writes to the pty pipe caused the dread`
			`end-of-the-world-as-we-know-it problem reported on the mailing list`
			`where ReadFile reads zero bytes even though there is still more to read`
			`on the pipe. This is because that change caused a 'record' to be read`
			`and a record can be zero bytes.`

			`I was never really keen about using a throwaway buffer just to get a`
			`count of the number of characters available to be read in the pty pipe.`
			`On closer reading of the documentation for PeekNamedPipe it seemed like`
			`the sixth argument to PeekNamedPipe should return what I needed without`
			`using a buffer. And, amazingly, it did, except that the problem still`
			`remained - a zero byte message still screwed things up.`

			`So, we now detect the case where there is zero bytes available as a`
			`message but there are bytes available in the pipe. In that scenario,`
			`return the bytes available in the pipe rather than the message length of`
			`zero. This could conceivably cause problems with pty pipe handling in`
			`this scenario but since the only way this scenario could possibly happen`
			`is when someone is writing zero bytes using WriteFile to a pty pipe, I'm`
			`ok with that.`

* DevNotes: Add entry cgf-000007. * child_info.h (child_info_spawn::parent_winpid): Declare new field. (child_info_spawn::get_parent_handle): Declare new function. * dcrt0.cc (child_info_spawn::get_parent_handle): Define new function. (child_info_spawn::handle_spawn): Recreate parent handle if possible when dynamically loaded. Don't mess with parent handle if it's NULL. * spawn.cc (child_info_spawn::worker): Set parent_winpid appropriately. 2012-05-15 06:42:56 +08:00			`2012-05-14 cgf-000007`

			`<1.7.16>`
			`- Fix invocation of strace from a cygwin process. Fixes:`
			`http://cygwin.com/ml/cygwin/2012-05/msg00292.html`
			`</1.7.16>`

			`The change in cgf-000004 introduced a problem for processes which load`
			`cygwin1.dll dynamically. strace.exe is the most prominent example of`
			`this.`

			`Since the parent handle is now closed for "non-Cygwin" processes, when`
			`strace.exe tried to dynamically load cygwin1.dll, the handle was invalid`
			`and child_info_spawn::handle_spawn couldn't use retrieve information`
			`from the parent. This eventually led to a strace_printf error due to an`
			`attempt to dereference an unavailable cygheap. Probably have to fix`
			`this someday. You shouldn't use the cygheap while attempting to print`
			`an error about the inavailability of said cygheap.`

			`This was fixed by saving the parent pid in child_info_spawn and calling`
			`OpenProcess for the parent pid and using that handle iff a process is`
			`dynamically loaded.`

* DevNotes: Add entry cgf-000006. * thread.cc (pthread::pop_cleanup_handler): Set cancel state to disabled to avoid recursively waiting for cancel. 2012-05-13 04:26:43 +08:00			`2012-05-12 cgf-000006`

			`<1.7.16>`
			`- Fix hang when calling pthread_testcancel in a canceled thread.`
			`Fixes some of: http://cygwin.com/ml/cygwin/2012-05/msg00186.html`
			`</1.7.16>`

			`This should fix the first part of the reported problem in the above`
			`message. The cancel seemed to actually be working but, the fprintf`
			`eventually ended up calling pthread_testcancel. Since we'd gotten here`
			`via a cancel, it tried to recursively call the cancel handler causing a`
			`recursive loop.`

* DevNotes: Add entry cgf-000005. * fhandler.h (PIPE_ADD_PID): Redefine to something we actually DON'T use. * pipe.cc (fhandler_pipe::create): Avoid clearing all open_mode bits when checking for PIPE_ADD_PID. Properly keep track of len so that passed in name is not overwritten. 2012-05-13 03:17:17 +08:00			`2012-05-12 cgf-000005`

			`<1.7.16>`
			`- Fix pipe creation problem which manifested as a problem creating a`
			`fifo. Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00253.html`
			`</1.7.16>`

			`My change on 2012-04-28 introduced a problem with fifos. The passed`
			`in name was overwritten. This was because I wasn't properly keeping`
			`track of the length of the generated pipe name when there was a`
			`name passed in to fhandler_pipe::create.`

			`There was also another problem in fhandler_pipe::create. Since fifos`
			`use PIPE_ACCESS_DUPLEX and PIPE_ACCESS_DUPLEX is an or'ing of`
			`PIPE_ACCESS_INBOUND and PIPE_ACCESS_OUTBOUND, using PIPE_ACCESS_OUTBOUND`
			`as a "never-used" option for PIPE_ADD_PID in fhandler.h was wrong. So,`
			`fifo creation attempted to add the pid of a pipe to the name which is`
			`wrong for fifos.`

* DevNotes: Add entry cgf-000004. * pinfo.cc (pinfo::init): Reuse shared memory if the state is marked with PID_REAPED. * spawn.cc (child_info_spawn::worker): Don't duplicate myself_pinfo into non-cygwin child. * fork.cc (frok::parent): Improve error output. 2012-05-08 23:06:43 +08:00			`2012-05-08 cgf-000004`

Add url for reported mailing list problem 2012-05-09 06:38:42 +08:00			`The change for cgf-000003 introduced a new problem:`
			`http://cygwin.com/ml/cygwin/2012-05/msg00154.html`
			`http://cygwin.com/ml/cygwin/2012-05/msg00157.html`
* DevNotes: Add entry cgf-000004. * pinfo.cc (pinfo::init): Reuse shared memory if the state is marked with PID_REAPED. * spawn.cc (child_info_spawn::worker): Don't duplicate myself_pinfo into non-cygwin child. * fork.cc (frok::parent): Improve error output. 2012-05-08 23:06:43 +08:00
			`Since a handle associated with the parent is no longer being duplicated`
			`into a non-cygwin "execed child", Windows is free to reuse the pid of`
			`the parent when the parent exits. However, since we did duplicate a`
			`handle pointing to the pid's shared memory area into the "execed child",`
			`the shared memory for the pid was still active.`

			`Since the shared memory was still available, if a new process reuses the`
			`previous pid, Cygwin would detect that the shared memory was not created`
			`and had a "PID_REAPED" flag. That was considered an error, and, so, it`
			`would set procinfo to NULL and pinfo::thisproc would die since this`
			`situation is not supposed to occur.`

			`I fixed this in two ways:`

			`1) If a shared memory region has a PID_REAPED flag then zero it and`
			`reuse it. This should be safe since you are not really supposed to be`
			`querying the shared memory region for anything after PID_REAPED has been`
			`set.`

			`2) Forego duping a copy of myself_pinfo if we're starting a non-cygwin`
			`child for exec.`

			`It seems like 2) is a common theme and an audit of all of the handles`
			`that are being passed to non-cygwin children is in order for 1.7.16.`

			`The other minor modification that was made in this change was to add the`
			`pid of the failing process to fork error output. This helps slightly`
			`when looking at strace output, even though in this case it was easy to`
			`find what was failing by looking for '^---' when running the "stv"`
			`strace dumper. That found the offending exception quickly.`

			`2012-05-07 cgf-000003`
* DevNotes: Add entry cgf-000003. * cygheap.h (init_cygheap::pid_handle): Delete. * dcrt0.cc (child_info_spawn::handle_spawn): Keep parent open if we have execed. * pinfo.cc (pinfo::thisproc): Remove pid_handle manipulations. (pinfo::init): Don't consider a reaped process to be available. * spawn.cc (child_info_spawn::worker): Remove pid_handle manipulations. Make wr_proc_pipe and parent noninheritable when starting a program which doesn't use the Cygwin DLL. Conditionally reset wr_proc_pipe to inheritable if CreateProcess fails. Inject wr_proc_pipe handle into non-Cygwin process. Consider a non-cygwin process to be 'synced'. 2012-05-07 23:05:56 +08:00
			`<1.7.15>`
			`Don't make Cygwin wait for all children of a non-cygwin child program.`
			`Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00063.html,`
			`http://cygwin.com/ml/cygwin/2012-05/msg00075.html`
			`</1.7.15>`

			`This problem is due to a recent change which added some robustness and`
			`speed to Cygwin's exec/spawn handling by not trying to force inheritance`
			`every time a process is started. See ChangeLog entries starting on`
			`2012-03-20, and multiple on 2012-03-21.`

			`Making the handle inheritable meant that, as usual, there were problems`
			`with non-Cygwin processes. When Cygwin "execs" a non-Cygwin process N,`
			`all of its N + 1, N + 2, ... children will also inherit the handle.`
			`That means that Cygwin will wait until all subprocesses have exited`
			`before it returns.`

			`I was willing to make this a restriction of starting non-Cygwin`
			`processes but the problem with allowing that is that it can cause the`
			`creation of a "limbo" pid when N exits and N + 1 and friends are still`
			`around. In this scenario, Cygwin dutifully notices that process N has`
			`died and sets the exit code to indicate that but N's parent will wait on`
			`rd_proc_pipe and will only return when every N + ... windows process`
			`has exited.`

			`The removal of cygheap::pid_handle was not related to the initial`
			`problem that I set out to fix. The change came from the realization`
			`that we were duping the current process handle into the child twice and`
			`only needed to do it once. The current process handle is used by exec`
			`to keep the Windows pid "alive" so that it will not be reused. So, now`
			`we just close parent in child_info_spawn::handle_spawn iff we're not`
			`execing.`

			`In debugging this it bothered me that 'ps' identified a nonactive pid as`
			`active. Part of the reason for this was the 'parent' handle in`
			`child_info was opened in non-Cygwin processes, keeping the pid alive.`
			`That has been kluged around (more changes after 1.7.15) but that didn't`
			`fix the problem. On further investigation, this seems to be caused by`
			`the fact that the shared memory region pid handles were still being`
			`passed to non-cygwin children, keeping the pid alive in a limbo-like`
			`fashion. This was easily fixed by having pinfo::init() consider a`
Add additional observation to cgf-000003 2012-05-07 23:28:40 +08:00			`memory region with PID_REAPED as not available. A more robust fix`
			`should be considered for 1.7.15+ where these handles are not passed`
			`to non-cygwin processes.`
* DevNotes: Add entry cgf-000003. * cygheap.h (init_cygheap::pid_handle): Delete. * dcrt0.cc (child_info_spawn::handle_spawn): Keep parent open if we have execed. * pinfo.cc (pinfo::thisproc): Remove pid_handle manipulations. (pinfo::init): Don't consider a reaped process to be available. * spawn.cc (child_info_spawn::worker): Remove pid_handle manipulations. Make wr_proc_pipe and parent noninheritable when starting a program which doesn't use the Cygwin DLL. Conditionally reset wr_proc_pipe to inheritable if CreateProcess fails. Inject wr_proc_pipe handle into non-Cygwin process. Consider a non-cygwin process to be 'synced'. 2012-05-07 23:05:56 +08:00
			`This fixed the problem where a pid showed up in the list after a user`
			`does something like: "bash$ cmd /c start notepad" but, for some reason,`
			`it does not fix the problem where "bash$ setsid cmd /c start notepad".`
			`That bears investigation after 1.7.15 is released but it is not a`
			`regression and so is not a blocker for the release.`

* DevNotes: Add entry cgf-000002. * fhandler_tty.cc (bytes_available): Revert to previous Oct-2011 behavior where a dummy buffer is used to determine how many bytes will be read. (fhandler_pty_master::ioctl): Correct coercion in assignment. 2012-05-04 11:00:43 +08:00			`2012-05-03 cgf-000002`

			`<1.7.15>`
			`Fix problem where too much input was attempted to be read from a`
			`pty slave. Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00049.html`
			`</1.7.15>`

			`My change on 2012/04/05 reintroduced the problem first described by:`
			`http://cygwin.com/ml/cygwin/2011-10/threads.html#00445`

			`The problem then was, IIRC, due to the fact that bytes sent to the pty`
			`pipe were not written as records. Changing pipe to PIPE_TYPE_MESSAGE in`
			`pipe.cc fixed the problem since writing lines to one side of the pipe`
			`caused exactly that the number of characters to be read on the other`
			`even if there were more characters in the pipe.`

			`To debug this, I first replaced fhandler_tty.cc with the 1.258,`
			`2012/04/05 version. The test case started working when I did that.`

			`So, then, I replaced individual functions, one at a time, in`
			`fhandler_tty.cc with their previous versions. I'd expected this to be a`
			`problem with fhandler_pty_master::process_slave_output since that had`
			`seen the most changes but was surprised to see that the culprit was`
			`fhandler_pty_slave::read().`

			`The reason was that I really needed the bytes_available() function to`
			`return the number of bytes which would be read in the next operation`
			`rather than the number of bytes available in the pipe. That's because`
			`there may be a number of lines available to be read but the number of`
			`bytes which will be read by ReadFile should reflect the mode of the pty`
			`and, if there is a line to read, only the number of bytes in the line`
			`should be seen as available for the next read.`

			`Having bytes_available() return the number of bytes which would be read`
			`seemed to fix the problem but it could subtly change the behavior of`
			`other callers of this function. However, I actually think this is`
			`probably a good thing since they probably should have been seeing the`
			`line behavior.`

* ChangeNotes: New file. Add entry cgf-000001. * sigproc.cc (proc_terminate): Don't set parent pid of child to 1 if we've execed since the execed process is still considered the parent. * child_info.h: Bump copyright. 2012-05-03 00:39:39 +08:00			`2012-05-02 cgf-000001`

. 2012-05-03 00:48:13 +08:00			`<1.7.15>`
			`Fix problem setting parent pid to 1 when process with children execs`
			`itself. Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00009.html`
			`</1.7.15>`
* ChangeNotes: New file. Add entry cgf-000001. * sigproc.cc (proc_terminate): Don't set parent pid of child to 1 if we've execed since the execed process is still considered the parent. * child_info.h: Bump copyright. 2012-05-03 00:39:39 +08:00
. 2012-05-03 00:48:13 +08:00			`Investigating this problem with strace showed that ssh-agent was`
			`checking the parent pid and getting a 1 when it shouldn't have. Other`
			`stuff looked ok so I chose to consider this a smoking gun.`
* ChangeNotes: New file. Add entry cgf-000001. * sigproc.cc (proc_terminate): Don't set parent pid of child to 1 if we've execed since the execed process is still considered the parent. * child_info.h: Bump copyright. 2012-05-03 00:39:39 +08:00
			`Going back to the version that the OP said did not have the problem, I`
			`worked forward until I found where the problem first occurred -`
			`somewhere around 2012-03-19. And, indeed, the getppid call returned the`
			`correct value in the working version. That means that this stopped`
			`working when I redid the way the process pipe was inherited around`
			`this time period.`

			`It isn't clear why (and I suspect I may have to debug this further at`
Add url for reported mailing list problem 2012-05-09 06:38:42 +08:00			`some point) this hasn't always been a problem but I made the obvious fix.`
* ChangeNotes: New file. Add entry cgf-000001. * sigproc.cc (proc_terminate): Don't set parent pid of child to 1 if we've execed since the execed process is still considered the parent. * child_info.h: Bump copyright. 2012-05-03 00:39:39 +08:00			`We shouldn't have been setting ppid = 1 when we're about to pass off to`
			`an execed process.`

			`As I was writing this, I realized that it was necessary to add some`
			`additional checks. Just checking for "have_execed" isn't enough. If`
			`we've execed a non-cygwin process then it won't know how to deal with`
			`any inherited children. So, always set ppid = 1 if we've execed a`
			`non-cygwin process.`