newlib-cygwin/winsup/cygwin/DevNotes

2012-05-03  cgf-000002

<1.7.15>
Fix problem where too much input was attempted to be read from a
pty slave.  Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00049.html
</1.7.15>

My change on 2012/04/05 reintroduced the problem first described by:
http://cygwin.com/ml/cygwin/2011-10/threads.html#00445

The problem then was, IIRC, due to the fact that bytes sent to the pty
pipe were not written as records.  Changing pipe to PIPE_TYPE_MESSAGE in
pipe.cc fixed the problem since writing lines to one side of the pipe
caused exactly that the number of characters to be read on the other
even if there were more characters in the pipe.

To debug this, I first replaced fhandler_tty.cc with the 1.258,
2012/04/05 version.  The test case started working when I did that.

So, then, I replaced individual functions, one at a time, in
fhandler_tty.cc with their previous versions.  I'd expected this to be a
problem with fhandler_pty_master::process_slave_output since that had
seen the most changes but was surprised to see that the culprit was
fhandler_pty_slave::read().

The reason was that I really needed the bytes_available() function to
return the number of bytes which would be read in the next operation
rather than the number of bytes available in the pipe.  That's because
there may be a number of lines available to be read but the number of
bytes which will be read by ReadFile should reflect the mode of the pty
and, if there is a line to read, only the number of bytes in the line
should be seen as available for the next read.

Having bytes_available() return the number of bytes which would be read
seemed to fix the problem but it could subtly change the behavior of
other callers of this function.  However, I actually think this is
probably a good thing since they probably should have been seeing the
line behavior.

2012-05-02  cgf-000001

<1.7.15>
Fix problem setting parent pid to 1 when process with children execs
itself.  Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00009.html
</1.7.15>

Investigating this problem with strace showed that ssh-agent was
checking the parent pid and getting a 1 when it shouldn't have.  Other
stuff looked ok so I chose to consider this a smoking gun.

Going back to the version that the OP said did not have the problem, I
worked forward until I found where the problem first occurred -
somewhere around 2012-03-19.  And, indeed, the getppid call returned the
correct value in the working version.  That means that this stopped
working when I redid the way the process pipe was inherited around
this time period.

It isn't clear why (and I suspect I may have to debug this further at
some poit) this hasn't always been a problem but I made the obvious fix.
We shouldn't have been setting ppid = 1 when we're about to pass off to
an execed process.

As I was writing this, I realized that it was necessary to add some
additional checks.  Just checking for "have_execed" isn't enough.  If
we've execed a non-cygwin process then it won't know how to deal with
any inherited children.  So, always set ppid = 1 if we've execed a
non-cygwin process.
* DevNotes: Add entry cgf-000002. * fhandler_tty.cc (bytes_available): Revert to previous Oct-2011 behavior where a dummy buffer is used to determine how many bytes will be read. (fhandler_pty_master::ioctl): Correct coercion in assignment. 2012-05-04 11:00:43 +08:00			`2012-05-03 cgf-000002`

			`<1.7.15>`
			`Fix problem where too much input was attempted to be read from a`
			`pty slave. Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00049.html`
			`</1.7.15>`

			`My change on 2012/04/05 reintroduced the problem first described by:`
			`http://cygwin.com/ml/cygwin/2011-10/threads.html#00445`

			`The problem then was, IIRC, due to the fact that bytes sent to the pty`
			`pipe were not written as records. Changing pipe to PIPE_TYPE_MESSAGE in`
			`pipe.cc fixed the problem since writing lines to one side of the pipe`
			`caused exactly that the number of characters to be read on the other`
			`even if there were more characters in the pipe.`

			`To debug this, I first replaced fhandler_tty.cc with the 1.258,`
			`2012/04/05 version. The test case started working when I did that.`

			`So, then, I replaced individual functions, one at a time, in`
			`fhandler_tty.cc with their previous versions. I'd expected this to be a`
			`problem with fhandler_pty_master::process_slave_output since that had`
			`seen the most changes but was surprised to see that the culprit was`
			`fhandler_pty_slave::read().`

			`The reason was that I really needed the bytes_available() function to`
			`return the number of bytes which would be read in the next operation`
			`rather than the number of bytes available in the pipe. That's because`
			`there may be a number of lines available to be read but the number of`
			`bytes which will be read by ReadFile should reflect the mode of the pty`
			`and, if there is a line to read, only the number of bytes in the line`
			`should be seen as available for the next read.`

			`Having bytes_available() return the number of bytes which would be read`
			`seemed to fix the problem but it could subtly change the behavior of`
			`other callers of this function. However, I actually think this is`
			`probably a good thing since they probably should have been seeing the`
			`line behavior.`

* ChangeNotes: New file. Add entry cgf-000001. * sigproc.cc (proc_terminate): Don't set parent pid of child to 1 if we've execed since the execed process is still considered the parent. * child_info.h: Bump copyright. 2012-05-03 00:39:39 +08:00			`2012-05-02 cgf-000001`

. 2012-05-03 00:48:13 +08:00			`<1.7.15>`
			`Fix problem setting parent pid to 1 when process with children execs`
			`itself. Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00009.html`
			`</1.7.15>`
* ChangeNotes: New file. Add entry cgf-000001. * sigproc.cc (proc_terminate): Don't set parent pid of child to 1 if we've execed since the execed process is still considered the parent. * child_info.h: Bump copyright. 2012-05-03 00:39:39 +08:00
. 2012-05-03 00:48:13 +08:00			`Investigating this problem with strace showed that ssh-agent was`
			`checking the parent pid and getting a 1 when it shouldn't have. Other`
			`stuff looked ok so I chose to consider this a smoking gun.`
* ChangeNotes: New file. Add entry cgf-000001. * sigproc.cc (proc_terminate): Don't set parent pid of child to 1 if we've execed since the execed process is still considered the parent. * child_info.h: Bump copyright. 2012-05-03 00:39:39 +08:00
			`Going back to the version that the OP said did not have the problem, I`
			`worked forward until I found where the problem first occurred -`
			`somewhere around 2012-03-19. And, indeed, the getppid call returned the`
			`correct value in the working version. That means that this stopped`
			`working when I redid the way the process pipe was inherited around`
			`this time period.`

			`It isn't clear why (and I suspect I may have to debug this further at`
			`some poit) this hasn't always been a problem but I made the obvious fix.`
			`We shouldn't have been setting ppid = 1 when we're about to pass off to`
			`an execed process.`

			`As I was writing this, I realized that it was necessary to add some`
			`additional checks. Just checking for "have_execed" isn't enough. If`
			`we've execed a non-cygwin process then it won't know how to deal with`
			`any inherited children. So, always set ppid = 1 if we've execed a`
			`non-cygwin process.`