OBSOLETE Patch-ID# 102110-01 Keywords: libthread timeout hangs sigprocmask thr_setconcurrency memory leak LWPS Synopsis: OBSOLETED by 101318 Date: Oct/11/94 Solaris Release: 2.3 SunOS release: 5.3 Unbundled Product: Unbundled Release: Topic: SunOS 5.3: libthread fixes BugId's fixed with this patch: 1146922 1153229 1154502 1156649 1171833 1170507 1169391 Changes incorporated in this version: 1171833 1170507 1169391 Relevant Architectures: sparc Patches accumulated and obsoleted by this patch: 101489-04 Patches which conflict with this patch: Patches required with this patch: Obsoleted by: 101318 on Jan/19/99 Files included with this patch: /usr/lib/libthread.so.1 Problem Description: 1171833 memory leak in thr_setspecific 1170507 libthread memory leak 1169391 thrds uses /dev/zero to initialize thrd stack via mmap(); mmap() contains garbag This patch fixes two different problems: one has to do with a memory leak in a multi-threaded process which failed to reclaim idle threads' stacks. The other problem has to do with libthread using /dev/zero even before any threads have been created. libthread uses /dev/zero for its own use but it should make sure that it does not start using it before any threads are created. This patch ensures that libthread will use /dev/zero only after some threads have been created. Some applications might want to close all file descriptors (say, 3 to MAX) and should continue to work if they use libthread but do not create any threads. The bug is that each thread in an MT program which uses Thread Specific Data (TSD) (the thr_setspecific(3t) interfaces), leaks 2 words plus 1 word per TSD key. So, for example, if there are 6 TSD keys, 2 + 6 words are leaked per thread created/exited. i.e. 32 bytes per thread created/exited. If the rate of thread creation/destruction is high, this could lead to a substantial memory leak, especially for long running applications. (from 101489-04) 1156649 lwp pool shrinks and doesn't grow regardless of calls thr_setconcurrency When a multi-threaded programe calls thr_setconcurrency() to set the number of lwps (say n) and calls another thr_setconcurrnecy() after five minutes when all aged lwps have died, the concurrency level is not the same what is requested in the second call. This problem can be explained in following steps assuming that _minlwp is level of concurrnecy given in setconcurrency call, and _nlwp is level of current concurrency. i) With first invocation of thr_setconcurrency(), _minlwps is set to the level of concurrency requested and accordingly number of lwps running on the system is equal to _minlwps. This is done only when level of concurrency present at that time (i.e. _minlwps) is less than what has been requested. ii) However after five minutes, when aged lwps are killed, the minlwps is still set to the level of concurrency requested earlier, however _nlwps (number of lwps available now) are much less. iii) With another invocation of thr_setconcurrency(), since _minlwps is equal to the level of concurrency requested, the lib thr_setconcurrency() simply returns though the number of lwps available are less than minlwps. This patch solves this problem identified in thr_setconcurrency(). (from 101489-03) 1154502 libthread panic when a SIGPOLL/SIGIO signal is received during sleep When a program linked with the threads library receives a SIGPOLL or SIGIO signal during sleep() a threads library panic occurs (from 101489-02) 1153229 libthread's version of sigprocmask() is clobbering errno libthread interposes on sigprocmask(2) to provide its own version. This version is simply a call to thr_sigsetmask(3t) thus ensuring that in a multi-threaded (MT) process, the masking operation is carried out only on the calling thread's signal mask. There is no process wide signal mask in an MT process. The bug is that libthread's version of sigprocmask() clears "errno" on success. The bug is typically seen in code such as the following : .... write_ret = write(...); sigprocmask(...); if (write_ret == -1) { printf("write() failed; errno is %d\n", errno); exit(1); } .... If the call to "write()" above fails, then the wrong value for errno will be printed out due to the bug that sigprocmask() clobbers errno. The work-around is to check "write_ret " *before* calling sigprocmask(): .... write_ret = write(...); if (write_ret == -1) { printf("write() failed; errno is %d\n", errno); exit(1); } sigprocmask(...); .... In any case, this is the correct programming style, since even if sigprocmask() were OK, the system would legitimately clobber errno if sigprocmask() were to fail, thus *legitimately* clobbering the error code returned by the failed "write()". Another problem is that libraries such as libsocket which return -1 and the error code in "errno" might also encounter this libthread bug. This results in library calls potentially returning -1 but a 0 in "errno". Hence, applications which call such libraries would run into this bug indirectly with no possibility of implementing a work-around in the application. (from 101489-01) 1146922 cond_timedwait() misses its timeout and hangs The cond_timedwait() interface doesn't reliably guarentee that a thread will wakeup even if it has specified a timeout period. Also, its possible for signals to be pending on a thread which are never delivered. Patch Installation Instructions: -------------------------------- Generic 'installpatch' and 'backoutpatch' scripts are provided within each patch package with instructions appended to this section. Other specific or unique installation instructions may also be necessary and should be described below. Special Install Instructions: ----------------------------- NOTE: Patches 101318-55 through 101318-61 should not be installed after patch 102110 is installed. SunSoft recommends that 101318-62 or later be installed instead. If you must install patches 101318-55 through 101318-61, they MUST be installed prior to the installation of 102110 and NEVER be installed after 102110. Doing so would backout all fixes for 102110. Instructions to install patch using "installpatch" -------------------------------------------------- 1. Become super-user. 2. Apply the patch by typing:
.
See /tmp/log. for reason for failure.
Explanation and recommended action: The installation of one of
patch packages failed. Installpatch will backout the patch
to leave the system in its pre-patched state. See the log file
for the reason for failure. Correct the problem and
re-apply the patch.
Error message:
Pkgadd of package failed with error code .
Will not backout patch...patch re-installation.
Warning: The system may be in an unstable state!
See /tmp/log. for reason for failure.
Explanation and recommended action: The installation of one of
the patch packages failed. Installpatch will NOT backout the
patch. You may manually backout the patch using backoutpatch,
then re-apply the entire patch. Look in the log file for the
reason pkgadd failed. Correct the problem and re-apply the
patch.
Patch Installation Messages:
---------------------------
Note: the messages listed below are not necessarily considered errors
as indicated in the explanations given. These messages are, however,
recorded in the patch installation log for diagnostic reference.
Message:
Package not patched:
PKG=SUNxxxx
Original package not installed
Explanation: One of the components of the patch would have patched a
package that is not installed on your system. This is not
necessarily an error. A Patch may fix a related bug for several
packages. Example: suppose a patch fixes a bug in both the
online-backup and fddi packages. If you had online-backup installed
but didn't have fddi installed, you would get the message
Package not patched:
PKG=SUNWbf
Original package not installed
This message only indicates an error if you thought the package
was installed on your system. If this is the case, take the
necessary action to install the package, backout the patch (if
it installed other packages) and re-install the patch.
Message:
Package not patched:
PKG=SUNxxx
ARCH=xxxxxxx
VERSION=xxxxxxx
Architecture mismatch
Explanation: One of the components of the patch would have patched a
package for an architecture different from your system. This is not
necessarily an error. Any patch to one of the architecture specific
packages may contain one element for each of the possible
architectures. For example, Assume you are running on a sun4m. If
you were to install a patch to package SUNWcar, you would see the
following (or similar) messages:
Package not patched:
PKG=SUNWcar
ARCH=sparc.sun4c
VERSION=11.5.0,REV=2.0.18
Architecture mismatch
Package not patched:
PKG=SUNWcar
ARCH=sparc.sun4d
VERSION=11.5.0,REV=2.0.18
Architecture mismatch
Package not patched:
PKG=SUNWcar
ARCH=sparc.sun4e
VERSION=11.5.0,REV=2.0.18
Architecture mismatch
Package not patched:
PKG=SUNWcar
ARCH=sparc.sun4
VERSION=11.5.0,REV=2.0.18
Architecture mismatch
The only time these messages indicate an error condition
is if installpatch does not correctly recognize your architecture.
Message:
Package not patched:
PKG=SUNxxxx
ARCH=xxxx
VERSION=xxxxxxx
Version mismatch
Explanation: The version of software to which the patch is applied is
not installed on your system. For example, if you were running Solaris
5.3, and you tried to install a patch against Solaris 5.2, you would
see the following (or similar) message:
Package not patched:
PKG=SUNWcsu
ARCH=sparc
VERSION=10.0.2
Version mismatch
This message does not necessarily indicate an error. If
the version mismatch was for a package you needed patched, either
get the correct patch version or install the correct package version.
Then backout the patch (if necessary) and re-apply.
Message:
Re-installing Patch.
Explanation: The patch has already been applied, but there is
at least one package in the patch that could be added. For
example, if you applied a patch that had both Openwindows and
Answerbook components, but your system did not have Answerbook
installed, the Answerbook parts of the patch would not have
been applied. If, at a later time, you pkgadd Answerbook, you
could re-apply the patch, and the Answerbook components of the
patch would be applied to the system.
Message:
Installpatch Interrupted.
Installpatch is terminating.
Explanation: Installpatch was interrupted during execution
(usually through pressing ^C). Installpatch will clean up
its working files and exit.
Message:
Installpatch Interrupted.
Backing out Patch...
Explanation: Installpatch was interrupted during execution
(usually through pressing ^C). Installpatch will clean up
its working files, backout the patch, and exit.
Patch Backout Errors:
---------------------
Error message:
prebackout patch exited with return code .
Backoutpatch exiting.
Explanation and corrective action: the prebackout script
supplied with the patch exited with a return code other
than 0. Generate a script trace of backoutpatch to determine
why the prebackout script failed. Correct the reason for
failure, and re-execute backoutpatch.
Error message:
postbackout patch exited with return code .
Backoutpatch exiting."
Explanation and corrective action: the postbackout script
supplied with the patch exited with a return code other than
0. Look at the postbackout script to determine why it failed.
Correct the failure and, if necessary, RE-EXECUTE THE
POSTBACKOUT SCRIPT ONLY.
Error message:
Only one service may be defined.
Explanation and corrective action: You have attempted to specify
more than one service from which to backout a patch. Different
services must have their patches backed out with different
invocations of backoutpatch.
Error message:
The -S and -R arguments are mutually exclusive.
Explanation and recommended action: You have specified both a
non-native service to backout, and a package installation root.
These two arguments are mutually exclusive. If backing out a
patch from a non-native usr partition, the -S option should be
used. If backing out a patch from a client's root
partition (either native or non-native), the -R option
should be used.
Error message:
The service cannot be found on this system.
Explanation and recommended action: You have specified a non-
native service from which to backout a patch, but the
specified service is not installed on your system. Correctly
specify the service when backing out the patch.
Error message:
Only one rootdir may be defined.
Explanation and recommended action: You have specified more than
one package install root using the -R option. The -R option
may be used only once per invocation of backoutpatch.
Error message:
The directory cannot be found on this system.
Explanation and recommended action: You have specified a
directory using the -R option which is either not mounted,
or does not exist on your system. Verify the directory name
and re-backout the patch.
Error message:
Patch has not been successfully applied to this system.
Explanation and recommended action: You have attempted to backout
a patch that is not applied to this system. If you must
restore previous versions of patched files, you may have to
restore the original files from the initial installation CD.
Error message:
Patch has not been successfully applied to this system.
Will remove directory
Explanation and recommended action: You have attempted to back
out a patch that is not applied to this system. While the
patch has not been applied, a residual
/var/sadm/patch/ (perhaps from an unsuccessful
installpatch) directory still exists. The patch cannot be
backed out. If you must restore old versions of the patched
files, you may have to restore them from the initial
installation CD.
Error message:
This patch was obsoleted by patch .
Patches must be backed out in the order in
which they were installed. Patch backout aborted.
Explanation and recommended action: You are attempting to backout
patches out of order. Patches should never be backed-out out
of sequence. This could undermine the integrity of the more
current patch.
Error message:
Patch was installed without backing up the original
files. It cannot be backed out.
Explanation and recommended action: Either the -d option of
installpatch was set when the patch was applied, or the save
area of the patch was deleted to regain space. As a result, the
original files are not saved and backoutpatch cannot be used.
The original files can only be recovered from the original
installation CD.
Error message:
pkgrm of package failed return code .
See /var/sadm/patch//log for reason for failure.
Explanation and recommended action: The removal of one of
patch packages failed. See the log file for the reason for
failure. Correct the problem and run the backout script again.
Error message:
Restore of old files failed.
Explanation and recommended action: The backout script uses the
cpio command to restore the previous versions of the files
that were patched. The output of the cpio command should
have preceded this message. The user should take the
appropriate action to correct the cpio failure.
KNOWN PROBLEMS:
On client server machines the patch package is NOT applied
to existing clients or to the client root template space.
Therefore, when appropriate, ALL CLIENT MACHINES WILL NEED
THE PATCH APPLIED DIRECTLY USING THIS SAME INSTALLPATCH
METHOD ON THE CLIENT. See instructions above for
applying patches to a client.
A bug affecting a package utility (eg. pkgadd, pkgrm, pkgchk)
could affect the reliability of installpatch or backoutpatch
which uses package utilities to install and backout the patch
package. It is recommended that any patch that fixes package
utility problems be reviewed and, if necessary, applied before
other patches are applied. Such existing patches are:
100901 Solaris 2.1
101122 Solaris 2.2
101331 Solaris 2.3
SEE ALSO
pkgadd, pkgchk, pkgrm, pkginfo, showrev, cpio