Patch-ID# 100884-17
Keywords: boot hang security tcp clone kernel panic mfar MP zs nfs mount procfs
Synopsis: SunOS 5.1: Jumbo kernel patch
Date: Jul/15/93
Solaris Release: 2.1
SunOS Release: 5.1
Unbundled Product:
Unbundled Release:
Relevant Architectures: sparc
BugId's fixed with this patch: 1118757 1119235 1109379 1108685 1114069 1108813 1107190 1108112 1108947 1110653 1110373 1105806 1100073 1103645 1106404 1111011 1112756 1113153 1114791 1119071 1110523 1111384 1117508 1120597 1125644 1123266 1112704 1120932 1115127 1119267 1113596 1084913 1104430 1122464 1113596 1111086 1123493 1124179 1121146 1121957 1116255 1120065 1102018 1133751 1132273 1123435
Changes incorporated in this version: 1132273 1123435
Patches accumulated and obsoleted by this patch: 100825-01,100828-01,100829-02,100848-01,100819-01,100858-01,100939-01,100907-01,100947-02
Patches which conflict with this patch:
Patches required with this patch:
Obsoleted by:
Files included with this patch:
kernel/drv/clone
kernel/drv/tcp
kernel/fs/nfs
kernel/sys/nfs
kernel/unix
kernel/fs/procfs
Problem Description:
SunOS 5.1 and SunOS 5.2 can panic with the following message:
panic: page_unlock: pp xxxxxxx is not locked
A watchdog reset is caused when running sundiag on a diskless machine
(swapping over NFS) when the 'mod_uninstall_daemon()' runs out of kernel
stack space. This only happens when swapping over NFS because the call
stack is much deeper. The bug synopsis has nothing to do with the actual
problem. This patch does not fix the clget() warning.
(From 100884-16)
This patch fixes a bug in diskless boot introduced by patch 100884-11's
bugfix for 1113596.
(From 100884-15)
When doing integer multiplication emulation, we should put the low part of
the 64 bit result into dest[0] and the high part into the y-register.
simulate_unimp() attempts to stuff dest[1], which was never set, into rd+1,
therefore trashing the contents of rd+1. (The high part of the 64 bit result
of .umul should be stuff into the y-register instead. This is already done
in crt.s)
(From 100884-14)
Kernel panics with a data fault. kadb/the core dump shows the crash being
in bcopy call by tcp_reinit_fn.1
(From 100884-13)
In Solaris 5.1, if alarm(n) is called with large n, alarm() calls returns
immediately. That is, a SIGALARM signal is delivered right away.
An NIS+ server may hang if a NIS/NIS+ client does a Ctrl-C in the middle
of browsing a large map using ypcat/niscat.
(From 100884-12)
A RACE SITUATION CAN OCCUR WHEN TWO OR MORE PROCESSES ARE TRYING TO
WRITE TO THE SAME FILE OVER NFS. THIS PATCH CORRECTS THE PROBLEM.
(From 100884-11)
1111086
Galaxy systems with VME devices panic'ed with a M-bus timeout error
when accessing the VME interrupt/control registers. The system should
not have panic'ed but instead prints a message "VME dropped INT-ACK cycle".
However, the implementation of sun4m_impl_bustype() on mars is wrong, causing
us to panic.
1124179
This is yet another Viking mfar hardware bug workaround.
1123493
This is a fix for a modctl bug.
1113596
This is the second crank of patch 100884-08 for the problem described
below. This fixes a locking problem introduced by the first crank:
lockd may spin and generate multiple lock/unlock requests if it
receives a signal while waiting for a reply to an NFS lock/unlock request.
This is most often manifested when a ksh user logs in or out of a machine
which NFS mounts his/her home directory and types ^C during the brief period
that ksh is locking or unlocking its history file. This causes ksh to hang
and the machine's lockd to consume lots of CPU time.
(From 100884-10)
1104430: The problem is a kernel panic:
panic: recursive mutex_enter ...
This happens when a debugger is applied to a process
whose executable (or any of shared libraries invoked
by the process) resides on an NFS-mounted filesystem.
1122464: The problem seems to be that the Sun is sending data into a
zero window. The first two packets work OK. The cisco rejects the
third packet because it contains data but its receive window is 0.
This means that the cisco never sees the ACK to its SYN. You'll note
that the cisco keeps retransmitting its SYN, and the sun keeps
retransmitting the 24 bytes of data.
(From 100884-09)
The SO_KEEPALIVE option has no affect in SUNos 5.x. Turning on the
option with setsockopt does not change the operation of TCP. There is
no supporting code to handle the option.
(From 100884-08)
lockd may spin and generate multiple lock/unlock requests if it
receives a signal while waiting for a reply to an NFS lock/unlock request.
This is most often manifested when a ksh user logs in or out of a machine
which NFS mounts his/her home directory and types ^C during the brief period
that ksh is locking or unlocking its history file. This causes ksh to hang
and the machine's lockd to consume lots of CPU time.
(for 100947-02)
If an NFS client mounts a filesystem read-only, access() will still
claim that writes are possible.
(for 100947-01)
Ultrix client "touch" a new file on Solarix 2.1 server will have
results in wrong permissions of rwsrwsrwt. This is due to Ultrix
sends a (short)-1 instead of a (long)-1 in the mode field for the
NFS SETATTR requests. Since only the lower 12 bits are valid mode
bits, we check for both (long)-1 and (short)-1.
(for 100939-01)
If you have a partition on a 4.1.2 (or 4.1.3) server that is full
and you write to it on a 5.1 client. The write appears to succeed
and the file is reported on the client as having grown. If you
look on the server the file is zero length.
(for 100907-01)
This bug causes Concurrent's "pwd" command to report incomplete pathnames.
This may also affect other vendors who rely on the dirent data to include
accurate name length data. KRBrown
The description field as copied from bug report 1119254 follows:
In fastpath, the readdir result is counting the null byte in the file
name. For example the name "." has a count of 2 and a value of ".\0"
Probable cause is the use of copystr() which returns the string length
including the NULL byte.
(From 100884-07)
Calling the clean user windows trap on sun4m running SunOS 5.1
consistently fails with a segmentation violation.
(From 100884-06)
Kernel Bug, 1125644:
One of the kernel bugs we encountered on the Solaris5.1_fcs sun4m (galaxy)
architecture was that the kernel could data fault while taking a pagefault,
due to a de-reference of a bogus pointer to the proc structure. The
pointer was bogus because it was obtained from the lwp structure which
could be in the process of being torn down due to a process exiting.
The fix is to obtain the proc pointer from the current thread (the
thread taking the pagefault), and not from the current lwp.
(From 100884-05)
MP startup is fragile. In some circumstances during boot, the
system may try to service an interrupt on a CPU that is not yet
fully initialized.
This problem has been observed on one or two configurations of MP
machines with the new 'zs' driver, though other 3rd party drivers
active during kernel initialization may provoke the problem.
1120597 zs driver watchdog resets on 4-Viking Galaxy
(From 100884-04)
This panic may be caused when the system fails to prevent a new
segment which overlaps an existing segment, in the address space,
from being created.
1110523 Kernel panics with "srmmu_pteload: remap page..."
1119071 crash in ipc_hash_remove due to outer perimeter bug
There are bugs in the earlier ROSS 605 chips which can lead to data
corruption in Multi-Processor mode. The fix applied determines if
these older chips exist in the system and, if so, boots the system in
Uni-Processor mode only - and prints a warning message on the console
at boot time.
The 'MFAR' bug fix is due to the discovery of a bug in the TI
SuperSPARC chip. Occasionally due to a unusual set of circumstances on
the MBus, a page fault will occur which latches the wrong faulting
address. The fix is to look at the faulting instruction to determine
the correct fault address.
1111384 sun4m systems should stop the boot if running SVR4 with down-rev
1117508 yet another mfar bug
(From 100884-03)
1118757 data fault while doing putpmsg/getpmsg
1119235 kernel hang with patch 100858-01
(From 100858-01)
TCP maximum segment size option has a lower limit of 128.
(From 100819-01)
If you see one of these kernel panics you need to apply the patch:
panic: tcp_close_detached - no mblk
panic: tcp_clean_death - no mblk
(From 100884-02)
Several problems have been uncovered in the 4m architecture. Most of
these affect either long SunDiag runs (a program used to test various
hardware/software interactions - esp. within Sun manufacturing, but
also at many Sun OEM sites) or long term stability of the 4m machines.
Machines affected by these problems are Sun 4/6XX, SPARCstation 10 (all
models), Sunergy and Sunergy Classic.
1114069 C2 (ss10) boot hang
(From 100884-01)
1100073 mmap() is not working correcty on 5.0.1/sun4m
1103645 sun4m l15 handler doesn't handle viking module error correctly
1106404 mmap system call fails on galaxy causing unexpected trap
1111011 kernel preempts the 2.8 non-preemptible PROM
1112756 fix module_ross.c to check for pfn, Cacheability, etype.
1113153 seg_kmem.c pass 0 for PTE_RM_MASK when the pte is being invalidated
1114791 Sunergy's and Classics are Watchdog Resetting with invalid Level 0 PTP
(From 100848-01)
1108813 security, srmmu window handler does not check %sp
(From 100829-02)
1107190 Page create can potentially return a page without acquiring the
exclusive lock
(From 100828-01)
When asyncio calls are made from the NeWSprint handler for the SPARCprinter
to write the second page of a job, the number of context switches skyrockets
to the point that the user is no longer able to get new input focus until one
of the two threads has finished.
1105806 asyncio calls made in NeWSprint cause too many context switches
(From 100825-01)
The patch fixes various system panics in kmem_alloc/kmem_free when
doing file locking. It fixes some problems with locks being lost
when upgrading locks and counting of locks is incorrect so the system
tunable parameter of the number of locks in the system is not accurate.
1108112 Kernel file locking can hang or crash system.
1108947 Kernel loses track of file locks
1110653 when system lock limit is reached, fcntl() never returns ENOLCK
1110373 system's counting of record locks is incorrect
Patch Installation Instructions:
--------------------------------
Generic 'installpatch' and 'backoutpatch' scripts are provided
within each patch package with instructions appended to this section.
Other specific or unique installation instructions may also be
necessary and should be described below.
Special Install Instructions:
-----------------------------
None.
Instructions to install patch using "installpatch"
--------------------------------------------------
1. Become super-user.
2. Apply the patch by typing:
//installpatch /
where is the directory containing the patch and
is the patch number. must be a full path name.
Example:
# /tmp/123456-01/installpatch /tmp/123456-01
3. If any errors are reported, see "Patch Installation Errors" in
the Command Descriptions section below.
Rebooting the system or restarting the application after a successful
patch installation is usually necessary to utilize patch.
NOTE: On client server machines the patch package is NOT applied
to existing clients or to the client root template space.
Therefore, when appropriate, ALL CLIENT MACHINES WILL NEED
THE PATCH APPLIED DIRECTLY USING THIS SAME INSTALLPATCH
METHOD ON THE CLIENT. See the next section for instructions
for installing a patch on a client.
Instructions for installing a patch on a diskless or dataless client
--------------------------------------------------------------------
1. Before applying the patch, the following command must be executed
on the server to give the client read-only, root access to the
exported /usr file system so that the client can execute the
pkgadd command:
share -F nfs -o ro,anon=0 /export/exec/Solaris_2.1_sparc.all/usr
The command:
share -F nfs -o ro,root= \
/export/exec/Solaris_2.1_sparc.all/usr
accomplishes the same goal, but only gives root access to the
client specified in the command.
2. Login to the client system and become super-user.
3. Continue with step 2 in the "Instructions to install patch using
installpatch" section above.
Instructions for backing out patch using "backoutpatch"
-------------------------------------------------------
1. Become super-user.
2. Change directory to /var/sadm/patch:
cd /var/sadm/patch
3. Backout patch by typing:
/backoutpatch
where is the patch number.
Example:
# 123456-01/backoutpatch 123456-01
4. If any errors are reported, see "Patch Backout Errors" in
the Command Descriptions section below.
Instructions for identifying patches installed on system:
----------------------------------------------------------
Type:
installpatch -p
This command produces a list of the patch IDs of the patches that
are currently applied to the system. When executed with the -p
option, the installpatch command does not modify the system in
any way.
Command Descriptions
--------------------
NAME
installpatch - apply patch package to Solaris 2.x system
backoutpatch - remove patch package from Solaris 2.x system
SYNOPSIS
installpatch [-u] [-d]
backoutpatch
DESCRIPTION
These installation and backout utilities apply only to
Solaris 2.x associated patches. They do not apply to Solaris
1.x associated patches. These utilities are currently only
provided with each patch package and are not included with
the standard Solaris 2.x release software.
OPTIONS
installpatch
-u unconditional install, do not verify file attributes
-d do not save original files being replaced
-p print a list of the patches currently applied on the system
DIAGNOSTICS
Patch Installation Errors:
--------------------------
Error message: Patch has already been applied.
Explanation and recommended action: This patch has already been
applied to the system. If the patch has to be reapplied
for some reason, backout the patch and then reapply it.
Error message: This patch is obsoleted by a patch which has
already been applied to this system. Application of this
patch would leave the system in an inconsistent state.
Patch installation is aborted.
Explanation and recommended action: Occasionally, a patch
is replaced by a new patch which incorporates the bug fixes
in the old patch and supplies additional fixes also. At
this time, the earlier patch is no longer made available
to users. The second patch is said to "obsolete" the
first patch. However, it is possible that some users
may still have the earlier patch and try to apply it to
a system on which the later patch is already applied.
If the obsoleted patch were allowed to be applied, the
additional fixes supplied by the later patch would no
longer be available, and the system would be left in an
inconsistent state. This error message indicates that
the user attempted to install an obsoleted patch. There
is no need to apply this patch because the later patch
has already supplied the fix.
Error message: The packages to be patched are not installed on
this system.
Explanation and recommended action: None of the packages
to be updated by this patch are installed on the system.
Therefore, this patch cannot be applied to the system.
Error message: This patch is not applicable to client systems.
Explanation and recommended action: The patch is only
applicable to servers and standalone machines. Attempting
to apply this patch to a client system will have no effect on
the system.
Error message: The /usr/sbin/pkgadd command is not executable.
Explanation and recommended action: The /usr/sbin/pkgadd
command cannot be executed. The most likely cause of this
is that installpatch is being run on a diskless or dataless
client and the /usr file system was not exported with
root access to the client. See the section above on
"Instructions for installing a patch on a diskless or
dataless client".
Error message: Patch directory is not of expected format.
Explanation and recommended action: The patch directory
supplied as an argument to installpatch did not contain
any patch packages. Verify that the argument supplied
to installpatch is correct.
Error message: The following validation errors were found:
Explanation and recommended action: Before applying the patch,
the patch application script verifies that the current
versions of the files to be patched have the expected
fcs checksums and attributes. If a file to be patched has
been modified by the user, the user is notified of this
fact. The user then has the opportunity to save the
file and make a similar change to the patched version.
For example, if the user has modified /etc/inet/inetd.conf
and /etc/inet/inetd.conf is to be replaced by the patch,
the user can save the locally modified /etc/inet/inetd.conf
file and make the same modification to the new file
after the patch is applied. After the user has noted all
validation errors and taken the appropriate action for
each one, the user should re-run installpatch using
the "-u" (for "unconditional") option. This time, the
patch installation will ignore validation errors and
install the patch anyway.
Error message: Insufficient space in /var/sadm to save old files.
Explanation and recommended action: There is insufficient
space in the /var/sadm directory to save old files.
The user has two options for handling this problem:
(1) generate additional disk space by deleting unneeded
files, or (2) override the saving of the old files by
using the "-d" (do not save) option when running installpatch.
However if the user elects not to save the old versions of
the files to be patched, backoutpatch CANNOT be used.
One way to regain space on a system is to remove the
save area for previously applied patches. Once the user
has decided that it is unlikely that a patch will be
backed out, the user can remove the files that were saved
by installpatch. The following commands should be executed
to remove the saved files for patch xxxxxx-yy:
cd /var/sadm/patch/xxxxxx-yy
rm -r save/*
rm .oldfilessaved
After these commands have been executed, patch xxxxxx-yy can
no longer be backed out.
Error message: Save of old files failed.
Explanation and recommended action: Before applying the patch,
the patch installation script uses cpio to save the old
versions of the files to be patched. This error message
means that the cpio failed. The output of the cpio
would have been preceded this message. The user should
take the appropriate action to correct the cpio failure.
A common reason for failure will be insufficient disk
space to save the old versions of the files. The user
has two options for handling insufficient disk space:
(1) generate additional disk space by deleting unneeded
files, or (2) override the saving of the old files by
using the "-d" option when running installpatch. However
if the user elects not to save the old versions of the
files to be patched, the patch CANNOT be backed out.
Error message: Pkgadd of package failed. See
/tmp/log. for reason for failure.
Explanation and recommended action: The installation of one of
patch packages failed. Any previously installed packages
in the patch should have been removed. See the log file
for the reason for failure. Correct the problem and
re-apply the patch.
Error message: error while adding patch to root template
Explanation and recommended action: The install script
determined this system to be a client server. The attempt
to apply the patch package to the appropriate root
template space located under /export/root/templates
failed unexpectedly. Check the log file for any failure
messages. Correct the problem and re-apply the patch.
Patch Backout Errors:
---------------------
Error message: Patch has not been applied to this system.
Explanation and recommended action: The user has attempted to back
out a patch that was never applied to this system. It is
possible that the patch was applied, but that the patch
directory /var/sadm/patch/ was deleted somehow.
If this is the case, the patch cannot be backed out. The
user may have to restore the original files from the
initial installation CD.
Error message: Patch was installed without backing up the
original files. It cannot be backed out.
Explanation and recommended action: Either the -d option of
installpatch was set when the patch was applied, or the save
area of the patch was deleted to regain space. As a result, the
original files are not saved and backoutpatch cannot be used. The
original files can only be recovered from the original
installation CD.
Error message: Pkgrm of package failed. See
/var/sadm/patch//log for reason for failure.
Explanation and recommended action: The removal of one of
patch packages failed. See the log file
for the reason for failure. Correct the problem and
run the backout script again.
Error message: Restore of old files failed.
Explanation and recommended action: The backout script uses the
cpio command to restore the previous versions of the files
that were patched. The output of the cpio command should
have preceded this message. The user should take the
appropriate action to correct the cpio failure.
KNOWN PROBLEMS:
On client server machines the patch package is NOT applied
to existing clients or to the client root template space.
Therefore, when appropriate, ALL CLIENT MACHINES WILL NEED
THE PATCH APPLIED DIRECTLY USING THIS SAME INSTALLPATCH
METHOD ON THE CLIENT. See instructions above for
applying patches to a client.
After a patch package has been installed pkginfo(1) will
not recognize the SUNW_PATCHID macro in the patch package
pkginfo file. Instead, to identify patches installed on
the system use the grep command method described in the
patch README.
The pkgadd command shipped with Solaris 2.1 fails (drops core
without any error message) when there are more than 100
entries in the /etc/mnttab file. This means that installpatch
can fail, because it uses pkgadd. Since this is very likely on
any big system with lots of automounts, ANY patch could fail.
Applying patch 100901-01 fixes this problem (the README for
patch 100901 mentions shutting down the automounter while
applying it).
SEE ALSO
pkgadd(1), pkgchk(1), pkgrm(1), pkginfo(1)