Patch-ID# 100884-17
Keywords: boot hang security tcp clone kernel panic mfar MP zs nfs mount procfs
Synopsis: SunOS 5.1:  Jumbo kernel patch
Date: Jul/15/93

Solaris Release: 2.1

SunOS Release: 5.1

Unbundled Product: 

Unbundled Release: 

Relevant Architectures: sparc

BugId's fixed with this patch: 1118757 1119235 1109379 1108685 1114069 1108813 1107190 1108112 1108947 1110653 1110373 1105806 1100073 1103645 1106404 1111011 1112756 1113153 1114791 1119071 1110523 1111384 1117508 1120597 1125644 1123266 1112704 1120932 1115127 1119267 1113596 1084913 1104430 1122464 1113596 1111086 1123493 1124179 1121146 1121957 1116255 1120065 1102018 1133751 1132273 1123435

Changes incorporated in this version: 1132273 1123435

Patches accumulated and obsoleted by this patch: 100825-01,100828-01,100829-02,100848-01,100819-01,100858-01,100939-01,100907-01,100947-02

Patches which conflict with this patch: 

Patches required with this patch: 

Obsoleted by: 

Files included with this patch: 

        kernel/drv/clone
        kernel/drv/tcp
        kernel/fs/nfs
        kernel/sys/nfs
        kernel/unix
        kernel/fs/procfs

Problem Description: 

SunOS 5.1 and SunOS 5.2 can panic with the following message:
	panic: page_unlock: pp xxxxxxx is not locked

A watchdog reset is caused when running sundiag on a diskless machine
(swapping over NFS) when the 'mod_uninstall_daemon()' runs out of kernel
stack space. This only happens when swapping over NFS because the call
stack is much deeper. The bug synopsis has nothing to do with the actual
problem. This patch does not fix the clget() warning.

(From 100884-16)
This patch fixes a bug in diskless boot introduced by patch 100884-11's
bugfix for 1113596.

(From 100884-15)
When doing integer multiplication emulation, we should put the low part of
the 64 bit result into dest[0] and the high part into the y-register.
simulate_unimp() attempts to stuff dest[1], which was never set, into rd+1,
therefore trashing the contents of rd+1.   (The high part of the 64 bit result
of .umul should be stuff into the y-register instead.  This is already done
in crt.s)

(From 100884-14)
Kernel panics with a data fault. kadb/the core dump shows the crash being
in bcopy call by tcp_reinit_fn.1

(From 100884-13)
In Solaris 5.1, if alarm(n) is called with large n, alarm() calls returns
immediately.  That is, a SIGALARM signal is delivered right away.

An NIS+ server may hang if a NIS/NIS+ client does a Ctrl-C in the middle
of browsing a large map using ypcat/niscat.

(From 100884-12)
A RACE SITUATION CAN OCCUR WHEN TWO OR MORE PROCESSES ARE TRYING TO
WRITE TO THE SAME FILE OVER NFS. THIS PATCH CORRECTS THE PROBLEM.

(From 100884-11)
1111086
Galaxy systems with VME devices panic'ed with a M-bus timeout error
when accessing the VME interrupt/control registers.  The system should
not have panic'ed but instead prints a message "VME dropped INT-ACK cycle".
However, the implementation of sun4m_impl_bustype() on mars is wrong, causing
us to panic.

1124179
This is yet another Viking mfar hardware bug workaround.

1123493
This is a fix for a modctl bug.

1113596 
This is the second crank of patch 100884-08 for the problem described
below.  This fixes a locking problem introduced by the first crank:

	lockd may spin and generate multiple lock/unlock requests if it
receives a signal while waiting for a reply to an NFS lock/unlock request.
This is most often manifested when a ksh user logs in or out of a machine
which NFS mounts his/her home directory and types ^C during the brief period
that ksh is locking or unlocking its history file.  This causes ksh to hang
and the machine's lockd to consume lots of CPU time.

(From 100884-10)

1104430: The problem is a kernel panic:
	panic: recursive mutex_enter ...
This happens when a debugger is applied to a process
whose executable (or any of shared libraries invoked
by the process) resides on an NFS-mounted filesystem.

1122464: The problem seems to be that the Sun is sending data into a 
zero window.  The first two packets work OK.  The cisco rejects the 
third packet because it contains data but its receive window is 0.  
This means that the cisco never sees the ACK to its SYN.  You'll note 
that the cisco keeps retransmitting its SYN, and the sun keeps 
retransmitting the 24 bytes of data.

(From 100884-09)
The SO_KEEPALIVE option has no affect in SUNos 5.x.  Turning on the
option with setsockopt does not change the operation of TCP.  There is
no supporting code to handle the option.

(From 100884-08)
        lockd may spin and generate multiple lock/unlock requests if it
receives a signal while waiting for a reply to an NFS lock/unlock request.
This is most often manifested when a ksh user logs in or out of a machine
which NFS mounts his/her home directory and types ^C during the brief period
that ksh is locking or unlocking its history file.  This causes ksh to hang
and the machine's lockd to consume lots of CPU time.

(for 100947-02)
If an NFS client mounts a filesystem read-only, access() will still
claim that writes are possible.

(for 100947-01)
Ultrix client "touch" a new file on Solarix 2.1 server will have
results in wrong permissions of rwsrwsrwt. This is due to Ultrix
sends a (short)-1 instead of a (long)-1 in the mode field for the
NFS SETATTR requests. Since only the lower 12 bits are valid mode
bits, we check for both (long)-1 and (short)-1.

(for 100939-01)
If you have a partition on a 4.1.2 (or 4.1.3) server that is full
and you write to it on a 5.1 client. The write appears to succeed
and the file is reported on the client as having grown. If you
look on the server the file is zero length.

(for 100907-01)
This bug causes Concurrent's "pwd" command to report incomplete pathnames.
This may also affect other vendors who rely on the dirent data to include
accurate name length data.  KRBrown

The description field as copied from bug report 1119254 follows:

In fastpath, the readdir result is counting the null byte in the file
name.  For example the name "." has a count of 2 and a value of ".\0"
Probable cause is the use of copystr() which returns the string length
including the NULL byte.

(From 100884-07)
Calling the clean user windows trap on sun4m running SunOS 5.1
consistently fails with a segmentation violation.
 
(From 100884-06)
Kernel Bug, 1125644:

One of the kernel bugs we encountered on the Solaris5.1_fcs sun4m (galaxy)
architecture was that the kernel could data fault while taking a pagefault,
due to a de-reference of a bogus pointer to the proc structure. The
pointer was bogus because it was obtained from the lwp structure which
could be in the process of being torn down due to a process exiting.
The fix is to obtain the proc pointer from the current thread (the
thread taking the pagefault), and not from the current lwp.

(From 100884-05)
MP startup is fragile.  In some circumstances during boot, the
system may try to service an interrupt on a CPU that is not yet
fully initialized.

This problem has been observed on one or two configurations of MP
machines with the new 'zs' driver, though other 3rd party drivers
active during kernel initialization may provoke the problem.

1120597 zs driver watchdog resets on 4-Viking Galaxy

(From 100884-04)
This panic may be caused when the system fails to prevent a new
segment which overlaps an existing segment, in the address space,
from being created.

1110523 Kernel panics with "srmmu_pteload: remap page..."
1119071 crash in ipc_hash_remove due to outer perimeter bug

There are bugs in the earlier ROSS 605 chips which can lead to data
corruption in Multi-Processor mode.  The fix applied determines if
these older chips exist in the system and, if so, boots the system in
Uni-Processor mode only - and prints a warning message on the console
at boot time.

The 'MFAR' bug fix is due to the discovery of a bug in the TI
SuperSPARC chip.  Occasionally due to a unusual set of circumstances on
the MBus, a page fault will occur which latches the wrong faulting
address.  The fix is to look at the faulting instruction to determine
the correct fault address.

1111384 sun4m systems should stop the boot if running SVR4 with down-rev
1117508 yet another mfar bug

(From 100884-03)
1118757 data fault while doing putpmsg/getpmsg
1119235 kernel hang with patch 100858-01

(From 100858-01)
TCP maximum segment size option has a lower limit of 128.

(From 100819-01)
If you see one of these kernel panics you need to apply the patch:
        panic: tcp_close_detached - no mblk
        panic: tcp_clean_death - no mblk

(From 100884-02)
Several problems have been uncovered in the 4m architecture.  Most of
these affect either long SunDiag runs (a program used to test various
hardware/software interactions - esp. within Sun manufacturing, but
also at many Sun OEM sites) or long term stability of the 4m machines.
Machines affected by these problems are Sun 4/6XX, SPARCstation 10 (all
models), Sunergy and Sunergy Classic.

1114069 C2 (ss10) boot hang

(From 100884-01)
1100073 mmap() is not working correcty on 5.0.1/sun4m
1103645 sun4m l15 handler doesn't handle viking module error correctly
1106404 mmap system call fails on galaxy causing unexpected trap
1111011 kernel preempts the 2.8 non-preemptible PROM
1112756 fix module_ross.c to check for pfn, Cacheability, etype.
1113153 seg_kmem.c pass 0 for PTE_RM_MASK when the pte is being invalidated
1114791 Sunergy's and Classics are Watchdog Resetting with invalid Level 0 PTP
 
(From 100848-01)
1108813 security, srmmu window handler does not check %sp
 
(From 100829-02)
1107190 Page create can potentially return a page without acquiring the
        exclusive lock
 
(From 100828-01)
When asyncio calls are made from the NeWSprint handler for the SPARCprinter
to write the second page of a job, the number of context switches skyrockets
to the point that the user is no longer able to get new input focus until one
of the two threads has finished.
 
1105806 asyncio calls made in NeWSprint cause too many context switches
 
(From 100825-01)
The patch fixes various system panics in kmem_alloc/kmem_free when
doing file locking. It fixes some problems with locks being lost
when upgrading locks and counting of locks is incorrect so the system
tunable parameter of the number of locks in the system is not accurate.
 
1108112 Kernel file locking can hang or crash system.
1108947 Kernel loses track of file locks
1110653 when system lock limit is reached, fcntl() never returns ENOLCK
1110373 system's counting of record locks is incorrect


Patch Installation Instructions: 
-------------------------------- 
Generic 'installpatch' and 'backoutpatch' scripts are provided
within each patch package with instructions appended to this section.
Other specific or unique installation instructions may also be
necessary and should be described below.

Special Install Instructions: 
----------------------------- 

None.


Instructions to install patch using "installpatch"
--------------------------------------------------

1.  Become super-user.

2.  Apply the patch by typing:

	<dir>/<patch-id>/installpatch <dir>/<patch-id>

    where <dir> is the directory containing the patch and <patch-id>
    is the patch number.  <dir> must be a full path name.

    Example:

	# /tmp/123456-01/installpatch /tmp/123456-01

3.  If any errors are reported, see "Patch Installation Errors" in
    the Command Descriptions section below.

    Rebooting the system or restarting the application after a successful
    patch installation is usually necessary to utilize patch.

    NOTE: On client server machines the patch package is NOT applied
	  to existing clients or to the client root template space.  
	  Therefore, when appropriate, ALL CLIENT MACHINES WILL NEED 
	  THE PATCH APPLIED DIRECTLY USING THIS SAME INSTALLPATCH 
	  METHOD ON THE CLIENT.  See the next section for instructions
	  for installing a patch on a client.


Instructions for installing a patch on a diskless or dataless client
--------------------------------------------------------------------

1.  Before applying the patch, the following command must be executed
    on the server to give the client read-only, root access to the
    exported /usr file system so that the client can execute the
    pkgadd command:

    share -F nfs -o ro,anon=0 /export/exec/Solaris_2.1_sparc.all/usr

    The command:

    share -F nfs -o ro,root=<client_name> \
		/export/exec/Solaris_2.1_sparc.all/usr

    accomplishes the same goal, but only gives root access to the
    client specified in the command.

2.  Login to the client system and become super-user.

3.  Continue with step 2 in the "Instructions to install patch using
    installpatch" section above.


Instructions for backing out patch using "backoutpatch"
-------------------------------------------------------

1.  Become super-user. 

2.  Change directory to /var/sadm/patch:
 
        cd /var/sadm/patch
 
3.  Backout patch by typing:
 
        <patch-id>/backoutpatch <patch-id>
 
    where <patch-id> is the patch number.

    Example:

	# 123456-01/backoutpatch 123456-01

4.  If any errors are reported, see "Patch Backout Errors" in 
    the Command Descriptions section below.


Instructions for identifying patches installed on system:
----------------------------------------------------------

Type:

    installpatch -p

This command produces a list of the patch IDs of the patches that
are currently applied to the system.  When executed with the -p
option, the installpatch command does not modify the system in
any way.


Command Descriptions
--------------------

NAME

     installpatch - apply patch package to Solaris 2.x system
     backoutpatch - remove patch package from Solaris 2.x system

SYNOPSIS

     installpatch [-u] [-d] <patch directory>
     backoutpatch <patch-id>

DESCRIPTION

     These installation and backout utilities apply only to
     Solaris 2.x associated patches. They do not apply to Solaris
     1.x associated patches. These utilities are currently only
     provided with each patch package and are not included with
     the standard Solaris 2.x release software.

OPTIONS

    installpatch

	-u  unconditional install, do not verify file attributes

	-d  do not save original files being replaced

	-p  print a list of the patches currently applied on the system

DIAGNOSTICS

    Patch Installation Errors:
    --------------------------

    Error message: Patch has already been applied.

      Explanation and recommended action: This patch has already been
	applied to the system.  If the patch has to be reapplied
	for some reason, backout the patch and then reapply it.

    Error message: This patch is obsoleted by a patch which has
	already been applied to this system.  Application of this
	patch would leave the system in an inconsistent state.
	Patch installation is aborted.

      Explanation and recommended action: Occasionally, a patch
	is replaced by a new patch which incorporates the bug fixes
	in the old patch and supplies additional fixes also.  At
	this time, the earlier patch is no longer made available
	to users.  The second patch is said to "obsolete" the
	first patch.  However, it is possible that some users
	may still have the earlier patch and try to apply it to
	a system on which the later patch is already applied.
	If the obsoleted patch were allowed to be applied, the
	additional fixes supplied by the later patch would no
	longer be available, and the system would be left in an
	inconsistent state.  This error message indicates that
	the user attempted to install an obsoleted patch.  There
	is no need to apply this patch because the later patch
	has already supplied the fix.

    Error message: The packages to be patched are not installed on
	this system.

      Explanation and recommended action:  None of the packages
	to be updated by this patch are installed on the system.
	Therefore, this patch cannot be applied to the system.

    Error message: This patch is not applicable to client systems.

      Explanation and recommended action: The patch is only
	applicable to servers and standalone machines.  Attempting
	to apply this patch to a client system will have no effect on
	the system.

    Error message: The /usr/sbin/pkgadd command is not executable.

      Explanation and recommended action:   The /usr/sbin/pkgadd
	command cannot be executed.  The most likely cause of this
	is that installpatch is being run on a diskless or dataless
	client and the /usr file system was not exported with
	root access to the client.  See the section above on
	"Instructions for installing a patch on a diskless or
	dataless client".

    Error message: Patch directory is not of expected format.

      Explanation and recommended action: The patch directory
	supplied as an argument to installpatch did not contain
	any patch packages.  Verify that the argument supplied
	to installpatch is correct. 

    Error message: The following validation errors were found:
	           <validation error(s)>

      Explanation and recommended action: Before applying the patch,
	the patch application script verifies that the current
	versions of the files to be patched have the expected
	fcs checksums and attributes.  If a file to be patched has
	been modified by the user, the user is notified of this
	fact.  The user then has the opportunity to save the
	file and make a similar change to the patched version.
	For example, if the user has modified /etc/inet/inetd.conf
	and /etc/inet/inetd.conf is to be replaced by the patch,
	the user can save the locally modified /etc/inet/inetd.conf
	file and make the same modification to the new file
	after the patch is applied.  After the user has noted all
	validation errors and taken the appropriate action for
	each one, the user should re-run installpatch using
	the "-u" (for "unconditional") option. This time, the
	patch installation will ignore validation errors and
	install the patch anyway.

    Error message:  Insufficient space in /var/sadm to save old files.

      Explanation and recommended action:  There is insufficient
        space in the /var/sadm directory to save old files. 
	The user has two options for handling this problem: 
	(1) generate additional disk space by deleting unneeded
	files, or (2) override the saving of the old files by
	using the "-d" (do not save) option when running installpatch.
	However if the user elects not to save the old versions of
	the files to be patched, backoutpatch CANNOT be used.

	One way to regain space on a system is to remove the
	save area for previously applied patches.  Once the user
	has decided that it is unlikely that a patch will be
	backed out, the user can remove the files that were saved
	by installpatch.  The following commands should be executed
	to remove the saved files for patch xxxxxx-yy:

	cd /var/sadm/patch/xxxxxx-yy
	rm -r save/*
	rm .oldfilessaved

	After these commands have been executed, patch xxxxxx-yy can
	no longer be backed out.

    Error message:  Save of old files failed.

      Explanation and recommended action:  Before applying the patch,
	the patch installation script uses cpio to save the old
	versions of the files to be patched.  This error message
	means that the cpio failed.  The output of the cpio
	would have been preceded this message.  The user should
	take the appropriate action to correct the cpio failure.
	A common reason for failure will be insufficient disk
	space to save the old versions of the files.  The user
	has two options for handling insufficient disk space:
        (1) generate additional disk space by deleting unneeded
        files, or (2) override the saving of the old files by
        using the "-d" option when running installpatch. However
        if the user elects not to save the old versions of the
        files to be patched, the patch CANNOT be backed out.

    Error message: Pkgadd of <pkgname> package failed.  See
	       /tmp/log.<patchnum> for reason for failure.

      Explanation and recommended action:  The installation of one of
	patch packages failed.  Any previously installed packages
	in the patch should have been removed.  See the log file
	for the reason for failure.  Correct the problem and
	re-apply the patch.

    Error message: error while adding patch to root template

      Explanation and recommended action:  The install script
	determined this system to be a client server.  The attempt 
	to apply the patch package to the appropriate root
	template space located under /export/root/templates
	failed unexpectedly.  Check the log file for any failure
	messages.  Correct the problem and re-apply the patch.


    Patch Backout Errors:
    ---------------------

    Error message:  Patch <patchnum> has not been applied to this system.

      Explanation and recommended action:  The user has attempted to back
	out a patch that was never applied to this system.  It is
	possible that the patch was applied, but that the patch
	directory /var/sadm/patch/<patchnum> was deleted somehow.
	If this is the case, the patch cannot be backed out.  The
	user may have to restore the original files from the
	initial installation CD.

    Error message:  Patch <patchnum> was installed without backing up the 
		original files.  It cannot be backed out.

      Explanation and recommended action:  Either the -d option of
	installpatch was set when the patch was applied, or the save
	area of the patch was deleted to regain space.  As a result, the
	original files are not saved and backoutpatch cannot be used.  The 
	original files can only be recovered from the original 
	installation CD.

    Error message: Pkgrm of <pkgname> package failed.  See
	       /var/sadm/patch/<patchnum>/log for reason for failure.

      Explanation and recommended action:  The removal of one of
	patch packages failed.  See the log file
	for the reason for failure.  Correct the problem and
	run the backout script again.

    Error message:  Restore of old files failed.

      Explanation and recommended action:  The backout script uses the
	cpio command to restore the previous versions of the files
	that were patched.  The output of the cpio command should
	have preceded this message.  The user should take the
	appropriate action to correct the cpio failure.

KNOWN PROBLEMS:

     On client server machines the patch package is NOT applied
     to existing clients or to the client root template space.
     Therefore, when appropriate, ALL CLIENT MACHINES WILL NEED
     THE PATCH APPLIED DIRECTLY USING THIS SAME INSTALLPATCH
     METHOD ON THE CLIENT.  See instructions above for
     applying patches to a client.
 
     After a patch package has been installed pkginfo(1) will
     not recognize the SUNW_PATCHID macro in the patch package
     pkginfo file.  Instead, to identify patches installed on
     the system use the grep command method described in the
     patch README.

     The pkgadd command shipped with Solaris 2.1 fails (drops core
     without any error message) when there are more than 100
     entries in the /etc/mnttab file.  This means that installpatch
     can fail, because it uses pkgadd.  Since this is very likely on
     any big system with lots of automounts, ANY patch could fail.
     Applying patch 100901-01 fixes this problem (the README for 
     patch 100901 mentions shutting down the automounter while 
     applying it).

SEE ALSO
     pkgadd(1), pkgchk(1), pkgrm(1), pkginfo(1)