Patch-ID# 103935-57
Keywords: security y2000 libsocket automountd libc sockmod kadb ufs nfs watchdog
Synopsis: SunOS 5.4 CS6400: jumbo patch for kernel
Date: Apr/29/98

Solaris Release: 2.4 CS6400

SunOS release: 5.4 CS6400

Unbundled Product:

Unbundled Release:

Xref: This patch available for non-CS6400 sparc as patch 101945
Xref: This patch available for x86 as patch 101946

Topic: SunOS 5.4 CS6400: jumbo patch for kernel


       NOTE 1: To get the full benefit from the timezone logic correction 
               introduced by this Kernel Update (refer to 4011495: 'zoneinfo' 
               summertime/wintertime switchover anomaly) you also need to 
	       install patch 102595-02 (or newer).  This patch updates
	       the files in /usr/share/lib/zoneinfo. 

       NOTE 2: If this patch is applied to a system installed with the entire
               configuration, the default save option of the installpatch 
	       utility will use approximately 20 MB of free space in /var.

       NOTE 3: You need to also apply patch 102113 if you are printing on a 
               NeWSprint printer.  Patch 102113 prevents a hang from occurring
               when printing.	

       NOTE 4: If this patch is applied to a server, it should also be applied
	       to dataless clients that also mount /usr from that server.  
	       Failure to do so will generate this error message when openwin
	       is started on the client: 
			"Binding Unix socket: Invalid argument".

       NOTE 5: Due to bugfixes 4026740, 4058892, 4058904 and 4059736 in
               101945-53, it is recommended that one installs the following 
               patches:
               105314-01 (or newer)   kernel/exec/elfexec patch
               105318-01 (or newer)   usr/bin/gcore patch

       NOTE 6: This patch accumulates all fixes from 101945-57 and 104062-02.

Cray SPR's fixed with this patch:   78722 83906 86962 86973 87182 87190 87401 
87473 88067 88366 88934 89507 89851 89899 89908 90854 91072 91079 91328 91672 
92110 92203 93474 94013 94543 94716 95162 96566 96130 96199 98286 99498 99870 
100099 101371 102992 103157 103170 103416 104451 105824

Cray SPR changes incorporated in this version:

BugId's fixed with this patch:  1120225 1124354 1130786 1130791 1143434 1143479 1145457 1150556 1151364 1151509 1151955 1152710 1152922 1155298 1157053 1158574 1159330 1159986 1160112 1162269 1162834 1163335 1163511 1164319 1164519 1164679 1164800 1165675 1165687 1166712 1166779 1166848 1167235 1168398 1169686 1169775 1169823 1169909 1170832 1170862 1171008 1171478 1172009 1172118 1172242 1172243 1172245 1172260 1172710 1172731 1172926 1172979 1172998 1173212 1173301 1173309 1173626 1173969 1173973 1174222 1174572 1174738 1174786 1174830 1174847 1174851 1174913 1175018 1175044 1175115 1175127 1175176 1175304 1175356 1175368 1175478 1175499 1175668 1175829 1175931 1175968 1176247 1176467 1176508 1176618 1176845 1177091 1177100 1177119 1177228 1177469 1177516 1177572 1177578 1177600 1177620 1177644 1177862 1178114 1178128 1178190 1178236 1178295 1178363 1178391
BugId's fixed with this patch:  1178400 1178407 1178506 1178641 1178753 1178761 1178824 1178835 1178889 1178898 1178957 1178985 1179258 1179311 1179403 1179480 1179625 1179738 1179884 1180414 1180578 1180819 1181009 1181201 1181258 1181259 1182051 1182105 1182158 1182458 1182492 1182509 1182597 1182686 1183120 1183215 1183343 1183552 1183568 1183662 1183837 1184134 1184256 1184636 1184991 1185149 1185694 1185775 1186156 1186224 1186287 1186420 1186557 1186805 1186845 1187322 1187536 1187901 1187948 1188259 1188287 1188307 1188367 1188399 1188464 1188701 1188790 1188906 1189271 1189329 1189389 1189511 1189590 1189592 1189967 1189968 1191078 1191422 1191457 1192162 1192238 1192309 1192982 1193007 1193066 1193448 1193696 1193721 1193801 1194263 1194355 1194613 1194878 1194923 1194928 1195432 1195436 1195437 1195797 1195904 1197042 1197596 1197646 1197708 1197979
BugId's fixed with this patch:  1198215 1198278 1198439 1198966 1199124 1199164 1199500 1199579 1199624 1200224 1200502 1200734 1200912 1201471 1201926 1202070 1202675 1203132 1203471 1204479 1204575 1205200 1205240 1205409 1205614 1205731 1205797 1206384 1206598 1206642 1206850 1207181 1207669 1207954 1208034 1208053 1208241 1209012 1209014 1209452 1209687 1209917 1210314 1210355 1210713 1210830 1211022 1211172 1211278 1211537 1211555 1211904 1212472 1212910 1213782 1213871 1213874 1214038 1214043 1214057 1214320 1215792 1216540 1217050 1217220 1217941 1218562 1218578 1218997 1219020 1219295 1219671 1219766 1220257 1220275 1220400 1220411 1220811 1220886 1220902 1220995 1221608 1221620 1221966 1222086 1222599 1222745 1222780 1222902 1223163 1223374 1223632 1223853 1223882 1223900 1223949 1224074 1224089 1224148 1224298 1224486 1224604 1224737 1226653 1226919 1226938 1227376 1227426 1227580 1229031 1229082 1229805 1229843
BugId's fixed with this patch:  1231720 1231871 1231997 1232577 1232825 1232838 1232866 1232869 1233049 1233088 1233719 1233827 1234450 1234879 1235099 1236149 1238343 1238559 1238582 1239343 1240151 1240331 1241056 1241118 1241282 1241611 1242188 1242481 1243116 1244088 1244917 1244971 1245077 1245291 1245300 1245602 1245703 1246302 1246408 1247172 1248090 1248446 1248840 1249319 1249667 1249829 1249985 1250127 1250652 1250848 1250937 1251423 1252953 1252967 1253223 1255435 1255536 1255623 1256153 1256610 1257205 1258151 1259279 1260593 1260769 1260873 1260959 1261245 1261934 1262082 1262095 1262096 1262660 1262694 1265000 1265447 1265930 1266278 1267082 4004147 4004575 4007937 4009069 4010565 4011495 4011648 4015367 4015495 4016973 4017242 4019260 4022354 4022478 4022642 4026339 4026740 4028300 4029971 4030158 4030669 4032974 4033392 4034353 4034355 4034868 4035845 4036063 4036589 4036676 4036746 4038317 4039071 4039165 4043953 4039792 4040476 
BugId's fixed with this patch:  4045229 4045522 4045941 4048895 4050818 4050892 4057818 4058892 4058904 4059632 4059736 4066789 4091935 4095455 4105997

Changes incorporated in this version: 1195797 1212910 1265000 4045229 4050818 4095455 4105997

Relevant Architectures: sparc.cray4d

Patches accumulated and obsoleted by this patch: 101918-01 101969-07 101971-01 101975-01 101981-02 101983-03 102002-05 102007-02 102020-07 102024-02 102119-01 102137-01 102169-01 102216-07 102224-10 102358-01 102509-08 102926-01 103575-01 104062-02 101945-41 641021-01 641022-02 641032-01 641034-01 641039-01

Patches which conflict with this patch:

Patches required with this patch:

Obsoleted by:

Files included with this patch:

/etc/fs/nfs/mount
/etc/lib/unix_scheme.so.1
/etc/name_to_sysnum
/kadb
/kernel/drv/arp
/kernel/drv/cn
/kernel/drv/dr
/kernel/drv/esp
/kernel/drv/icmp
/kernel/drv/ip
/kernel/drv/isp
/kernel/drv/logindmux
/kernel/drv/logindmux.conf
/kernel/drv/partn
/kernel/drv/sbi
/kernel/drv/sd
/kernel/drv/sx
/kernel/drv/tcp
/kernel/drv/tl
/kernel/drv/udp
/kernel/fs/autofs
/kernel/fs/cachefs
/kernel/fs/lofs
/kernel/fs/nfs
/kernel/fs/procfs
/kernel/fs/tmpfs
/kernel/fs/ufs
/kernel/misc/dlma
/kernel/misc/klmmod
/kernel/misc/strplumb
/kernel/misc/swapgeneric
/kernel/misc/tlimod
/kernel/sched/TS
/kernel/strmod/arp
/kernel/strmod/rlmod
/kernel/strmod/rpcmod
/kernel/strmod/sockmod
/kernel/strmod/telmod
/kernel/strmod/timod
/kernel/sys/c2audit
/kernel/sys/kaio
/kernel/sys/nfs=../../kernel/fs/nfs
/kernel/unix
/platform/CYRS,Superserver-6400/kernel/drv/esp
/platform/CYRS,Superserver-6400/kernel/drv/ip
/platform/CYRS,Superserver-6400/kernel/drv/isp
/platform/CYRS,Superserver-6400/kernel/drv/sd
/platform/CYRS,Superserver-6400/kernel/drv/tcp
/platform/CYRS,Superserver-6400/kernel/fs/nfs
/platform/CYRS,Superserver-6400/kernel/misc/hswp
/platform/CYRS,Superserver-6400/kernel/sched/RT
/platform/CYRS,Superserver-6400/kernel/sched/TS
/platform/CYRS,Superserver-6400/kernel/sys/nfs
/sbin/ifconfig
/sbin/init
/sbin/su
/sbin/sulogin
/usr/4lib/libc.so.1.8
/usr/4lib/libc.so.2.8
/usr/bin/csh
/usr/bin/ftp
/usr/bin/su
/usr/include/security/ia_appl.h
/usr/include/sys/aio.h
/usr/include/sys/aio_req.h
/usr/include/sys/asynch.h
/usr/include/sys/cpuvar.h
/usr/include/sys/ddi_impldefs.h
/usr/include/sys/disp.h
/usr/include/sys/errno.h
/usr/include/sys/partn.h
/usr/include/sys/proc.h
/usr/include/sys/sema_impl.h
/usr/include/sys/sockmod.h
/usr/include/sys/stropts.h
/usr/include/sys/strsubr.h
/usr/include/sys/thread.h
/usr/kernel/sched/RT
/usr/kvm/adb
/usr/kvm/crash
/usr/kvm/lib/adb/sema
/usr/kvm/libkvm.a
/usr/kvm/libkvm.so.1
/usr/kvm/prtdiag
/usr/lib/autofs/automountd
/usr/lib/fs/autofs/automount
/usr/lib/fs/nfs/inetboot
/usr/lib/fs/nfs/mount
/usr/lib/fs/nfs/umount
/usr/lib/libaio.so.1
/usr/lib/libauth.a
/usr/lib/libauth.so.1
/usr/lib/libc.a
/usr/lib/libc.so.1
/usr/lib/libp/libc.a
/usr/lib/libsocket.a
/usr/lib/libsocket.so.1
/usr/lib/libthread.so.1
/usr/lib/libthread_db.so.0
/usr/lib/libp/libc.a
/usr/lib/pics/libc_pic.a
/usr/lib/security/unix_scheme.so.1
/usr/lib/utmp_update
/usr/sbin/dr_daemon
/usr/sbin/ifconfig
/usr/sbin/in.ftpd
/usr/sbin/in.rlogind
/usr/sbin/in.telnetd
/usr/sbin/init
/usr/sbin/static/rcp
/usr/share/src/uts/cray4d/sys/kmem.h
/usr/share/src/uts/cray4d/sys/kmem_impl.h
/usr/share/src/uts/cray4d/sys/machparam.h
/usr/share/src/uts/cray4d/sys/mman.h
/usr/share/src/uts/sun4d/sys/clock.h
/usr/share/src/uts/sun4d/sys/physaddr.h
/usr/ucblib/libucb.a
/usr/ucblib/libucb.so.1
/usr/src/uts/cray4d/vm/dr_cage.c


Problem Description:

(from 101945-57)

4105997 Y2000 tm_test01 fails with current S2.5.1 strptime()
4045229 strptime and getdate year calculation not count century; strptime range checks
4050818 getdate %C (century) should use current year offset if year offset not given
1265000 "panic: kernel heap corruption detected" while running TStrans (high/long)
1212910 libaio test hangs and/or panics multi-processor sytems  reproducibly
1195797 Seagate 535M drive failed during Unix boot with "polled command timeout"
4095455 automounter security problem

(from 101945-56)

4091935 bcp /usr/4lib/libc mktime() fails for specific -ve values in tm structure
4045941 bcp /usr/4lib/libc mktime() doesn't care leap year.
4015367 Solaris 2.5 cannot handle crash dump bigger than 2GB

(from 101945-55)

4033392 ss20 -150 hard hang while accessing mirrored disk

(from 101945-54)

4059632 Kernel watchdog resets with misaligned stack
1223882 Neutron r hangs reproducibly for a back to back NFS I/O load

(from 101945-53)

1226919 ping -sv -i 127.0.0.1 224.0.0.1 causes a panic
4026740 assert failure in segnf_gettype: seg->s_base == addr
4032974  system hangs when lbolt wraps around.
4043953 kernel randomly paniced with assertion failure in callout.c, line 345
4048895 port deadman functionality to Solaris 2.4 for the sun4d architecture
4057818 panic due to procfs access of non-existent mapping
4058892 as_getprot() needs to report real size of ISM segments
4058904 accessing addresses in ISM segments between "real" end and "segment" end loop
4059736 as_memory() does not dump ISM segments
4066789 SD Driver is issuing BDR's on 2.4 and 2.5.

(from 101945-52)

4035845 do_unmount can hang while an NFS server is down
4015495 connect deadlock when ephemeral port is same as dest port
4010565 su can be interrupted by <control-C> and not logged in /var/adm/log

(from 101945-51)

4050892 init_swift_idle_cpu() should not search OBP for property
4045522 need to complete the fix of 1219295
4045229 strptime and getdate year calculation not count century; strptime range checks
4038317 Panic Deadlock condition occurrs between relvm and ia_set_process_group.
4036589 mt application hangs if last pthread_create is allowed to exit
4030158 bcp /usr/4lib/libc strptime format %y does not handle year 2000 or later
4026339 /usr/ucb/ps hangs while trying to get anonmap serial_lock in segvn_fault()
1265930 Solaris 2.4: Daylight saving with 101945-37 and higher put off one hour
4011495 'zoneinfo' summertime/wintertime switchover anomaly
4039792 mktime/localtime are broken and need to be fixed
1246408 ftp may be used to get root access from port 20 to other machines
4040476 in.rlogind security problem
1229082 sd does not set resid properly when a partial xfer during recovered error happens

(from 101945-50)

4039165 A broken TLI application can panic the system during T_CONN_REQ
4039071 Interrupt of passwd command can corrupt password
4036063 security problem with writing core files
4022354 kill -9 can not kill application thread in cv_wait called from getandset()
4019260 icmp panic on mi_free() in icmp_close() in 2.5
4007937 Processes hang accessing files over NFS in clnt_tli_kcreate()
1238582 privileged ifconfig ioctls by normal user succeed on sockets created as root

(from 101945-49)

4036746 mcs command fails when no physical disk swap
4036676 sockmod queues MSG_OOB and MSG_PEEK ioctls (no-26)
4034868 Security hole: buffer overflow in bin password. get the effect uid of 0 ( root )
4034355 Solaris ignores KEEPALIVE probes without data
4034353 TCP/IP sometimes forgets PSH after sending URG.
4032974 system hangs when lbolt wraps around.
1267082 nfs mount command should ignore bad options and complete mount
1262082 2.5.1 sun4d hangs w/kernelmap fragmentation
1250848 SC2000 with 85 mzh CPU routinely panic/watchdog in srmmu_tlbflush(

(from 101945-48)

4029971 getopt security problem
4028300 automounter security hole
4022478 bad trap Memory address alignment in tl_discon_ind
4017242 PDB-Systemcrash with BAD TRAP under 101945-41

(from 101945-47)

4022642 TLI application causes Data Fault system panics
4016973 panic in ufs_trans_syncip(2.4)
4009069 2.5 TCP generates wrong checksum and never recovers from error
4004575 High mutex hits, slow performance when c2auditing enabled
4004147 panics in segkp_load when the file command is run
1265447 SYSTEM HANG, CLOCK THREAD IN MUTEX_ENTER WAITING FOR ANOTHER LOCK
1252967 2.3 NFS server can not handle the locking state correctly.
1250937 NFS server can crash NFS client by sending bogus stat() data
1248090 getwd very slow over nfs to 4.1.3 server
1245291 Bug in libthread.so(cond_timedwait()) and libposix4.so(sigtimedwait) in 2.4,2.5
1212472 nanosleep returns too early

(from 101945-46) 

4011648 Fix for bug 1248840 introduces performance degradation in rsh
1262096 IP believes it has correctly reassembled a packet, but one fragment not received
1262095 Memory address alignment panic in pi_willto()
1260593 /usr/ucb/ps hang in rw_enter while other thread wedged in page_lock
1261934 /proc must disallow access to pages not in the address space map
1255536 2.5 data fault panic in tdiraddentry accessing tmpfs
1249985 "deadman" doesn't work correctly on MP systems.
1233088 ioctl(PIOCPSINFO) is 100 times too slow on multi-threaded processes
1226653 IP can send packets larger than MTU size to the driver

(from 103935-45)

4030669 "Deadlock condition" panic with new -45 JKP

(from 103935-43)
 
105284: 2.4 PATCH FOR SPR 105662: DATA CORRUPTION WITH DR-MEM-DETACH ENABLED
 
There are 2 symptoms, neither of which have ever been seen.  These
problems were found in 2.5 and are being retrofitted into 2.4 (see
105825).  The two problems are data corruption and assertion failure -
ASSERT(pp->p_vnode).  Note that asserts are only enabled in debug mode
kernels.
 
(from C101945-41)
 
103170: 2.4 PATCH FOR SPR 103156: DATA FAULT PANIC IN IDLE ROUTINE - SEE SPR 101309
 
An interrupt window allowed a corrupt stack pointer that resulted in a
data fault panic in the idle loop.
 
(from C101945-39)
 
104451: 2.4 PATCH: DR DAEMON INCORRECTLY UPDATES ALTERNATE PATHING PATHGROUP FLAGS
 
When a board is DR attached or detached containing Alternately Pathed
controllers, DR may fail to correctly update the attach/detach flags
for the corresponding Alternate Path pathgroup(s) containing the
controllers.
 
(from C101945-38)
 
103416: 2.4 PATCH FOR SPR 103392: DR DETACH CAUSES A WATCHDOG SYNC WHEN THE BOOT DEVICE
 
Previously, when the boot device was under AP and VxVM control,
attempting a DR detach caused a watchdog sync as result of stack
overflow.  This patch corrects the problem by reducing the number of
calls on the stack at the point in which the overflow occured by now
loading the RT module when the DR module is loaded.
 
103157: 2.4 PATCH FOR SPR 102927: PANIC IN CHECK_REALTIME_THREADS TRYING TO ACCESS NULL POINTER (T_PROCP)
 
During a heavily load with very dynamic threads/processes coming and
going a window exists such that if a DR or Hotswap operation is
executed it's possible to catch a thread data structure which doesn't
have a properly initialized t_procp field resulting in the Hotswap
quiesce code attempting to access an unexpected NULL pointer.  This
window exists in the thread_create code.
 
(from C101945-37)
 
102992: RECURSIVE MUTEX ENTER PANIC
 
The symptom will be a "recursive mutex enter" panic from the procedure
"srmmu_pageinvalidate". This is a fix to a bug introduced in the fix
for SPR 99870.
 
(from C101945-36)
 
96199: PANIC - RECURSIVE MUTEX_ENTER AND THEN PANIC SYNC TIMEOUT; 2.4 PATCH FOR 99844
 
It is possible with multi-threaded user applications to get recursive
mutex_enter panics while calling various user-level LWP routines.  This
panic is due to the mismanagement of cpu context in the LWP system
calls.
 
99498: HEARTBEAT FAILURE WHILE RUNNING RANDPTN ON 101945-35
 
There was a mutex deadlock condition in the memory partitioning code.
This was hit while moving processors from one partition to another.
 
99870: 2.4 PATCH FOR SPR 99353: SYSTEM PANIC: SRMMU_UNLOCK ORACLE RUNNING AT THE TIME
 
Fixes incorrect flushing of pages in dr_flush which can fail if the
pages have their mappings locked.
 
100099: 2.4 PATCH FOR SPR 99967: WHEN LOADING DR DRIVER, CS6400 PANICS CS6400 WITH DATA FAULT
 
Performing a modload of the DR driver module (/kernel/drv/dr) will
cause a DATA FAULT panic if the system contains memory with holes, i.e.
bad SIMMs.  The workaround is to either repair the bad simms or
blacklist the memory group containing them.
 
101371: 2.4 PATCH FOR SPR 100404: PANIC: SRMMU_PTELOAD - PTE REMAP PANIC WITH JKP-36
 
This fixes a panic was encountered during internal stress testing of an
earlier build of 101945-36 prior to release.  This is caused by
occasional conditions existing when dr_detach logic relocates pages.
 
(from C101945-35)
 
98286: 2.4 PATCH FOR SPR 96374: SSA DISKS ARE NOT ACCESSIBLE AFTER DR BOARD RE/ATTACH O
 
Because OBP no longer rebuilds the entire devinfo tree (in order to
speed boot), DR becomes confused about what was available.
 
(from C101945-34)
 
96566: 2.4 PATCH FOR SPR 96509: BAG TRAP: MEMORY ADDRESS ALIGNMENT (SUSWORD+0X54)
96130: 2.4 PATCH FOR SPR 95922: EXECUTABLE HANGS A CPU - CANNOT SWITCH PROCESS OFF A CPU - MAY HANG SYSTEM
 
This problem will always produce a "BAD TRAP" message with the detail of
"<Memory address alignment>" with an offset which translates to susword+54.
 
93474: 2.4 PATCH SPR FOR DR_DAEMON AP SUPPORT
91328: THE META DEVICE DATABASE IS NOT DETECTED AS DEVICE USAGE BY THE DR DAEMON
 
The AP interface code is not enabled in the dr_daemon.  This patch is
needed for AP and DR to operate together.  The dr_daemon does not
report the location of the Sun Online DiskSuite database partitions.
 
92110: DR AND "EXTENDED KERNEL MEMORY" DO NOT SUPPORT DLM (DISTRIBUTED LOCK MANAGER)
 
Virtual memory used by Cray's Extended Kernel Memory overlaps with that
used by the Distributed Lock Manager.  In order to support both features
successfully, the fix submitted by this Patch-RTI is required.
 
(from C101945-32)
 
95162: KMEM_AVAIL() NEEDS TO REPORT BYTES AVAILABLE INSTEAD OF PAGES (LIMIT OF 1GB)
 
The original kmem_avail() routine did not properly support large
memories (>4GB) and could result in hung systems due to the system
believing that memory was low.  Cray repaired the routine to instead
return available memory in pages, however this causes some 3rd party
drivers which used kmem_avail() to not work properly (note that kmem_avail
is _not_ DDI/DKI).  The solution, based on the general usage of kmem_avail,
was to allow kmem_avail to return in bytes, but with a maximum of 1GB.
 
(from C101945-29)
 
92203: 2.4 PATCH FOR SPR 91630: WATCHDOG: ADDRESS FAULT ON STACK ADDRESS, STORE TO SUPERVISOR SPACE, LEVEL 3
 
Panics and watchdogs have been observed for specific versions of
Cray Solaris. A misoperation of the SuperSPARC processor caused
these under very limited circumstances; this patch prevents
these conditions.
 
94716: 2.4 PATCH FOR SPR 94661: WATCHDOG - SYSTEM THREAD WITH ONE PAGE STACK
 
The system can watchdog reset when interrupts are serviced on the stack
of certain system threads that only have a 1 page stack.
 
94543: 2.4 PATCH FOR SPR 93485: DEADLOCK CONDITION DETECTED: CYCLE IN BLOCKING CHAIN
 
When an lwp library lwp synchronization routine is called (e.g. _lwp_cond_wait,
_lwp_mutex_lock, _lwp_sema_p, etc) the kernel may panic with the message:
"Deadlock condition detected: cycle in blocking chain."
 
94013: RMSCER Phase I
 
(from C101945-27)
 
91079: KMEM_CACHE_KSTAT_UPDATE PANIC
 
System panics in either socket logic or nfs logic with a data fault or
illegal address.
 
89507: FTP OF LARGE BINARY FILES ACROSS ETHERNET AND FDDI HAVE BYTE ORDERING PROBLEMS
89899: PANIC: ASSERTION FAILED N <= UIOP->UIO_RESID
90854: RCP FILE TRANSFERS HANG BETWEEN 2 CS6400 SYSTEMS - FDDI AND ETHERNET
 
During TCP based transfers (ftp and rcp) of large files, the files may
sometimes get corrupted.  The corruption takes the form of incorrect
ordering of contiguous portions of the files.  Usually the files will
maintain a consistent "sum"check, however "sum -r" will show the change.
With respect to rcp, the problem is manifested by a "hang" of the rcp
command due to a breakdown in the rcp protocol between the sending and
receiving host.  The protocl breakdown is speculated to be due to misordering
of messages between the two rcp processes.
91672: 1 PROCESSOR SYSTEM WILL NOT BOOT WITH 101945-27 KERNEL PATCH
 
Host (OS) hangs during bootup on systems with only one processor.
 
89851: DR DETACH RELEASE (SOLARIS 2.4 MAINTENANCE UPDATE #2)
 
Release of Detach functionality of Dynamic Reconfiguration completing
the entire DR RAS feature.
 
(from C101945-23)
 
88934: PROTECTION PAGEFAULTING LOOP ENCOUNTERED WITH MEMORY STRIDE TEST
91072: MEMORY ADDRESS ALIGNMEMT - PANIC
 
This problem is best characterized by excessive protection pagefaulting in user
programs. The region where this commonly happen is above virtual address
0x20000000. This problem can also cause the system to panic due to
possible corruption of kernel page structs, page tables, anon structs &
anonmap structs.
 
88366: DATA FAULT PANIC DURING SYNC AFTER HOSTINT FROM SSP
 
Syncing the system after doing a hostint may results in a panic.
 
(from C101945-22)
 
88067: KADB DOES NOT CONTAIN THE CRS DEFINITIONS OF THE THREAD AND CPU STRUCTURES
 
The cpu and thread adb macros are missing #ifdef _CRS code added for
processor partitioning code.  The Makefile now defines -D_CRS correctly
for this during the build of kadb.
 
(from C101945-17)
 
87401: EXTENDED KERNEL RMAP OVERFLOW UNDER HEAVY LOAD
 
The system was complaining about the extended kernelmap size being
too small.
 
86962: DR APPEARS TO HANG DURING HOLD IOCTL.  MEM ATTACH PANICS IN DR_PAGE_ALLOC
 
Attempts to issue the DR memory drain may hang for long periods, possibly
hours.  This putback really only addresses this problem partially - extensive
DR testing still required to state this hang problem as repaired.
There are other types of hangs that can occur in processes (threads) that
are (primarily) exiting and freeing up chunks of memory (stuck in page_free).
This putback addresses that specific problem.
DR Attaches that occur _after_ previous Detaches, may result in panics
during the memory attach (dr_kphysm_init or dr_page_alloc).  Error messages
regardings boards which are already memory detached are not clear.
Also addressed are panics that may occur for system with over 4G of memory.
 
87190: DR DOESN'T SUPPORT EXTENDED_KVSEG.  MEMSCRUB LIST NOT UPDATED. TEXT FAULT PANIC
 
System either panics (TEXT FAULT) or hangs during DR (memory) Detach.
Doesn't support extended kvseg extensions.
 
87182: ARBSTOP: XDBUS REPLY FIFO OVERFLOW & BW GRANT TIMEOUT
 
Machine can arbstop.  An analysis of the arbstopdump will show either
BW reply fifo overflow for the BW associated with XDbus #0 for the
booting CPU, or BW grant timeout for the same BW.
 
83906: MPSTAT REPORTS WRONG NUMBER OF CPUS
 
Mpstat cannot obtain stats on cpus that were put online via the psradm command.
 
86973: KMEM ALLOC EXTENSION TO SUPPORT ALLOCATION FROM DIFFERENT MEMORY TYPES
 
This fixes the recurring problem of kernel heap space being exhausted under
condition of heavy swap space utilization. The kernel heap allocation is now
expanded to support different backends. Specifically, the kmem allocator
now supports allocation from a new 2G extended sysmap space in extended
kernel space.
 
(from C102020-05)
 
89908: PANIC: ISP_I_SCSI_PKTFREE: FREEING FREE PACKET
 
A panic would occur when detaching system boards with isp adapters
installed.  The problem is timing related, so would only happen when
the isp driver was compiled without -DDEBUG.
 
(from C102020-04)
 
89851: DR DETACH RELEASE (SOLARIS 2.4 MAINTENANCE UPDATE #2)
 
DDI_DR_DETACH of sd did not allow detaching of offlined disks.
Also, DDI_RESUME would not resume activity on a low-activity disk
following a suspend.
 
(from C102020-03)
87473: PANIC IN SCSI_HBA_DETACH DURING DR DETACH
 
During DR Detach, the Host panic'd during the Detach of the I/O devices
on the detaching board, specifically during the detach of the ISP scsi
controller.
 
(from C102020-02)
 
78722: LIBAIO SHOULD SUPPORT MORE THAN 50 WORKER THREADS
 
The _max_workers variable governs the maximum number of asynchronous threads
that can be created by a process to handle asynchronous IO requests.  These
threads are created by libaio via calls to aioread() and aiowrite().  The
user of these calls, aioread() and aiowrite(), has no knowledge or control
over the number of threads created.  This decision is made by libaio.  When
we allowed the creation of up to 256 worker threads instead of 50 this
yielded a significant increase in the TPC-B throughput.  We can see no
downside to increasing the maximum number of worker threads.  These threads
will spend most of their life waiting on a synchronous IO and therefore
consume very little CPU time.  When the worker threads have no more IO's
on their queues, they simply sleep until the process exits.
 
(from C641034-01)
 
88773: USER THREAD IN KERNEL MODE WITH T_KPRI_REQ NON-ZERO IS IN LEVEL 0 DISPATCH Q
 
In a busy system a user thread executing a system call may appear to be hung.
 
(from C641032-01)
 
88135: THE NUMBER OF ONLINE CPUS IS INCORRECT AFTER THE PARTN DRIVER MOVES CPUS
 
The number of online cpus returned by sysconf(2) is incorrect after cpus are
moved into a partition. Each move of a cpu increases the count by 1.
The variable ncpus_online is incorrect after moving cpus to a partition.
Modified move_cpus_to_partn() to decrement ncpus_online after calling
cpu_remove_active() since it is not done there.
 
(from C641022-02)
 
89467: HOTSWAP THINKS NF IS UNSAFE DEVICE
 
Hotswap driver does not recognize the NPI fddi driver, "nf", as being a
DR safe device and so will not suspend/resume systems containing it.
 
(from C641022-01)
 
86960: DR APPL COMPLAINS THAT PSEUDO DRIVERS ARE DR UNSAFE (DETACH)
 
The DR unsafe device query accessible from the DR Attach/Detach (hostview) GUI
mistakenly reports open pseudo drivers as DR unsafe.  Note that such open
drivers will _not_ prevent an OS Quiesce/Resume from taking place.
 
(from C641021-01)
 
86950: DR_DAEMON: DISABLE AP DEPENDENCIES AND DR DETACH
 
Dynamic Reconfiguration detach and Alternate Pathing (AP) will not be
supported in the initial 2.4 release.  This change disables DR detach and
eliminates dr_daemon dependencies upon the AP library.

(from 101945-45)

1259279 x86 byte ordering/endian problem in ip & arp for DL_UNITDATA_REQ
1241282 ftp session dies on cd or dir
1227580 cannot support high TCP connection rates: noncaput errors reported by the driver
1223900 alarm(2) doesn't work properly with large arguments
1219295 automountd doesn't umount the hsfs filesystem.

(from 101945-44)

1266278 freeing free xxx panic; indirtrunc tries to free the same block twice
1262694 Solaris 2.4 hangs due to memory leak in kmem_alloc-8, kmem_alloc_24 and -40 lea
1261245 window probes can cause ack wars
1260959 Streams information delayd 50-100 ms until dbri driver schedules it
1260873 Kernel memory gets corrupted when sharing and unsharing secure NFS.
1258151 Solaris 2.4 nfs -o noac option not working properly with novell nfs server
1257205 Undefined symbol `getlocale_time' in libc.so.1.8 Kernel Patch 101945-37
1255623 getdate() fails on 1st of month with julian date getdate() fails on 1st of month with julian date scripts/
1253223 System running 2.3 with KJP-80 on single CPU /24MB hangs in fork test case
1249319 lo_sync() should not flush the underlying filesystem
1248840 solaris 2.3,sc2000, TCP socket can't handle FIN pkt from client surely, deadloc
1248446 arp cache is not getting updated appropriately
1239343 Threaded application dumped core with multi cpu machine.
1238559 sun4m user process can arbitrarily dump core with kadb
1256153 watchdog after continuing from kadb
1234450 NFS (VOP_WRITE &c) returns EINTR when "intr" is not specified on the mount.
1232838 Backport 1229099: sched: sched()/prioctl()/clock() deadlock during heavy swappi
1211537 mxcc parity check disabling is incorrect in sun4m kernel
1151955 The PASSLENGTH attribute in /etc/default/passwd doesn't appear to work
1255435 ftp dumps core if lostpeer signal handler is called right before getreply()

(from 101945-43)

1256610 strwrite fails to call queuerun on error path: bug performance hit
1251423 panic - recursive mutex_enter on lwplock
1250652 system crashed with "trap: unexpected MMU trap"
1249829 connection times out if remote only sends zero-windows
1244971 solaris 2.3, patch 101318-77 has a bug, it can't handle `boot -s` correctly.
1233827 tcp retransmits too much for short connections as seen at web sites
1233049 System hangs when user stops thread writing to ODS logging device
1223374 SS20 w/4 CPUs gets watchdog reset
1219671 Memory is given free which was never allocated before.
1215792 delayed availability of freed diskspace when UFS logging with ODS 4.0/3.0
1245602 Logging UFS is slower than UFS for local writes
1198215 ftp can silently lose data when writing to nfs
1175499 repeated getdate calls leaks memory
1240331 ifconfig of a non-existent virtual interface creates one even as a regular user

(from 101945-42)

1246302 sd,ssd: sd_uselabel() function needs to return error if disk label is bad.
1242188 hang waiting for rwlock with holdcnt of -1 but no owner
1227376 panic "Deadlock condition detected: cycle in blocking chain"
1245703 Deadlock condition detected: cycle in blocking chain
1241056 TL driver panics while servicing tl_ordrel
1221966 telmod writes NULL byte into a zero-length mblk
1250127 kernel memory leak - machine hanging - looks like a problem with streams_msg_15
1244088 SS2000 is completely hanging under heavy I/O - Solaris 2.4 + 101945-36
1233827 tcp retransmits too much for short connections as seen at web sites
1206850 Solaris 2.4, telnet/ftp error in single user mode.

(from 101945-41)

1244917 syslog(3) does not correctly cache the file descriptor that it writes on

(from 101945-40)

1241611 Machine panics with page_vpsub
1232869 paging thresholds are too low on very big systems causing kmem alloc failures
1240151 sockmod corrupts unix socket list due to 2 binds on same socket at same time
1198966 Buggy streams programming causes panic
1242481 panic: ufs_putapage: bn == UFS_HOLE
1238343 hang after installing T101945-37
1172118 hat_share fails during seg_dup if parent's L1 has lost ISM mappings

(from 101945-39)

1233719 system hangs due to idle_q > maxninode and IREF set in head inode of ufs_idle_q
1224298 getXXbyYY using bcp gets SEGV after fix for 1211555
1236149 connect() on AF_UNIX/SOCK_STREAM sockets hangs on Solaris x86 platform
1231997 f77 REWIND makes error : "eor/uio [1010] off end of record" on nfs files
1223853 TCP stream may not go away when process dies
1234879 system panic with auditd: zero divide trap when 101318-75 applied.
1229805 popen assumes maximum number of file descriptors is 256
1220995 directory blocks not counted in quotas
1218997 Simple cp/rm operations hang and cannot be killed
1218578 sd: prtvtoc not reporting bad vtoc on drive
1197979 Recursive use of xdr_pointer could blow autofs unmount thread stack

(from 101945-38)

1232866 bogus pkt_len (short one) causes ip to panic
1229843 2.4 ufs `umount' on an errored device hangs system.
1229031 page_unlock: page not locked panic occurring when locking address space
1224089 NFS writes hang when doing copy over fddi 3.0.1
1208034 SC1000 5.4 Data fault out of tcp_rput_data null pointer under heavy load
1188307 ill_frag_timeout() panic on Intergraph machines when heavily stressed
1224148 TCP performance becomes unacceptable due to bad checksums generated by 2 .4
1189967 real-time latency limits exceeded occasionally
1231871 cpu_surrender doesn't check for threads waiting on kp queue

(from 101945-37)

1224737 SC2000 panics in setf() with fd >= u.u_nofiles (setf called by lockd)
1221608 Automounter: NIS+ Searchpath
1223949 automountd disappear from jurassic automountd disappear from jurassic
1213782 informix server hangs in getmsg()
1224486 sd: there should be retries for both read & write in case of media/hw error
1224604 sd: retries on KEY_ABORTED_COMMAND should eventually be given up
1220811 If a system with swap mirrored crashes it takes hours to write dump
1211555 t_open fails to open /dev/tcp under bcp
1222902 panic in tcp_xmit_ctl because referencing through the wrong mblk.
1220400 lofs becomes confused about where the present working directory "." is
1208053 Implementation of LC_* and other environment variables causes security problem.
1222599 pullupmsg() can corrupt kernel memory and hang CPUs
1217941 Data fault from cron in anon_getpage
1205731 mktime malfunction
1222086 file server hangs when many users exceed quotas
1214320 lwp_cond_wait() syscall returns to user holding a kernel mutex
1186845 wait4() emulation doesn't handle pid < 0
1235099 Using sigprof and libaio will cause program to segfault.
1211172 Automountd fails to unmount lofs file system
1210355 TCP goes into "throttled receive" mode when using multithread
1220902 workaround needed for Viking Hardware Problem

(from 101945-36)
1231720 back out the patch for 1176618; it broke the 101945-35 kernel patch
1227426 system hung while doing dlm related activities

(from 101945-35)

1219766 recv() returns 0 bytes if you close socket too quickly after doing a send()
1210314 ufs_bmap complains about about user level errors
1197646 mktime() can not handle negative tm_isdst across year boundary
1224074 Sybase OpenServer client application hangs with 101945-32 or -34
1217220 BSD sockets make SVR4 programs hang
1209917 fddi crashes when MTU is larger than 4352
1188906 mbtowc() dumps core in bcp
1223163 mach_small4m.c needs fix for panic: asynchronous faults
1216540 potential deadlock when process auditing is enabled
1209687 Panic from audit_thread_free with saved path not empty
1222780 bug in ftpd that can cause it to dump core.
1213871 srmmu_alloc panic when system assigns pid to more than max_nproc processes
1209012 hard hang in findmod: The code in the while goes round and round and round ...
1180819 out of per-user processes message should include uid of offending user
1199124 out of per-user processes warning message should not be kern.crit
1219020 any user can hang the system with fork(); only recourse is rebooting
1222745 kernel workaround needed for sun4d systems with 85MHZ voyagers
1189590 should automatically enable multiple-command mode in Sun4D systems with MXCC4.2
1211904 sc1000 running solaris 2.4 and FDDi 3.0.1 panics with "UIOSTR: strgetmsg ..."
1217050 system panics with strread: STRUIO
1208241 core dumps generated by set-uid executables possibly reveal data.
1176618 prog dumps core if you printf very large string 37000
1220257 Syslog(3) possibly can be abused to gain root access on Solaris 2.x systems
1210713 Data fault panic during kernel startup
1206642 utmp_update can be used to make bogus entries in utmp
1175668 automountd consumes all cpu time and loops when cd to automount directory

(from 101945-34)

1220886 patch 101945-32 breaks Informix
1221620 aioread(3) coredumps when ENOMEM is expected
1218562 kernel panics with memory address alignment
1214057 crash provides open fd for /dev/mem
1214043 Makefile.master has wrong RELEASE info in /ws/on494-patch
1213874 bug 1155298 integration port mistake affects AF_UNIX socket bind() error returns
1211278 ufs function quotadq() thrashing on a lock
1210830 After installing T101945-33, Lotus 123 will not run in BCP, Falls over with SEGV
1209452 When user go over quota no messages are printed to the console and messages file
1209014 uprintf can cause panic on modified OBP systems
1207954 TCP data got corrupted when using multithread libraries
1205797 Support on sun4d platform for keeping a kernel thread in CTX 0 across resumes
1205409 sun4m intermittently goes down with PANIC: getdiskquota
1203471 Solaris does not guarantee bounded dispatch latency for RT processes
1203132 lockfs -h and umount of the UFS lying under a loopback file system causes panic
1200224 unreferenced vnodes may persist in active locks table when lockd dies
1193448 autofs unnecessarily blocks requests on already mounted filesystems
1183662 Search for lofs to unmount should stop on first match
1193066 system hangs due to deadlock between u_flock and pidlock
1191422 read on af_unix socket returns 0 when other end does a write(len = 0)
1189511 stc: SPC ports hang on SC1000 with Solaris 2.4
1186420 in.ftpd does not call sa_auth_acctmg and thus ignores password aging, etc.
1186287 an unprivileged user can use utmp_update to clear entries from /var/adm/utmp
1184636 watchdog reset when checking residency of pages; segment attached SHM_SHARE_MMU
1184256 data fault in freeproc() caused by race with cfork()
1184134 dlmd needs to pop timod, once TCP problems resolved
1183120 Poor feature interaction between tmpunload and large pte's
1179311 1000's of zombie processes which kill further attempts to login to system.
1177119 Readdir() in BCP does not work properly.
1175356 automounter fails to remount unmounted directories
1175127 2.3 tcp performance over satellite/delayed links is very poor compared to 4.1.3
1130786 multiple mbus-to-sbus asynchronous faults panic system

(from 101945-33)

1211022 after installing T101945-30 you cant login to a dataless client
1207669 NFS writes can fail and no error is ever returned, even to fsync or close.
1207181 kmem_cache_xxxx panics with nfs file systems.
1206598 "Cannot reset access time of file at inode x" messages during backup
1206384 fscanf function failed with EUC character
1205240 Mounts of secure file systems fail at random
1204575 gettimeofday system call in BCP is broken.
1199624 queuerun indirectly causes fork() call to hang
1197708 data fault panic in sockmod
1192309 machine panics with audit_finish: residue audit record
1183343 Sybase gets Error 605 on SS5 during installation on a raw partition

(from 101945-32)

1205614 Some of the exported APIs in libbc (/usr/4lib) are redundant
1204479 sprintf format "%.4S" prints improperly when strings include 0216 or 0217
1202070 one residual stop-the-clock-and-hang-the-system bug in /proc
1201926 strrput is calling queuerun() this may lead to a dead lock.
1200502 fopen() and unlink() makes a corrupt file on multi cpu machine.
1194613 getdate() bug under Solaris 2.4 .
1182105 libaio and libthread are not compatible
1199164 process won't exit due to non-zero refcount
1180578 PPP stops working after system is installed with 101318-63
1179884 Non-blocking socket connection over x25 would hang the system.
1164319 sc2000 panic with sema_v turnstile corruption -

(from 101945-31)

1178889 TCP doesn't close down properly across the loopback interface
1189592 Infinite loop in TCP
1186156 keyserv caches old private key after user's password is changed
1194878 ioctl with TCGETA doesn't work with libsocket
1202675 automountd can dump core due to double endnetconfig() call

(from 101945-30)

1199579 flow controlled send does not generate SIGPOLL
1195432 Panic with a data fault from in.telnetd
1200734 bcp sys5 mount() call returns wrong err
1182509 FTP transfer hangs

(from 101945-29)

1195436 With patch T101945-20 /.profile is not executed
1194928 System paniced when shutting down. Data fault panic in canputnext.
1194355 tcp server detects checksum error and loops
1193801 SysV synopsis mount() call doesn't exist on bcp library.
1193721 data fault in putq() due to NULL q_last pointer
1189389 machine panics with a data fault in mutex enter
1182158 Some cmds (ls, pwd) will hang when executed on nfs mounted directories.
1193696 kernel memory allocator: invalid free: buffer not in cache kmem_alloc_1152
1186805 other [] too many write error and EDQUOT messages from nfs to syslog
1191078 Machine hangs with many proc's in rmalloc_wait - memory leak
1178641 NFS client should fail to open files with the mandlock bit set
1175931 nfs loops on async write errors
1198278 ksh loops in kernel making NFS read calls on its history file.
1180414 streams allocb failure results in data corruption
1169823 synctodr() : unable to sync error message ever three days or so
1162834 deadlock between prioctl() and munmap()
1177469 /proc causes page deadlock in NFS
1182597 swapped out lwp->lwp_ar0 in prgetprregs causes data fault and hang
1187536 Deadlock using /proc
1188701 assertion failed: new_state != LMS_WAIT_CPU
1189271 procfs: run-on-last-close doesn't always work
1192982 deadlock condition detected: cycle in blocking chain
1198439 procfs is out-of-spec with respect to microstate accounting
1145457 ksh does not set the correct arguments for su -

(from 101945-28)

1195437 Panic in ip layer (icmp_inbound_error missing pullupmsg)
1181201 port option does not work with autofs.
1124354 Scorpion and Gal-Ross panics on Scheduler stress test

(from 101945-27)

1188464 In Binary capability mode, getpass() echos password
1175044 Signals don't work in 4.1.[23] csh running under 2.4 BCP mode
1187901 Process hung in nanosleep
1185149 nl_langinfo is not MT-safe
1158574 One interrupt level per slot on sun4d

(from 101945-26)

1192238 "noac" mount option not honored immediately after mount
1172926 application hangs on a TCP connection if the remote system dies
1130791 2.x setsockopt SO_SNDBUF fails with protocol error for stream AF_UNIX domain

(from 101945-25)

1189968 Need strict multihoming in IP to prevent breakins over the Internet
1187948 machine is hung because a thread is looping in connopen() (no-25).

(from 101945-24)

1191457 DROPEN failed executing Fortran program

(from 101945-23)

1188259 ls -a .. causes Data fault panic after lockfs -h and umount the file system
1182051 sched: Text fault while telnet test
1173309 system panics with assertion failed: tcp->tcp_rcv_head == NULL
1164519 Socket returns with "address already in use" because conn in "BOUND" state

(from 101945-22)

1188287 NFS mounted files get truncated
1177228 Data fault in freeb routine while running sundiag
1165675 rquotad returns inappropriate error on nfs client

This fix includes a modified sys/errno.h which introduces a
new error number: EDQUOT (49). When an over-quota condition
is encountered, the following filesystem-related system calls
will fail and the errno will be set to EDQUOT. Previously,
the errno was set to ENOSPC.

The affected system calls are: creat(2), link(2), mkdir(2),
mknod(2), open(2), rename(2), symlink(2), and write(2)

Any applications that check for an over-quota condition
during a failed system call may encounter EDQUOT (49) as
a valid value for errno.

(from 101945-21)

1181259 NFS mount fails with: couldnt bind to reserved port - ESC# 11042

(from 101945-20)

1178190 MT program with more than 6 thread will consume system resource totally.
1155298 bind of AF_UNIX address simultaneously from multiple processes can fail
1143479 setuid/setgid program takes on default system limits

(from 101945-19)

1186557 pid_ref field wraps around manifests as kmem list corruption.
1186224 socket select hangs in NON-BLOCKED mode
1178506 INN wounded after upgrade to SunOS 5.4
1181009 setsockopt returns error when expanding max receive size to 20KB (AF_UNIX)
1169775 Solaris 2.X does not correctly handle Copy-On-Write faults on a page

(from 101945-18)

1175368 SECURITY anyone can gain root access to a 2.3 machine

(from 101945-17)

1185694 systems with 7 32MB DSIMMS fail to boot
1183568 NFS client get old data after file on server being updated.

(from 101945-16)

1178114 ioctl SIOCSPGRP/FIOSETOWN path broken for MT libsocket(linked to libthread) code
1171008 Mux hangs when expecting messages on lower stream during I_LINK/UNLINK
1159986 lckpwdf causes passwd to crash

(from 101945-15)

1183837 Random processes dump core on sun4d
1176508 panic mutex adaptive exit under 2.4 fcs when accessing directory over nfs.
1172998 x86: auto_lookup(): assertion failure in mutex_exit() on non-existent fs
1170832 (gnu) make in parallel mode will fail on automounted file systems

(from 101945-14)

1182492 autmountd's macro_expand function may cause buffer to overflow

(from 101945-13)

1182686 kernel rwlocks can allow readers and writers simultaneously
1177600 No way to cache the root and /usr file systems with CacheFS

(from 101945-12)

1178957 sigurg not delivered on second oob data arrival
1177644 swift specific mmu write function doesn't flush tlb
1164800 panic: ddi_setcallback: no callback structures
1159330 automountd unmounts the wrong lofs
1176247 Performance is poor on sun4m Viking MP systems due to unecessary cross calls
1166712 significant priority inversion problems when using mmap file access.
1178753 SS20 with 7 x 32Mb Simms installed will panic and hangs.
1181258 SAVECORE SEGMENTATION FAULT WHILE TRYING TO SAVE CORE

(from 101945-11)

1177620 gettimeofday doesn't work correctly when dual processors are used
1178363 unstrcpy() can cause an EFAULT failure when it copies certain bytes.

The kernel string copy routine can cause a data fault during exec when
the string being copied contains 0x80 and is aligned in a certain way. 
As most strings copied by the kernel use the 7-bit ASCII code, this 
error will almost never be seen.

On MP sun4m systems, clock interrupts may be serviced more than once
per tick, causing the system's notion of time to drift.

(from 101945-10)

1178835 RCS operations fail on file system NFS mounted from AIX system.

The problem happens when application do fchmod between writes (which is very
rare) which has a chance to lead different views of the file attributes on
client from that on server. The solution purges the client cache before doing
setattr so that the views will be the same.

(from 101945-09)

1178128 SX Sundiag test gets a segmentation fault (SIGSEGV) when testing contiguous mem
1176845 Sun4m MP CPU startup sequence causes SS-20/Hypersparc machines to hang hard
1179480 sun4d needs to clear important registers on startup

On a sun4d, during the first boot after a power-on reset, the system
may experience panics due to stale bits leftover in the MFSR register
after POST (with a successive boot succeeding). This fix ensures the
MFSR is cleared during the cpu's startup.

1176845
-------
On MP SS-20 machines using  100Mhz HyperSPARC processors booting from the net causes the machine to hang hard. The problem is easily reproducible with 4-way MP systems and reproducible with 2-way MP systems also. The problem has been observed on MP platforms using the SuperSPARC processors also but it is hard to reproduce and the TTF is very long. This bug is induced by the MP CPU startup sequence.

The problem with net boot hanging has been under investigation for nearly three months and  other bugs had to be fixed in locore (level 14 interrupt handling),  sun4m/machdep.c ( init_mon_clock, start_mon_clock and stop_mon_clock). See bugids (1168398,  1175829) for details.

1178128
-------

Sun manufacturing was testing SS-20/Colorado platforms and the SX test failed with a segmentation violation when testing SX functionality for rendering into physically contiguous memory. 

This has caused manufacturing to hold of the manufacturing line for SS-20/Colorado platforms.

(from 101945-08)

1178641 NFS client should fail to open files with the mandlock bit set
1178295 /usr/sbin/eeprom caused Aurora machine to panic
1178236 System panics with data fault in free_zero_zero()
1177862 Illegal FP got "Segmentation Fault - core dumped" on MP Grizzly
1177516 SS10/SX panics when user program dumps core with a mapping to ZX
1177100 CPR doesn't support Grizzly configuration
1175478 Panic in prototype inkernel logdmuxunlink() after munlink failed
1175115 nfs write error "(file handle: xxx xxx" message cannot be redirected by syslog
1175018 cpu workarounds not correct for Voyager module on sun4d

changes to support the SuperSPARC2 processor module on sun4d
Note:  User needs to apply this patch to fix 1175018 before the 
       SuperSPARC 2 modules are installed.  Otherwise user will have
       to use the workaround from the bug report to boot the machine.

The problem occurs when nfs encounters write errors. NFS will print a write
error to the console. In some cases the physical console is printed upon in
the event that the console driver is deprived or resources. What has been done
is to put a throttle on NFS write error messages, enabling the administrator to
type on the console and try to figure out what is going on.

I_UNLINK or I_PUNLINK commands may time out and close the stream before
the multiplexor has processed the command.

o The module identify code for ross625 cpu needs to be added into cpr and
  cprboot code.
 
  To run VAC MP startup code correctly for cprboot, cpu states for non-boot 
  cpus needs to be cleared to 0 initially for x calls. The offline and quiesced
  flags are set back when we are running on the cpu startup threads.
 
This problem involves a system panic when application software that maps
certain portions of the ZX frame buffer dump core.   

The as_memory() routine in vm/vm_as.c is intended to "weed out" non-seg_vn
segments from a process' address space when determining which segments
should be written out to the core file created in the file system.  

However, as_memory() does not correctly check to determine if a segment
is handled by seg_vn, which results in it informing it's caller (i.e.
core_segs() in common/os/core.c) that the segment should be written out.

In the case of device segments managed by the seg_drv driver, the segment 
is not backed by real memory.  When core_segs() instructs vn_rdwr() to 
copy the contents of the segment data area to the core file (it does this
because as_memory() tells it to), it may touch parts of the device that will lock up the hardware.  Subsequent device access will timeout and lead to the
panic.

Test case psABI sct2.1 llsi/hard_traps 5 got UNRESOLVED error with
Grizzly (MP) running Solaris-2.4.  The expected result is to catch
SIGILL (signal 4) without core dump.

An isolated demo script is provided and the output for Viking (UP/MP) or
Grizzly (UP) are:
Caught signal number 4

and the output for Grizzly (MP) is:
Segmentation Fault - core dumped

Socket interface networking programs under heavy use may panic the
machine with free_zero_zero() on the kernel call stack. This fixes
the problem in the sockmod module.

This fixes the panic that occurs on an SS5 running Solaris 2.4 with
patch 101945-06 installed when the "eeprom" command is executed in
order to change an option setting in NVRAM.

The NFS server will deny access to mandatory lock files.  This is done
for two reasons.  First, mandatory locking is not supported over NFS.
Second, it is dangerous for the server to access mandatory lock files.
It would be very easy for a normal user to completely hang the NFS
server.  The user could create a file and set the mode to indicate
that it is a mandatory lock file.  It could then lock the file with
a program which then just does a pause.  This user could then go to
an NFS client and try to access the file.  With each request from the
client, including retries, another NFS server daemon on the server
would get blocked, until the server ran out of NFS server daemons.

(from 101945-07)

1177091 prgetstatus can generate pagefault holding p_lock, can deadlock if freemem is 0
1177578 strmakemsg/strgeterr causes panic in strrput due to NULL mblk ptr
1176467 fcntl system call fails in process run by rcmd
1172243 Customer runs application from dumb terminal and system crashes.

The system can freeze under heavy swapping pressure
due to procfs holding a critical lock when it takes
a page fault.

Doing I_SETSIG on a console window through serial line and exiting the process could cause a system panic.

Kernel panic in putnext/ptcwrite.

A socket endpoint not created through the socket library (by dup()
of a socket endpoint for example) may experience some failures
on fcntl()/ioctl() calls. (This bug is only limited to 2.4 release)

(from 101945-06)

1174847 SS5 running 4.1.3U1 - running customer application - HARD HANGS
1177572 Installing Solaris 2.4 ON patch 101945-05 and running OW causes machine to panic

The patch to bug ID 1151364 broke OW's consolidation. This happened
because releasef() changed to have an extra argument. OW shouldn't
have been dependent on releasef() which is private to the ON
consolidation. Since this problem was not discovered until after
the patch was made, it made more sense for ON to produce a new
patch which restores releasef() to have its old interface. The
interface changed for kaio. A new interface is added called areleasef()
which is only used by kaio.

This is an enhancement to the workaround created for bug 1161592.
Change is local to sun4m/swift cpu code and has NO impact on other
non-swift platforms.

(from 101945-05)

1174830 savecore on diskless machine didn't generate unix, vmcore is trash
1174738 segmapdev uses condition variables with spin-type mutexes
1172998 x86: auto_lookup(): assertion failure in mutex_exit() on non-existent fs
1175829 booting continuously a diskless machine over network hangs a single CPU machine
1151364 asynchronous I/O in the user level hurts RDBMS performance

This is a performance improvement for applications that are using libaio
for doing async IO to raw files or devices. There are no API changes, only
a new version of libaio.so.1 is installed. One side benefit of this fix
is that async IO to tape should now work.  This patch to bug 1151364
requires installation of libaio/kaio patch 102020-01 or later)

1175829
--------

During booting diskless over the net, a single CPU system hangs. The cause is that 
the kernel decides to let PROM handle the profiling timer (L14) interrupt after 
programming the timer with default values. The PROM waits 
for profiling timer (L14) to interrupt but that never happens as timer has been 
mistakenly stopped (when it was programmed with default values)

1174738
-------

The problem is that lp is a spin-type mutex (the devctx lock) passed by
segmapdev_fault.

It is illegal, but probably not well documented, to pass a spin-type mutex
to cv_wait().

It is also illegal to pass 1 as the arg to mutex_init() for spin-type mutexes.
That is a machine-dependent spl argument, and 1 isn't a valid choice (it needs
to be a SPL above lock level.

1172998
--------

The panic:

panic: mutex_adaptive_exit: mutex not owned by thread, lp f58936
	70 owner e0000000 lock 0 waiters 0 curthread f5ce6360

Kernel crash dumps generated on diskless sun4m, sun4d or i86pc systems are
not complete.

(from 101945-04)

1175968 non-master cpu network interfaces broken on SS1000
1172243 Customer runs application from dumb terminal and system crashes.
1169686 4.1.3 system on network goes down, hangs 2.3 system

The problem shows up when a "ps" thread is running through the virtual memory
area to get the address space size for a mapped file. The address space lock is
held and a get attributes function is called. This initiates an nfs get 
attribute request. If the machine that the request is made to is not responding
the nfs request will block. The address space lock which is held by the blocked
ps thread might block other processes on the local machine.

Typically when a server goes down all nfs file system activity is blocked
on any clients. The nfs operation resumes once the server comes up. In this
situation a server is powered down and causes a client to hang. The hang is
due to a process pile-up. The client is doing a ps and its thread is holding
the address space lock (as_lock) for a running process lets call A. The A
process is a mapped file from the server. The client ps thread path has reached
rm_assize() which needs to get the file size so it calls VOP_GETATTR()
which goes across the wire to the server. This operation goes nowhere because
the server is not running. The as_lock held by the ps process is blocking
other processes such as init.

The solution is not to go over the wire but to return a cached entry for the
file size. The change is to define a new attribute flag in vnode.h called
ATTR_HINT. The rm_assize() function recognizes will use this flag when it
calls VOP_GETATTR(). The nfs getattr function will see that the size of the
file is requested and that the passed in flag is ATTR_HINT. It will return the
file size from the rnode rather than make a request to the server.

Typically when a server goes down all nfs file system activity is blocked
on any clients. The nfs operation resumes once the server comes up. In this
situation a server is powered down and causes a client to hang. The hang is
due to a process pile-up. The client is doing a ps and its thread is holding
the address space lock (as_lock) for a running process lets call A. The A
process is a mapped file from the server. The client ps thread path has reached
rm_assize() which needs to get the file size so it calls VOP_GETATTR()
which goes across the wire to the server. This operation goes nowhere because
the server is not running. The as_lock held by the ps process is blocking
other processes such as init.

The solution is not to go over the wire but to return a cached entry for the
file size. The change is to define a new attribute flag in vnode.h called
ATTR_HINT. The rm_assize() function recognizes will use this flag when it
calls VOP_GETATTR(). The nfs getattr function will see that the size of the
file is requested and that the passed in flag is ATTR_HINT. It will return the
file size from the rnode rather than make a request to the server.

Running applications that do I_SETSIG on console, when console
is the serial port (i.e not the frame buffer), causes system
to crash, when attempting to send signal to a process.

Support for SC2000E and SS1000E was patched in the 2.3 and 2.4 releases,
and integrated into the 2.5 release. This fix introduced a bug which
causes non-zero system boards to have tpe-link-test turned to the
incorrect setting. This has the effect of rendering the additional
le interfaces non-functional.

(from 101945-03)

1169909 Running xlib code in Realtime class causes code to block. in poll()
1167235 panic data fault in strioctl - apparently doing TIOCSPGRP
1150556 System becomes "panic: Overflow of asynchronous faults".

          This change is for proper handling of memory ECC errors.  Previously,
          an attempt to enqueue an error when the async fault queue was full,
          resulted in panic: "Overflow of asynchronous faults"

          The new functionality is:
          When the queue is full, discard the entry and disable correctable
          error interrupt generation.  Schedule re-enable of interrupt
          generation (via timeout) after a period of 30 minutes.
          Message generation is enabled to log information regarding SIMM and
          faulting address.  An additional message is output:

          Excessive Asynchronous Faults: Possible Memory Deterioration

          Uncorrectable error occurring while the async fault queue is full
          results in immediate panic.

          In addition to queue  overflow handling, the rate of error
          occurrence is also monitored.  If the rate of errors is such
          that 256 errors are reported in less than 1 second, ce interrupts
          are disabled.  Re-enable of the ce interrupts is scheduled for
          30 minutes (via timeout).

Protect with mutex the testing and setting of the session and controlling
terminal related flags in the streamhead. 

Real time stream threads will block in a poll.

(from 101945-02)

1174572 Viking workaround enabled on parts that do not need it
1172979 spurious SIGALRM received in test program that forks child processes
1172009 recv() on sockets should return the error only once for SunOS 4.X compatibility
1170862 kadb hangs on MP configuration
1173626 Race condition in ross625_mp_mmu_writepte() where ref/mod bits can be lost
1172242 HyperSPARC Ross_625 A.2/A.3 has bcopy error if destination page is read-only
1172245 iflush code need to be more intelligent for HyperSPARC-MP
1168398 MP CPU start up causes machine to lock up when booting from net
1166848 L1 A and then sync locks up machine
1166779 Add support for dragon+ dual power supply
1152922 prtdiag(1M) should display SBus clock frequency
1165687 reads on acceptor sockets not non-blocking under Solaris 2 when listener is
1160112 socket library accidentally closes file descriptor on error
1120225 recv() returns EPIPE when called with MSG_PEEK
1152710 socket lib in 2.3/2.2 have problems with not clearing bad connections and errno
1171478 socket recv() calls fail with EINVAL due to bad fix in 5.4

AF_UNIX and AF_INET sockets can sometimes get EPIPE errors for recv(MSG_PEEK).
When the socket library sees the EPIPE error it will in some cases close
the file descriptor causing the application to get EBADF errors for subsequent
operations.

A AF_UNIX listening socket can get into a permanant error state 
(returning EPIPE or ECONNRESET) for any operation until the socket is closed.

The non-blocking attribute of a socket endpoint is not transferred
from a non-blocking listener endpoint to a accepting endpoint.
This causes some socket non-blocking programs to block. This
patch fixes the problem by setting the accepting endpoint non-blocking
attribute if the listener was non-blocking.

Add dual power supply support to SC2000 and SC2000E systems. Systems
with dual power supplies will receive warnings on system console when
one of the redundant power supplies fails.

Modify prtdiag(1M) to indicate SBus Clock frequency. The SC2000E and
SS1000E run with 25 MHz SBus clock frequency. The SC2000 and SS1000
run with 20 MHz SBus clock frequency. This change to prtdiag(1M) 
makes it easy to determine the SBus clock frequency on the system.

For versions greater than 2.10 of the Open Boot Prom, L1-A followed
by "sync" will sometimes hang.

Summary: Patch to support Hypersparc CPU (Colorado) Modules. 

Below is a brief description of each bug:

1170862
-------
On a Colorado MP machine, if you set up break-points in the kernel
try to resume from there, the machine sometimes hangs. Neither L1-A
nor taking the keyboard helps. One has to power cycle the machine.

1173626
-------
There is a small window in "ross625_mp_mmu_writepte()" where the reference
or modified bits can be lost before cpus are captured

1172242
-------
This is needed for a Ross625 A0-A3 bug where it's possible
for the hardware bcopy to write into the destination if the 
destination is write protected under some circumstances.

1172245:
-------
The iflush code for Ross 625 (virtual address-cache cpu) needed to 
add in  per-cpu local flush support instead of doing global broadcast
to  flush all cpus all the times.

1168398
-------
We have two processors in the system: CPU 0 and CPU 2. CPU 2 executed pause_cpus(), pause_cpus() created a pause_thread for CPU 0 and CPU2 was spinning on safe_list[0] waiting for pause thread for CPU 0 to set safe_list[0] to a 1. But CPU 0 never executed its pause thread. Instead, CPU 0 took a level 14 interrupt and  dropped into the PROM and never re-surfaced from the PROM.  

In SunOS 4.X sockets when a read() or recv*() call returns an error the 
application can do another read()/recv*() and get an EOF. This patch applies 
this subtle aspect of socket semantics to SunOS 5.X.

This specification of signal actions from the signal(5)
manual page was being violated:

                Setting  a  signal action to SIG_IGN for a signal
     that is pending causes the pending signal to  be  discarded,
     whether or not it is blocked.  Any queued values pending are
     also discarded, and the resources used  to  queue  them  are
     released and made available to queue other signals.

The condition under which the pending signal was not being
discarded was the specific case of SIGALRM signals generated
by the setitimer(ITIMER_REAL) interface.  The malfunction
happens in a narrow race condition which will be triggered
under intensive setting of a signal handler and setting it
to SIG_IGN while the itimer is active.

SunOS 5.4 sometimes enables a bug workaround on systems
that do not need it.

(from 101945-01)

1173969 MT process doesn't stop on multi processor systems

dbx appears to malfunction when controlling a multithreaded
process that does many fork1()s.  The bug is in the system, not dbx.

Also, stopping dbx with a jobcontrol signal from the terminal, ^Z,
while it is controlling a multithreaded process will cause the
multithreaded process to becomed permanently stopped.

(from 101918-01)

1157053 ESC8146 System panics when doing a copy to NFS file system mounted across FDDI-S

Cause of problem is due to non-aligned transfers.  The memory address alignment trap happened in xdr_writeargs() when copying data in a loop. The address was not on a long word boundary, it was on a word boundary. nfs_feedback() can adjust the transfer address and size for a request such as for a retransmission. The xdr_writeargs() can make use of bcopy(). The xdr_writeargs() is in file nfs_xdr.c. There are a few other functions in this file that do a similar copy operation that should be changed to use bcopy.

(from 101983-03)

1174913 autofs checking for local subnets doesn't work when NIS+ is the nameservice

This patch is to fix bug 1174913: autofs checking for local subnets doesn't
work when NIS+ is the nameservice. The problem is that when a mount takes
place ,its not giving a preference to the interface that the client machine
is sitting on. It should be mounting from the servers interface that the
client machine is attached to first and then an alternate if that does not
respond. This is because that the automounter is looking up the table netmask
while it should be looking up the table netmasks.

(from 101983-02)

1151509 automounter's built in timeout is too short for low speed lines

automountd by default only waits 15 seconds for servers to
reply to its initial connection requests. This timeout may be too
short for slow links or for very busy servers. This patch allows the
system administrator to tune the total timeout by specifying the
number of attempts (original + retries). This is done by adding a
retry=n entry to the options list for the busy server entry in the
automounter map. The default is one attempt (retry=0), when no
retry=n option is specified in the options field, or when the
retry=n option is invalid. Each retry is equivalent to approximately
30 seconds.

Since automountd is currently single-threaded, this option should be
used with care, as it will cause automountd to take more time to decide
whether a server is dead or not (reply received or not), causing incoming
autofs kernel requests to be queued for longer periods of time.

For example, the following /etc/auto_home map uses the retry=1 option
to force automountd to send the original request, and retry it once more,
before giving up with a "server not responding error". If the reply is
received before the next retry, there will be no retransmission.

NOTE: It is not recommended to set this option as the map default, since
      it will cause automountd to needlessly wait longer for replies from
      real dead servers which otherwise would have replied without the
      need for retries had they been up.

        /etc/auto_home:

# Home directory map for automounter
#
userx   -nosuid,hard,intr,retry=1       busy_server:/export/home/userx
+auto_home

(from 101983-01)

1174222 automounter does not mount from 4.1.3 NFS servers with libc patch

automountd first makes a null RPC call to the remote portmapper (rpcbind) 
of the server from which it needs to mount to determine if the server 
is able to respond to mount requests or not. In some cases (multiple
servers specified on map entry) it would call the remote portmapper using
version 3, which is not available on non SVR4 systems. Some systems are
now silent about version mismatches, which causes automountd to assume
the server is dead (or at least it's rpcbind/portmapper). This patch fixes
this problem by always using version 2 of the portmapper protocol.

(from 101975-01)

1173301 some files may not show up under cachefs

Files can sometimes be missing from a cachefs mounted directory.  This
can happen if the entry in question is the last one in the directory
block, but would be the first one in the cachefs front file. If a
client system runs touch on this file, it will erase the contents of
the file on the server.

(from 101969-07)

1182458 network interface can hang on NFS server with high ipReasmDuplicates counts

(from 101969-06)

1178985 Multicast routing broken in Solaris 2.4
1179625 freeb(): bad pointer passed to kmem_cache_free

A performance problem with T101969-05 will be seen in TCP/IP 
network connections over high bandwidth network interfaces (FDDI, 
ATM, 100Base-T, but not 10Base-T EtherNet) resulting in lower 
then expected throughput. There is no impact on TCP/IP functionality.

IP Multicast routing does not work correctly in 2.4.
The (publicly available) mrouted program does not receive any 
IGMP_HOST_MEMBERSHIP_REPORT messages due to the kernel incorrectly
thinking that these messages are malformed.

(from 101969-05)

1179625 freeb(): bad pointer passed to kmem_cache_free

Kernel PANIC in freeb(), from freemsg(), from tcp_closei(), ... may be caused by TCP freemsg()ing an already freed mblk. This bug was introduced in a bug fix for 1167357.

(from 101969-04)

1178391 system with PPP device using the same IP address as le0 will stop working
1178400 NFS copies btw 690MP(512) server and Sunos 4.1.3 corrupt data without any error

MPs can transmit IP packets with the same ip_id field potentially causing
fragmented packets to be reassembled incorrectly. Normally this is not
a problem since the corruption will be detected by the UDP/TCP checksum.
However, SunOS 4.X does not by default verify the UDP checksum in which case
the incorrectly reassembled packets can cause NFS file corruption.

When an IP address is shared between an ethernet and a point-to-point links
and if the links go down and point-to-point links comes up first,  the ethernet
link will not be able to come up with the shared IP address..

(from 101969-03)

1172731 After PPP connects improper routing entries cause problems

When an IP address is shared between a POINT-TO_POINT interface and a numbered
interface can result in invalid routing entries.

(from 101969-02)

1174786 Unnumbered interfaces with respect to PPP have problems
1174851 SC2000 hang due to no ip clow control when used with FDDI board

When a high bandwidth network interface is receiving a large number of packets addressed to it, but with no one bound to the specified port, then IP does a lot of processing. IP sends an ICMP unreachable packet back to the source of the original packet. This can cause large amounts of kmem to be consumed, which can cause subsequent kmem_alloc() failures, including allocb() failures in the driver for the high bandwidth interface driver. This can cause subsequent fragments of large IP frames to be dropped by the driver.  IP will then hold on to these incomplete frames awaiting the arrival of the missing fragments which will never show up, IP holds on to these frames for 60 seconds. Which in the case of a NPI FDDI interface at 80Mb/S can be 300Mbyte of kmem.

When a point-to-point interface shares an ip address with a numbered interface,
point-to-point link will stop receiving packets if the numbered interface is
shutdown.

(from 101969-01)

1162269 all net IP broadcast packets (255.255.255.255) have a ttl of 1

Applications and environments that depend on routers forwarding broadcast
packets might run into problems with the fact that IP sets the TTL of
all broadcast packets to 1 (in order to avoid any broadcast storms when
there are misconfigured machines on the wire). This patch makes it possible
to override the default TTL of 1.

(from 101971-01)

1172260 5.4 <-> 4.1.2 socket connection loses sync and delays transfer of data

A TCP connection might not start immediately when a window update is
received after the Solaris 5.4 side has sent a zero window probe.

With some TCP implementations at the remote end there will be a few seconds 
of delay (waiting for a retransmit timeout).

(from 101981-02)

1179738 Users with 8 characters name or more will not be logged into utmp

(from 101981-01)

1173212 SECURITY: su can display root password in the console

If a username is too long (greater than 8 characters), when su root fails or succeeds for that user, the characters typed in as the password are echoed to the console.

(from 102169-01)

1178761 ufs_putapage:bn == UFS_HOLE panic when filesystem fills up...

A corrupt inode can be created when extending a UFS file and running
out of space.  This can later cause a panic 'ufs_putapage: bn==UFS_HOLE'.

(from 102926-01)

1178824 When catman is invoked on a system w/BSM & 101318-54 the system crashes

(from 102119-01)

1163335 SS1000 interactive performance very poor with > 200 users logged in

Traditional BSD-derived systems require per-character processing to be handled
by the rlogin and telnet daemons.  This is very inefficient, since it often
requires several user level context switches per input character.  This patch
provides a "fast path" entirely in the kernel, which eliminates the added
overhead of processing by a user-level daemon for normal data traffic.

(from 102020-07)

1194263 scsi errors, timeouts, other scsi errors

(from 102020-06)

1193007 ESC - 5.4 vol driver on SS5 internal CDROM-drive fails with 3rd-party CD's

(from 102020-05)

1188367 isp0 : unkown capacity , disk offline

(from 102020-04)

1195904 usr/src/uts/common/sys/Makefile missing ascii.h

(from 102020-03)

1185775 KAIO thorttling too conservative, Sybase cannot install on low memory machines

(from 102020-02)

1151364 asynchronous I/O in the user level hurts RDBMS performance

The kaio patch for the original RTI is missing several deliverable files.
The purpose of this RTI is to correct the patch 102020. This doesn't 
require a putback because the sources are already in the patch workspace. 
The missing files are listed in the deliverables section for this RTI.

(from 102020-01)

1151364 asynchronous I/O in the user level hurts RDBMS performance

This is a performance improvement for applications that are using libaio
for doing async IO to raw files or devices. There are no API changes, only
a new version of libaio.so.1 is installed. One side benefit of this fix
is that async IO to tape should now work.

(from 102007-02)

1184991 2.3/2.4 - panics in tmpnode_hold - data fault - TMPFS

(from 102007-01)

1175304 vnode v_count is not maintained correctly

vnode v_count numbers are not maintained correctly causing vnodes to
never disappear or, in the earliest bug, drop to zero prematurely and
panic the system.

(from 102358-01)

1183552 ftpd processes hang at httpd (WWW server) sitesq

(from 103575-01)

1249667 ftp size increases by 8k/2 page size with every open/close session memory leak

(from 102216-07)

1232825 RPC: Unable to send/receive

(from 102216-06)

1245300 5.4 lockd on client side can't handle two outstanding klm requests simutaneousl

(from 102216-05)

1164679 KLM doesn't initialize rsys & rpid correctly

(from 102216-04)

1143434 Secure nfs does not work properly across NIS+ domains

(from 102216-03)

1197596 _svcauth_unix can crash the kernel

(from 102216-02)

1194923 System hangs with many klm_lockctl messages echoed to the screen.

(from 102216-01)

1179403 NFS client starts using unreserved UDP port numbers

(from 102224-10)

1247172 Threads losing signals when preempted

(from 102224-09)

1241118 libthread panic in thr_join handling of zombie threads seems to be broken

(from 102224-08)

1260769 MT application is dropping signal events when run on multi-processor systems

(from 102224-07)

1232577 Signal delivered twice in MT program

(from 102224-06)

	Performance improvement fix for 102224-05

(from 102224-05)

1197042 mutex_unlock() can corrupt user's dynamically allocated memory (no-25)
1214038 Possible deadlock in libthread due to unsafe locking of _reaplock

(from 102224-04)

1192162 programs using cond_timedwait eventually hang
1187322 cannot thr_continue the "main" thread of execution after a call to thr_suspend

(from 102224-03)

1188399 sema_init with USYNC_PROCESS does not work properly

(from 102224-02)

1188790 cond_timedwait occasionally returns incorrectly

(from 102224-01)

1178898 Add thr_stksegment() interface and upgrade thr_main(0 interface for rtc.

(from 102024-02)

1226938 program that calls c-shell script fails with "/dev/fd/3: Bad file number" error

(from 102024-01)

1172710 csh does not find files which are links going to non-existent files using ?/*.
1175176 csh sends syslog messages on tcp stream

csh (  which hasn't opened syslog yet ) vforks to execute a command

child csh does NIS+ lookups and opens syslog. The syslog descriptor is stored in a global for future use. Since the child has been vfork'ed the parent's global syslog descriptor variable is also altered.

parent csh vforks another child

this child wants to write a message to syslog, checks the global variable and finds it set
( to the value set by the first child ). However, this descriptor actually points to some other
stream( TCP) and the syslog message is sent down to the TCP stream. This causes a system
crash.

Customer reports that csh does not find files which are symbolic links
going to non-existent files using ?/*, whereas sh can find them

(from C641039-01)
 
91281: ISP ATTACH KILLS OFF WATCHDOG TIMER PREMATURELY
 
If an error occurs when attaching any isp instance, the watchdog timer is
killed off.  Since there is one watchdog timer for all isp instances, it
should only be killed off if the attach failure occurred when attaching
the initial isp instance.
 
89908: PANIC: ISP_I_SCSI_PKTFREE: FREEING FREE PACKET
 
A panic would occur when detaching system boards with isp adapters
installed.  The problem is timing related, so would only happen when
the isp driver was compiled without -DDEBUG.
 
88420: ISP DEADLOCK DURING ISP (DR) DETACH BETWEEN ISP_DR_DETACH AND ISP_I_WATCH
 
ISP driver enters a possible deadlock condition, usually under heavy
I/O activity, which ultimately results in the OS hanging up.
 
(from C102002-05)
 
91104: ESP TIMEOUT LOGIC HAS A SYNCHRONIZATION WINDOW
 
Timeout logic has synchronization window.
 
90922: ESP_DR_DETACH PANICS WHEN DETACHING LAST ESP IN SYSTEM (ESP_SOFTC = NULL)
 
Detaching the last instance of an esp causes a panic.
 
(from C102002-04)
 
87473: PANIC IN SCSI_HBA_DETACH DURING DR DETACH
 
During DR Detach, the Host panic'd during the Detach of the I/O devices
on the detaching board, specifically during the detach of the ESP scsi
controller.

(from 102137-01)

1179258 cg14 does not resume on Kodiak due to a bug in sx resume code

On Kodiak cgfourteen screen is not resumed correctly. The openwin screen is
resumed white. Refresh or restarting openwin will not correct it.

(from 102509-08)

1243116 ESC: tag command queue'ing errors on Data General array box's

(from 102509-07)

1262660 isp: returns incorrect pkt_resid when read command completed without data xfer

(from 102509-06)

1252953 ESC: isp gets 'Fatal Error' system then hard hangs after data fault panic
1245077 SCSI transport failed: reason 'unexpected_bus_free':

(from 102509-05)

1220275 isp: SCSI transport failed
1223632 isp : DMA engine in ISP firmware may potentially cause data corruption

(from 102509-04)

1220275 isp: SCSI transport failed
1220411 booting from the net can cause bad trap data faults with some versions of isp's

(from 102509-03)

1200912 isp: dwis, SCSI, Fatal timeout, Fatal error, resetting interface
1201471 isp driver hangs host

(from 102509-02)

1205200 isp/esp: cannot modify target?-sync-speed property

(from 102509-01)

1199500 isp: with the new ISP PROM, ISP in 2.3 or 2.4 should still run ISP f/w in driver

(from 102002-05)

1189329 esp: implicit restore data pointer fix is not complete (bug id 1183215)
(bug
id1183215

(from 102002-04)

1183215 esp: implicit restore pointer at reconnect is incomplete

(from 102002-03)

1178407 esp: esp_watch_reset_delay is not always restarted

On high availability systems (HA), the sd driver may issue many scsi
bus and bus device resets. These problems were fixed by earlier RTI
1163511 (patch T102002-02).

However, there is a simple additional change required for fixing reset
related problems in the esp driver. This RTI is for updating the patch
with this additional fix.

(from 102002-02)

1163511 esp: bus device reset may result in not completing some requests

On high availability systems (HA), the sd driver may issue many scsi
bus and bus device resets. Usually, this works but sometimes this causes
loss of packets, hangs, or panics. Similar problems occur during
scsi abort's. The fix for this is extensive (ie. the entire reset/abort
handling had to be rewritten since it was fatally flawed). Code in the
critical path is also affected.

(from 102002-01)

1173973 esp: scsi resets occurring more often with newer fab FAS236 chips

On Sun4d systems (both sundragons and scorpions) we are getting more
scsi timeout resets now with the newer reduce die FAS236 chips ( 2400150). 
They are no longer making the old style chip (2400121) anymore .
It doesn't seem to be configuration dependent. 
On sundragons it occurs most often on tape devices.
It also shows up more often with the 1/2 height exabyte 8 mm tape drive.

Patch Installation Instructions:
--------------------------------
Refer to the Install.info file within the patch for instructions on
using the generic 'installpatch' and 'backoutpatch' scripts provided
with each patch. Any other special or non-generic installation
instructions should be described below.

Special Install Instructions:
-----------------------------

It is necessary to install the same revision of this patch on
both servers and clients for remote login to clients to work.

1) Stop automountd
   # /etc/init.d/autofs stop

2) Install patch

3) Edit the necessary entries on your automounter maps
   (add the retry=n option).

4) Restart automountd
   # /etc/init.d/autofs start

Reboot the system after patch installation.

See NOTES 1-6 near the top of this file.