Patch-ID# 103957-12 Keywords: security floating point signal cpu_surrender strioctl C2 sbus le0 klm Synopsis: SunOS 5.5 CS6400: kernel jumbo patch Date: Jun/25/97 Solaris Release: 2.5 CS6400 SunOS release: 5.5 CS6400 Unbundled Product: Unbundled Release: Xref: This patch available for non-CS6400 sparc as patch 103093 Xref: This patch available for x86 as patch 103094 Topic: SunOS 5.5 CS6400: kernel jumbo patch NOTE: TO GET THE COMPLETE FIX FOR 4032974, ONE NEEDS TO INSTALL THE FOLLOWING PATCHES: 103477-11 (or higher) kernel/strmod/rpcmod and kernel/sys/nfs patch 104548-03 (or higher) SunOS 5.5 CS6400: isp driver fixes 102982-02 (or higher) usr/bin/csh patch FAILURE TO INSTALL ALL THESE PATCHES WILL CAUSE THE SYSTEM TO HANG AFTER 248 DAYS. Cray SPR's fixed with this patch: 98179 100433 100861 101359 101372 102615 104194 105066 105514 Cray SPR's incorporated in this version: BugId's fixed with this patch: 1161438 1182705 1189967 1220902 1223900 1227580 1228664 1229015 1229031 1230150 1230478 1230865 1231471 1231759 1231871 1232869 1233084 1233088 1235169 1238559 1238581 1238919 1241118 1243804 1244142 1245291 1245540 1247172 1248186 1249250 1249985 1251421 1251423 1251430 1253223 1253366 1253528 1256153 1256610 1258191 1259966 1260769 1260959 1260982 1261400 1261511 1262082 1262694 1264333 1264890 1265396 1265447 1265705 1266767 4004147 4004575 4009069 4015497 4016316 4017457 4022849 4032974 4035167 Changes incorporated in this version: 1262082 1264890 4032974 4035167 Relevant Architectures: sparc.cray4d Patches accumulated and obsoleted by this patch: 103093-08 103084-02 103489-01 103325-03 103164-07 Patches which conflict with this patch: Patches required with this patch: 103477-05 (or higher revs) Obsoleted by: Files included with this patch: /platform/cray4d/kernel/unix /platform/cray4d/kernel/genunix /kernel/sys/doorfs /kernel/misc/klmmod /kernel/misc/klmops /usr/lib/libthread.so.1 Problem Description: (from 103093-12) 4035167 Need a new, private interface between JVM and libthread to get a thread's TOS 4032974 system hangs when lbolt wraps around. 1264890 Sun4d running 2.5.1 panics bp_map: read_hwmap failed 1262082 2.5.1 sun4d hangs w/kernelmap fragmentation (from 103093-11) 4022849 2.5.1 kadb kernel panics with kernel heap corruption; appl hang; sys unusable 4016316 On 2.5.1 and 2.5.1 SHWP system goes into a state of soft hang. 4015497 Locking bug in I_NREAD ioctl handler. 4004575 High mutex hits, slow performance when c2auditing enabled 1245291 Bug in libthread.so(cond_timedwait()) and libposix4.so(sigtimedwait) in 2.4,2.5 1182705 Signals may orphan locks on clients (from 103093-10) 4004147 panics in segkp_load when the file command is run 1265447 SYSTEM HANG, CLOCK THREAD IN MUTEX_ENTER WAITING FOR ANOTHER LOCK (from 103093-09) 4009069 2.5 TCP generates wrong checksum and never recovers from error 1265396 Ctrl-C typed to dbx is sent to child debugee (not to dbx) when app uses sigwait 1249985 "deadman" doesn't work correctly on MP systems. 1233088 ioctl(PIOCPSINFO) is 100 times too slow on multi-threaded prcesses 1247172 Threads losing signals when preempted (from 103093-08) 1227580 cannot support high TCP connection rates: noncaput errors reported by the driver 1223900 alarm(2) doesn't work properly with large arguments 1266767 F_GETLK returns incorrect value on 2.x if a lock is pending 1261400 several processes are hung waiting for rwlock 1245540 The application which worked on non-Ultra does not work on 2.5/Ultra system. 1259966 winlock timed out causes Copy8FFB come into infinite loop on ffb single buffer (from 103093-07) 1265705 Add hyperSPARC Colorado-4 support to S2.5 and later kernels 1264333 _lwp_suspend()/continue() interrupts blocking system calls 1262694 Solaris 2.4 hangs due to memory leak in kmem_alloc-8, kmem_alloc_24 and -40 lea 1260982 rwnext & infonext fix in 2.4 to wait to enter inner perimeter didn't make 2.5 1260959 Streams information delayd 50-100 ms until dbri driver schedules it 1253223 System running 2.3 with KJP-80 on single CPU /24MB hangs in fork test case 1238559 sun4m user process can arbitrarily dump core with kadb 1256153 watchdog after continuing from kadb 1261511 alloc_hunk() bug causes panic with 1MB CPU cache (from 103957-06) SPR 105825 2.5 PATCH FOR SPR 105662: DATA CORRUPTION WITH DR-MEM-DETACH ENABLED There are 4 symptoms and an optimization; note that only one of these symptoms was ever seen at a customer installation (data corruption). All of the problems only occur when dr-mem-detach is enabled: - data corruption (2 different bugs, same symptom) - assertion failure - ASSERT(PTBL_IS_LOCKED(ptbl->ptbl_flags)) Note that asserts are only enabled in debug mode kernels. - assertion failure - ASSERT(pp->p_vnode) Note that asserts are only enabled in debug mode kernels. - optimization: the caged kernel messages which used to be emitted in debug mode are now only emitted if the DR_MEMDBG_CAGEDKERN flag is set in dr_mem_debug 4017457 Customer encounters poor I/O performance, is interested in freemem_lock fix (from 103093-06) 1256610 strwrite failes to call queuerun on error path: bug performance hit 1253528 The problem is associated with the bug found in the SE5 kernel. 1251423 panic - recursive mutex_enter on lwplock 1249250 SIGSEGV handler gets truncated fault address (from 103957-05) 105066: 2.5 PATCH FOR SPR 105031: DATA FAULT PANIC IN PAGE_LOOKUP+0X4C Data fault panic could occur in exec() flow due to kernel attempting to access kernel addresses while accidently in user context. Situation could occur if process in exec() flow should fault when accessing ELF header. Very obscure bug. 105514: SYSTEM PANIC IN TRASH_USER_WINDOWS+0X6C - SUN BUG 1255692 Invalid address panic in trash_user_windows(). (from C103093-03) 98179: THREAD_LOAD ASSUMES STACK IS IN USER SPACE WHICH IS NEVER TRUE: SUNBUG 1230150 103171: 2.5 PATCH SPR 103156: DATA FAULT PANIC IN IDLE ROUTINE - SEE SPR 101309 104194: 2.5 PATCH FOR SPR 104099: PAGE_CREATE_WAIT FAILS TO FREE P->PCF_LOCK MUTEX (from C103093-02) 100433: 2.5 PATCH FOR SPR 99353: SYSTEM PANIC: SRMMU_UNLOCK ORACLE RUNNING AT THE TIME panic: srmmu_unlock [with additional arguments] 100861: 2.5 PATCH FOR SPR 100719: CS6400 PANIC IN CHECKPAGE ROUTINE UNDER 2.5 WITH MEMOR The system will panic with a memory address alignment error in the checkpage procedure (around checkpage+0x148). 101372: 2.5 PATCH FOR SPR 100404: PANIC: SRMMU_PTELOAD - PTE REMAP PANIC WITH JKP-36 panic: srmmu_pteload - pte remap 102615: 2.5 PATCH FOR SPR 102190: SYSTEM HANG AFTER NPI FDDI DETACH ERROR The problem is a result of a call to the srmmu_logger() during a pause_cpus() state. The srmmu_logger code indirectly attempted to acquire a mutex held by a different thread which had been paused due to the previous pause_cpus() call. 101359: 2.5 PATCH FOR SPR 101309: REPEATED DATA FAULT PANICS AT IDLE+0X130, LOOKS LIKE NULL ADDRESS IN L7 System panics in idle thread while running Oracle database testing. Problem was a porting problem from 2.4 to 2.5 in the Processor Partition code due to change in a Solaris protocol related to thread scheduling. (from 103093-05) 1251421 Files may be corrupted after a power failure (from 103093-04) 1244142 ULTRA panics with 3rd party ATM card driver 1232869 paging thresholds are too low on very big systems causing kmem alloc failures 1161438 The pageout daemon blows up when a lot of memory is added to a system 1231471 Global register %g1 gets corrupted on SPARC2 and 2.5 1243804 lockfs -h and umount of the UFS lying under a loopback file system causes panic (from 103093-03) 1238581 indirect system calls fail on sun4u when C2 auditing is enabled 1230150 THREAD_LOAD ASSUMES STACK IS IN USER SPACE WHICH IS NEVER TRUE THREAD_LOAD ASSUMES STACK IS IN USER SPACE WHICH IS NEVER TRUE (from 103093-02) 1233084 freectty set cred pointer to NULL causing other module panic the system 1231759 strioctl ic_timout changed values from seconds to miliseconds 1229031 page_unlock: page not locked panic occurring when locking address space 1220902 workaround needed for Viking Hardware Problem 1189967 real-time latency limits exceeded occasionally 1231871 cpu_surrender doesn't check for threads waiting on kp queue (from 103093-01) 1228664 no fp queue in the signal handler mcontext for floating point ieee exceptions (from 103489-01) 1248186 ULTRA panics with 3rd party ATM card driver (from 103084-02) 1235169 ftp tests cause "le0 port" hang on Neutron ftp sessions hang neutron (from 103084-01) 1229015 interrupt fails to get to driver (from 103325-03) 1251430 Solaris 2.5 system panicked with message "lm_get_sysid: too many lm_sysid's" (from 103325-02) 1251430 Solaris 2.5 system panicked with message "lm_get_sysid: too many lm_sysid's" (from 103325-01) 1238919 mount causes the system to panic Data fault. (from 103164-07) 1258191 msgrcv was not interrupted by thr_suspend(SIGLWP). (from 103164-06) 1260769 MT application is dropping signal events when run on multi-processor systems (from 103164-05) 1247172 Threads losing signals when preempted (from 103164-04) 1241118 libthread panic in thr_join handling of zombie threads seems to be broken (from 103164-03) 1253366 threads deadlock occurs in delivering SIGIO (from 103164-02) 1230478 deadlock in libthread (from 103164-01) 1230865 Problem with threads and signals. Patch Installation Instructions: -------------------------------- Refer to the Install.info file within the patch for instructions on using the generic 'installpatch' and 'backoutpatch' scripts provided with each patch. Any other special or non-generic installation instructions should be described below. Special Install Instructions: ----------------------------- Reboot the system after patch installation. NOTE: TO GET THE COMPLETE FIX FOR 4032974, ONE NEEDS TO INSTALL THE FOLLOWING PATCHES: 103477-11 (or higher) kernel/strmod/rpcmod and kernel/sys/nfs patch 104548-03 (or higher) SunOS 5.5 CS6400: isp driver fixes 102982-02 (or higher) usr/bin/csh patch FAILURE TO INSTALL ALL THESE PATCHES WILL CAUSE THE SYSTEM TO HANG AFTER 248 DAYS.