|
HPE INTERNAL USE ONLY |
|
Analysis Code: 1020
Severity: Error
INEX has found
evidence of a Stack Trace in a node’s /var/log/messages[.*] file.
As you look at the
Stack Trace that INEX has called out you will see that in many cases that it is
the stack trace from another node. The reason that the Stack Trace from one
node appears in another nodes’ messages file is simply to record that Stack
Trace elsewhere so the information is not lost, the OS is taking advantage of
the fact that the nodes are clustered together.
If a Stack Trace does
appear you will want to review the information seen on the UpDown tab of this
workbook and comapare the time stamps of the Stack Trace to determine if the
node in question actually crashed. You will then want to look out on STaTS for any crash files,
crashtxt and/or crashdmp. If the crash related files are found you will then
want to proceed as you normally would to process the crash. Keep in mind the
Stack Trace information that INEX has found may be very useful if it is not
present in the crashtxt or analysis.* file of the crash dump. Refer to the INEX
User’s Guide.
An example of a Stack
Trace seen by INEX:
Node 1 panic stack trace:
(tpd_panic+0x10b)
(lckevt_clock_timeout+0x7d)
(run_timer_softirq+0x14c)
(__do_softirq+0xcf)
(call_softirq+0x1c)
(do_softirq+0x6d)
(irq_exit+0x75)
(smp_apic_timer_interrupt+0x45)
(apic_timer_interrupt+0x13)
As you can see, this
Stack Trace is for Node 1.