|
HPE INTERNAL USE ONLY |
|
Analysis Code: 201
Severity: Warning
INEX is checking “showeventlog_-d_-debug_-oneline.out”
for the string:
"scsi_cmnd_retry:
pd .*
opcode .* rval 0x31"
We are looking for the event
occurring 15 times per minute per unique pd and
unique port and unique <N:S:P>.
Based upon the information
provided look for the following:
·
Check issue exposed
on single/multiple PDs?
·
Check issue exposed
on single port or multiple ports
Plan Of
Actions:
If issue seen on single pd:
For ex:-
480 pd 1 port b0 on 0:0:1
- controlport
rst –l <port>
- After 15 min, check issue still around? If then, reseat
the drive.
- After 15 min, check issue still around? If then,
elevate the issue.
If issue seen on multiple
drives and pointing same port:
- controlport
rst –l <port>
- After 15 min, check issue still around? If then,
elevate the issue.
If issue seen on multiple
drives and it is pointing to same <N:S> then:
For ex:-
520 pd 240 port a0 on 3:0:1
300 pd 264 port a0 on 3:0:1
264 pd 274 port a0 on 3:0:2
542 pd 276 port a0 on 3:0:2
622 pd 294 port a0 on 3:0:2
- controlport rst –l
<port>
If issue seen on multiple
drives and it is pointing to multiple <N:S> then:
- This situation may require multiple
controlport reset commands to be issued, this type of
activity may have an impact on multiple hosts and their IOs. In a situation
like this careful consideration would have to be given to the multiple port
resets or consider issuance of a cluster shutdown to resolve the issue. Elevate
the issue.
On Array check may be
accomplished with the following command to collect and display the data:
Get last 15 min event log and
check any events with "scsi_cmnd_retry: pd .*
opcode .* rval 0x31" pattern?
# showeventlog -oneline
-debug -min 15 -msg "scsi_cmnd_retry:
pd .* opcode .* rval
0x31" |\
sed -e
"s/.*scsi_cmnd_retry: //g" -e "s/ -
opcode.*//g" | sort | uniq –c
Multiplication factor based on
minutes covered.
- 15 events/minute * 15
min = 225 events per port of the drive
i.e.,
If events are more than >=200 then need corrective actions.
For ex:- As per
data below, pd 1 and pd 2
reported the issue.
480 pd 1 port b0 on 0:0:1
486 pd 3 port b0 on 0:0:1