|
|
Analysis Code: 64 and 65
Severity: Error
Both analysis codes,
64 and 65, pertain to the SanDisk OPTIMUS model SSD disk drives in particular
the ECO and ASCEND models.
While both these
models may present the same error the cause of the error is different between
the two (2) models.
The following CFIs
reference this analysis code:
CFI: 5872
Reference SAW: http://h41302.www4.hp.com/km/saw/view.do?docId=emr_na-c04780380
Also, the following
information will be useful:
BZ:
142874
OPTIMUS Drive and
"Media Format Corruption
Background:
The Following model
numbers are impacted:
OPTIMUS Eco:
"DOPE0480S5xnNMRI" SmrtStor
"DOPE1920S5xnNMRI" SmrtStor
Fix to problem in 3P02 and Higher (3P04
latest).
OPTIMUS Ascend:
"DOPA0480S5xnNMRI" SanDisk
"DOPA0920S5xnNMRI" SanDisk
Fix to problem in 3P03 and higher.
SandDisk acquired SmartStor,
so both models are now SanDisk drives.
Both of these product
lines can present with the same error/symptom: "Media Corrupted
Format",
however the root causes are different.
Problem Description:
ASCEND (DOPA):
The cause of the "Media Format
Corrupted" issue with ASCEND drive models is a result of the drive
having been
downgraded at some point. If one of these drives is downgraded from the newer
versions
of
firmware, the older firmware does not understand the changes in the newer
firmware.
Firmware 3P02 and higher use
different data patterns for read interactions with different types of blocks.
Also the LBA range was increased for
the unmap function.
Fix:
Upgrade to 3P03. Due to
the changes in the firmware, which are not on the physical drive itself,
the
actual process of upgrading the firmware takes into account the changes between
versions of
firmware.
Thus, no further action
beyond the upgrade is required.
When to reformat:
1. After a drive has
been down graded or there suspicion that the drive
may have been downgraded
at some point prior
to the current upgrade.
2. When the drive
encounters the "Media Format Corrupted" error.
ECO (DOPE)
Firmware prior to 3P02 (current is
3P04) could encounter a problem due to the timing of a drive reset,
where an
entry in the L2P table would be marked as stale. And if later an IO tries to
access the stale
entry in
the L2P table we encounter and receive the "Media Format Corrupted"
error. Because this involves
the L2P
table, it involves meta-data at a physical drive level.
Fix:
Upgrade to 3P04 or
higher. Reformatting may be necessary.
When to reformat:
1. If the drives being
upgraded are drives currently at firmware revision 3P01 or earlier.
2. When the drive encounters
the "Media Format Corrupted" error.
3. If these are
new/replacement drives even at 3P04 and they are *NOT* from HP directly.
We do not know the state of the drives from
3rd party supplies/OEMs or from our
logistics.
4. If the customer had
ever downgraded the drive.
Only for ECO (DOPE) PDs
SanDisk Failure analysis
report states " When 3P02 or later firmware is
loaded, the drive must be
formatted to ensure that earlier un-formats/power cycle conditions
did not leave page count mismatches in LBA’s causing a format corrupt
issue."
Reformatting:
For existing drives
that may have data/chunklets allocated to the drive
in question:
1. Vacate the drive.
Make sure the you use the "-perm" switch because we do not know
what PD ID the OS
will assign when the drive is re-admitted. You need to also remove all the
spares.
2. Retreive
the WWN of the drive to be reformatted.
3. Upgrade firmware if
necessary, (upgradepd -w <WWN>)
4. Dismiss the drive.
5. Reformat the
drive. (use: controlpd format 520 <wwn_of_disk>)
6. re-admit
the drive. You need to make sure that spares are also allocated back onto the
drive.
7. Execute a tune to
help move data back.
For replacement
drives:
1. Use the noautosmag touch file to disable auto admit with the SMAG oprations. #onallnodes 'touch /var/opt/tpd/touchfiles/noautosmag'
2. use
showpd -i to find the
drives WWN.
3. Upgrade the
firmware. (upgradepd -w
<WWN>)
4. Reformat the
drive. (use: controlpd format 520 <wwn_of_disk>)
5. Remove the
aforementioned touch file.
6. Resume the SMAG
operation.
For new drives:
1. DO not admit the
drives. (use system variable mvar
outlined below)
2. use
showpd -i to find the
drives WWN.
3. Upgrade the
firmware. (upgradepd -w
<WWN>)
4. Reformat the
drive. (use: controlpd format 520 <wwn_of_disk>)
5. admit
the drives.