Starting statistics collection

You can start the collection of cluster statistics from the Starting the Collection of Statistics panel in the SAN Volume Controller Console.

Introduction

For each collection interval, the SAN Volume Controller Console creates three statistics files: one for managed disks (MDisks), named Nm_stat; one for virtual disks (VDisks), named Nv_stats; and one for nodes, named Nn_stats. The files are written to the /dumps/iostats directory on the node. To retrieve the statistics files from the non-configuration nodes onto the configuration node, svctask cpdumps command must be used.

A maximum of 16 files of each type can be created for the node. When the 17th file is created, the oldest file for the node is overwritten.

Fields

The following fields are available:
Interval
Specify the interval in minutes between the collection of statistics. You can specify from 1 to 60 minutes in increments of 1 minute.
The following tables describe the information that is reported for individual nodes:
Table 1. Statistics collection for MDisks for individual nodes. Describes the MDisk information that is reported for individual nodes
Statistic name Description
idx Indicates the identifier of the MDisk for which the statistics apply.
id Indicates the name of the MDisk for which the statistics apply.
ro Indicates the number of MDisk read operations processed during the sample period.
wo Indicates the number of MDisk write operations processed during the sample period.
rb Indicates the number of blocks of data read during the sample period.
wb Indicates the number of blocks of data written during the sample period.
re Indicates the cumulative read external response time in milliseconds for each MDisk. The cumulative response time for disk reads is calculated by starting a timer when a SCSI read command is issued and stopped when the command completes successfully. The elapsed time is added to the cumulative counter.
we Indicates the cumulative write external response time in milliseconds for each MDisk. The cumulative response time for disk writes is calculated by starting a timer when a SCSI write command is issued and stopped when the command completes successfully. The elapsed time is added to the cumulative counter.
rq Indicates the cumulative read queued response time in milliseconds for each MDisk. This is measured from above the queue of commands to be sent to an MDisk because the queue depth is already full. This calculation includes the elapsed time taken forread commands to complete from the time they join the queue.
wq Indicates the cumulative write queued response time in milliseconds for each MDisk. This is measured from above the queue of commands to be sent to an MDisk because the queue depth is already full. This calculation includes the elapsed time taken for write commands to complete from the time they join the queue.
Note: MDisk statistics files for nodes are written to the /dumps/iostats directory on the individual node.
Table 2. Statistic collection for VDisks for individual nodes. Describes the VDisk information that is reported for individual nodes
Statistic name Description
idx Indicates the VDisk ID for which the statistics apply.
id Indicates the VDisk name for which the statistics apply.
ro Indicates the number of VDisk read operations processed during the sample period.
wo Indicates the number of VDisk write operations processed during the sample period.
rb Indicates the number of blocks of data read during the sample period.
wb Indicates the number of blocks of data written during the sample period.
rl Indicates the cumulative read response time in milliseconds for each VDisk. The cumulative response time for VDisk reads is calculated by starting a timer when a SCSI read command is received and stopped when the command completes successfully. The elapsed time is added to the cumulative counter.
wl Indicates the cumulative write response time in milliseconds for each VDisk. The cumulative response time for VDisk writes is calculated by starting a timer when a SCSI write command is received and stopped when the command completes successfully. The elapsed time is added to the cumulative counter.
rlw Indicates the worst read response time in microseconds for each VDisk since the last time statistics were collected. This value is reset to zero after each statistics collection sample.
wlw Indicates the worst write response time in microseconds for each VDisk since the last time statistics were collected. This value is reset to zero after each statistics collection sample.
xl Indicates the cumulative data transfer response time in milliseconds for each VDisk since the last time statistics were collected. When this statistic is viewed for multiple VDisks and with other statistics, it can indicate if the latency is caused by the host, fabric, or the SAN Volume Controller.
Table 3. Partitioning statistic collection for individual nodes. Describes the statistics that are related to partitions.
Statistic name Description
dlav Indicates the destage latency average for the node since the last statistics collection period.
dlcn Indicates the destage count in sample period.
dlmn Indicates the destage minimum latency in microseconds.
dlmx Indicates the destage maximum latency in microseconds.
plav Indicates the prestage latency average for the node since the last statistics collection period.
plcn Indicates the prestage count in sample period.
plmn Indicates the prestage minimum latency in microseconds.
plmx Indicates the prestage maximum latency in microseconds.
slav Indicates the stage latency average in microseconds for the node since the last statistics collection period.
slcn Indicates the stage count in sample period.
slmn Indicates the stage minimum latency in microseconds.
slmx Indicates stage maximum latency in microseconds.
mifav Indicates the destaged messsages in flight.
mifcn Indicates the number of in-flight messages since the last statistics collection period.
mifmn Indicates the minimum number of in-flight messages in the sample period.
mifmx Indicates the maximum number of in-flight messages in the sample period.
mifgav Indicates the number of in-flight message guides.
mifgcn Indicates the number of in-flight message guides in the sample period.
mifgmn Indicates the minimum number of in-flight message guides in the sample period.
mifgmx Indicates the maximum number of in-flight message guides in the sample period.
taav Indicates the average latency in microseconds for begin track access.
tacn Indicates the number of begin track access in the sample period.
tamn Indicates the minimum latency in microseconds for begin track access.
tamx Indicates the maximum latency in microseconds for begin track access.
tlav Indicates the average latency in microseconds for get track lock.
tlcn Indicates the number of get track lock in sample period.
tlmn Indicates the minimum latency in microseconds for get track lock.
tlmx Indicates the maximum latency in microseconds for get track lock.
pfav Indicates the average for the partition fullness.
pfcn Indicates the partition fullness count in the sample period.
pfmn Indicates the minimum partition fullness in the sample period.
pfmx Indicates the maximum partition fullness in the sample period.
Table 4. Statistic collection for VDisks cache per individual nodes. Describes the VDisk cache information that is reported for individual nodes
Statistic name Description
ctr Indicates the total number of track reads received. For example, if a single read spans two tracks, it is counted as two total track reads.
ctrs Indicates the total number of sectors read for reads received.
ctw Indicates the total number of track writes received. For example, if a single write spans two tracks, it is counted as two total track writes.
ctws Indicates the total number of sectors written for writes received from components.
ctp Indicates the number of track stages that are initiated by the cache that are prestage reads.
ctps Indicates the total number of staged sectors initiated by the cache.
ctrh Indicates the number of total track read-cache hits on prestage or non-prestage data. For example, a single read that spans two tracks where only one of the tracks obtained a total cache hit, is counted as one track read-cache hit.
ctrhs Indicates the total number of sectors read for reads received from other components that have obtained total cache hits on prestage or non-prestage data.
ctrhp Indicates the number of track reads received from other components that have been treated as cache hits on any prestaged data. For example, if a single read spans two tracks where only one of the tracks obtained a total cache hit on prestaged data, it is counted as one track read for the prestaged data. A cache hit that obtains a partial hit on prestage and non-prestage data still contributes to this value.
ctrhps Indicates the total number of sectors read for reads received from other components that have obtained cache hits on any prestaged data.
ctrm Indicates the number of track reads received from other components that have cache misses. A cache miss includes a partial cache hit. The SAN Volume Controller cache does not have the concept of a partial cache hit.
ctrms Indicates the total number of sectors read for reads received from other components that have cache misses.
ctd Indicates the total number of cache track initiated writes submitted to other components as a result of a VDisk cache flush or destage operation on a track basis.
ctds Indicates the total number of sectors written for cache-initiated track writes.
ctwft Indicates the number of track writes received from other components and processed in flush through write mode.
ctwfts Indicates the total number of sectors written for writes received from other components and processed in flush through write mode.
ctwwt Indicates the number of track writes received from other components and processed in write through write mode.
ctwwts Indicates the total number of sectors written for writes received from other components and processed in write through write mode.
ctwfw Indicates the number of track writes received from other components and processed in fast-write mode.
ctwfws Indicates the total number of sectors written for writes received from other components and processed in fast-write mode.
ctwfwsh Indicates the track writes in fast-write mode that were written in write-through mode because of the lack of memory.
ctwfwshs Indicates the track writes in fast-write mode that were written in write through due to the lack of memory.
ctwm Indicates the number of track writes received from other components where some of the sectors in the write data resulted in new dirty data being generated in the cache. A partial write cache hit counts as a write cache miss. Low resource writes do not contribute to this counter.
ctwms Indicates the total number of sectors received from components where some of the sectors in the write data resulted in new dirty data being generated in the cache.
ctwh Indicates the number of track writes received from other components where every sector in the track obtained a write hit on already dirty data in the cache. For a write to count as a total cache hit, the entire track write data must already be marked in the write cache as dirty.
ctwhs Indicates the total number of sectors received from other components where every sector in the track obtained a write hit on already dirty data in the cache.
cm Indicates the number of sectors of modified or dirty data held in the cache.
cv Indicates the number of sectors of read and write cache data held in the cache.
dlav Indicates the destage latency average in microseconds for the VDisk since the last statistics collection period.
plav Indicates the prestage latency average in microseconds for the VDisk since the last statistics collection period.
slav Indicates the stage latency average in microseconds for the VDisk since the last statistics collection period.
Table 5. Statistic collection for VDisks that are used in Metro Mirror and Global Mirror relationships for individual nodes. Describes the VDisk information related to Metro Mirror or Global Mirror relationships that is reported for individual nodes
Statistic name Description
gwo Indicates the total number of overlapping VDisk writes. An overlapping write is when the logical block address (LBA) range of write request collides with another outstanding request to the same LBA range and the write request is still outstanding to the secondary site.
gwot Indicates the total number of fixed or unfixed overlapping writes. When all nodes in all clusters are running SAN Volume Controller version 4.3.1, this records the total number of write I/O requests received by the Global Mirror feature on the primary that have overlapped. When any nodes in either cluster are running SAN Volume Controller versions earlier than 4.3.1, this value does not increment.
gws Indicates the total number of write requests that have been issued to the secondary site.
gwl Indicates cumulative secondary write latency in milliseconds. This statistic accumulates the cumulative secondary write latency for each VDisk. You can calculate the amount of time to recovery from a failure based on this statistic and the gws statistics.
Table 6. Statistic collection for node ports. Describes the port information that is reported for individual nodes
Statistic name Description
cpu_busy Indicates the number of busy milliseconds since the node was reset. This statistic reports the amount of the time the processor has spent polling while waiting for work versus actually doing work. This statistic accumulates from zero.
id Indicates the port identifier for the node.
wwpn Indicates the worldwide port name for the node.
hbt Indicates the bytes transmitted to hosts.
cbt Indicates the bytes transmitted to disk controllers.
lnbt Indicates the bytes transmitted to other nodes in the same cluster.
rmbt Indicates the bytes transmitted to other nodes in the other clusters.
hbr Indicates the bytes received from hosts.
cbr Indicates the bytes received from controllers.
lnbr Indicates the bytes received to other nodes in the same cluster.
rmbr Indicates the bytes received to other nodes in the other clusters.
het Indicates the commands initiated to hosts.
cet Indicates the commands initiated to disk controllers.
lnet Indicates the commands initiated to other nodes in the same cluster.
rmet Indicates the commands initiated to other nodes in the other clusters.
her Indicates the commands received from hosts.
cer Indicates the commands received from disk controllers.
lner Indicates the commands received from other nodes in the same cluster.
rmer Indicates the commands received from other nodes in the other clusters.
lf Indicates a link failure count.
lsy Indicates the loss-of-synchronization count.
lsi Indicates the lost-of-signal count.
pspe Indicates the primitive sequence-protocol error count.
itw Indicates the number of transmission word counts that are not valid.
icrc Indicates the number of CRC that are not valid.
bbcz Indicates the total time in microseconds for which the port had data to send but was prevented from doing so by a lack of buffer credit from the switch.
Table 7. Statistic collection for nodes . Describes the node information that is reported for each nodes
Statistic name Description
id Indicates the name of the node.
cluster Indicates the name of the cluster.
node_id Indicates the unique identifier for the node.
cluster_id Indicates the name of the cluster.
ro Indicates the number of messages or bulk data received.
wo Indicates the number of messages or bulk data sent.
rb Indicates the number of bytes received.
wb Indicates the bytes sent.
rq Indicates the accumulated receive latency, including inbound queue time. This statistic is the latency from the time that a command arrives at the node communication layer to the time that the cache completes the command.
re Indicates the accumulated receive latency, excluding inbound queue time. This statistic is the latency that is experienced by the node communication layer from the time that an I/O is queued to cache until the time that the cache gives completion for it.
wq Indicates the accumulated send latency, including outbound queue time. This statistic includes the entire time that data is sent. This time includes the time from when the node communication layer receives a message and waits for resources, the time to send the message to the remote node, and the time taken for the remote node to respond.
we Indicates the accumulated send latency, excluding outbound queue time. This statistic is the time from when the node communication layer issues a message out onto the fibre channel until the node communication layer receives notification that the message has arrived.
Table 8. Global cache statistic collection for nodes . Describes the global cache statistics for nodes
Statistic name Description
drll Indicates the demote list length.
gcll Indicates the number of global copies for the cache.
mcll Indicates the modified list length.
dlav Indicates the destage latency average for the node since the last statistics collection period.
dlcn Indicates the destage count in sample period.
dlmn Indicates the destage minimum latency.
dlmx Indicates the destage maximum latency.
plav Indicates the prestage latency average for the node since the last statistics collection period.
plcn Indicates the prestage count in sample period.
plmn Indicates the prestage minimum latency.
plmx Indicates the prestage maximum latency.
slav Indicates the stage latency average for the node since the last statistics collection period.
slcn Indicates the stage count in sample period.
slmn Indicates the stage minimum latency.
slmx Indicates the stage maximum latency.
cfav Indicates the average for cache fullness.
cfcn Indicates the cache fullness count in the sample period.
cfmn Indicates the minimum percentage of cache fullness.
cfmx Indicates the maximum percentage of cache fullness.
wcfav Indicates the average percentage of write-cache fullness.
wcfcn Indicates the write-cache fullness count in the sample period.
wcfmn Indicates the minimum percentage of write-cache fullness.
wcfmx Indicates the maximum percentage of write-cache fullness.
dtav Indicates the average latency in microseconds for data transfer fullness.
dtcn Indicates the number of data transfers in the sample period.
dtmn Indicates the minimum latency in microseconds for data transfers.
dtmx Indicates the maximum latency in microseconds for data transfers.
taav Indicates the average latency in microseconds for track access.
tacn Indicates the track access count in the sample period.
tamn Indicates the minimum latency for track access.
tamx Indicates the maximum latency for track access.
tlav Indicates the average latency in microseconds for track locks.
tlcn Indicates the number of track locks in the sample period.
tlmn Indicates the minimum latency in microseconds for track locks.
tlmx Indicates maximum latency in microseconds for track locks.

Actions

The following actions are available:

OK
Click this button to change statistic collection.
Cancel
Click this button to exit the panel without changing statistic collection.
Library | Support | Terms of use | Feedback
© Copyright IBM Corporation 2003, 2009. All Rights Reserved.