Cisco Unified CallManager Release 5.1/6.0 COP (Cisco Option Package) Release Notes - Version 1

 

July 18, 2007

 

This is a Release note to comply with the Field Notice mentioned below:

 

Field Notice#  FN# 62850 - MCS 7825-H2 running Cisco Unified Call Manager 5.1 or Cisco Unified Presence 6.0 sporadic server hang – 7/10/2007

 

 

Products Affected

 

Type

Tech Group

BU

Family / SW Type

Line / SW Id

HW Id / SW Ver

Remarks

Hardware 

(VTG) Voice 

IPCBU 

SERVER 

7820 

MCS-7825-H2-IPC1 

MCS7825-H2, first generation - only affected when running Cisco Unified Communications Manager 5.1 or Cisco Unified Presence 6.0

Hardware 

(VTG) Voice 

IPCBU 

SERVER 

7820 

MCS-7825-H2-IPC2 

MCS7825-H2, second generation - only affected when running Cisco Unified Communications Manager 5.1 or Cisco Unified Presence 6.0

Hardware 

(VTG) Voice 

IPCBU 

SERVER 

7820 

MCS7825H2-K9-CMA2 

Cisco Unified Communications Manager 5.1 running on MCS7825-H2

Software 

(VTG) Voice 

IPCBU 

NON-OS 

CALLMGR 

CM5.1-K9-DL320G4 

Software Only for Cisco Unified Communications Manager 5.1 running on HP DL320G4

Software 

(VTG) Voice 

IPCBU 

NON-OS 

CALLMGR 

CM6.0-K9-DL320G4 

Software Only for Cisco Unified Communications Manager 6.0 running on HP DL320G4

Software 

(VTG) Voice 

IPCBU 

NON-OS 

CUPS 

SW-CUP6.0-K9= 

Software Only for Cisco Unified Presence 6.0 - only affected when running on MCS7825-H2 or HP DL320G4

Software 

(VTG) Voice 

IPCBU 

NON-OS 

CUPS 

SW-CUP6.0-K9P 

Software Only Promotion for Cisco Unified Presence 6.0 - only affected when running on MCS7825-H2 or HP DL320G4

 

 

DDTS

DDTS

Remarks

CSCsi75567

MCS-7825H2-IPC1: Server randomly rebooting for no apparent reason

 

Problem Description

The MCS7825-H2 and the HP equivalent HP DL320-G4 are currently experiencing sporadic errors on Cisco Unified Communications Manager 5.1(x), Cisco Unified Communications Manager 6.0(x) and/or Cisco Unified Presence 6.0(x) whereby the server becomes unresponsive for a duration of time. If that time exceeds 10 minutes, a system failsafe timer will cause the server to reboot and then the system will return to normal operation. This error can recur on the same server more than once.

This error has not been seen on any other MCS server models or other Cisco Unified Communications Manager or Cisco Unified Presence versions. It has also not been seen on any other Unified Communications applications. At this time, it is limited to only those product versions and servers combinations listed in the Products Affected section above.

Cisco and HP Engineering have identified what they believe to be a root cause and are currently collecting additional data through testing to verify that the fix described in the Workaround/Solution section below resolves the problem. Testing completed thus far has shown positive results. Cisco and HP Engineering are continuing to complete the validation process and this Field Notice will be updated as additional information becomes available.

Background

Starting with Cisco Unified Communications Manager 5.0, Cisco began offering appliance versions for various Cisco Unified Communications applications. Cisco OEMs the MCS 7800 series portfolio from 2 vendors. Each vendor provides Cisco the necessary hardware and drivers to allow Unified Communincations applications to run on the hardware. Although similar, the two platforms are not identical and therefore, the behavior of the appliance and the overall application can vary from vendor to vendor, from model to model and release to release. In this case, the MCS7825-H2 model and the equivalent HP DL320G4 are exhibiting kernel hangs. This has not been seen in any other appliance models or application versions running on appliances other than those listed in the Products Affected section above.

The error occurring is the result of code being executed in the HP Advanced Server Management (ASM) component versions 7.6.0 - 7.7.0. The HP ASM software is a set of common software components that is supported on multiple operating systems and enable monitoring and control hardware features in ProLiant servers. These features help monitor CPU utilization, detect and manage critical server software and hardware exceptions, thermal events, memory errors, loss of power, as well as critical disk, NIC, and network errors.

Version 7.8.0 of the HP Advanced Server Management component is expected to correct this particular system freeze and this newer component is included in the proposed patch in the Workaround/Solution section below.

Problem Symptom

A Cisco Unified Communications Manager 5.1(x) and/or 6.0(1) and/or Cisco Unified Presence 6.0 system running on an MCS7825-H2 or HP DL320G4 with this issue will appear to become non-responsive for a period of time. If that period of time exceeds 10 minutes, the system will reboot and return to normal operation. During the time of non-responsiveness, the system clock will stop updating and will continue to display the same time for the duration of the hang. The frequency of this system hang may vary from only once in many months to multiple times a week.

Workaround/Solution

A diagnostic and a patch file relating to this problem are available. The files are available at http://www.cisco.com/cgi-bin/tablebuild.pl/callmgr-utilpage and are named:

1) ciscocm.hpasm-7.8-verify.cop.sgn - diagnostic file
2) ciscocm.hpasm-7.8-install.cop.sgn - patch file

The first file is a benign diagnostic file that does not affect any server resources; it simply indicates whether the server qualifies for the second file by verifying that the server is a MCS-7825-H2 and that version 7.6 or 7.7 of the HP Advanced Server Management software is installed. This diagnostic file will not install the fix.

The second file is the actual patch and removes the older version of the HP Advanced Server Management software and replaces it with the newer 7.8 HP Advanced Server Management software, which addresses the sporadic system halts and reboots. Even if this event has not already occurred on a given server, Cisco recommends installing the second patch file on affected servers.

**Note: Once installed, the patch cannot be uninstalled.

Installation procedure for either Diagnostic or Patch files:

1) Navigate to http://www.cisco.com/cgi-bin/tablebuild.pl/callmgr-utilpage and click the right mouse button on the file of interest, select "Save As ...", navigate to an appropriate directory for saving the file, and click OK; make note of the directory in which the file is saved.

2) Log into the Cisco Unified Communcations Manager OS Administration GUI on each MCS-7825-H2 server in the cluster and select Software Upgrades -> Install / Upgrade. Note that it will be necessary to either burn the file to a CD or DVD, or to transfer it to a server that is accessible from the MCS-7825-H2 server, depending on which installation/upgrade option is chosen.

The progress and status of the file installation will appear in the Software Installation/Upgrade screen. This screen has 2 areas, a header area titled "Installation Status" as well as a text box below titled "Installation Log" that shows the progress of the installation. A successful installation will be indicated with a "Success" message in the Installation Status - Status field.  See note below.  If the file installation fails, please contact your Cisco TAC representative.

**Note: There is one exception whereby systems may show a status of "Error encountered" in the Installation Status - status field after the patch file is installed. If this occurs, please review the text in the Installation Log textbox to see if it indicates success. 

For the verify COP file, success can be determined by viewing the text in the log describing whether the system is affected or not.  You will see either "AGENT UPGRADE REQUIRED" or "NO CORRECTIVE ACTIONS ARE NEEDED" bracketed by asterisks.

For the patch COP file, "COMPLETED SUCCESSFULLY" will be shown in bracketed asterisks

 

As long as the Installation Log indicates success,  this status can be ignored as it represents a bug in the Install/Upgrade software itself.  See CSCsj05998 for further information.

New Order Status

To ensure this issue does not affect additional customers, the lead times for the following Product IDs have been moved out and orders for them will be held until a fix is verified:

Product ID

MCS7825H2-K9-CMA2

CM5.1-K9-DL320G4

CM6.0-K9-DL320G4



In addition, the following Product IDs can be installed on multiple hardware versions. To minimize the chance that customers will install them on an affected server, they will remain on New Product Hold with a New Product Hold questionnaire that will query for the type of server Cisco Unified Presence will be loaded onto. Orders that are targeted to have Cisco Unified Presence 6.0 installed on an MCS7825-H2 or HP DL320G4 will be held until the fix is verified.

Product ID

SW-CUP6.0-K9=

SW-CUP6.0-K9P



In order to keep issues like this from impacting new customer deployments, Cisco dual sources most of its MCS product line. This issue is only being seen on the products and versions listed in the Products Affected section. Customers that wish to place orders for any of the affected products can still order equivalent I2 based product IDs or can order a different class server until this issue has been resolved.