Abstract
This document provides detailed instructions to locate and repair problems in the SAPC owing to errors, wrong or abnormal behaviors.
1 Introduction
1.1 Document Purpose and Scope
The purpose of this document is to provide detailed instructions to locate and fix different problems in the SAPC typically in live sites.
This document requires strong knowledge of the product and used CBA components. It is addressed to both Ericsson personnel and System Administrators.
This document does not contain periodic maintenance tasks and instructions to change the configuration of the main functions within the SAPC. The System Administrator Guide contains this type of information.
2 Tools
This section describes the tools that can be used to troubleshoot the SAPC.
2.1 forall
This command script launches the same CLI command or commands to several SAPC nodes, according to the following use:
sapcadmin@SC-1:~> forall
Usage: forall <node group> <command>
Examples: forall PLs ps -fe | grep beam
Examples: forall SCs hostname ; exportfs
Reserved 'control', 'payload' and 'cluster' shall have same effect
than 'SCs', 'PLs' and 'AllNodes' node groups. However, could be used
even when IMM is not available because they extract the information
from the cluster file system hierarchy.
2.2 immHelper
This command helps to know the current SAPC components state, according to the following use:
sapcadmin@SC-1:~> immHelper
Usage: immHelper <command: su|su2|sg|sg2|si|si2|ng|ng2> [FILTER]
command:
ng: SAPC node groups | ng2: detailed ng (show [L]ocked/[U]nlocked nodes)
su: service units | su2: detailed su (includes node name)
sg: service groups | sg2: detailed sg (includes node group)
si: service instances | si2: detailed si (includes availability)
comp: components
sw: installed software
You could grep results with 'SAPC' or whatever ...
2.3 amfHelper
This command executes actions on Service Units, according to the following use:
sapcadmin@SC-1:~> amfHelper Wrapper to execute actions on Service units using amf-adm. It encapsulate the complexity of "Preinstantiable" and handles lock/unlock and lock-in/unlock-in in correct way. The repair option tries to unlock or repair matched service group and service unit that are locked or in a wrong status.
Use: amfHelper -f <filter> [-a <action>] [-v]
Parameters:
-f <filter>: Service unit and service group egrep filter. For example: amfHelper -f 'Pcrf|SubsCharg'
-a <action>: stop | start | restart | status | repair. Interactive menu if missing.
-v: verbose mode
Examples:
amfHelper -f 'Pcrf|SubsCharg' -a stop
amfHelper -f 'SAPC' -a stop
amfHelper -a repair -f 'SAPC'
2.4 sapcHealthCheck
This command performs several checkups to verify the status of the system: SAPC deploy, TIPC communication, DRBD devices, CMW status, AMF status, active FM alarms, existing coredumps, Diameter daemon status, Data Base, and error logs in the system. It also provides an overall status in function of checkups results.
According to the activity (installation, upgrade, O&M or scaling workflow), this script applies different checkups and uses different criterions for overall status. Moreover, the command allows get each checkup independently.
The use of the command is done according to the following usages:
sapcadmin@SC-1:~> sapcHealthCheck -h
Usage: sapcHealthCheck [-t <seconds>] [-p CHECKUP ]
sapcHealthCheck [-t <seconds>] [BATCH]
OPTIONS:
-h, --help help
-t, --timeout timeout seconds for checking platform commands. Set on 300 sg by default.
-p, --param specific checkup
CHECKUP:
SAPCInstallation
Connectivity
DRBD
CoreMiddleware
AMFState
Alarms
CoreDumps
Diameter
SystemOperative
DataBase
ErrorLogs
BATCH
-d, --deploy deployment/installation checkups
-u, --upgrade upgrade checkups
-o, --oam operation and maintenance checkups
-s, --scaling scaling checkups
This command reports Health Check NOK for Not ACTIVE zone in Geographical Redundancy deployments.
2.5 auto_provisioning
This command is used to automatically provision data, during the SAPC deployment, according to the following use:
sapcadmin@SC-1:~> auto_provision
Usage:
/usr/local/bin/auto_provision start
This command will provision the data included in provided
files REST /cluster/storage/no-backup/auto_provision/initial_provisioning.rest and XML /cluster/storage/no-backup/auto_provision/initial_configuration.xml for vSAPC.
3 Troubleshooting Functions
3.1 Linux Consoles
The Linux Console is accessed using the SSH protocol towards the System Controller processors using the sapcadmin user through the <OAM VIP>. For more details, refer to System Administrator Guide. If the <OAM VIP> is unavailable, the operation and maintenance scripts cannot be used. Refer to Section 5.9.
For more information about available commands, check the LDE Management Guide.
3.2 Internal Database Command Line Tools
Internal database command line tools can provide useful information about database status. These tools offer data on a cluster level, like the status of the processors that form the SAPC and some other information regarding memory consumption and internal connections. To execute every tool, use the command clurun.sh from any of the SC or PL processors.
sapcadmin@SC-1:~> clurun.sh
3.3 COM CLI
This console provides a direct CLI for the COM subsystem and also a textual representation of Management Information Model (MIM). For more details, refer to System Administrator Guide.
3.4 System Health Check
To check if the SAPC is working properly, the sapcHealthCheck is used. It provides information about:
- SAPC Software components installed.
- Nodes communication through TIPC.
- DRBD status.
- CMW and AMF status
- Alarms, coredumps, and error logs in the system
- Diameter daemon status
- DBS status
The next example shows the script output using default options for a succeed state:
sapcadmin@SC-X:~> sudo sapcHealthCheck==================== HEALTH CHECK REPORT ====================
Checking the SAPC is installed...
SAPC installation --> OK --> All the 20 ERIC-SAPC SDPs installed are used [main version: ERIC-SAPC-CXP9030138_7-R1A22].
Checking TIPC communication...
TIPC --> OK --> All the 4 available nodes at TIPC level are up.
Checking DRBD devices...
DRBD device --> OK
Checking CMW status...
CMW status --> OK --> All the "node comp app su si sg siass csiass pm" are OK (Stopped PM jobs are ignored).
Checking AMF status...
AMF status --> OK --> All the AMF entities are OK.
Checking active FM alarms...
Alarms --> OK --> There are no active FM Alarms.
Checking existing coredumps...
Coredumps --> OK --> There are no coredumps.
Checking Diameter daemon status...
Diameter stack --> OK --> The stack is alive in every running PL.
Checking system operative...
External peers configured and in use --> OK
Checking Data Base...
All Data Base agents working normally --> OK
Checking error logs in the system...
No errors in the system --> OK
*** SAPC HEALTH CHECK SUMMARY ***
WARNINGS: 0
ERRORS: 0
**********************************
SAPC Health Check finished: OK
3.5 Processor Load (CPU and Memory)
The specific commands to check that CPU and Memory load are described in Preventive Maintenance.
3.6 Alarms and Notifications
For information about how to check system alarms, refer to Preventive Maintenance.
If any alarm is raised, act on the corresponding OPI to make it cease.
3.7 Logging
For further information, about the Logging events generated by the SAPC, refer to Logging Events.
3.8 Measurements
The traffic measurements generated by the SAPC also provide useful information when troubleshooting a problem. For more information, refer to Measurements.
3.9 Core Files
To check the existence of system core files, refer to Preventive Maintenance.
If any, send then to the next level of maintenance support for analysis.
3.10 System Messages
Important information about the general status of the different processors can be found as root user in the following files found in any SC processor:
root@<SC-X>:/var/log/<node-id>/auth*
root@<SC-X>:/var/log/<node-id>/kernel*
root@<SC-X>:/var/log/<node-id>/messages*
- Note:
- Where <SC-X> is SC-1 or SC-2
Where <node-id> is SC-1, SC-2, PL-3, PL-4 or PL-n
3.11 SAPC Reboot
The SAPC can be reloaded with the following commands.
The procedure implies almost 30-seconds downtime until the internal database is operational again.
- Log on to the system with sapcadmin user, through <OAM VIP>.
- Perform the reboot of the SAPC.
sapcadmin@SC-X> sudo cmw-cluster-reboot [--yes]
If --yes is specified, the command does not require confirmation.
- Wait until the node is back.
- Log on again to the system with sapcadmin user, through <OAM VIP>.
- Check the status of the node according to Preventive Maintenance.
3.12 Processor Lock and Unlock
A processor on the SAPC can be locked from the node. A processor locked means that it is not part of the cluster, until unlocked command is performed (the processor comes back to the node).
- Log on to the system with sapcadmin user, through <OAM VIP>.
- Perform the lock of the processor in the SAPC.
sapcadmin@SC-X> sudo cmw-node-lock <processor>
Caution!Traffic performance can be affected until the processor is unlocked.
- To get the processor back on the node, execute the following
command:
sapcadmin@SC-X> sudo cmw-node-unlock <processor>
- Check the status of the node according to Preventive Maintenance.
4 Troubleshooting Procedure
Troubleshooting a problem in the SAPC requires the use of one or more functions described in previous chapters. The correct use of these tools is needed to prevent overload situations. In a faulty situation, they must be used in the right order to ensure an efficient location of the fault:
- Perform a System Health Check described on Section 3.4.
- Check processors load. See Section 3.5.
- Check for alarms in the system. To do that, follow Section 3.6.
- Check for logs in the system. To do that, follow Section 3.7.
- Check the traffic measurements. See Section 3.8.
- Check the capacity measurements and purchased capacity licenses. See Section 3.8.
- Check system core dumps files, follow Section 3.9.
- Check system messages. See Section 3.10.
A troubleshooting workflow is shown in Figure 1.
5 Common Faulty Situations
In the following chapters, some common problems that could appear during normal operation and possible solutions are described.
5.1 General Failures
5.1.1 License Is Not Active
In case traffic is continuously rejected or misprocessed, it could be caused by a license which is not properly installed, expired or whose capacity has been exceeded. In case a license has expired, contact supply organization to request an extension.
- Check License Manager status and configuration. Refer to View License Information.
- Check for specific alarms regarding License Manager or capacity licenses. Refer to License Management
- If there are active alarms about capacity licenses, check the capacity measurements (see Section 3.8) and purchased capacity licenses.
- In case a license is not properly activated or installed, reinstall licenses. Refer to Install License Key File.
- In case a license has expired, contact supply organization to request an extension.
- To restore system functionality temporarily in extraordinary situations, activate Emergency Unlock mode as described in Activate Emergency Unlock Mode. Consider that the number of activations is limited.
5.1.2 Processor Is Out of Service
The SAPC node is composed of several processors. If, during operation, any processor goes out of service, the rest of the traffic processors must handle all the traffic, so it can result in a higher load situation for them. To verify and correct the situation, follow the next step:
- Check the SAPC platform components status is correct following the steps described in Section 3.4.
5.1.3 Load Regulation
The SAPC continuously monitors CPU and memory use. If the values of these parameters exceed a configurable threshold, the SAPC rejects session establishments in Gx , Rx and Smp interfaces by answering with DIAMETER_TOO_BUSY AVP. This is to guarantee a graceful behavior of the SAPC in overload situations. In case DIAMETER_TOO_BUSY messages are detected continuously during a prolonged period:
- Follow the System Health Check described in Section 3.4. If one or more nodes are not working properly, the rest of them could be in an overload situation.
- Check value for Load Regulation constraints as described in Overload Control User Guide. Adjust the values to the manufacturer recommendation.
- If none of the previous actions have detected a malfunction or erroneous configuration, it is most likely that the SAPC is applying load regulation because of high resource consumption. If this situation persists over time, contact next level of maintenance support.
5.2 Provisioning Failures
Representational State Transfer (REST) is used as an interface for the SAPC provisioning purpose. Through REST services commands, it is possible to provision the SAPC with Subscribers, Subscribers Groups, and Policies.
- If REST provisioning is not being successful
Verify that the information that is provisioned is correct according to the following documents:
- If the internal database does not accept more provisioning
entries
Check that the database storage capacity limit is not reached.
To verify this, launch the following command that shows the amount of used and free memory:
sapcadmin@SC-1:~>clurun.sh collect_stats -d dbnResult from [PL-3.dbn]: DbsService=DBN,DbsPU=PL-3
DbsPU.VS.DBS.Mem.NormalHeap.Free 0
DbsPU.VS.DBS.Mem.NormalHeap.Used 61889
DbsPU.VS.DBS.Mem.RecordHeap.Free 3276215
DbsPU.VS.DBS.Mem.RecordHeap.PUsed 0
DbsPU.VS.DBS.Mem.RecordHeap.Used 0
DbsPU.VS.DBS.Mem.TotalHeap.Free 3276215
DbsPU.VS.DBS.Mem.TotalHeap.Used 61889
...
Result from [PL-4.dbn]: DbsService=DBN,DbsPU=PL-4
DbsPU.VS.DBS.Mem.NormalHeap.Free 0
DbsPU.VS.DBS.Mem.NormalHeap.Used 61978
DbsPU.VS.DBS.Mem.RecordHeap.Free 3276215
DbsPU.VS.DBS.Mem.RecordHeap.PUsed 0
DbsPU.VS.DBS.Mem.RecordHeap.Used 0
DbsPU.VS.DBS.Mem.TotalHeap.Free 3276215
DbsPU.VS.DBS.Mem.TotalHeap.Used 61978
...Caution!If there is no total heap available contact Ericsson personnel for more information.
5.3 Failures during the Initial Configuration and Provisioning in Cloud
Optionally, the SAPC can be automatically configured and provisioned in Cloud deployments. During deployment time, the files containing the data are injected to the SAPC and the auto_provision script is executed (See SAPC VNF Descriptor Generator Tool to find more details).
In some scenarios, in which the Cloud Infrastructure where the SAPC is deployed presents a slow connection speed, this automatic procedure could not have been executed.
- Confirm if the script for automatic configuration and
provisioning has been successfully executed, looking at the contents
of the following log file in SC-1:
SC-1:~ # cat /var/log/auto_provision/auto_provision.log
- Instead, if the last text message is "Waiting
NDB nodes to be STARTED ...", the execution has failed. Then,
do it manually from SC-1:
SC-1:~ # /opt/sapc/adapt/init.d/auto_provision/auto_provision start
5.4 Fair Usage Reporting Failures
If no quota is received from the SAPC, verify that:
- The subscriber or subscriber group has usageLimits information configured.
- The content of the Usage Limits is syntactically right. This can be checked by parsing the JSON structure with some external tool (for example: http://jsonformatter.curiousconcept.com).
- Subscription Date < Current time < Expiry Date.
- Accumulation policies for applicable Reporting Groups (and included counters) evaluate to TRUE
If quota = 0 is received from the SAPC, check that applicable Reporting Groups are enabled:
- Accumulation policies for applicable Reporting Groups (and included counters) evaluate to TRUE
For further information, refer to Configuration Guide for Fair Usage.
5.5 Multi-access Failures
TCP connectivity exists between peer and the SAPC.
5.5.1 Diameter Connection Problems
If there is any failure related to Diameter traffic, verify the following checks:
- Verify Diameter Flow Policy
Check through an ECLI session if all Flow Policies are defined. For that purpose, the next command must be executed with the associated output:
> show ManagedElement=1,Transport=1,Evip=1,EvipAlbs=1,EvipAlb=alb_tr,EvipFlowPolicies=1
EvipFlowPolicies=1 EvipFlowPolicy=SCTP_diameter EvipFlowPolicy=diameter EvipFlowPolicy=diameter_sx EvipFlowPolicy=soap
- Verify VIP address for traffic or Diameter port
Check through an ECLI session if the VIP-TRF and DIAMETER-PORT are set. For that purpose, the next command must be executed with the associated output considering the VIP address for traffic handling and the 3868-port values:
>show all ManagedElement=1,Transport=1,Evip=1,EvipAlbs=1,EvipAlb=alb_tr,EvipFlowPolicies=1
EvipFlowPolicies=1 EvipFlowPolicy=SCTP_diameter dest="<VIP-TRF>" protocol="sctp" soGrp="1011250"
EvipFlowPolicy=diameter dest="<VIP-TRF>" destPort="<DIAMETER-PORT>" protocol="tcp" targetPool="PLs_rr"
EvipFlowPolicy=diameter_sx dest="<VIP-TRF>" destPort="<DIAMETER-PORT>" protocol="tcp" targetPool="PLs_rr"
EvipFlowPolicy=soap_traf dest="<VIP-TRF>" destPort="8080" protocol="tcp" targetPool="PLs_rr"
- Diameter status processes
Check through an SSH connection as sapcadmin user to any SC processor if the C-diameter status is OK. For that purpose, the next command must be executed with the associated output, considering N as the number of PL processors:
sapcadmin@SC-X:/> amfHelper -f CDIA -a status
Searching SUs filtering (egrep) by 'CDIA' ...
#***>> CDIA [Node] [Service Unit DN] [AdminState] [OpState] [PresenceState] [ReadinessState] [Preinst] [Active/Standby]
Done!
SC-2 safSu=ERIC-CDIA-Runtime-1,safSg=ERIC-CDIA-SG,safApp=ERIC-CDIA-Runtime UNLOCKED(1) ENABLED(1) INSTANTIATED(3) IN-SERVICE(2) 0 Active
SC-1 safSu=ERIC-CDIA-Runtime-0,safSg=ERIC-CDIA-SG,safApp=ERIC-CDIA-Runtime UNLOCKED(1) ENABLED(1) INSTANTIATED(3) IN-SERVICE(2) 0 Active
PL-N safSu=ERIC-CDIA-Runtime-N,safSg=ERIC-CDIA-SG,safApp=ERIC-CDIA-Runtime UNLOCKED(1) ENABLED(1) INSTANTIATED(3) IN-SERVICE(2) 0 Active
5.5.2 Diameter Failures
5.5.2.1 DIAMETER RFC 6733 Messages Failures
If there is any problem with the establishment of Diameter connections, it can be owing to one of the following reasons:
- Capabilities-Exchange-Request (CER) message is received from an unknown peer.
- Check that acceptFrom in IMM is <Empty> to allow any unknown peer. This is the recommended configuration since it is not possible to change
this value without a restart of system.
- Note:
- SC-X:~ # immlist -a host -a acceptFrom `immfind -c
OtpdiaTransportTcp`
host=PL-3
acceptFrom=\<Empty\>
host=PL-4
acceptFrom=\<Empty\>
- In case the value is defined, check that the neighbor node host IP matches the "PCRE" regular expression.
- Note:
- Example accepts peers with IPs from 10.* and 172.* using PCRE expression 10.*|172.*:
SC-1: # immlist -a host -a acceptFrom `immfind -c OtpdiaTransportTcp`
host=PL-3
acceptFrom="10.*|172.*"
host=PL-4
acceptFrom="10.*|172.*"
- Check that acceptFrom in IMM is <Empty> to allow any unknown peer. This is the recommended configuration since it is not possible to change
this value without a restart of system.
- Receive request for an unsupported application.
- Check Application Id and Supported Vendor Id.
immlist -a supportedVendorId -a authApplicationId `immfind -c OtpdiaApplications`
- Check grouped AVP Vendor Specific Application Id
immlist -a vendorSpecificApplicationId `immfind -c OtpdiaApplications`
vendorSpecificApplicationId=otpdiaVendorSpecificApplicationId=Gx
From this, we use vendorSpecificApplicationId=otpdiaVendorSpecificApplicationId=Gx.
immlist otpdiaVendorSpecificApplicationId=Gx,otpdiaProduct=SAPC -a vendorId -a otpdiaVendorSpecificApplicationId -a authApplicationIdvendorId=10415
otpdiaVendorSpecificApplicationId=otpdiaVendorSpecificApplicationId=Gx
authApplicationId=16777238
- Check Application Id and Supported Vendor Id.
- Incorrect Origin-Host, Origin-realm, Host-IP-Address from
SAPC
- Check originHost, originRealm, hostIpAddress
immlist `immfind -c OtpdiaService`
- Check originHost, originRealm, hostIpAddress
- Not possible to establish two or more connections to same
peer
- Check that restrictConnections is set to false for otpdiaService=Pcrf,otpdiaProduct=SAPC in IMM
5.5.3 Gx Failures
5.5.3.1 General Failures
In case any AVP related to a concrete control (Bearer Access Control, Service Access Control, QoS Control for the Default Bearer and the APN, Usage Reporting, and so on) is not obtained in CCA or RAR messages, it can be owing to one of the following reasons:
- Check the bit for the corresponding control received in Gx-Capability-List AVP within CCR requests.
- Check also the corresponding control received in Supported-Features AVP.
- Check the configuration for the controls for the PCEF sending
the Gx requests.
In case all mobile session establishments are suddenly rejected, it can be caused by the number of mobile sessions exceeding the capacity license. Check for License Manager alarms and verify the measurement mobileActiveSessions against the purchased capacity license.
5.5.3.2 Gx Access Control Failures
To identify the value of the Diameter Result-Code AVP in the answer message. To get this value, a protocol analyzer is recommended to be used (for example, Wireshark) and for further information about the Result-Code meaning, refer to Gx Interface Description.
If service authorization is not successful, it can be owing to one of the following reasons:
Subscriber received in the Subscription-Id AVP is not found in the SAPC and "Unknown" special EPC-Subscriber entry is not provisioned.
- Check if UnknownSubscribers measure is abnormally high.
- Check if the subscriber is correctly provisioned.
Service authorization result is not the expected one:
- Check if the specific service is correctly defined.
- Check in the provisioned Subscriber profile the allowed and blacklist services.
For static services, check if the service is included in the applicable Rule Space (either the one indicated by the PCEF or the one decided by the SAPC).
Check also whether the service is within the list of services to redirect provisioned both for the subscriber and the active groups the subscriber belongs to.
Check policies for the specific service. To detect if there is any error in the policy evaluation, activate temporarily the warning logging level.
To check if there is a protocol error, activate temporarily the warning logging level.
Session to be updated does not exist:
- Check as root user if there are new logs with the next
message:
Non-Persistent data storage is empty.
In the next path:
SC-X:~ # /cluster/storage/no-backup/coremw/var/log/saflog/sapc/
- In case the processors load allows it, activate temporarily the warning logging level to check if the session was previously removed.
5.5.3.3 Gx QoS Control for the Default Bearer and the APN Failures
If the QoS Control for the Default Bearer and the APN result is not the expected one:
- Verify that the Bearer QoS Control applies for the PCEF (check the configuration of the PCEF in the SAPC).
- Check the values in QoS-Negotiation, Qos-Upgrade, and QoS-Information AVPs.
- Check if gxQosDowngraded, gxQosUpgraded, and gxQosDeactivated measures are abnormally high. If so, check the values configured
in the QoS Profiles associated with QoS Control for the Default Bearer
and the APN (either per service or per bearer) policies. Compare this
to the values received in the requested QoS Profile.
For details about the provisioning and configuration, refer to Configuration Guide for Bearer QoS Control and Bandwidth Management.
5.5.3.4 Gx Usage Reporting Failures
If no quota is received from SAPC, verify that:
- The subscriber or subscriber group has usageLimits information configured.
- The contents of the Usage Limits are syntactically right. This can be checked by parsing the JSON structure with some external tool (for example http://jsonformatter.curiousconcept.com ).
- Accumulation policies for applicable Reporting Groups (and included counters) evaluate to "TRUE". To detect if there is any error in the policy evaluation, activate temporarily the info logging level.
- Subscription Date < Current time < Expiry Date
If no more volume quota available is received from the SAPC, check that applicable Reporting Groups are enabled:
- Accumulation policies for applicable Reporting Groups (and included counters) evaluate to "TRUE". To detect if there is any error in the policy evaluation, activate temporarily the info logging level.
For further information, refer to Configuration Guide for Fair Usage.
5.5.4 Rx Failures
If there is any failure related to Rx interface, verify the following:
- Check service classification related configuration, provisioning,
and policies.
Refer to Configuration Guide for Dynamic Policy Control (Rx).
- Check service authorization related configuration, provisioning,
and policies.
Refer to Configuration Guide for Dynamic Policy Control (Rx).
- Check service qualification related configuration, provisioning,
and policies.
Refer to Configuration Guide for Dynamic Policy Control (Rx).
- Check if the counters rxAaasFailed, rxRaasFailed, rxAsasFailed, rxAaasUnableToComply, rxAaasInvalidInfo, rxAaasIpSessionNotAvailable, RxTerminateUnknownSessions , rxRarsTimeout, and rxAsrsTimeout measures are abnormally high.
In case all AF session establishments are suddenly rejected, it can be caused by the number of AF sessions exceeding the capacity license. Check for License Manager alarms and verify the afActiveSessions measurement against the purchased capacity license.
5.5.5 Sy Failures
If there is any failure related to Sy interface, verify the following:
- Check Subscriber Charging related configuration, provisioning,
and policies.
Refer to Configuration Guide for Integration with OCS for Spending Limit Reporting (Sy).
- Check if the counters sySlrsTimeout, sySlasFailed, sySnasFailed, syStrsTimeout , and syStasFailed measures are abnormally high.
- If SLR message is sent by the SAPC to the Online Charging System, but STR message is not sent later on: verify that the destination realm sent within the SLR and the origin realm received within the SLA are both properly defined in the SAPC DIAMETER routing table. Refer toConfiguration Guide for Integration with OCS for Spending Limit Reporting (Sy).
5.5.6 Smp Failures
If there is any failure related to Smp interface, verify the following:
- Check PDN-GW and SPID related configuration, provisioning
and policies.
Refer to Configuration Guide for Mobility Based Policy Control for Overlay Deployments.
- Check if the counters sxCcasInitFailed, sxCcasInitInvalidAvp, sxCcasInitMissingAvp and sxCcasInvalidInfo measures are abnormally high.
5.6 End User Notifications Failures
In case SMS/SOAP Notifications fail to be sent, verify the following:
- Verify that enableDelivery attribute
of NotificationConfig COM object is set to "TRUE".
- Verify that End User Notifications are configured properly
- For SMS notifications
Check that SMSCenter and SMSDestination COM object values are correctly configured under Network COM object.
- For SOAP notifications
Check that WebServiceEndPoints, WebServiceEndPoint, and WsDestination COM object values are correctly configured under Network COM object.
- For SMS notifications
- Check that notification policies are correctly configured.
- Check that "ConnectionNotificationServerFailed" alarm
is raised.
- Check logs for end-user notifications.
Further information on configuring end-user notifications can be found in Configuration Guide for End User Notifications.
5.7 External Database Failures
If there is any failure related to access to external database, verify the following:
- Check through an ECLI session
if the VIP address for access to external database <VIP-ExtDB>
is defined on the Abstract Load Balancer (alb_tr).
> show ManagedElement=1, Transport=1, Evip=1, EvipAlbs=1, EvipAlb=alb_tr, EvipVips=1
EvipVip= <VIP-ExtDB>
EvipVip= <VIP-TRF>
- Check through an ECLI session
if the Local IP address for access to external database is defined
on the Entity Data object.
> show ManagedElement=1, PolicyControlFunction=1, EntityData=1
localIp=<VIP-ExtDB>
- Check through an ECLI session
the IPs defined for external database.
> show ManagedElement=1, PolicyControlFunction=1,EntityData=1, EDSources=1, EDSource=ExternalRepository
EDSource=ExternalRepository
definition="def ExternalRepository () { dataSource = { url = \"\"; query = \"\"; } fieldDef = { ips = \"136.225.72.9;136.225.72.17;136.225.72.25\"; port = \"389\"; } }"
- Check if there is connection to any of the External databases
from your PL.
sapcadmin@PL-X:~> ping -I <VIP-ExtDB> <External Database IP>
- Check if the external database outgoing connections are
correctly established within the active IP. 64 connections per PL
are expected.
sapcadmin@SC-X:~> forall sapc_pls "hostname; lsof -i@<External Database IP>:389 | grep -i ESTABLISHED | wc -l"
PL-3 64 PL-4 64
Further information on configuring access to External Database can be found in Database Access.
5.8 SOAP Notification Interface Failures
If there is any failure related to SOAP notification interface, verify the following:
- Check through an ECLI session
the Flow Policy for SOAP incoming notification service. Misconfigured
SOAP incoming notification service flow policy in eVIP prevents the
correct binding of the SOAP server process to the listening port
> show ManagedElement=1, Transport=1, Evip=1, EvipAlbs=1, EvipAlb=alb_tr, EvipFlowPolicies=1
EvipFlowPolicy=soap
This is the print definition of flow policy:
> show ManagedElement=1, Transport=1, Evip=1, EvipAlbs=1, EvipAlb=alb_tr, EvipFlowPolicies=1, EvipFlowPolicy=soap
EvipFlowPolicy=soap dest="<VIP-ExtDB>" destPort="8080" protocol="tcp" targetPool="PLs_rr"
- Check if the SOAP incoming notification service port 8080
is listening on Abstract Load Balancer where the VIP for access to
external database is configured (alb_tr).
SC-X:~ # forall sapc_pls "hostname ; netstat -nap | grep :8080 | grep LISTEN"
PL-10 tcp 0 0 10.41.30.53:8080 0.0.0.0:* LISTEN 11954/soap-notifica PL-11 tcp 0 0 10.41.30.53:8080 0.0.0.0:* LISTEN 12538/soap-notifica PL-12 tcp 0 0 10.41.30.53:8080 0.0.0.0:* LISTEN 12514/soap-notifica PL-5 tcp 0 0 10.41.30.53:8080 0.0.0.0:* LISTEN 10432/soap-notifica PL-6 tcp 0 0 10.41.30.53:8080 0.0.0.0:* LISTEN 12556/soap-notifica PL-7 tcp 0 0 10.41.30.53:8080 0.0.0.0:* LISTEN 11887/soap-notifica PL-8 tcp 0 0 10.41.30.53:8080 0.0.0.0:* LISTEN 12822/soap-notifica PL-9 tcp 0 0 10.41.30.53:8080 0.0.0.0:* LISTEN 11194/soap-notifica
Further information on configuring SOAP incoming notification web service can be found in SOAP Notification Interface Description.
5.9 SC absence feature
If the <OAM VIP> interface is unavailable, it can be because of failures in both SC. In this scenario, OAM features are restricted, but the traffic can be still processed for 15 more minutes before the whole cluster reboots.
- Recover at least one of the two SC to prevent the cluster from rebooting. Once it is recovered, the <OAM VIP> is available.
- If no SC is recovered in 15 minutes, the cluster goes down. In this scenario, recover at least one of the SC and the system restarts normally. If the SC does not recover or the system does not restart normally, contact next level of maintenance support.

Contents
