1 Introduction
This instruction concerns alarm handling for the SAF, LOTC Time Synchronization Failed alarm.
1.1 Alarm Description
This alarm is related to Service Availability Forum (SAF), refer to LOTC Time Synchronization, Reference [3] for more information.
The alarm is issued when the Network Time Protocol (NTP) server(s) cannot be contacted or if the local time is off by more than the threshold value of 10 seconds.
The alarm has the following severity levels:
- Minor
- Major
- Critical
Depending on severity, the possible alarm causes and the corresponding fault reasons, fault locations, and impacts are described in Section 1.1.1.
Depending on severity, the alarm attributes are listed and explained in Section 1.1.2.
1.1.1 Alarm Causes
Minor severity alarm causes are listed in Table 1:
|
Alarm Cause |
Description |
Fault Reason |
Fault Location |
Impact |
|---|---|---|---|---|
|
Not possible to contact one of the configured NTP servers configured in cluster.conf from a System Controller (SC). |
Connectivity from the SCs to the external NTP service cannot be established. |
Network infrastructure misconfiguration. |
Network infrastructure |
Loss of NTP service redundancy |
|
Internal CUDB node network does not allow communication. |
Network infrastructure | |||
|
External NTP server is not available. |
External NTP server | |||
|
Network between the CUDB node and the external NTP server does not allow communication between the two endpoints. |
Datacenter network | |||
|
Not possible to contact one of the SC NTP servers from a payload blade or Virtual Machine (VM). |
Connectivity from the SCs to the SC NTP service cannot be established. |
Internal CUDB node network does not allow communication. |
Network infrastructure | |
|
NTP server in an SC blade is not available, or the SC blade is not available |
Major severity alarm causes are listed in Table 2:
|
Alarm Cause |
Description |
Fault Reason |
Fault Location |
Impact |
|---|---|---|---|---|
|
Unusable: The NTP servers provided in cluster.conf can not be used by the local NTP daemon, ntpd. |
Server is listed in NTP configuration (/etc/ntp.conf), but is not reported in the list of peers provided by the ntpq -p command. |
Server name can not be resolved into IP address. |
CUDB configuration |
Loss of NTP service redundancy |
|
Rejected: None of the configured NTP servers can be selected as a current time source (rejected at initial selection). |
The time server could not be selected after 60 minutes from start. |
NTP protocol algorithm declares a rejected server for the following reasons: |
NTP server or external network | |
|
Rejected: None of the configured NTP servers can be selected as a current time source (rejected at reselection). |
The time server selection was successful, but the NTP daemon was restarted and the reselection process takes longer than 90 seconds. | |||
|
Unreachable: Not possible to contact any of the configured NTP servers configured in cluster.conf from an SC. |
Connectivity from the SCs to the external NTP service cannot be established. |
Network infrastructure misconfiguration. |
Network infrastructure |
Risk of losing consistent time reference |
|
Internal CUDB node network does not allow communication. |
Network infrastructure | |||
|
External NTP server is not available. |
External NTP server | |||
|
Network between the CUDB node and the external NTP server does not allow communication between the two endpoints. |
Datacenter network | |||
|
Not possible to contact any of the SC blade NTP servers from a PL blade. |
Connectivity from the SCs to the SC NTP service can not be established. |
Internal CUDB node network does not allow communication between the location of the raised alarm and the SCs. |
Network infrastructure | |
(1) This occurs when the time difference from the local and remote servers
is bigger than 1000 seconds.
Critical severity alarm causes are listed in Table 3:
|
Alarm Cause |
Description |
Fault Reason |
Fault Location |
Impact |
|---|---|---|---|---|
|
The time difference between the local system time and the remote time server is greater than the alarm threshold (10 seconds), but smaller than the insane threshold (1000 seconds). |
Between 10 and 1000 seconds time difference between the SC time and the external NTP reference. |
Time jump in the external NTP server. |
External NTP server |
Inaccurate system time |
|
The connection towards external NTP servers is re-established after a period of non-connectivity. |
Internal CUDB node network, network infrastructure | |||
|
Datacenter network | ||||
|
Between 10 and 1000 seconds time difference between the payload blade or VM time and the SC time. |
||||
|
The connection towards SC NTP servers is re-established after a period of non-connectivity. |
Internal CUDB node network, network infrastructure |
1.1.2 Alarm Attributes
Minor severity alarm attributes are listed in Table 4:
|
Attribute Name |
Attribute Value |
|---|---|
|
Auto Cease |
Yes |
|
Module |
SAF |
|
Error Code |
11 |
|
Timestamp First |
Date and time when the alarm was raised for the first time. |
|
Repeated Counter |
Number which indicates how many times the alarm was raised. |
|
Timestamp Last |
Date and time of the most recent alarm raise. |
|
Resource ID |
.1.3.6.1.4.1.193.169.9.5.<length>.<NOI> |
|
Alarm Model Description |
LOTC Time Synchronization, SAF |
|
Alarm Active Description |
SAF platform: LOTC Time Synchronization, minor, @<NON> |
|
ITU Alarm Event Type |
other (1) |
|
ITU Alarm Probable Cause |
timingProblemX733 (550) |
|
ITU Alarm Perceived Severity |
(5) – Minor |
|
Originating Source IP |
Node IP where the alarm was raised. |
|
Sequence Number |
Number which indicates the order in which alarms were raised. |
Major severity alarm attributes are listed in Table 5:
|
Attribute Name |
Attribute Value |
|---|---|
|
Auto Cease |
Yes |
|
Module |
SAF |
|
Error Code |
12 |
|
Timestamp First |
Date and time when the alarm was raised for the first time. |
|
Repeated Counter |
Number which indicates how many times the alarm was raised. |
|
Timestamp Last |
Date and time of the most recent alarm raise. |
|
Resource ID |
.1.3.6.1.4.1.193.169.9.5.<length>.<NOI> |
|
Alarm Model Description |
LOTC Time Synchronization, SAF |
|
Alarm Active Description |
SAF platform: LOTC Time Synchronization, major, @<NON> |
|
ITU Alarm Event Type |
other (1) |
|
ITU Alarm Probable Cause |
timingProblemX733 (550) |
|
ITU Alarm Perceived Severity |
(4) – Major |
|
Originating Source IP |
Node IP where the alarm was raised. |
|
Sequence Number |
Number which indicates the order in which alarms were raised. |
Critical severity alarm attributes are listed in Table 6:
|
Attribute Name |
Attribute Value |
|---|---|
|
Auto Cease |
Yes |
|
Module |
SAF |
|
Error Code |
5 |
|
Timestamp First |
Date and time when the alarm was raised for the first time. |
|
Repeated Counter |
Number which indicates how many times the alarm was raised. |
|
Timestamp Last |
Date and time of the most recent alarm raise. |
|
Resource ID |
.1.3.6.1.4.1.193.169.9.5.<length>.<NOI> |
|
Alarm Model Description |
LOTC Time Synchronization, SAF |
|
Alarm Active Description |
SAF platform: LOTC Time Synchronization, critical, @<NON> |
|
ITU Alarm Event Type |
other (1) |
|
ITU Alarm Probable Cause |
timingProblemX733 (550) |
|
ITU Alarm Perceived Severity |
(3) – Critical |
|
Originating Source IP |
Node IP where the alarm was raised. |
|
Sequence Number |
Number which indicates the order in which alarms were raised. |
In Table 4, Table 5, and Table 6, the indicated variables are as follows:
- <NON> is the
notifying object name that indicates where the component that generates
the alarm is. For example:
PL_2_3
- <NOI> is the
notifying object identifier. It corresponds to <NON> in a dot-separated, ASCII-decimal-encoded, character-per-character
format. For example:
80.76.95.50.95.51 for safNode=PL_2_3
- <length> is the number of characters in <NON>, which is equivalent to the number of octets in <NOI>. In the previous example, <length> is 6.
For more information about attribute descriptions, refer to CUDB Node Fault Management Configuration Guide, Reference [1].
1.2 Prerequisites
This section provides information on the documents, tools, and conditions that apply to the procedure.
1.2.1 Documents
Before starting this procedure, ensure that you have read the following documents:
- CUDB Node Fault Management Configuration Guide, Reference [1], regarding alarm configuration.
- System Safety Information, Reference [4]
- Personal Health and Safety Information, Reference [5]
1.2.2 Tools
Not applicable.
1.2.3 Conditions
Not applicable.
2 Procedure
If the alarm is raised, do the following:
- Follow the instructions specified in LOTC Time Synchronization, Reference [3].
- If the alarm does not cease, contact the next level of maintenance support. Further actions are outside the scope of this Operating Instruction.
Glossary
For the terms, definitions, acronyms, and abbreviations used in this document, refer to CUDB Glossary of Terms and Acronyms, Reference [2].
Reference List
| CUDB Documents |
|---|
| [1] CUDB Node Fault Management Configuration Guide. |
| [2] CUDB Glossary of Terms and Acronyms. |
| Other Ericsson Documents |
|---|
| [3] LOTC Time Synchronization. |
| [4] System Safety Information. |
| [5] Personal Health and Safety Information. |

Contents