1 Alarm Description
The alarm is raised when a scheduled backup has failed.
|
Alarm Cause |
Description |
Fault Reason |
Fault Location |
Impact |
|---|---|---|---|---|
|
A scheduled backup has failed. |
A scheduled backup event was triggered but failed to create a backup. |
Insufficient disk space |
Local hard disk |
The Managed Element (ME) cannot be restored to its current state later. This can imply more efforts to bring back the ME from an unstable state to a controlled state and can have impact on service availability. Subsequent scheduled backups also fail may also fail until the fault condition is cleared. |
|
Conflict with another ongoing task |
Managed Element (ME) | |||
|
Error reported by participant |
Managed Element (ME) | |||
|
System failover or reboot |
Managed Element (ME) |
Risk of data loss or data corruption.
For Insufficient Disk Space faults, the fault is non-transient and the user must take action or else all subsequent scheduled backups will also fail.
For all other possible fault reasons, subsequent scheduled backups will fail until the fault condition reported in the alarm no longer exists.
This alarm is only cleared after the creation of a scheduled backup of the type (System Data or User Data) that raised the alarm. For example, if the alarm is raised for a failed System Data backup, it can only be cleared when a scheduled System Data backup is successfully created.
2 Procedure
2.1 Handle Alarm BRM, Scheduled Backup Failed
Prerequisites
- This instruction references the following documents:
- Delete Backup
- Export Backup
- List Backups
- Schedule Single Backup
- Set Maximum Number of Scheduled Backups
- Note:
- These Operating Instructions describe only the System Data
backup instructions. To apply them to a User Data backup, the user
needs to navigate to the User Data backup manager in the first step
as follows:
>dn ManagedElement=<node_name>,SystemFunctions=1,BrM=1,BrmBackupManager=USER_DATA
- No tools are required.
- The following condition must apply:
- A BRM, Scheduled Backup Failed alarm is raised.
- An Ericsson Command-Line Interface (ECLI) session in Exec mode is in progress.
Steps
- Check the Additional Text attribute of the alarm.
- Select action based on the attribute value:
- If Additional Text contains Scheduled Backup for <backup_type> failed with disk space error, proceed with Section 2.2 Handle Reason Insufficient Disk Space.
- If Additional Text contains Scheduled Backup for <backup_type> failed due to conflict with, proceed with Section 2.3 Handle Reason BRF Conflict with Other Task.
- If Additional Text contains Scheduled Backup for <backup_type> failed due to participant error, proceed with Section 2.4 Handle Reason Participant Reported Error.
- If Additional Text contains Scheduled Backup for <backup_type> failed due to system failover or reboot, proceed with Section 2.5 Handle Reason System Failover or Reboot.
2.2 Handle Reason Insufficient Disk Space
Steps
- Does this alarm occur every time a scheduled backup takes
place?
Yes: Continue with the next step.
No: Proceed with Step 7.
- Contact the backup administrator about the backup policy.
Is the maximum number of stored scheduled backups too high?
Yes: Continue with the next step.
No: Proceed with Step 6.
- Decrease the maximum number of stored scheduled backups.
Decreasing the value of attribute maxStoredSceduledBackups below the number of scheduled backups in the system automatically deletes the oldest scheduled backups and triggers a new scheduled backup. If the new scheduled backup is successful, the alarm is cleared.
For information on how to decrease the maxStoredSceduledBackups value, refer to Set Maximum Number of Scheduled Backups.
- Check whether a scheduled backup is triggered and successfully
created.
For information on how to list the backups, refer to List Backups.
- Is the alarm cleared?
Yes: Proceed with Step 25.
No: Proceed with Step 7.
- More storage capacity can be needed on the ME. Contact the planning organization and proceed with Step 25.
- List the
backups locally stored in the ME.
For information on how to list the backups, refer to List Backups.
- Is any locally stored manual or scheduled backup no longer
required on the ME?
Yes: Continue with the next step.
No: Proceed with Step 16.
- Note:
- A local backup file is not required if there is no immediate need to restore it on the ME or once it has been exported to a remote file storage.
- If needed, export to the remote file
storage the following locally stored backups:
- Backups that need to be preserved and have not been exported yet
- Backups that have been deleted from the remote file storage
For information on how to export a backup, refer to Export Backup.
- Delete any locally stored backup not required on the ME.
Attention!Risk of data loss or data corruption.
Do not delete backups listed in attribute restoreEscalationList.
For information on how to delete a backup, refer to Delete Backup.
- Has any scheduled backup been manually deleted?
Yes: Continue with the next step.
No: Proceed with Step 14.
- Check whether a scheduled backup is triggered and successfully
created.
For information on how to list the backups, refer to List Backups.
- Is the alarm cleared?
Yes: Proceed with Step 25.
No: Proceed with Step 16.
- Schedule a single
backup.
For information on how to schedule a single backup, refer to Schedule Single Backup.
- Note:
- Ensure to create a scheduled backup of the backup type that generated the alarm. The backup type SYSTEM_DATA or USER_DATA is indicated by additionalText in the alarm.
- Is the new scheduled backup successfully created and is
the alarm cleared?
Yes: Proceed with Step 25.
No: Continue with the next step.
- Identify which files are taking the
most space and which files are the oldest by listing the files in
the file system as follows:
- du -xak /| sort -n | tail -20
The following is an example output:
37120 /usr/lib/perl5/5.10.0 46616 /usr/bin 46908 /usr/lib/perl5 47916 /usr/share 51800 /var 60688 /lib/modules/3.0.74-0.6.10.1.5564.0.⇒ PTF-default/kernel/drivers 62752 /opt/lpmsv/loader 66364 /usr/lib 71100 /opt/com/lib/comp 77900 /opt/com/lib 82564 /opt/lpmsv 90328 /lib/modules/3.0.74-0.6.10.1.5564.0.⇒ PTF-default/kernel 94164 /lib/modules/3.0.74-0.6.10.1.5564.0.⇒ PTF-default 100168 /lib/modules 103560 /opt/com 111096 /lib 128280 /usr/lib64 308568 /usr 333108 /opt 851148 /
- Show a list of files older than some days, for example:
find /cluster/ -mtime +5
The following is an example output:
[...] /cluster/home /cluster/hooks /cluster/hooks/2 /cluster/snapshot /cluster/lost+found /cluster/dumps /cluster/etc/pam.d /cluster/etc/login.allow [...]
- du -xak /| sort -n | tail -20
- Are some of these files normally deleted automatically?
Yes: Continue with the next step.
No: Proceed with Step 20.
- Schedule a single backup.
For information on how to schedule a single backup, refer to Schedule Single Backup.
- Note:
- Ensure to create a scheduled backup of the backup type that generated the alarm. Attribute additionalText for command show on the alarm identifies the backup type.
- Is the new scheduled backup successfully created and is
the alarm cleared?
Yes: Proceed with Step 25.
No: Proceed with Step 23.
- Can significant
file space be saved by deleting some of these files without damaging
the system?
Yes: Continue with the next step.
No: Proceed with Step 23.
- Delete the files:
rm <file1> [<file2> …]
- Proceed with Step 18.
- Perform data collection, refer to Data Collection Guideline.
- Consult the next level of maintenance support. Further actions are outside the scope of this instruction.
- Job is completed.
2.3 Handle Reason BRF Conflict with Other Task
Steps
- Refer to the alarm Additional Text to determine which
task was conflicting with the scheduled backup, for example.
Scheduled Backup for System Data failed due to conflict with Create Backup task for MANUAL backup CMWBackup_20190502_10 of type BRM_SYSTEM_DATA
- Confirm that the ongoing operation has completed.
Navigate to the BrmBackupManager Managed Object (MO) corresponding the scheduled backup type, for example:
>dn ManagedElement=NODE06ST,SystemFunctions=1,BrM=1,BrmBackupManager=SYSTEM_DATA
- Wait until all tasks have completed. Continue to check
the state of the current operation until it is finished:
>show progressReport,state
state=FINISHED
- Schedule a single backup.
For information on how to schedule a single backup, refer to Schedule Single Backup.
- Note:
- It is assumed that there are no scheduled backup events left in the ME, or the existing scheduled backup events are too far in time and therefore not appropriate to wait for to clear the alarm.
- Wait for the scheduled backup to complete.
- Is the alarm cleared?
Yes: Proceed with Step 9.
No: Continue with the next step.
- Perform data collection, refer to Data Collection Guideline.
- Consult the next level of maintenance support. Further actions are outside the scope of this instruction.
- Job is completed.
2.4 Handle Reason Participant Reported Error
Steps
- Perform data collection, refer to Data Collection Guideline.
- Consult the next level of maintenance support. Further actions are outside the scope of this instruction.
- Job is completed.
2.5 Handle Reason System Failover or Reboot
Steps
- Wait for the system to fully recover from the failover
or reboot. Continue to check the status using:
# cmw-status app
Status OK
- Schedule a single backup.
For information on how to schedule a single backup, refer to Schedule Single Backup.
- Note:
- It is assumed that there are no scheduled backup events left in the ME, or the existing scheduled backup events are too far in time and therefore not appropriate to wait for to clear the alarm.
- Wait for the scheduled backup to complete.
- Is the alarm cleared?
Yes: Proceed with Step 7.
No: Continue with the next step.
- Perform data collection, refer to Data Collection Guideline.
- Consult the next level of maintenance support. Further actions are outside the scope of this instruction.
- Job is completed.

Contents