How to use Fault Management

This user guide shows how subscription created on DMS-ETSI for FM and how DMS-ETSI set alert rules in monitoring tool (prometheus).

Prerequisites

  1. To create PaaS environment for FM, the following softwares need to be installed.

    • OpenStack

    • Prometheus

    • Alert Manager

    • Node Exporter

    • Kube-state-metrics

    • Notification server

  2. In PaaS environment containerized VNF should be installed, deploy and instantiate according to the steps below.

    https://docs.openstack.org/tacker/latest/user/v2/cnf/deployment_using_helm/index.html

Abbreviations

  • FM Fault Management

  • DMS-ETSI Deployment Management Services of ETSI like tacker

  • NF FM NF FM may be human in this version or a component part of SMO

Procedure

  1. Create FM subscription

    NF FM sends a request to DMS-ETSI to create FM subscription. In this FM subscription, multiple filter conditions can be set for fault like compute, storage, network etc mentioned in create subscription sample file sample_param_file.json.

    Follow the below steps to create a subscription for fault:

    • Confirm “ID” of the instantiated VNF by executing below command.

      $ openstack vnflcm list --os-tacker-api-version 2
      +--------------------------------------+-------------------+---------------------+--------------+----------------------+------------------+--------------------------------------+
      | ID                                   | VNF Instance Name | Instantiation State | VNF Provider | VNF Software Version | VNF Product Name | VNFD ID                              |
      +--------------------------------------+-------------------+---------------------+--------------+----------------------+------------------+--------------------------------------+
      | d2e61392-14dc-4b23-8d33-a19456de65c4 |                   | INSTANTIATED        | Company      | 1.0                  | Sample VNF       | b1bb0ce7-ebca-4fa7-95ed-4840d70a1177 |       |
      +--------------------------------------+-------------------+---------------------+--------------+----------------------+------------------+--------------------------------------+
      
    • Change the follwing values in subscription sample file

      sample_param_file.json to the actual values confirmed from above and save the file.

      • “vnfdIds” : Set the value of “VNFD ID”

      • “vnfProvider” : Set the string of “VNF Provider”

      • “vnfProductName” : Set the string of “VNF Product Name”

      • “vnfSoftwareVersion” : Set the value of “VNF Software Version”

      • “vnfInstanceIds” : Set the value of “ID”

      The content of the subscription sample sample_param_file.json is as follows:

      {
        "filter": {
          "vnfInstanceSubscriptionFilter": {
            "vnfdIds": [
              "b1bb0ce7-ebca-4fa7-95ed-4840d70a1177"
            ],
            "vnfProductsFromProviders": [
              {
                "vnfProvider": "Company",
                "vnfProducts": [
                   {
                      "vnfProductName": "Sample VNF",
                      "versions": [
                        {
                          "vnfSoftwareVersion": 1.0,
                          "vnfdVersions": [1.0, 2.0]
                            }
                          ]
                      }
                  ]
              }
           ],
           "vnfInstanceIds": [
             "d2e61392-14dc-4b23-8d33-a19456de65c4"
           ]
        },
        "notificationTypes": [
           "AlarmNotification",
           "AlarmClearedNotification",
           "AlarmListRebuiltNotification"
        ],
        "faultyResourceTypes": [
           "COMPUTE",
           "STORAGE",
           "NETWORK"
        ],
        "perceivedSeverities": [
           "CRITICAL",
           "MAJOR",
           "MINOR",
           "WARNING",
           "INDETERMINATE",
           "CLEARED"
        ],
        "eventTypes": [
           "EQUIPMENT_ALARM",
           "COMMUNICATIONS_ALARM",
           "PROCESSING_ERROR_ALARM",
           "ENVIRONMENTAL_ALARM",
           "QOS_ALARM"
        ],
        "probableCauses": [
           "The server cannot be connected."
         ]
        },
        "callbackUri": "http://10.0.0.194:5000/your-callback-endpoint",
        "authentication": {
          "authType": [
            "BASIC"
          ],
          "paramsBasic": {
          "userName": "nfv_user",
          "password": "devstack"
          }
        }
      }
      
    • Execute below command to create FM subscription.

      $ openstack vnffm sub create sample_param_file.json --os-tacker-api-version 2
      
    • Verify FM subscription by executing following command.

      $ openstack vnffm sub list --os-tacker-api-version 2
      +--------------------------------------+-----------------------------------------------+
      | ID                                   | Callback Uri                                  |
      +--------------------------------------+-----------------------------------------------+
      | 724b6752-b782-48e8-a8bb-a20a0fdb8d9f | http://10.0.0.194:5000/your-callback-endpoint |
      +--------------------------------------+-----------------------------------------------+
      
  2. Create alert rules on Monitoring tool

    • Prometheus configuration has two files.

      1. deployment.yaml which contains all the configurations to discover pods and services running in the Kubernetes cluster dynamically. No need to change in deployment.yaml

      2. configmap.yaml which contains all the alert rules for sending alerts to the Alert manager.

        The content of the sample configmap.yaml is as follow:

        apiVersion: v1
        kind: ConfigMap
        metadata:
          name: prometheus-config
          namespace: monitoring
        data:
          prometheus.rules: |-
            groups:
              - name: example
            rules:
              - alert: KubePodCrashLooping
            annotations:
              probable_cause: The server cannot be connected.
              fault_type: Server Down
              fault_details: fault details
            expr: |
              increase(kube_pod_container_status_restarts_total[10m]) > 0
              for: 1m
            labels:
            receiver_type: tacker
            function_type: vnffm
            vnf_instance_id: 8c93a232-92fb-461a-a5b4-60efa2dd5f81
            pod: vdu2-798d577c96-6t42j
            perceived_severity: CRITICAL
            event_type: EQUIPMENT_ALARM
        
    • After add/delete/modify alert rule in sample configmap.yaml, perform

      following steps to make it effective.

      1. Delete old Prometheus ConfigMap

        $ kubectl delete -f configmap.yaml
        
      2. Delete old Prometheus Deployment File

        $ kubectl delete -f deployment.yaml
        
      3. Delete Prometheus Service

        $ kubectl delete -f service.yaml
        
      4. Create Prometheus ConfigMap with updated ConfigMap

        $ kubectl apply -f configmap.yaml
        
      5. Create Prometheus Deployment File

        $ kubectl apply -f deployment.yaml
        
      6. Create Prometheus Service

        $ kubectl apply -f service.yaml
        

Requirements

  1. Receiving Notification

    • The NF FM sends a create subscription request to the DMS-ETSI.

    • After sending the create subscription request, DMS-ETSI will send a GET request to the callback_uri in NF FM to verify its correctness. NF FM should receive this request and then return HTTP 204 to DMS-ETSI.

  2. Sending Heal Request

    • When a fault occurs in a CNF and matches subscribed alarm condition, DMS-ETSI will send an Alarm Notification to NF FM.

    • NF FM should receive the notification, get VNF/VNFC information (vnfInstanceId, vnfcInstanceId) from it, and then send Heal CNF request to NF-LCM, which further sends heal request to DMS-ETSI.

References