All pages
Powered by GitBook
1 of 9

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Management

Perform system management tasks on your NSO deployment.

System Management

Perform NSO system management and configuration.

NSO consists of a number of modules and executable components. These executable components will be referred to by their command-line name, e.g. ncs, ncs-netsim, ncs_cli, etc. ncs is used to refer to the executable, the running daemon.

Starting NSO

When NSO is started, it reads its configuration file and starts all subsystems configured to start (such as NETCONF, CLI, etc.).

By default, NSO starts in the background without an associated terminal. It is recommended to use a when installing NSO for production deployment. This will create an init script that starts NSO when the system boots, and makes NSO start the service manager.

Licensing NSO

NSO is licensed using Cisco Smart Licensing. To register your NSO instance, you need to enter a token from your Cisco Smart Software Manager account. For more information on this topic, see .

Configuring NSO

NSO is configured in the following two ways:

  • Through its configuration file, ncs.conf.

  • Through whatever data is configured at run-time over any northbound, for example, turning on trace using the CLI.

ncs.conf File

The configuration file ncs.conf is read at startup and can be reloaded. Below is an example of the most common settings. It is included here as an example and should be self-explanatory. See for more information. Important configuration settings are:

  • load-path: where NSO should look for compiled YANG files, such as data models for NEDs or Services.

  • db-dir: the directory on disk that CDB uses for its storage and any temporary files being used. It is also the directory where CDB searches for initialization files. This should be a local disk and not NFS mounted for performance reasons.

  • Various log settings.

The ncs.conf file is described in the . There is a large number of configuration items in ncs.conf, most of them have sane default values. The ncs.conf file is an XML file that must adhere to the tailf-ncs-config.yang model. If we start the NSO daemon directly, we must provide the path to the NCS configuration file as in:

However, in a System Install, systemd is typically used to start NSO, and it will pass the appropriate options to the ncs command. Thus, NSO is started with the command:

It is possible to edit the ncs.conf file, and then tell NSO to reload the edited file without restarting the daemon as in:

This command also tells NSO to close and reopen all log files, which makes it suitable to use from a system like logrotate.

In this section, some of the important configuration settings will be described and discussed.

Exposed Interfaces

NSO allows access through a number of different interfaces, depending on the use case. In the default configuration, clients can access the system locally through an unauthenticated IPC socket (with the ncs* family of commands, port 4569) and plain (non-HTTPS) HTTP web server (port 8080). Additionally, the system enables remote access through SSH-secured NETCONF and CLI (ports 2022 and 2024).

We strongly encourage you to review and customize the exposed interfaces to your needs in the ncs.conf configuration file. In particular, set:

  • /ncs-config/webui/match-host-name to true.

  • /ncs-config/webui/server-name to the hostname of the server.

If you decide to allow remote access to the web server, also make sure you use TLS-secured HTTPS instead of HTTP. Not doing so exposes you to security risks.

Using /ncs-config/webui/match-host-name = true requires you to use the configured hostname when accessing the server. Web browsers do this automatically but you may need to set the Host header when performing requests programmatically using an IP address instead of the hostname.

To additionally secure IPC access, refer to .

For more details on individual interfaces and their use, see .

Dynamic Configuration

Let's look at all the settings that can be manipulated through the NSO northbound interfaces. NSO itself has a number of built-in YANG modules. These YANG modules describe the structure that is stored in CDB. Whenever we change anything under, say /devices/device, it will change the CDB, but it will also change the configuration of NSO. We call this dynamic configuration since it can be changed at will through all northbound APIs.

We summarize the most relevant parts below:

tailf-ncs.yang Module

This is the most important YANG module that is used to control and configure NSO. The module can be found at: $NCS_DIR/src/ncs/yang/tailf-ncs.yang in the release. Everything in that module is available through the northbound APIs. The YANG module has descriptions for everything that can be configured.

tailf-common-monitoring2.yang and tailf-ncs-monitoring2.yang are two modules that are relevant to monitoring NSO.

Built-in or External SSH Server

NSO has a built-in SSH server which makes it possible to SSH directly into the NSO daemon. Both the NSO northbound NETCONF agent and the CLI need SSH. To configure the built-in SSH server we need a directory with server SSH keys - it is specified via /ncs-config/aaa/ssh-server-key-dir in ncs.conf. We also need to enable /ncs-config/netconf-north-bound/transport/ssh and /ncs-config/cli/ssh in ncs.conf. In a System Install, ncs.conf is installed in the "config directory", by default /etc/ncs, with the SSH server keys in /etc/ncs/ssh.

Run-time Configuration

There are also configuration parameters that are more related to how NSO behaves when talking to the devices. These reside in devices global-settings.

User Management

Users are configured at the path aaa authentication users.

Access control, including group memberships, is managed using the NACM model (RFC 6536).

Adding a User

Adding a user includes the following steps:

  1. Create the user: admin@ncs(config)# aaa authentication users user <user-name>.

  2. Add the user to a NACM group: admin@ncs(config)# nacm groups <group-name> admin user-name <user-name>.

  3. Verify/change access rules.

It is likely that the new user also needs access to work with device configuration. The mapping from NSO users and corresponding device authentication is configured in authgroups. So, the user needs to be added there as well.

If the last step is forgotten, you will see the following error:

Monitoring NSO

This section describes how to monitor NSO. See also .

Use the command ncs --status to get runtime information on NSO.

NSO Status

Checking the overall status of NSO can be done using the shell:

Or, in the CLI:

For details on the output see $NCS_DIR/src/yang/tailf-common-monitoring2.yang.

Below is an overview of the output:

It is also important to look at the packages that are loaded. This can be done in the CLI with:

Monitoring the NSO Daemon

NSO runs the following processes:

  • The daemon: ncs.smp: this is the NCS process running in the Erlang VM.

  • Java VM: com.tailf.ncs.NcsJVMLauncher: service applications implemented in Java run in this VM. There are several options on how to start the Java VM, it can be monitored and started/restarted by NSO or by an external monitor. See the Manual Page and the java-vm settings in the CLI.

  • Python VMs: NSO packages can be implemented in Python. The individual packages can be configured to run a VM each or share a Python VM. Use the

Logging

NSO has extensive logging functionality. Log settings are typically very different for a production system compared to a development system. Furthermore, the logging of the NSO daemon and the NSO Java VM/Python VM is controlled by different mechanisms. During development, we typically want to turn on the developer-log. The sample ncs.conf that comes with the NSO release has log settings suitable for development, while the ncs.conf created by a System Install are suitable for production deployment.

NSO logs in /logs in your running directory, (depends on your settings in ncs.conf). You might want the log files to be stored somewhere else. See man ncs.conf for details on how to configure the various logs. Below is a list of the most useful log files:

  • ncs.log : NCS daemon log. See . Can be configured to Syslog.

  • ncserr.log.1, ncserr.log.idx, ncserr.log.siz: if the NSO daemon has a problem. this contains debug information relevant to support. The content can be displayed with ncs --printlog ncserr.log.

Syslog

NSO can syslog to a local Syslog. See man ncs.conf how to configure the Syslog settings. All Syslog messages are documented in Log Messages. The ncs.conf also lets you decide which of the logs should go into Syslog: ncs.log, devel.log, netconf.log, snmp.log, audit.log, WebUI access log. There is also a possibility to integrate with rsyslog to log the NCS, developer, audit, netconf, SNMP, and WebUI access logs to syslog with the facility set to daemon in ncs.conf. For reference, see the upgrade-l2 example, located in examples.ncs/development-guide/high-availability/hcc .

Below is an example of Syslog configuration:

Log messages are described on the link below:

NSO Alarms

NSO generates alarms for serious problems that must be remedied. Alarms are available over all the northbound interfaces and exist at the path /alarms. NSO alarms are managed as any other alarms by the general NSO Alarm Manager, see the specific section on the alarm manager in order to understand the general alarm mechanisms.

The NSO alarm manager also presents a northbound SNMP view, alarms can be retrieved as an alarm table, and alarm state changes are reported as SNMP Notifications. See the "NSO Northbound" documentation on how to configure the SNMP Agent.

This is also documented in the example /examples.ncs/getting-started/using-ncs/5-snmp-alarm-northbound.

Alarms are described on the link below:

Trace ID

NSO can issue a unique Trace ID per northbound request, visible in logs and trace headers. This Trace ID can be used to follow the request from service invocation to configuration changes pushed to any device affected by the change. The Trace ID may either be passed in from an external client or generated by NSO.

Trace ID is enabled by default, and can be turned off by adding the following snippet to NSO.conf:

Trace ID is propagated downwards in LSA setups and is fully integrated with commit queues.

Trace ID can be passed to NSO over NETCONF, RESTCONF, JSON-RPC, or CLI as a commit parameter.

If Trace ID is not given as a commit parameter, NSO will generate one. The generated Trace ID is an array of 16 random bytes, encoded as a 32-character hexadecimal string, in accordance with .

For RESTCONF requests, this generated Trace ID will be communicated back to the requesting client as an HTTP header called X-Cisco-NSO-Trace-ID. The trace-id query parameter can also be used with RPCs and actions to relay a trace-id from northbound requests.

For NETCONF, the Trace ID will be returned as an attribute called trace-id.

Trace ID will appear in relevant log entries and trace file headers on the form trace-id=....

Disaster Management

This section describes a number of disaster scenarios and recommends various actions to take in the different disaster variants.

NSO Fails to Start

CDB keeps its data in four files A.cdb, C.cdb, O.cdb and S.cdb. If NSO is stopped, these four files can be copied, and the copy is then a full backup of CDB.

Furthermore, if neither files exist in the configured CDB directory, CDB will attempt to initialize from all files in the CDB directory with the suffix .xml.

Thus, there exist two different ways to re-initiate CDB from a previously known good state, either from .xml files or from a CDB backup. The .xml files would typically be used to reinstall factory defaults whereas a CDB backup could be used in more complex scenarios.

If the S.cdb file has become inconsistent or has been removed, all commit queue items will be removed, and devices not yet processed out of sync. For such an event, appropriate alarms will be raised on the devices and any service instance that has unprocessed device changes will be set in the failed state.

When NSO starts and fails to initialize, the following exit codes can occur:

  • Exit codes 1 and 19 mean that an internal error has occurred. A text message should be in the logs, or if the error occurred at startup before logging had been activated, on standard error (standard output if NSO was started with --foreground --verbose). Generally, the message will only be meaningful to the NSO developers, and an internal error should always be reported to support.

  • Exit codes 2 and 3 are only used for the NCS control commands (see the section COMMUNICATING WITH NCS in the in Manual Pages manual page) and mean that the command failed due to timeout. Code 2 is used when the initial connect to NSO didn't succeed within 5 seconds (or the TryTime if given), while code 3 means that the NSO daemon did not complete the command within the time given by the --timeout option.

If the NSO daemon starts normally, the exit code is 0.

If the AAA database is broken, NSO will start but with no authorization rules loaded. This means that all write access to the configuration is denied. The NSO CLI can be started with a flag ncs_cli --noaaa that will allow full unauthorized access to the configuration.

NSO Failure After Startup

NSO attempts to handle all runtime problems without terminating, e.g., by restarting specific components. However, there are some cases where this is not possible, described below. When NSO is started the default way, i.e. as a daemon, the exit codes will of course not be available, but see the --foreground option in the Manual Page.

  • Out of memory: If NSO is unable to allocate memory, it will exit by calling abort(3). This will generate an exit code, as for reception of the SIGABRT signal - e.g. if NSO is started from a shell script, it will see 134, as the exit code (128 + the signal number).

  • Out of file descriptors for accept(2): If NSO fails to accept a TCP connection due to lack of file descriptors, it will log this and then exit with code 25. To avoid this problem, make sure that the process and system-wide file descriptor limits are set high enough, and if needed configure session limits in ncs.conf. The out-of-file descriptors issue may also manifest itself in that applications are no longer able to open new file descriptors. In many Linux systems, the default limit is 1024, but if we, for example, assume that there are four northbound interface ports, CLI, RESTCONF, SNMP, WebUI/JSON-RPC, or similar, plus a few hundred IPC ports, x 1024 == 5120. But one might as well use the next power of two, 8192, to be on the safe side.

    Several application issues can contribute to consuming extra ports. In the scope of an NSO application that could, for example, be a script application that invokes CLI command or a callback daemon application that does not close the connection socket as it should.

Transaction Commit Failure

When the system is updated, NSO executes a two-phase commit protocol towards the different participating databases including CDB. If a participant fails in the commit() phase although the participant succeeded in the preparation phase, the configuration is possibly in an inconsistent state.

When NSO considers the configuration to be in an inconsistent state, operations will continue. It is still possible to use NETCONF, the CLI, and all other northbound management agents. The CLI has a different prompt which reflects that the system is considered to be in an inconsistent state and also the Web UI shows this:

The MAAPI API has two interface functions that can be used to set and retrieve the consistency status, those are maapi_set_running_db_status() and maapi_get_running_db_status() corresponding. This API can thus be used to manually reset the consistency state. The only alternative to reset the state to a consistent state is by reloading the entire configuration.

Backup and Restore

All parts of the NSO installation can be backed up and restored with standard file system backup procedures.

The most convenient way to do backup and restore is to use the ncs-backup command. In that case, the following procedure is used.

Take a Backup

NSO Backup backs up the database (CDB) files, state files, config files, and rollback files from the installation directory. To take a complete backup (for disaster recovery), use:

The backup will be stored in the "run directory", by default /var/opt/ncs, as /var/opt/ncs/backups/[email protected].

For more information on backup, refer to the in Manual Pages.

Restore a Backup

NSO Restore is performed if you would like to switch back to a previous good state or restore a backup.

It is always advisable to stop NSO before performing a restore.

  1. First stop NSO if NSO is not stopped yet.

  2. Restore the backup.

    Select the backup to be restored from the available list of backups. The configuration and database with run-time state files are restored in /etc/ncs and /var/opt/ncs.

  3. Start NSO.

Rollbacks

NSO supports creating rollback files during the commit of a transaction that allows for rolling back the introduced changes. Rollbacks do not come without a cost and should be disabled if the functionality is not going to be used. Enabling rollbacks impacts both the time it takes to commit a change and requires sufficient storage on disk.

Rollback files contain a set of headers and the data required to restore the changes that were made when the rollback was created. One of the header fields includes a unique rollback ID that can be used to address the rollback file independent of the rollback numbering format.

The use of rollbacks from the supported APIs and the CLI is documented in the documentation for the given API.

ncs.conf Config for Rollback

As described , NSO is configured through the configuration file, ncs.conf. In that file, we have the following items related to rollbacks:

  • /ncs-config/rollback/enabled: If set to true, then a rollback file will be created whenever the running configuration is modified.

  • /ncs-config/rollback/directory: Location where rollback files will be created.

  • /ncs-config/rollback/history-size: The number of old rollback files to save.

Troubleshooting

New users can face problems when they start to use NSO. If you face an issue, reach out to our support team regardless if your problem is listed here or not.

A useful tool in this regard is the ncs-collect-tech-report tool, which is the Bash script that comes with the product. It collects all log files, CDB backup, and several debug dumps as a TAR file. Note that it works only with a System Install.

Some noteworthy issues are covered here.

Installation Problems: Error Messages During Installation
  • Error

  • Impact The resulting installation is incomplete.

Problem Starting NSO: NSO Terminating with GLIBC Error
  • Error

  • Impact NSO terminates immediately with a message similar to the one above.

Problem in Running Examples: The netconf-console Program Fails
  • Error You must install the Python SSH implementation Paramiko in order to use SSH.

  • Impact Sending NETCONF commands and queries with netconf-console fails, while it works using

Problems Using and Developing Services

If you encounter issues while loading service packages, creating service instances, or developing service models, templates, and code, you can consult the Troubleshooting section in .

General Troubleshooting Strategies

If you have trouble starting or running NSO, examples, or the clients you write, here are some troubleshooting tips.

Transcript

When contacting support, it often helps the support engineer to understand what you are trying to achieve if you copy-paste the commands, responses, and shell scripts that you used to trigger the problem, together with any CLI outputs and logs produced by NSO.

Source ENV Variables

If you have problems executing ncs commands, make sure you source the ncsrc script in your NSO directory (your path may be different than the one in the example if you are using a local install), which sets the required environmental variables.

Log Files

To find out what NSO is/was doing, browsing NSO log files is often helpful. In the examples, they are called devel.log, ncs.log, audit.log. If you are working with your own system, make sure that the log files are enabled in ncs.conf. They are already enabled in all the examples. You can read more about how to enable and inspect various logs in the section.

Verify HW Resources

Both high CPU utilization and a lack of memory can negatively affect the performance of NSO. You can use commands such as top to examine resource utilization, and free -mh to see the amount of free and consumed memory. A common symptom of a lack of memory is NSO or Java-VM restarting. A sufficient amount of disk space is also required for CDB persistence and logs, so you can also check disk space with df -h command. In case there is enough space on the disk and you still encounter ENOSPC errors, check the inode usage with df -i command.

Status

NSO will give you a comprehensive status of daemon status, YANG modules, loaded packages, MIBs, active user sessions, CDB locks, and more if you run:

NSO status information is also available as operational data under /ncs-state.

Check Data Provider

If you are implementing a data provider (for operational or configuration data), you can verify that it works for all possible data items using:

Debug Dump

If you suspect you have experienced a bug in NSO, or NSO told you so, you can give Support a debug dump to help us diagnose the problem. It contains a lot of status information (including a full ncs --status report) and some internal state information. This information is only readable and comprehensible to the NSO development team, so send the dump to your support contact. A debug dump is created using:

Just as in CSI on TV, the information must be collected as soon as possible after the event. Many interesting traces will wash away with time, or stay undetected if there are lots of irrelevant facts in the dump.

If NSO gets stuck while terminating, it can optionally create a debug dump after being stuck for 60 seconds. To enable this mechanism, set the environment variable $NCS_DEBUG_DUMP_NAME to a filename of your choice.

Error Log

Another thing you can do in case you suspect that you have experienced a bug in NSO is to collect the error log. The logged information is only readable and comprehensible to the NSO development team, so send the log to your support contact. The log actually consists of a number of files called ncserr.log.* - make sure to provide them all.

System Dump

If NSO aborts due to failure to allocate memory (see ), and you believe that this is due to a memory leak in NSO, creating one or more debug dumps as described above (before NSO aborts) will produce the most useful information for Support. If this is not possible, NSO will produce a system dump by default before aborting, unless DISABLE_NCS_DUMP is set.

The default system dump file name is ncs_crash.dump and it could be changed by setting the environment variable $NCS_DUMP before starting NSO. The dumped information is only comprehensible to the NSO development team, so send the dump to your support contact.

System Call Trace

To catch certain types of problems, especially relating to system start and configuration, the operating system's system call trace can be invaluable. This tool is called strace/ktrace/truss. Please send the result to your support contact for a diagnosis.

By running the instructions below.

Linux:

BSD:

Solaris:

AAA configuration.
  • Rollback file directory and history length.

  • Enabling north-bound interfaces like REST, and WebUI.

  • Enabling of High-Availability mode.

  • All northbound agents like CLI, REST, NETCONF, SNMP, etc. are listed with their IP and port. So if you want to connect over REST, for example, you can see the port number here.

    patches

    Lists any installed patches.

    upgrade-mode

    If the node is in upgrade mode, it is not possible to get any information from the system over NETCONF. Existing CLI sessions can get system information.

    show python-vm status current
    to see current threads and
    show python-vm status start
    to see which threads were started at startup time.
    audit.log: central audit log covering all northbound interfaces. See Log Messages and Formats. Can be configured to Syslog.
  • localhost:8080.access: all HTTP requests to the daemon. This is an access log for the embedded Web server. This file adheres to the Common Log Format, as defined by Apache and others. This log is not enabled by default and is not rotated, i.e. use logrotate(8). Can be configured to Syslog.

  • devel.log: developer-log is a debug log for troubleshooting user-written code. This log is enabled by default and is not rotated, i.e. use logrotate(8). This log shall be used in combination with the java-vm or python-vm logs. The user code logs in the VM logs and the corresponding library logs in devel.log. Disable this log in production systems. Can be configured to Syslog. You can manage this log and set its logging level in ncs.conf.

  • ncs-java-vm.log, ncs-python-vm.log: logger for code running in Java or Python VM, for example, service applications. Developers writing Java and Python code use this log (in combination with devel.log) for debugging. Both Java and Python log levels can be set from their respective VM settings in, for example, the CLI.

  • netconf.log, snmp.log: Log for northbound agents. Can be configured to Syslog.

  • rollbackNNNNN: All NSO commits generate a corresponding rollback file. The maximum number of rollback files and file numbering can be configured in ncs.conf.

  • xpath.trace: XPATH is used in many places, for example, XML templates. This log file shows the evaluation of all XPATH expressions and can be enabled in the ncs.conf.

    To debug XPATH for a template, use the pipe target debug in the CLI instead.

  • ned-cisco-ios-xr-pe1.trace (for example): if device trace is turned on a trace file will be created per device. The file location is not configured in ncs.conf but is configured when the device trace is turned on, for example in the CLI.

  • Progress trace log: When a transaction or action is applied, NSO emits specific progress events. These events can be displayed and recorded in a number of different ways, either in CLI with the pipe target details on a commit, or by writing it to a log file. You can read more about it in the Progress Trace.

  • Transaction error log: log for collecting information on failed transactions that lead to either a CDB boot error or a runtime transaction failure. The default is false (disabled). More information about the log is available in the Manual Pages under Configuration Parameters (see logs/transaction-error-log).

  • Upgrade log: log containing information about CDB upgrade. The log is enabled by default and not rotated (i.e., use logrotate). With the NSO example set, the following examples populate the log in the logs/upgrade.log file: examples.ncs/development-guide/ned-upgrade/yang-revision, examples.ncs/development-guide/high-availability/upgrade-basic, examples.ncs/development-guide/high-availability/upgrade-cluster, and examples.ncs/getting-started/developing-with-ncs/14-upgrade-service. More information about the log is available in the Manual Pages under Configuration Parameters (see logs/upgrade-log).

  • Exit code 10 means that one of the init files in the CDB directory was faulty in some way — further information in the log.

  • Exit code 11 means that the CDB configuration was changed in an unsupported way. This will only happen when an existing database is detected, which was created with another configuration than the current in ncs.conf.

  • Exit code 13 means that the schema change caused an upgrade, but for some reason, the upgrade failed. Details are in the log. The way to recover from this situation is either to correct the problem or to re-install the old schema (fxs) files.

  • Exit code 14 means that the schema change caused an upgrade, but for some reason the upgrade failed, corrupting the database in the process. This is rare and usually caused by a bug. To recover, either start from an empty database with the new schema, or re-install the old schema files and apply a backup.

  • Exit code 15 means that A.cdb or C.cdb is corrupt in a non-recoverable way. Remove the files and re-start using a backup or init files.

  • Exit code 16 means that CDB ran into an unrecoverable file error (such as running out of space on the device while performing journal compaction).

  • Exit code 20 means that NSO failed to bind a socket.

  • Exit code 21 means that some NSO configuration file is faulty. More information is in the logs.

  • Exit code 22 indicates an NSO installation-related problem, e.g., that the user does not have read access to some library files, or that some file is missing.

  • A commonly used command for changing the maximum number of open file descriptors is ulimit -n [limit]. Commands such as netstat and lsof can be useful to debug file descriptor-related issues.

    Cause This happens if the installation program has been damaged, most likely because it has been downloaded in ASCII mode.
  • Resolution Remove the installation directory. Download a new copy of NSO from our servers. Make sure you use binary transfer mode every step of the way.

  • Cause This happens if you are running on a very old Linux version. The GNU libc (GLIBC) version is older than 2.3.4, which was released in 2004.
  • Resolution Use a newer Linux system, or upgrade the GLIBC installation.

  • netconf-console-tcp
    .
  • Cause The netconf-console command is implemented using the Python programming language. It depends on the Python SSHv2 implementation Paramiko. Since you are seeing this message, your operating system doesn't have the Python module Paramiko installed.

  • Resolution Install Paramiko using the instructions from https://www.paramiko.org. When properly installed, you will be able to import the Paramiko module without error messages.

    Exit the Python interpreter with Ctrl+D.

  • Workaround A workaround is to use netconf-console-tcp. It uses TCP instead of SSH and doesn't require Paramiko. Note that TCP traffic is not encrypted.

  • daemon-status

    You can see the NSO daemon mode, starting, phase0, phase1, started, stopping. The phase0 and phase1 modes are schema upgrade modes and will appear if you have upgraded any data models.

    version

    The NSO version.

    smp

    Number of threads used by the daemon.

    ha

    The High-Availability mode of the NCS daemon will show up here: secondary, primary, relay-secondary.

    internal/callpoints

    The next section is callpoints. Make sure that any validation points, etc. are registered. (The ncs-rfs-service-hook is an obsolete callpoint, ignore this one).

    • UNKNOWN code tries to register a call-point that does not exist in a data model.

    • NOT-REGISTERED a loaded data model has a call-point but no code has registered.

    Of special interest is of course the servicepoints. All your deployed service models should have a corresponding service-point. For example:

    internal/cdb

    The cdb section is important. Look for any locks. This might be a sign that a developer has taken a CDB lock without releasing it. The subscriber section is also important. A design pattern is to register subscribers to wait for something to change in NSO and then trigger an action. Reactive FASTMAP is designed around that. Validate that all expected subscribers are OK.

    loaded-data-models

    The next section shows all namespaces and YANG modules that are loaded. If you, for example, are missing a service model, make sure it is loaded.

    System Install
    Cisco Smart Licensing
    ncs.conf in Manual Pages
    NSO Manual Pages
    Restricting Access to the IPC port
    Northbound APIs
    NSO Alarms
    ncs.conf(5)
    Log Messages and Formats
    Log Messages and Formats
    Alarm Types
    Trace ID
    ncs(1)
    ncs(1)
    ncs-backup(1)
    earlier
    Implementing Services
    Logging
    Disaster Management

    cli, netconf, rest, snmp, webui

        <developer-log>
          <enabled>true</enabled>
          <file>
            <name>${NCS_LOG_DIR}/devel.log</name>
            <enabled>false</enabled>
          </file>
          <syslog>
            <enabled>true</enabled>
          </syslog>
        </developer-log>
        <developer-log-level>trace</developer-log-level>
    admin@ncs(config)# python-vm logging level level-info
    admin@ncs(config)# java-vm java-logging logger com.tailf.maapi level level-info
        <xpathTraceLog>
          <enabled>true</enabled>
          <filename>${NCS_LOG_DIR}/xpath.trace</filename>
        </xpathTraceLog>
    admin@ncs(config)# commit | debug template
    admin@ncs(config)# devices device r0 trace pretty
    $ python
    ...
    >>> import paramiko
    >>>
    # ncs -c /etc/ncs/ncs.conf
    # systemctl nso start
    # ncs --reload
    ncs@ncs(config)#
    Possible completions:
      aaa                        AAA management, users and groups
      cluster                    Cluster configuration
      devices                    Device communication settings
      java-vm                    Control of the NCS Java VM
      nacm                       Access control
      packages                   Installed packages
      python-vm                  Control of the NCS Python VM
      services                   Global settings for services, (the services themselves might be augmented somewhere else)
      session                    Global default CLI session parameters
      snmp                       Top-level container for SNMP related configuration and status objects.
      snmp-notification-receiver Configure reception of SNMP notifications
      software                   Software management
      ssh                        Global SSH connection configuration
    admin@ncs(config)# devices global-settings
    Possible completions:
      backlog-auto-run               Auto-run the backlog at successful connection
      backlog-enabled                Backlog requests to non-responding devices
      commit-queue
      commit-retries                 Retry commits on transient errors
      connect-timeout                Timeout in seconds for new connections
      ned-settings                   Control which device capabilities NCS uses
      out-of-sync-commit-behaviour   Specifies the behaviour of a commit operation involving a device that is out of sync with NCS.
      read-timeout                   Timeout in seconds used when reading data
      report-multiple-errors         By default, when the NCS device manager commits data southbound and when there are errors, we only
                                     report the first error to the operator, this flag makes NCS report all errors reported by managed
                                     devices
      trace                          Trace the southbound communication to devices
      trace-dir                      The directory where trace files are stored
      write-timeout                  Timeout in seconds used when writing
      data
    admin@ncs(config)# show full-configuration aaa authentication users user
    aaa authentication users user admin
     uid        1000
     gid        1000
     password   $1$GNwimSPV$E82za8AaDxukAi8Ya8eSR.
     ssh_keydir /var/ncs/homes/admin/.ssh
     homedir    /var/ncs/homes/admin
    !
    aaa authentication users user oper
     uid        1000
     gid        1000
     password   $1$yOstEhXy$nYKOQgslCPyv9metoQALA.
     ssh_keydir /var/ncs/homes/oper/.ssh
     homedir    /var/ncs/homes/oper
    !...
    admin@ncs(config)# show full-configuration nacm
    nacm write-default permit
    nacm groups group admin
     user-name [ admin private ]
    !
    nacm groups group oper
     user-name [ oper public ]
    !
    nacm rule-list admin
     group [ admin ]
     rule any-access
      action permit
     !
    !
    nacm rule-list any-group
     group [ * ]
     rule tailf-aaa-authentication
      module-name       tailf-aaa
      path              /aaa/authentication/users/user[name='$USER']
      access-operations read,update
      action            permit
     !
    admin@ncs(config)# show full-configuration devices authgroups
    devices authgroups group default
     umap admin
      remote-name     admin
      remote-password $4$wIo7Yd068FRwhYYI0d4IDw==
     !
     umap oper
      remote-name     oper
      remote-password $4$zp4zerM68FRwhYYI0d4IDw==
     !
    !
    jim@ncs(config)# devices device c0 config ios:snmp-server community fee
    jim@ncs(config-config)# commit
    Aborted: Resource authgroup for jim doesn't exist
    $ ncs --status
    ncs# show ncs-state
    admin> show packages
    packages package cisco-asa
     package-version 3.4.0
     description     "NED package for Cisco ASA"
     ncs-min-version [ 3.2.2 3.3 3.4 4.0 ]
     directory       ./state/packages-in-use/1/cisco-asa
     component upgrade-ned-id
      upgrade java-class-name com.tailf.packages.ned.asa.UpgradeNedId
     component ASADp
      callback java-class-name [ com.tailf.packages.ned.asa.ASADp ]
     component cisco-asa
      ned cli ned-id  cisco-asa
      ned cli java-class-name com.tailf.packages.ned.asa.ASANedCli
      ned device vendor Cisco
        <syslog-config>
          <facility>daemon</facility>
        </syslog-config>
    
        <ncs-log>
          <enabled>true</enabled>
          <file>
            <name>./logs/ncs.log</name>
            <enabled>true</enabled>
          </file>
          <syslog>
            <enabled>true</enabled>
          </syslog>
        </ncs-log>
    <trace-id>false</trace-id>
      -- WARNING ------------------------------------------------------
      Running db may be inconsistent. Enter private configuration mode and
      install a rollback configuration or load a saved configuration.
      ------------------------------------------------------------------
    # ncs-backup
    systemctl stop ncs
    ncs-backup --restore
    systemctl start ncs
    root@linux:/# ncs-collect-tech-report --full 
    tar: Skipping to next header
    gzip: stdin: invalid compressed data--format violated
    Internal error: Open failed: /lib/tls/libc.so.6: version
    `GLIBC_2.3.4' not found (required by
    .../lib/ncs/priv/util/syst_drv.so)
    $ source /etc/profile.d/ncs.sh
    $ ncs --status
    $ ncs --check-callbacks
    $ ncs --debug-dump mydump1
    # strace -f -o mylog1.strace -s 1024 ncs ...
    # ktrace -ad -f mylog1.ktrace ncs ...
    # kdump -f mylog1.ktrace > mylog1.kdump
    # truss -f -o mylog1.truss ncs ...
    servicepoints:
      id=l3vpn-servicepoint daemonId=10 daemonName=ncs-dp-6-l3vpn:L3VPN
      id=nsr-servicepoint daemonId=11 daemonName=ncs-dp-7-nsd:NSRService
      id=vm-esc-servicepoint daemonId=12 daemonName=ncs-dp-8-vm-manager-esc:ServiceforVMstarting
      id=vnf-catalogue-esc daemonId=13 daemonName=ncs-dp-9-vnf-catalogue-esc:ESCVNFCatalogueService

    Cisco Smart Licensing

    Manage purchase and licensing of Cisco software.

    Cisco Smart Licensing is a cloud-based approach to licensing and it simplifies the purchase, deployment, and management of Cisco software assets. Entitlements are purchased through a Cisco account via Cisco Commerce Workspace (CCW) and are immediately deposited into a Smart Account for usage. This eliminates the need to install license files on every device. Products that are smart-enabled communicate directly to Cisco to report consumption.

    Cisco Smart Software Manager (CSSM) enables the management of software licenses and Smart Account from a single portal. The interface allows you to activate your product, manage entitlements, and renew and upgrade software.

    A functioning Smart Account is required to complete the registration process. For detailed information about CSSM, see Cisco Smart Software Manager.

    Smart Accounts and Virtual Accounts

    A Virtual Account exists as a sub-account within the Smart Account. Virtual Accounts are a customer-defined structure based on organizational layout, business function, geography, or any defined hierarchy. They are created and maintained by the Smart Account administrator(s).

    Visit to learn about how to create and manage Smart Accounts.

    Request a Smart Account

    The creation of a new Smart Account is a one-time event and subsequent management of users is a capability provided through the tool. To request a Smart Account, visit and take the following steps:

    1. After logging in select Request a Smart Account in the Administration section.

    2. Select the type of Smart Account to create. There are two options: (a) Individual Smart Account requiring agreement to represent your company. By creating this Smart Account you agree to authorization to create and manage product and service entitlements, users, and roles on behalf of your organization. (b) Create the account on behalf of someone else.

    3. Provide the required domain identifier and the preferred account name.

    Adding Users to a Smart Account

    Smart Account user management is available in the Administration section of . Take the following steps to add a new user to a Smart Account:

    1. After logging in Select Manage Smart Account in the Administration section.

    2. Choose the Users tab.

    3. Select New User and follow the instructions in the wizard to add a new user.

    Create a License Registration Token

    1. To create a new token, log into CSSM and select the appropriate Virtual Account.

    2. Click on the Smart Licenses link to enter CSSM.

    3. In CSSM click on New Token.

    Notes on Configuring Smart Licensing

    • If ncs.conf contains configuration for any of java-executable, java-options, override-url/url, or proxy/url under the configure path /ncs-config/smart-license/smart-agent/ any corresponding configuration done via the CLI is ignored.

    • The smart licensing component of NSO runs its own Java virtual machine. Usually, the default Java options are sufficient:

      If you for some reason need to modify the Java options, remember to include the default values as found in the YANG model.

    Validation and Troubleshooting

    Available show and debug Commands

    • show license all: Displays all information.

    • show license status: Displays status information.

    • show license summary: Displays summary.

  • The account request will be pending approval of the Account Domain Identifier. A subsequent email will be sent to the requester to complete the setup process.

  • Follow the dialog to provide a description, expiration, and export compliance applicability before accepting the terms and responsibilities. Click on Create Token to continue.

  • Click on the new token.

  • Copy the token from the dialogue window into your clipboard.

  • Go to the NSO CLI and provide the token to the license smart register idtoken command:

  • show license tech: Displays license tech support information.
  • show license usage: Displays usage information.

  • debug smart_lic all: All available Smart Licensing debug flags.

  • Cisco Cisco Software Central
    Cisco Cisco Software Central
    Cisco Cisco Software Central

    Package Management

    Perform package management tasks.

    All user code that needs to run in NSO must be part of a package. A package is basically a directory of files with a fixed file structure or a tar archive with the same directory layout. A package consists of code, YANG modules, etc., that are needed to add an application or function to NSO. Packages are a controlled way to manage loading and versions of custom applications.

    Network Element Drivers (NEDs) are also packages. Each NED allows NSO to manage a network device of a specific type. Except for third-party YANG NED packages which do not contain a YANG device model by default (and must be downloaded and fixed before adding to the package), a NED typically contains a device YANG model and the code, specifying how NSO should connect to the device. For NETCONF devices, NSO includes built-in tools to help you build a NED, as described in , that you can use if needed. Otherwise, a third-party YANG NED, if available, should be used instead. Vendors, in some cases, provide the required YANG device models but not the entire NED. In practice, all NSO instances use at least one NED. The set of used NED packages depends on the number of different device types the NSO manages.

    When NSO starts, it searches for packages to load. The ncs.conf parameter /ncs-config/load-path defines a list of directories. At initial startup, NSO searches these directories for packages and copies the packages to a private directory tree in the directory defined by the /ncs-config/state-dir parameter in ncs.conf

    admin@ncs# license smart register idtoken YzY2YjFlOTYtOWYzZi00MDg1...
    Registration process in progress.
    Use the 'show license status' command to check the progress and result.
              leaf java-options {
              tailf:info "Smart licensing Java VM start options";
              type string;
              default "-Xmx64M -Xms16M
              -Djava.security.egd=file:/dev/./urandom";
              description
              "Options which NCS will use when starting
              the Java VM.";}
    , and loads and starts all the packages found. On subsequent startups, NSO will by default only load and start the copied packages. The purpose of this procedure is to make it possible to reliably load new or updated packages while NSO is running, with a fallback to the previously existing version of the packages if the reload should fail.

    In a System Install of NSO, packages are always installed (normally through symbolic links) in the packages subdirectory of the run directory, i.e. by default /var/opt/ncs/packages, and the private directory tree is created in the state subdirectory, i.e. by default /var/opt/ncs/state.

    Loading Packages

    Loading of new or updated packages (as well as removal of packages that should no longer be used) can be requested via the reload action - from the NSO CLI:

    This request makes NSO copy all packages found in the load path to a temporary version of its private directory, and load the packages from this directory. If the loading is successful, this temporary directory will be made permanent, otherwise, the temporary directory is removed and NSO continues to use the previous version of the packages. Thus when updating packages, always update the version in the load path, and request that NSO does the reload via this action.

    If the package changes include modified, added, or deleted .fxs files or .ccl files, NSO needs to run a data model upgrade procedure, also called a CDB upgrade. NSO provides a dry-run option to packages reload action to test the upgrade without committing the changes. Using a reload dry-run, you can tell if a CDB upgrade is needed or not.

    The report all-schema-changes option of the reload action instructs NSO to produce a report of how the current data model schema is being changed. Combined with a dry run, the report allows you to verify the modifications introduced with the new versions of the packages before actually performing the upgrade.

    For a data model upgrade, including a dry run, all transactions must be closed. In particular, users having CLI sessions in configure mode must exit to operational mode. If there are ongoing commit queue items, and the wait-commit-queue-empty parameter is supplied, it will wait for the items to finish before proceeding with the reload. During this time, it will not allow the creation of any new transactions. Hence, if one of the queue items fails with rollback-on-error option set, the commit queue's rollback will also fail, and the queue item will be locked. In this case, the reload will be canceled. A manual investigation of the failure is needed in order to proceed with the reload.

    While the data model upgrade is in progress, all transactions are closed and new transactions are not allowed. This means that starting a new management session, such as a CLI or SSH connection to the NSO, will also fail, producing an error that the node is in upgrade mode.

    By default, the reload action will (when needed) wait up to 10 seconds for the commit queue to empty (if the wait-commit-queue-empty parameter is entered) and reload to start.

    If there are still open transactions at the end of this period, the upgrade will be canceled and the reload operation will fail. The max-wait-time and timeout-action parameters to the action can modify this behavior. For example, to wait for up to 30 seconds, and forcibly terminate any transactions that still remain open after this period, we can invoke the action as:

    Thus the default values for these parameters are 10 and fail, respectively. In case there are no changes to .fxs or .ccl files, the reload can be carried out without the data model upgrade procedure, and these parameters are ignored since there is no need to close open transactions.

    When reloading packages, NSO will give a warning when the upgrade looks suspicious, i.e., may break some functionality. Note that this is not a strict upgrade validation, but only intended as a hint to the NSO administrator early in the upgrade process that something might be wrong. Currently, the following scenarios will trigger the warnings:

    • One or more namespaces are removed by the upgrade. The consequence of this is all data belonging to this namespace is permanently deleted from CDB upon upgrade. This may be intended in some scenarios, in which case it is advised to proceed with overriding warnings as described below.

    • There are source .java files found in the package, but no matching .class files in the jars loaded by NSO. This likely means that the package has not been compiled.

    • There are matching .class files with modification time older than the source files, which hints that the source has been modified since the last time the package was compiled. This likely means that the package was not re-compiled the last time the source code was changed.

    If a warning has been triggered it is a strong recommendation to fix the root cause. If all of the warnings are intended, it is possible to proceed with packages reload force command.

    In some specific situations, upgrading a package with newly added custom validation points in the data model may produce an error similar to no registration found for callpoint NEW-VALIDATION/validate or simply application communication failure, resulting in an aborted upgrade. See New Validation Points on how to proceed.

    In some cases, we may want NSO to do the same operation as the reload action at NSO startup, i.e. copy all packages from the load path before loading, even though the private directory copy already exists. This can be achieved in the following ways:

    • Setting the shell environment variable $NCS_RELOAD_PACKAGES to true. This will make NSO do the copy from the load path on every startup, as long as the environment variable is set. In a System Install, NSO is typically started as a systemd system service, and NCS_RELOAD_PACKAGES=true can be set in /etc/ncs/ncs.systemd.conf temporarily to reload the packages.

    • Giving the option --with-package-reload to the ncs command when starting NSO. This will make NSO do the copy from the load path on this particular startup, without affecting the behavior on subsequent startups.

    • If warnings are encountered when reloading packages at startup using one of the options above, the recommended way forward is to fix the root cause as indicated by the warnings as mentioned before. If the intention is to proceed with the upgrade without fixing the underlying cause for the warnings, it is possible to force the upgrade using NCS_RELOAD_PACKAGES=force environment variable or --with-package-reload-force option.

    Always use one of these methods when upgrading to a new version of NSO in an existing directory structure, to make sure that new packages are loaded together with the other parts of the new system.

    Redeploying Packages

    If it is known in advance that there were no data model changes, i.e. none of the .fxs or .ccl files changed, and none of the shared JARs changed in a Java package, and the declaration of the components in the package-meta-data.xml is unchanged, then it is possible to do a lightweight package upgrade, called package redeploy. Package redeploy only loads the specified package, unlike packages reload which loads all of the packages found in the load-path.

    Redeploying a package allows you to reload updated or load new templates, reload private JARs for a Java package, or reload the Python code which is a part of this package. Only the changed part of the package will be reloaded, e.g. if there were no changes to Python code, but only templates, then the Python VM will not be restarted, but only templates reloaded. The upgrade is not seamless however as the old templates will be unloaded for a short while before the new ones are loaded, so any user of the template during this period of time will fail; the same applies to changed Java or Python code. It is hence the responsibility of the user to make sure that the services or other code provided by the package is unused while it is being redeployed.

    The package redeploy will return true if the package's resulting status after the redeploy is up. Consequently, if the result of the action is false, then it is advised to check the operational status of the package in the package list.

    Adding NED Packages

    Unlike a full packages reload operation, new NED packages can be loaded into the system without disrupting existing transactions. This is only possible for new packages, since these packages don't yet have any instance data.

    The operation is performed through the /packages/add action. No additional input is necessary. The operation scans all the load-paths for any new NED packages and also verifies the existing packages are still present. If packages are modified or deleted, the operation will fail.

    Each NED package defines ned-id, an identifier that is used in selecting the NED for each managed device. A new NED package is therefore a package with a ned-id value that is not already in use.

    In addition, the system imposes some additional constraints, so it is not always possible to add just any arbitrary NED. In particular, NED packages can also contain one or more shared data models, such as NED settings or operational data for private use by the NED, that are not specific to each version of NED package but rather shared between all versions. These are typically placed outside any mount point (device-specific data model), extending the NSO schema directly. So, if a NED defines schema nodes outside any mount point, there must be no changes to these nodes if they already exist.

    Adding a NED package with a modified shared data model is therefore not allowed and all shared data models are verified to be identical before a NED package can be added. If they are not, the /packages/add action will fail and you will have to use the /packages/reload command.

    The command returns true if the package's resulting status after deployment is up. Likewise, if the result for a package is false, then the package was added but its code has not started successfully and you should check the operational status of the package with the show packages package <PKG> oper-status command for additional information. You may then use the /packages/package/redeploy action to retry deploying the package's code, once you have corrected the error.

    In a high-availability setup, you can perform this same operation on all the nodes in the cluster with a single packages ha sync and-add command.

    Managing Packages

    In a System Install of NSO, management of pre-built packages is supported through a number of actions. This support is not available in a Local Install, since it is dependent on the directory structure created by the System Install. Please refer to the YANG submodule $NCS_DIR/src/ncs/yang/tailf-ncs-software.yang for the full details of the functionality described in this section.

    Actions

    Actions are provided to list local packages, to fetch packages from the file system, and to install or deinstall packages:

    • software packages list [...]: List local packages, categorized into loaded, installed, and installable. The listing can be restricted to only one of the categories - otherwise, each package listed will include the category for the package.

    • software packages fetch package-from-file <file>: Fetch a package by copying it from the file system, making it installable.

    • software packages install package <package-name> [...]: Install a package, making it available for loading via the packages reload action, or via a system restart with package reload. The action ensures that only one version of the package is installed - if any version of the package is installed already, the replace-existing option can be used to deinstall it before proceeding with the installation.

    • software packages deinstall package <package-name>: Deinstall a package, i.e. remove it from the set of packages available for loading.

    There is also an upload action that can be used via NETCONF or REST to upload a package from the local host to the NSO host, making it installable there. It is not feasible to use in the CLI or Web UI, since the actual package file contents is a parameter for the action. It is also not suitable for very large (more than a few megabytes) packages, since the processing of action parameters is not designed to deal with very large values, and there is a significant memory overhead in the processing of such values.

    More on Package Management

    NSO Packages contain data models and code for a specific function. It might be NED for a specific device, a service application like MPLS VPN, a WebUI customization package, etc. Packages can be added, removed, and upgraded in run-time. A common task is to add a package to NSO to support a new device type or upgrade an existing package when the device is upgraded.

    (We assume you have the example up and running from the previous section). Currently installed packages can be viewed with the following command:

    So the above command shows that NSO currently has one package, the NED for Cisco IOS.

    NSO reads global configuration parameters from ncs.conf. More on NSO configuration later in this guide. By default, it tells NSO to look for packages in a packages directory where NSO was started. So in this specific example:

    As seen above a package is a defined file structure with data models, code, and documentation. NSO comes with a couple of ready-made packages: $NCS_DIR/packages/. Also, there is a library of packages available from Tail-f, especially for supporting specific devices.

    Adding and Upgrading a Package

    Assume you would like to add support for Nexus devices to the example. Nexus devices have different data models and another CLI flavor. There is a package for that in $NCS_DIR/packages/neds/nexus.

    We can keep NSO running all the time, but we will stop the network simulator to add the Nexus devices to the simulator.

    Add the nexus package to the NSO runtime directory by creating a symbolic link:

    The package is now in place, but until we tell NSO to look for package changes nothing happens:

    So after the packages reload operation NSO also knows about Nexus devices. The reload operation also takes any changes to existing packages into account. The data store is automatically upgraded to cater to any changes like added attributes to existing configuration data.

    Simulating the New Device

    Adding the New Devices to NSO

    We can now add these Nexus devices to NSO according to the below sequence:

    NED Administration
    admin@ncs# packages reload
    reload-result {
        package cisco-ios
        result true
    }
    admin@ncs# packages reload max-wait-time 30 timeout-action kill
    admin@ncs# packages package mserv redeploy
    result true
    admin@ncs# show packages package mserv oper-status
    oper-status file-load-error
    oper-status error-info "template3.xml:2 Unknown servicepoint: templ42-servicepoint"
    admin@ncs# packages add
    add-result {
        package router-nc-1.1
        result true
    }
    admin@ncs# show packages
    packages package cisco-ios
     package-version 3.0
     description     "NED package for Cisco IOS"
     ncs-min-version [ 3.0.2 ]
     directory       ./state/packages-in-use/1/cisco-ios
     component upgrade-ned-id
      upgrade java-class-name com.tailf.packages.ned.ios.UpgradeNedId
     component cisco-ios
      ned cli ned-id  cisco-ios
      ned cli java-class-name com.tailf.packages.ned.ios.IOSNedCli
      ned device vendor Cisco
    NAME      VALUE
    ---------------------
    show-tag  interface
    
     oper-status up
    $ pwd
    .../examples.ncs/getting-started/using-ncs/1-simulated-cisco-ios
    $ ls packages/
    cisco-ios
    $ ls packages/cisco-ios
    doc
    load-dir
    netsim
    package-meta-data.xml
    private-jar
    shared-jar
    src
    $ ncs-netsim stop
    $ cd $NCS_DIR/examples.ncs/getting-started/using-ncs/1-simulated-cisco-ios/packages
    $ ln -s $NCS_DIR/packages/neds/cisco-nx
    $ ls -l
    ... cisco-nx -> .../packages/neds/cisco-nx
      admin@ncs# show packages packages package
      cisco-ios ...  admin@ncs# packages reload
    
    >>> System upgrade is starting.
    >>> Sessions in configure mode must exit to operational mode.
    >>> No configuration changes can be performed until upgrade has
    completed.
    >>> System upgrade has completed successfully.
    reload-result {
        package cisco-ios
        result true
    }
    reload-result {
        package cisco-nx
        result true
    }
    $ ncs-netsim add-to-network cisco-nx 2 n
    $ ncs-netsim list
    ncs-netsim list for  /Users/stefan/work/ncs-3.2.1/examples.ncs/getting-started/using-ncs/1-simulated-cisco-ios/netsim
    
    name=c0 ...
    name=c1 ...
    name=c2 ...
    name=n0 ...
    name=n1 ...
    
    
    $ ncs-netsim start
    DEVICE c0 OK STARTED
    DEVICE c1 OK STARTED
    DEVICE c2 OK STARTED
    DEVICE n0 OK STARTED
    DEVICE n1 OK STARTED
    $ ncs-netsim cli-c n0
    n0#show running-config
    no feature ssh
    no feature telnet
    fex 101
     pinning max-links 1
    !
    fex 102
     pinning max-links 1
    !
    nexus:vlan 1
    !
    ...
    admin@ncs(config)# devices device n0 device-type cli ned-id cisco-nx
    admin@ncs(config-device-n0)# port 10025
    admin@ncs(config-device-n0)# address 127.0.0.1
    admin@ncs(config-device-n0)# authgroup default
    admin@ncs(config-device-n0)# state admin-state unlocked
    admin@ncs(config-device-n0)# commit
    admin@ncs(config-device-n0)# top
    admin@ncs(config)# devices device n0 sync-from
    result true

    Alarm Types

    Alarm Type Descriptions

    abort-error

    abort-error

    • Initial Perceived Severity major

    • Description An error happened while aborting or reverting a transaction. Device's configuration is likely to be inconsistent with the NCS CDB.

    • Recommended Action Inspect the configuration difference with compare-config, resolve conflicts with sync-from or sync-to if any.

    • Clear Condition(s) If NCS achieves sync with the device, or receives a transaction id for a netconf session towards the device, the alarm is cleared.

    • Alarm Message(s)

      • Device {dev} is locked

      • Device {dev} is southbound locked

      • abort error

    alarm-type

    alarm-type

    • Description Base identity for alarm types. A unique identification of the fault, not including the managed object. Alarm types are used to identify if alarms indicate the same problem or not, for lookup into external alarm documentation, etc. Different managed object types and instances can share alarm types. If the same managed object reports the same alarm type, it is to be considered to be the same alarm. The alarm type is a simplification of the different X.733 and 3GPP alarm IRP alarm correlation mechanisms and it allows for hierarchical extensions. A 'specific-problem' can be used in addition to the alarm type in order to have different alarm types based on information not known at design-time, such as values in textual SNMP Notification varbinds.

    bad-user-input

    bad-user-input

    • Initial Perceived Severity critical

    • Description Invalid input from user. NCS cannot recognize parameters needed to connect to device.

    certificate-expiration

    certificate-expiration

    • Description The certificate is nearing its expiry or has already expired. The severity depends on the time left to expiry, it ranges from warning to critical.

    • Recommended Action Replace certificate.

    cluster-subscriber-failure

    cluster-subscriber-failure

    • Initial Perceived Severity critical

    • Description Failure to establish a notification subscription towards a remote node.

    commit-through-queue-blocked

    commit-through-queue-blocked

    • Initial Perceived Severity warning

    • Description A commit was queued behind a queue item waiting to be able to connect to one of its devices. This is potentially dangerous since one unreachable device can potentially fill up the commit queue indefinitely.

    commit-through-queue-failed

    commit-through-queue-failed

    • Initial Perceived Severity critical

    • Description A queued commit failed.

    commit-through-queue-failed-transiently

    commit-through-queue-failed-transiently

    • Initial Perceived Severity critical

    • Description A queued commit failed as it exhausted its retry attempts on transient errors.

    commit-through-queue-rollback-failed

    commit-through-queue-rollback-failed

    • Initial Perceived Severity critical

    • Description Rollback of a commit-queue item failed.

    configuration-error

    configuration-error

    • Initial Perceived Severity critical

    • Description Invalid configuration of NCS managed device, NCS cannot recognize parameters needed to connect to device.

    connection-failure

    connection-failure

    • Initial Perceived Severity major

    • Description NCS failed to connect to a managed device before the timeout expired.

    final-commit-error

    final-commit-error

    • Initial Perceived Severity critical

    • Description A managed device validated a configuration change, but failed to commit. When this happens, NCS and the device are out of sync.

    ha-alarm

    ha-alarm

    • Description Base type for all alarms related to high availablity. This is never reported, sub-identities for the specific high availability alarms are used in the alarms.

    ha-node-down-alarm

    ha-node-down-alarm

    • Description Base type for all alarms related to nodes going down in high availablity. This is never reported, sub-identities for the specific node down alarms are used in the alarms.

    ha-primary-down

    ha-primary-down

    • Initial Perceived Severity critical

    • Description The node lost the connection to the primary node.

    ha-secondary-down

    ha-secondary-down

    • Initial Perceived Severity critical

    • Description The node lost the connection to a secondary node.

    missing-transaction-id

    missing-transaction-id

    • Initial Perceived Severity warning

    • Description A device announced in its NETCONF hello message that it supports the transaction-id as defined in http://tail-f.com/yang/netconf-monitoring. However when NCS tries to read the transaction-id no data is returned. The NCS check-sync feature will not work. This is usually a case of misconfigured NACM rules on the managed device.

    ncs-cluster-alarm

    ncs-cluster-alarm

    • Description Base type for all alarms related to cluster. This is never reported, sub-identities for the specific cluster alarms are used in the alarms.

    ncs-dev-manager-alarm

    ncs-dev-manager-alarm

    • Description Base type for all alarms related to the device manager This is never reported, sub-identities for the specific device alarms are used in the alarms.

    ncs-package-alarm

    ncs-package-alarm

    • Description Base type for all alarms related to packages. This is never reported, sub-identities for the specific package alarms are used in the alarms.

    ncs-service-manager-alarm

    ncs-service-manager-alarm

    • Description Base type for all alarms related to the service manager This is never reported, sub-identities for the specific service alarms are used in the alarms.

    ncs-snmp-notification-receiver-alarm

    ncs-snmp-notification-receiver-alarm

    • Description Base type for SNMP notification receiver Alarms. This is never reported, sub-identities for specific SNMP notification receiver alarms are used in the alarms.

    ned-live-tree-connection-failure

    ned-live-tree-connection-failure

    • Initial Perceived Severity major

    • Description NCS failed to connect to a managed device using one of the optional live-status-protocol NEDs.

    out-of-sync

    out-of-sync

    • Initial Perceived Severity major

    • Description A managed device is out of sync with NCS. Usually it means that the device has been configured out of band from NCS point of view.

    package-load-failure

    package-load-failure

    • Initial Perceived Severity critical

    • Description NCS failed to load a package.

    package-operation-failure

    package-operation-failure

    • Initial Perceived Severity critical

    • Description A package has some problem with its operation.

    receiver-configuration-error

    receiver-configuration-error

    • Initial Perceived Severity major

    • Description The snmp-notification-receiver could not setup its configuration, either at startup or when reconfigured. SNMP notifications will now be missed.

    revision-error

    revision-error

    • Initial Perceived Severity major

    • Description A managed device arrived with a known module, but too new revision.

    service-activation-failure

    service-activation-failure

    • Initial Perceived Severity critical

    • Description A service failed during re-deploy.

    time-violation-alarm

    time-violation-alarm

    • Description Base type for all alarms related to time violations. This is never reported, sub-identities for the specific time violation alarms are used in the alarms.

    transaction-lock-time-violation

    transaction-lock-time-violation

    • Initial Perceived Severity warning

    • Description The transaction lock time exceeded its threshold and might be stuck in the critical section. This threshold is configured in /ncs-config/transaction-lock-time-violation-alarm/timeout.

    alarm-type
        certificate-expiration
        ha-alarm
            ha-node-down-alarm
                ha-primary-down
                ha-secondary-down
        ncs-cluster-alarm
            cluster-subscriber-failure
        ncs-dev-manager-alarm
            abort-error
            bad-user-input
            commit-through-queue-blocked
            commit-through-queue-failed
            commit-through-queue-failed-transiently
            commit-through-queue-rollback-failed
            configuration-error
            connection-failure
            final-commit-error
            missing-transaction-id
            ned-live-tree-connection-failure
            out-of-sync
            revision-error
        ncs-package-alarm
            package-load-failure
            package-operation-failure
        ncs-service-manager-alarm
            service-activation-failure
        ncs-snmp-notification-receiver-alarm
            receiver-configuration-error
        time-violation-alarm
            transaction-lock-time-violation

    Recommended Action Verify that the user supplied input are correct.

  • Clear Condition(s) This alarm is not cleared.

  • Alarm Message(s)

    • Resource {resource} doesn't exist

  • Clear Condition(s) This alarm is cleared when the certificate is no longer loaded.

  • Alarm Message(s)

    • Certificate expires in less than {days} day(s)/Certificate has expired.

  • Recommended Action Verify IP connectivity between cluster nodes.

  • Clear Condition(s) This alarm is cleared if NCS succeeds to establish a subscription towards the remote node, or when the subscription is explicitly stopped.

  • Alarm Message(s)

    • Failed to establish netconf notification subscription to node ~s, stream ~s

    • Commit queue items with remote nodes will not receive required event notifications.

  • Clear Condition(s) An alarm raised due to a transient error will be cleared when NCS is able to reconnect to the device.

  • Alarm Message(s)

    • Commit queue item ~p is blocked because item ~p cannot connect to ~s

  • Recommended Action Resolve with rollback if possible.

  • Clear Condition(s) This alarm is not cleared.

  • Alarm Message(s)

    • Failed to authenticate towards device {device}: {reason}

    • Device {dev} is locked

    • {Reason}

    • Device {dev} is southbound locked

    • Commit queue item {CqId} rollback invoked

    • Commit queue item {CqId} has failed: Operation failed because: inconsistent database

    • Remote commit queue item ~p cannot be unlocked: cluster node not configured correctly

  • Recommended Action Resolve with rollback if possible.

  • Clear Condition(s) This alarm is not cleared.

  • Alarm Message(s)

    • Failed to connect to device {dev}: {reason}

    • Connection to {dev} timed out

    • Failed to authenticate towards device {device}: {reason}

    • The configuration database is locked for device {dev}: {reason}

    • the configuration database is locked by session {id} {identification}

    • the configuration database is locked by session {id} {identification}

    • {Dev}: Device is locked in a {Op} operation by session {session-id}

    • resource denied

    • Commit queue item {CqId} rollback invoked

    • Commit queue item {CqId} has failed: Operation failed because: inconsistent database

    • Remote commit queue item ~p cannot be unlocked: cluster node not configured correctly

  • Recommended Action Investigate the status of the device and resolve the situation by issuing the appropriate action, i.e., service redeploy or a sync operation.

  • Clear Condition(s) This alarm is not cleared.

  • Alarm Message(s)

    • {Reason}

  • Recommended Action Verify that the configuration parameters defined in tailf-ncs-devices.yang submodule are consistent for this device.

  • Clear Condition(s) The alarm is cleared when NCS reads the configuration parameters for the device, and is raised again if the parameters are invalid.

  • Alarm Message(s)

    • Failed to resolve IP address for {dev}

    • the configuration database is locked by session {id} {identification}

    • {Reason}

    • Resource {resource} doesn't exist

  • Recommended Action Verify address, port, authentication, check that the device is up and running. If the error occurs intermittently, increase connect-timeout.

  • Clear Condition(s) If NCS successfully reconnects to the device, the alarm is cleared.

  • Alarm Message(s)

    • The connection to {dev} was closed

    • Failed to connect to device {dev}: {reason}

  • Recommended Action Reconcile by comparing and sync-from or sync-to.

  • Clear Condition(s) If NCS achieves sync with a device, the alarm is cleared.

  • Alarm Message(s)

    • The connection to {dev} was closed

    • External error in the NED implementation for device {dev}: {reason}

    • Internal error in the NED NCS framework affecting device {dev}: {reason}

  • Recommended Action Make sure the HA cluster is operational, investigate why the primary went down and bring it up again.

  • Clear Condition(s) This alarm is never automatically cleared and has to be cleared manually when the HA cluster has been restored.

  • Alarm Message(s)

    • Lost connection to primary due to: Primary closed connection

    • Lost connection to primary due to: Tick timeout

    • Lost connection to primary due to: code {Code}

  • Recommended Action Investigate why the secondary node went down, fix the connectivity issue and reconnect the secondary to the HA cluster.

  • Clear Condition(s) This alarm is cleared when the secondary node is reconnected to the HA cluster.

  • Alarm Message(s)

    • Lost connection to secondary

  • Recommended Action Verify NACM rules on the concerned device.

  • Clear Condition(s) If NCS successfully reads a transaction id for which it had previously failed to do so, the alarm is cleared.

  • Alarm Message(s)

    • {Reason}

  • Recommended Action Verify the configuration of the optional NEDs. If the error occurs intermittently, increase connect-timeout.

  • Clear Condition(s) If NCS successfully reconnects to the managed device, the alarm is cleared.

  • Alarm Message(s)

    • The connection to {dev} was closed

    • Failed to connect to device {dev}: {reason}

  • Recommended Action Inspect the difference with compare-config, reconcile by invoking sync-from or sync-to.

  • Clear Condition(s) If NCS achieves sync with a device, the alarm is cleared.

  • Alarm Message(s)

    • Device {dev} is out of sync

    • Out of sync due to no-networking or failed commit-queue commits.

    • got: ~s expected: ~s.

  • Recommended Action Check the package for the reason.

  • Clear Condition(s) If NCS successfully loads a package for which an alarm was previously raised, it will be cleared.

  • Alarm Message(s)

    • failed to open file {file}: {str}

    • Specific to the concerned package.

  • Recommended Action Check the package for the reason.

  • Clear Condition(s) This alarm is not cleared.

  • Recommended Action Check the error-message and change the configuration.

  • Clear Condition(s) This alarm will be cleared when the NCS is configured to successfully receive SNMP notifications

  • Alarm Message(s)

    • Configuration has errors.

  • Recommended Action Upgrade the Device NED using the new YANG revision in order to use the new features in the device.

  • Clear Condition(s) If all device yang modules are supported by NCS, the alarm is cleared.

  • Alarm Message(s)

    • The device has YANG module revisions not supported by NCS. Use the /devices/device/check-yang-modules action to check which modules that are not compatible.

  • Recommended Action Corrective action and another re-deploy is needed.

  • Clear Condition(s) If the service is successfully redeployed, the alarm is cleared.

  • Alarm Message(s)

    • Multiple device errors: {str}

  • Recommended Action Investigate if the transaction is stuck and possibly interrupt it by closing the user session which it is attached to.

  • Clear Condition(s) This alarm is cleared when the transaction has finished.

  • Alarm Message(s)

    • Transaction lock time exceeded threshold.

  • NED Administration

    Learn about Cisco-provided NEDs and how to manage them.

    This section provides necessary information on NED (Network Element Driver) administration with a focus on Cisco-provided NEDs. If you're planning to use NEDs not provided by Cisco, refer to the NED Development to build your own NED packages.

    NED represents a key NSO component that makes it possible for the NSO core system to communicate southbound with network devices in most deployments. NSO has a built-in client that can be used to communicate southbound with NETCONF-enabled devices. Many network devices are, however, not NETCONF-enabled, and there exist a wide variety of methods and protocols for configuring network devices, ranging from simple CLI to HTTP/REST-enabled devices. For such cases, it is necessary to use a NED to allow NSO communicate southbound with the network device.

    Even for NETCONF-enabled devices, it is possible that the NSO's built-in NETCONF client cannot be used, for instance, if the devices do not strictly follow the specification for the NETCONF protocol. In such cases, one must also use a NED to seamlessly communicate with the device. See Managing Cisco-provided third Party YANG NEDs for more information on third-party YANG NEDs.

    Types of NED Packages

    A NED package is a package that NSO uses to manage a particular type of device. A NED is a piece of code that enables communication with a particular type of managed device. You add NEDs to NSO as a special kind of package, called NED packages.

    A NED package must provide a device YANG model as well as define means (protocol) to communicate with the device. The latter can either leverage the NSO built-in NETCONF and SNMP support or use a custom implementation. When a package provides custom protocol implementation, typically written in Java, it is called a CLI NED or a Generic NED.

    Cisco provides and supports a number of such NEDs. With these Cisco-provided NEDs, a major category are CLI NEDs which communicate with a device through its CLI instead of a dedicated API.

    CLI NED

    This NED category is targeted at devices that use CLI as a configuration interface. Cisco-provided CLI NEDs are available for various network devices from different vendors. Many different CLI syntaxes are supported.

    The driver element in a CLI NED implemented by the Cisco NSO NED team typically consists of the following three parts:

    • The protocol client, responsible for connecting to and interacting with the device. The protocols supported are SSH and Telnet.

    • A fast and versatile CLI parser (+ emitter), usually referred to as the turbo parser.

    • Various transform engines capable of converting data between NSO and device formats.

    The YANG models in a CLI NED are developed and maintained by the Cisco NSO NED team. Usually, the models for a CLI NED are structured to mimic the CLI command hierarchy on the device.

    Generic NED

    A generic NED is typically used to communicate with non-CLI devices, such as devices using protocols like REST, TL1, Corba, SOAP, RESTCONF, or gNMI as a configuration interface. Even NETCONF-enabled devices in many cases require a generic NED to function properly with NSO.

    The driver element in a Generic NED implemented by the Cisco NED team typically consists of the following parts:

    • The protocol client, responsible for interacting with the device.

    • Various transform engines capable of converting data between NSO and the device formats, usually JSON and/or XML transformers.

    There are two types of Generic NEDs maintained by the Cisco NSO NED team:

    • NEDs with Cisco-owned YANG models. These NEDs have models developed and maintained by the Cisco NSO NED team.

    • NEDs targeted at YANG models from third-party vendors, also known as, third-party YANG NEDs.

    Generic Cisco-provided NEDs with Cisco-owned YANG Models

    Generic NEDs belonging to the first category typically handle devices that are model-driven. For instance, devices using proprietary protocols based on REST, SOAP, Corba, etc. The YANG models for such NEDs are usually structured to mimic the messages used by the proprietary protocol of the device.

    Third-party YANG NEDs

    As the name implies, this NED category is used for cases where the device YANG models are not implemented, maintained, or owned by the Cisco NSO NED team. Instead, the YANG models are typically provided by the device vendor itself, or by organizations like IETF, IEEE, ONF, or OpenConfig.

    This category of NEDs has some special characteristics that set them apart from all other NEDs developed by the Cisco NSO NED team:

    • Targeted for devices supporting model-driven protocols like NETCONF, RESTCONF, and gNMI.

    • Delivered from the software.cisco.com portal without any device YANG models included. There are several reasons for this, such as legal restrictions that prevent Cisco from re-distributing YANG models from other vendors, or the availability of several different version bundles for open-source YANG, like OpenConfig. The version used by the NED must match the version used by the targeted device.

    • The NEDs can be bundled with various fixes to solve shortcomings in the YANG models, the download sources, and/or in the device. These fixes are referred to as recipes.

    Since the third-party NEDs are delivered without any device YANG models, there are additional steps required to make this category of NEDs operational:

    1. The device models need to be downloaded and copied into the NED package source tree. This can be done by using a special (optional) downloader tool bundled with each third-party YANG NED, or in any custom way.

    2. The NED must be rebuilt with the downloaded YANG models.

    This procedure is thoroughly described in .

    Recipes

    A third-party YANG NED can be bundled with up to three types of recipe modules. These recipes are used by the NED to solve various types of issues related to:

    • The source of the YANG files.

    • The YANG files.

    • The device itself.

    The recipes represent the characteristics and the real value of a third-party YANG NED. Recipes are typically adapted for a certain bundle of YANG models and/or certain device types. This is why there exist many different third-party YANG NEDs, each one adapted for a specific protocol, a specific model package, and/or a specific device.

    The NSO NED team does not provide any super third-party YANG NEDs, for instance, a super RESTCONF NED that can be used with any models and any device.

    Download Recipes

    When downloading the YANG files, it is first of all important to know which source to use. In some cases, the source is the device itself. For instance, if the device is enabled for NETCONF and sometimes for RESTCONF (in rare cases).

    In other cases, the device does not support model download. This applies to all gNMI-enabled devices and most RESTCONF devices too. In this case, the source can be a public Git repository or an archive file provided by the device vendor.

    Another important question is what YANG models and what versions to download. To make this task easier, third-party NEDs can be bundled with the download recipes. These are presets to be used with the downloader tool bundled with the NED. There can be several profiles, each representing a preset that has been verified to work by the Cisco NSO NED team. A profile can point out a certain source to download from. It can also limit the scope of the download so that only certain YANG files are selected.

    YANG Recipes (YR)

    Third-party YANG files can often contain various types of errors, ranging from real bugs that cause compilation errors to certain YANG constructs that are known to cause runtime issues in NSO. To ensure that the files can be built correctly, the third-party NEDs can be bundled with YANG recipes. These recipes patch the downloaded YANG files before they are built by the NSO compiler. This procedure is performed automatically by the make system when the NED is rebuilt after downloading the device YANG files. For more information, refer to .

    Runtime Recipes (RR)

    Many devices enabled for NETCONF, RESTCONF, or gNMI sometimes deviate in their runtime behavior. This can make it impossible to interact properly with NSO. These deviations can be on any level in the runtime behavior, such as:

    • The configuration protocol is not properly implemented, i.e., the device lacks support for mandatory parts of, for instance, the RESTCONF RFC.

    • The device returns "dirty" configuration dumps, for instance, JSON or XML containing invalid elements.

    • Special quirks are required when applying new configuration on a device. May also require additional transforms of the payload before it is relayed by the NED.

    • The device has aliasing issues, possibly caused by overlapping YANG models. If leaf X in model A is modified, the device will automatically modify leaf Y in model B as well.

    A third-party YANG NED can be bundled with runtime recipes to solve these kinds of issues, if necessary. How this is implemented varies from NED to NED. In some cases, a NED has a fixed set of recipes that are always used. Alternatively, a NED can support several different recipes, which can be configured through a NED setting, referred to as a runtime profile. For example, a multi-vendor third-party YANG NED might have one runtime profile for each device type supported:

    NED Settings

    NED settings are YANG models augmented as configurations in NSO and control the behavior of the NED. These settings are augmented under:

    • /devices/global-settings/ned-settings

    • /devices/profiles/ned-settings

    • /devices/device/ned-settings

    Most NEDs are instrumented with a large number of NED settings that can be used to customize the device instance configured in NSO. The README file in the respective NED contains more information on these.

    Purpose of NED ID

    Each managed device in NSO has a device type that informs NSO how to communicate with the device. When managing NEDs, the device type is either cli or generic. The other two device types, netconf and snmp, are used in NETCONF and SNMP packages and are further described in this guide.

    In addition, a special NED ID identifier is needed. Simply put, this identifier is a handle in NSO pointing to the NED package. NSO uses the identifier when it is about to invoke the driver in a NED package. The identifier ensures that the driver of the correct NED package is called for a given device instance. For more information on how to set up a new device instance, see .

    Each NED package has a NED ID, which is mandatory. The NED ID is a simple string that can have any format. For NEDs developed by the Cisco NSO NED team, the NED ID is formatted as <NED NAME>-<gen | cli>-<NED VERSION MAJOR>.<NED VERSION MINOR>.

    Examples

    • onf-tapi_rc-gen-2.0

    • cisco-iosxr-cli-7.43

    The NED ID for a certain NED package stays the same from one version to another, as long as no backward incompatible changes have been done to the YANG models. Upgrading a NED from one version to another, where the NED ID is the same, is simple as it only requires replacing the old NED package with the new one in NSO and then reloading all packages.

    Upgrading a NED package from one version to another, where the NED ID is not the same (typically indicated by a change of major or minor number in the NED version), requires additional steps. The new NED package first needs to be installed side-by-side with the old one. Then, a NED migration needs to be performed. This procedure is thoroughly described in .

    The Cisco NSO NED team ensures that our CLI NEDs, as well as Generic NEDs with Cisco-owned models, have version numbers and NED ID that indicate any possible backward incompatible YANG model changes. When a NED with such an incompatible change is released, the minor digit in the version is always incremented. The case is a bit different for our third-party YANG NEDs since it is up to the end user to select the NED ID to be used. This is further described in .

    NED Versioning Scheme

    A NED is assigned a version number consisting of a sequence of numbers separated by dots. The first two numbers represent the major and minor version, and the third number represents the maintenance version.

    For example, the number 5.8.1 indicates a maintenance release (1) for the minor release 5.8. Incompatible YANG model changes require either the major or minor version number to be changed. This means that any version within the 5.8.x series is backward compatible with the previous versions.

    When a newer maintenance release with the same major/minor version replaces a NED release, NSO can perform a simple data model upgrade to handle stored instance data in the CDB (Configuration Database). This type of upgrade does not pose a risk of data loss.

    However, when a NED is replaced by a new major/minor release, it becomes a NED migration. These migrations are complex because the YANG model changes can potentially result in the loss of instance data if not handled correctly.

    NED Installation in NSO

    This section describes the NED installation in NSO for Local and System installs. Consult the README.md supplied with the NED for the most up-to-date installation description.

    Local Install of NED in NSO

    This section describes how to install a NED package on a locally installed NSO. See for more information.

    Follow the instructions below to install a NED package:

    1. Download the latest production-grade version of the NED from software.cisco.com using the URLs provided on your NED license certificates. All NED packages are files with the .signed.bin extension named using the following rule: ncs-<NSO VERSION>-<NED NAME>-<NED VERSION>.signed.bin. The NED package ncs-6.0-cisco-iosxr-7.43.signed.bin will be used in the example below. It is assumed the NED package has been downloaded into the directory named /tmp/ned-package-store. The environment variable NSO_RUNDIR needs to be configured to point to the NSO runtime directory. Example:

    2. Unpack the NED package and verify its signature.

    Alternatively, the tar.gz file can be installed directly into NSO. In this case, skip steps 3 and 4, and do as below instead:

    System Install of Cisco-provided NED in NSO

    This section describes how to install a NED package on a system-installed NSO. See for more information.

    1. Download the latest production-grade version of the NED from software.cisco.com using the URLs provided on your NED license certificates. All NED packages are files with the .signed.bin extension named using the following rule: ncs-<NSO_VERSION>-<NED NAME>-<NED VERSION>.signed.bin. The NED package ncs-6.0-cisco-iosxr-7.43.signed.bin will be used in the example below. It is assumed that the package has been downloaded into the directory named /tmp/ned-package-store.

    2. Unpack the NED package and verify its signature.

      In case the signature cannot be verified (for instance, if access to internet is down), do as below instead.

    Configuring a device with the new Cisco-provided NED

    The basic steps for configuring a device instance using the newly installed NED package are described in this section. Only the most basic configuration steps are covered here.

    Many NEDs require additional custom configuration to be operational. This applies in particular to Generic NEDs. Information about such additional configuration can be found in the files README.md and README-ned-settings.md bundled with the NED package.

    The following info is necessary to proceed with the basic setup of a device instance in NSO:

    • NED ID of the new NED.

    • Connection information for the device to connect to (address and port).

    • Authentication information to the device (username and password).

    CLI NED Setup

    For CLI NEDs, it is mandatory to specify the protocol to be used, either SSH or Telnet.

    The following values will be used for this example:

    • NED ID: cisco-iosxr-cli-7.43

    • Address: 10.10.1.1

    • Port: 22

    Do the CLI NED setup as below:

    1. Start an NSO CLI session.

    2. Enter the configuration mode.

    3. Configure a new authentication group to be used for this device.

    4. Configure the new device instance.

    Cisco-provided Generic NED Setup

    This example shows a simple setup of a generic NED.

    The following values will be used for this example:

    • NED ID: onf-tapi_rc-gen-2.0

    • Address: 10.10.1.2

    • Port: 443

    Do the Generic NED setup as below:

    1. Start an NSO CLI session.

    2. Enter the configuration mode.

    3. Configure a new authentication group to be used for this device.

    4. Configure the new device instance.

    Managing Cisco-provided third Party YANG NEDs

    The third-party YANG NED type is a special category of the generic NED type targeted for devices supporting protocols like NETCONF, RESTCONF, and gNMI. As the name implies, this NED category is used for cases where the device YANG models are not implemented or maintained by the Cisco NSO NED Team. Instead, the YANG models are typically provided by the device vendor itself or by organizations like IETF, IEEE, ONF, or OpenConfig.

    A third-party YANG NED package is delivered from the software.cisco.com portal without any device YANG models included. It is required that the models are first downloaded, followed by a rebuild and reload of the package, before the NED can become fully operational. This task needs to be performed by the NED user.

    Downloading with the NED Built-in Download Tool

    This section gives a brief instruction on how to download the device YANG models using the special downloader tool that is bundled with each third-party YANG NED. Each specific NED can contain specific requirements regarding downloading/rebuilding. Before proceeding, check the file README-rebuild.md bundled with the NED package. Furthermore, it is recommended to use a non-production NSO environment for this task.

    1. Download and install the third-party YANG NED package into NSO, see .

    2. Configure a device instance using as usual. See for more information. The device name dev-1 will be used in this example.

    3. Open an NCS CLI session (non-configure mode).

    4. The installed NED is now basically empty. It contains no YANG models except some used by the NED internally. This can be verified with the following CLI commands:

    Rebuilding NED with Downloaded YANG Files

    The NED must be rebuilt when the device YANG models have been downloaded and stored properly. Compiling third-party YANG files is often combined with various types of issues caused by bad or odd YANG constructs. Such issues typically cause compiler errors or unwanted runtime errors in NSO. A third-party YANG NED is configured to take care of all currently known build issues. It will automatically patch the problematic files such that they build properly for NSO. This is done using a set of YANG build recipes bundled with the NED package.

    Adapting the YANG build recipes is a continuous process. If new issues are found, the Cisco NED team updates the recipes accordingly and releases a new version of the NED.

    It is strongly recommended that end users report newly found YANG build issues to the Cisco NSO NED team through a support request.

    Before rebuilding the NED, it is important to know the path to the target directory used for the downloaded YANG files. This is the same as the local directory if the built-in NED downloader tool was used, see .

    This example uses the environment variable NED_YANG_TARGET_DIR to represent the target directory.

    To rebuild the NED with the downloaded YANG file:

    1. Enter the NED build directory, which is the parent directory to the target directory.

    2. Run the make clean all command. The output from the make command can be massive, depending on the number of YANG files, etc. After this step, the NED is rebuilt with the device YANG models included. Lines like below indicate that the NED has applied a number of YANG recipes (patches) to solve known issues with the YANG files:

    Reloading the NED Package into NSO

    This is the final step to make a third-party YANG NED operational. If the NED built-in YANG downloader tool was used together with no local-dir argument specified (i.e., the default), the only thing required is a package reload in NSO, which you can do by running the packages reload or the packages add command.

    If another target directory was used for the YANG file download, it is necessary to first do a proper re-install of the NED package. See .

    Rebuilding the NED with a Unique NED ID

    A common use case is to have many different versions of a certain device type in the network. All devices can be managed by the same third-party YANG NED. However, each device will likely have its unique set of YANG files (or versions) which this NED has to be rebuilt for.

    To set up NSO for this kind of scenario, some additional steps need to be taken:

    • Each flavor of the NED needs to be built in a separate source directory, i.e., untar the third-party YANG NED package at multiple locations.

    • Each flavor of the re-built NED must have its own unique NED-ID. This will make NSO allow multiple versions of the same NED package to co-exist.

    The default NED ID for a third-party YANG NED typically looks like this: <NED NAME>-gen-<NED VERSION MAJOR DIGIT>.<NED VERSION MINOR DIGIT>

    The NED build system allows for a customized NED ID by setting one or several of three make variables in any combination when rebuilding the NED:

    • NED_ID_SUFFIX

    • NED_ID_MAJOR

    • NED_ID_MINOR

    Do as follows to build each flavor of the third-party YANG NED. Do it in iterations, one at a time:

    1. Unpack the empty NED package as described in .

    2. Unpack the NED package again in a separate location. Rename the NED directory to something unique.

    3. Configure a device instance using the installed NED, as described in . Configure it to connect to the first variant of the device.

    4. Follow the instructions in to download the YANG files. Configure local-dir

    Upgrading a Cisco-provided Third Party YANG NED to a Newer Version

    The NSO procedure to upgrade a NED package to a newer version uses the following approach:

    • If there are no backward incompatible changes in the schemas (YANG models) of respective NEDs, simply replace the old NED with the new one and reload all packages in NSO.

    • In case there are backwards incompatible changes present in the schemas, some administration is required: the new NED needs to be installed side-by-side with the old NED, after which a NED migration must be performed to properly update the data in CDB using the new schemas. More information about NED migration is available in .

    Whether or not there are backward incompatible differences present between two versions of the same NED, is determined by the NED ID. If the versions have the same NED ID, they are fully compatible; otherwise, the NED IDs will differ, typically indicated by the major and/or minor number in the NED ID.

    The third-party YANG NEDs add some extra complexity to the NED migration feature. This is because the device YANG models are not included in the NED package. It is up to the end user to select the YANG model versions to use and also to configure the NED ID. If the same NED, at a later stage, needs to be upgraded and rebuilt with newer versions of the YANG model, a decision has to be made regarding the NED ID: Is it safe to use the same NED ID, or should a new one be used?

    Using a unique NED ID for each NED package is always the safe option. It minimizes the risk of data loss during package upgrade, etc. However, in some cases, it might be beneficial to use the same NED ID when upgrading a NED package since it minimizes the administration in NSO, i.e., simply replace the old NED package with the new one without any need of NED migration.

    This kind of use case can occur when the firmware is upgraded on an NSO-controlled device. For example, assume that we have an optical device that supports the TAPI YANG models from the Open Networking Foundation. Current firmware supports version 2.1.3 of the TAPI bundle. The third-party YANG NED onf-tapi_rc has been rebuilt accordingly with TAPI version 2.1.3 and the default NED ID onf-tapi_rc-gen-2.0. This NED package is installed in NSO and a device instance named dev-1 is configured using it. Next, the optical device is upgraded with the new firmware that supports the TAPI bundle version 2.3.1 instead. The onf-tapi_rc NED needs to be upgraded accordingly. The question is what NED ID to use?

    To upgrade a Cisco-provided third-party YANG NED to a newer version:

    1. Unpack a fresh copy of the onf-tapi_rc NED package.

    2. Download the TAPI models v2.3.1 from the TAPI public Git repository.

    3. Rebuild the NED package with a temporary unique NED ID for this rebuild. Any unique NED ID works for this.

      This will generate the NED ID: onf-tapi_rc-gen-2.3.1.

    NED Migration

    If you upgrade a managed device (such as installing a new firmware), the device data model can change in a significant way. If this is the case, you usually need to use a different and newer NED with an updated YANG model.

    When the changes in the NED are not backward compatible, the NED is assigned a new ned-id to avoid breaking existing code. On the plus side, this allows you to use both versions of the NED at the same time, so some devices can use the new version and some can use the old one. As a result, there is no need to upgrade all devices at the same time. The downside is, NSO doesn't know the two NEDs are related and will not perform any upgrade on its own due to different ned-ids. Instead, you must manually change the NED of a managed device through a NED migration.

    For third-party NEDs, the end user is required to configure the NED ID and also be aware of the backward incompatibilities. See for an example.

    Migration is required when upgrading a NED and the NED-ID changes, which is signified by a change in either the first or the second number in the NED package version. For example, if you're upgrading the existing router-nc-1.0.1 NED to router-nc-1.2.0 or router-nc-2.0.2, you must perform NED migration. On the other hand, upgrading to router-nc-1.0.2 or router-nc-1.0.3 retains the same ned-id and you can upgrade the router-1.0.1 package in place, directly replacing it with the new one. However, note that some third-party, non-Cisco packages may not adhere to this standard versioning convention. In that case, you must check the ned-id values to see whether migration is needed.

    A potential issue with a new NED is that it can break an existing service or other packages that rely on it. To help service developers and operators verify or upgrade the service code, NSO provides additional options of migration tooling for identifying the paths and service instances that may be impacted. Therefore, ensure that all the other packages are compatible with the new NED before you start migrating devices.

    To prepare for the NED migration process, first, load the new NED package into NSO with either packages reload or packages add command. Then, use the show packages command to verify that both NEDs, the new and the old, are present. Finally, you may perform the migration of devices either one by one or multiple at a time.

    Depending on your operational policies, this may be done during normal operations and does not strictly require a maintenance window, as the migration only reads from and doesn't write to a network device. Still, it is recommended that you create an NSO backup before proceeding.

    Note that changing a ned-id also affects device templates if you use them. To make existing device templates compatible with the new ned-id, you can use the copy action. It will copy the configuration used for one ned-id to another, as long as the schema nodes used haven't changed between the versions. The following example demonstrates the copy action usage:

    For individual devices, use the /devices/device/migrate action, with the new-ned-id parameter. Without additional options, the command will read and update the device configuration in NSO. As part of this process, NSO migrates all the configuration and service meta-data. Use the dry-run option to see what the command would do and verbose to list all impacted service instances.

    You may also use the no-networking option to prevent NSO from generating any southbound traffic towards the device. In this case, only the device configuration in the CDB is used for the migration but then NSO can't know if the device is in sync. Afterward, you must use the compare-config or the sync-from action to remedy this.

    For migrating multiple devices, use the /devices/migrate action, which takes the same options. However, with this action, you must also specify the old-ned-id, which limits the migration to devices using the old NED. You can further restrict the action with the device parameter, selecting only specific devices.

    It is possible for a NED migration to fail if the new NED is not entirely backward compatible with the old one and the device has an active configuration that is incompatible with the new NED version. In such cases, NSO will produce an error with the YANG constraint that is not satisfied. Here, you must first manually adjust the device configuration to make it compatible with the new NED, and then you can perform the migration as usual.

    Depending on what changes are introduced by the migration and how these impact the services, it might be good to re-deploy the affected services before removing the old NED package. It is especially recommended in the following cases:

    • When the service touches a list key that has changed. As long as the old schema is loaded, NSO is able to perform an upgrade.

    • When a namespace that was used by the service has been removed. The service diffset, that is, the recorded configuration changes created by the service, will no longer be valid. The diffset is needed for the correct get-modifications output, deep-check-sync, and similar operations.

    Revision Merge Functionality

    The YANG modeling language supports the notion of a module revision. It allows users to distinguish between different versions of a module, so the module can evolve over time. If you wish to use a new revision of a module for a managed device, for example, to access new features, you generally need to create a new NED.

    When a model evolves quickly and you have many devices that require the use of a lot of different revisions, you will need to maintain a high number of NEDs, which are mostly the same. This can become especially burdensome during NSO version upgrades, when all NEDs may need to be recompiled.

    When a YANG module is only updated in a backward-compatible way (following the upgrade rules in RFC6020 or RFC7950), the NSO compiler, ncsc, allows you to pack multiple module revisions into the same package. This way, a single NED with multiple device model revisions can be used, instead of multiple NEDs. Based on the capabilities exchange, NSO will then use the correct revision for communication with each device.

    However, there is a major downside to this approach. While the exact revision is known for each communication session with the managed device, the device model in NSO does not have that information. For that reason, the device model always uses the latest revision. When pushing configuration to a device that only supports an older revision, NSO silently drops the unsupported parts. This may have surprising results, as the NSO copy can contain configuration that is not really supported on the device. Use the no-revision-drop commit parameter when you want to make sure you are not committing config that is not supported by a device.

    If you still wish to use this functionality, you can create a NED package with the ncs-make-package --netconf-ned command as you would otherwise. However, the supplied source YANG directory should contain YANG modules with different revisions. The files should follow the module-or-submodule-name@revision-date.yang naming convention, as specified in the RFC6020. Some versions of the compiler require you to use the --no-fail-on-warnings option with the ncs-make-package command or the build process may fail.

    The examples.ncs/development-guide/ned-upgrade/yang-revision example shows how you can perform a YANG model upgrade. The original, 1.0 version of the router NED uses the [email protected] YANG model. First, it is updated to the version 1.0.1 [email protected] using a revision merge approach. This is possible because the changes are backward-compatible.

    In the second part of the example, the updates in [email protected] introduce breaking changes, therefore the version is increased to 1.1 and a different NED-ID is assigned to the NED. In this case, you can't use revision merge and the usual NED migration procedure is required.

    In case the signature cannot be verified (for instance, if access to internet is down), do as below instead:

    The result of the unpacking is a tar.gz file with the same name as the .bin file.

  • Untar the tar.gz file. The result is a subdirectory named like <NED NAME>-<NED MAJOR VERSION DIGIT>.<NED MINOR VERSION DIGIT>

  • Install the NED into NSO, using the ncs-setup tool.

  • Finally, open an NSO CLI session and load the new NED package like below:

  • The result of the unpacking is a tar.gz file with the same name as the .bin file.
  • Perform an NSO backup before installing the new NED package.

  • Start an NSO CLI session.

  • Fetch the NED package.

  • Install the NED package (add the argument replace-existing if a previous version has been loaded).

  • Finally, load the NED package.

  • Protocol: ssh
  • User: cisco

  • Password: cisco

  • Next, check the README.md and README-ned-settings.md bundled with the NED package for further information on additional settings to make the NED fully operational.
  • Finally, commit the configuration.

    In the case of SSH, run also:

  • User: admin
  • Password: admin

  • Next, check the README.md and README-ned-settings.md bundled with the NED package for further information on additional settings to make the NED fully operational.
  • Finally, commit the configuration.

  • The built-in downloader tool consists of a couple of NSO RPCs defined in one of the NED internal YANG files.

  • Start with checking the default local directory. This directory will be used as a target for the device YANG models to be downloaded.

    This RPC will throw an error if the NED package was installed directly using the tar.gz file. See NED Installation in NSO for more information.

    If this error occurs, it is necessary to unpack the NED package in some other directory and use that as a target for the download. In the example below it is /tmp/ned-package-store/onf-tapi_rc-2.0/src/yang.

  • Continue with listing the models supported by the connected device.

    The size of the displayed list is device-dependent and so is the level of detail in each list entry. The only mandatory field is the name. Furthermore, not all devices are actually capable of advertising the models supported. If the currently connected device lacks this support, it is usually emulated by the NED instead. Check the README-rebuild.md for more information regarding this.

  • Next, list the download profiles currently supported by the device.

    A download profile is a preset for the built-in download tool. Its purpose is to make the download procedure as easy as possible. A profile can, for instance, define a certain source from where the device YANG models will be downloaded. Another usage can be to limit the scope of the YANG files to download. For example, one profile to download the native device models, and another for the OpenConfig models. All download profiles are defined and verified by the Cisco NSO NED team. There is usually at least one profile available, otherwise, check the README-rebuild.md bundled in the NED package.

  • Finally, try downloading the YANG models using a profile. In case a non-default local directory is used as a target, it must be explicitly specified.

    In case the default local directory is used, no further arguments are needed.

    The tool will output a list with each file downloaded. It automatically scans each YANG file for dependencies and tries to download them as well.

  • Verify that the downloaded files have been stored properly in the configured target directory.

  • to point to the location configured in
    .
  • Rebuild a NED package from the location configured in Rebuilding NED with Downloaded YANG Files. Use a suitable combination of the NED_ID_SUFFIX, NED_ID_MAJOR, NED_ID_MINOR.

    Example 1:

    This will result in the NED ID: onf-tapi_rc_tapi_v2.1.3-gen-2.0.

    Example 2:

    This will result in the NED ID: onf-tapi_rc-gen-2.1.3.

  • Install the newly built NED package into NSO, side-by-side with the original NED package. See Configuring a Device with the New Cisco-provided NED for further information.

    Example:

  • Configure a new device instance using the newly installed NED package. Configure it to connect to the first variant of the device, as done in step 3.

  • Verify functionality by executing a sync-from on the configured device instance.

  • Install the new onf-tapi_rc NED package into NSO, side by side with the old one.

  • Now, execute a dry run of the NSO NED migration feature. This command generates a list of all schema differences found between the two packages, like below:

    If the goal is to rebuild the new NED package again using the same NED ID as the old NED package, there are two things to look out for in the list:

    1. Does the list contain any items with backward-compatible false?

    2. If the answer is yes, is the affected schema node relevant for any use case, i.e., referenced by any service code running in NSO?

    Any item listed as backward-compatible false can potentially result in data loss if the old NED is simply replaced with the new one. This might however be acceptable if the affected schema node is not relevant for any use case.

  • Managing Cisco-provided third-Party YANG NEDs
    Rebuilding the NED with a Unique NED ID
    Configuring a device with the new Cisco-provided NED
    NED Migration
    Managing Cisco-provided third-Party YANG NEDs
    Local Install Steps
    System Install Steps
    Local Install of NED in NSO
    Cisco-provided Generic NED Setup
    Downloading with the NED Built-in Download Tool
    NED Installation in NSO
    NED Installation in NSO
    Cisco-provided Generic NED Setup
    Downloading with the NED Built-in Download Tool
    NED Migration
    Upgrading a Cisco-provided Third Party YANG NED to a Newer Version
    NED Package Types
    CLI NED
    Generic NED
    Third-Party YANG NEDs
    NED Version Scheme
    Sample NED Package Versioning
    Rebuilding NED with Downloaded YANG Files
    > tar xfz ncs-6.0-cisco-iosxr-7.43.tar.gz
    > ls -d */
    cisco-iosxr-7.43
        > ncs-setup --package cisco-iosxr-7.43 --dest $NSO_RUNDIR
    > ncs_cli -C -u admin
    admin@ncs# packages reload
    reload-result {
    package cisco-iosxr-cli-7.43
    result true
    }
    > $NCS_DIR/bin/ncs-backup
    > ncs_cli -C -u admin
    admin@ncs# software packages fetch package-from-file
        /tmp/ned-package-store/ncs-6.0-cisco-iosxr-7.43.tar.gz
    admin@ncs# software packages list
      package {
         name ncs-6.0-cisco-iosxr-7.43.tar.gz
         installable
     }
    admin@ncs# software packages install cisco-iosxr-7.43
    admin@ncs# software packages list
      package {
         name ncs-6.0-cisco-iosxr-7.43.tar.gz
         installed
      }
    admin@ncs# packages reload
    admin@ncs# software packages list
      package {
         name cisco-iosxr-cli-7.43
         loaded
     }
    admin@ncs(config)# commit
    admin@ncs(config)# devices device xrdev-1 ssh fetch-host-keys
    admin@ncs(config)# commit
    admin@ncs# devices device dev-1 rpc ?
    Possible completions:
    rpc-get-modules  rpc-list-modules  rpc-list-profiles  rpc-show-default-local-dir
    admin@ncs# devices device dev-1 rpc rpc-show-default-local-dir show-default-local-dir
    result /nso-lab-rundir/packages/onf-tapi_rc-2.0/src/yang
    admin@ncs#
    admin@ncs# devices device dev-1 rpc rpc-show-default-local-dir show-default-local-dir
    Error: External error in the NED implementation for device nokia-srlinux-1: default
        local directory does not exist (/nso-lab-rundir/packages/onf-tapi_rc-2.0/src/yang)
    admin@ncs#
    > cd /tmp/ned-package-store
    > chmod u+x ncs-6.0-onf-tapi_rc-2.0.3.signed.bin
    > ./ncs-6.0-onf-tapi_rc-2.0.3.signed.bin
    > tar xfz ncs-6.0-onf-tapi_rc-2.0.3.tar.gz
    > ls -d */
    onf-tapi_rc-2.0
    admin@ncs# devices device netsim-0 rpc rpc-list-modules list-modules
    module {
        name tapi-common
        revision 2020-04-23
        namespace urn:onf:otcc:yang:tapi-common
        schema https://localhost:7888/restconf/tailf/modules/tapi-common/2020-04-23
    }
    module {
        name tapi-connectivity
        revision 2020-06-16
        namespace urn:onf:otcc:yang:tapi-connectivity
        schema https://localhost:7888/restconf/tailf/modules/tapi-connectivity/2020-06-16
    }
    module {
        name tapi-dsr
        revision 2020-04-23
        namespace urn:onf:otcc:yang:tapi-dsr
        schema https://localhost:7888/restconf/tailf/modules/tapi-dsr/2020-04-23
    }
    module {
        name tapi-equipment
        revision 2020-04-23
        namespace urn:onf:otcc:yang:tapi-equipment
        schema https://localhost:7888/restconf/tailf/modules/tapi-equipment/2020-04-23
    }
    
    ...
    admin@ncs# devices device dev-1 rpc rpc-list-profiles list-profiles
    profile {
        name onf-tapi-from-device
        description Download the ONF TAPI YANG models. Download is done directly from device.
    }
    profile {
        name onf-tapi-from-git
        description Download the ONF TAPI YANG models. Download is done from the ONF TAPI github repo.
    }
    profile {
        name onf-tapi
        description Download the ONF TAPI YANG models. Download source must be specified explicitly.
    }
    admin@ncs# devices device dev-1 rpc rpc-get-modules get-modules profile
    onf-tapi-from-device local-dir /tmp/ned-package-store/onf-tapi_rc-2.0/src/yang
    admin@ncs# devices device dev-1 rpc rpc-get-modules get-modules profile onf-tapi-from-device
    result
    Fetching modules:
      tapi-common - urn:onf:otcc:yang:tapi-common (32875 bytes)
      tapi-connectivity - urn:onf:otcc:yang:tapi-connectivity (40488 bytes)
        fetching imported module tapi-path-computation
        fetching imported module tapi-topology
      tapi-dsr - urn:onf:otcc:yang:tapi-dsr (11172 bytes)
      tapi-equipment - urn:onf:otcc:yang:tapi-equipment (33406 bytes)
      tapi-eth - urn:onf:otcc:yang:tapi-eth (93152 bytes)
        fetching imported module tapi-oam
      tapi-notification - urn:onf:otcc:yang:tapi-notification (23864 bytes)
      tapi-oam - urn:onf:otcc:yang:tapi-oam (30409 bytes)
      tapi-odu - urn:onf:otcc:yang:tapi-odu (45327 bytes)
      tapi-path-computation - urn:onf:otcc:yang:tapi-path-computation (19628 bytes)
      tapi-photonic-media - urn:onf:otcc:yang:tapi-photonic-media (52848 bytes)
      tapi-topology - urn:onf:otcc:yang:tapi-topology (43357 bytes)
      tapi-virtual-network - urn:onf:otcc:yang:tapi-virtual-network (13278 bytes)
    fetched and saved 12 yang module(s) to /tmp/ned-package-store/onf-tapi_rc-2.0/src/yang
    > ls -l /tmp/ned-package-store/onf-tapi_rc-2.0/src/yang
    total 616
    -rw-r--r-- 1 nso-user staff 109607 Nov 11 13:15 tailf-common.yang
    -rw-r--r-- 1 nso-user staff  32878 Nov 11 13:15 tapi-common.yang
    -rw-r--r-- 1 nso-user staff  40503 Nov 11 13:15 tapi-connectivity.yang
    -rw-r--r-- 1 nso-user staff  11172 Nov 11 13:15 tapi-dsr.yang
    -rw-r--r-- 1 nso-user staff  33406 Nov 11 13:15 tapi-equipment.yang
    -rw-r--r-- 1 nso-user staff  93152 Nov 11 13:15 tapi-eth.yang
    -rw-r--r-- 1 nso-user staff  23864 Nov 11 13:15 tapi-notification.yang
    -rw-r--r-- 1 nso-user staff  30409 Nov 11 13:15 tapi-oam.yang
    -rw-r--r-- 1 nso-user staff  45327 Nov 11 13:15 tapi-odu.yang
    -rw-r--r-- 1 nso-user staff  19628 Nov 11 13:15 tapi-path-computation.yang
    -rw-r--r-- 1 nso-user staff  52848 Nov 11 13:15 tapi-photonic-media.yang
    -rw-r--r-- 1 nso-user staff  43357 Nov 11 13:15 tapi-topology.yang
    -rw-r--r-- 1 nso-user staff  13281 Nov 11 13:15 tapi-virtual-network.yang
    > make clean all NED_ID_SUFFIX=_tapi_v2.1.3
    > make clean all NED_ID_MAJOR=2 NED_ID_MINOR=1.3
    > cd /tmp/ned-package-store
    > tar cfz onf-tapi_rc-2.0-variant-1.tar.gz onf-tapi_rc-2.0-variant-1
    > ncs-setup --package onf-tapi_rc-2.0-variant-1.tar.gz --dest $NSO_RUNDIR
    > ncs_cli -C -u admin
    admin@ncs# packages reload
    > cd /tmp/ned-package-store
    > tar cfz onf-tapi_rc-2.0-variant-1.tar.gz onf-tapi_rc-2.0-variant-1
    > ncs-setup --package onf-tapi_rc-2.0-variant-1.tar.gz --dest $NSO_RUNDIR
    > ncs_cli -C -u admin
    admin@ncs# packages reload
    
    >>> System upgrade is starting.
    >>> Sessions in configure mode must exit to operational mode.
    >>> No configuration changes can be performed until upgrade has completed.
    >>> System upgrade has completed successfully.
    reload-result {
        package onf-tapi_rc-gen-2.0
        result true
    }
    reload-result {
        package onf-tapi_rc-gen-2.3.1
        result true
    }
    admin@ncs# devices device dev-1 migrate new-ned-id onf-tapi_rc-gen-2.3.1 dry-run
    
    modified-path {
        path /tapi-common:context/tapi-virtual-network:virtual-network-context/
            virtual-nw-service/vnw-constraint/service-layer
        info leaf-list type stack has changed
        backward-compatible false
    }
    modified-path {
        path /tapi-common:context/tapi-virtual-network:virtual-network-context/
            virtual-nw-service/vnw-constraint/requested-capacity/bandwidth-profile
        info sub-tree has been deleted
        backward-compatible false
    }
    modified-path {
        path /tapi-common:context/tapi-virtual-network:virtual-network-context/
            virtual-nw-service/vnw-constraint/latency-characteristic/queing-latency-characteristic
        info sub-tree has been deleted
        backward-compatible false
    }
    modified-path {
        path /tapi-common:context/tapi-virtual-network:virtual-network-context/
            virtual-nw-service/vnw-constraint
        info min/max has been relaxed
        backward-compatible true
    }
    modified-path {
        path /tapi-common:context/tapi-virtual-network:virtual-network-context/
            virtual-nw-service/vnw-constraint
        info list key has changed; leaf 'local-id' has changed type
        backward-compatible false
    }
    modified-path {
        path /tapi-common:context/tapi-virtual-network:virtual-network-context/
            virtual-nw-service/layer-protocol-name
        info node is no longer mandatory
        backward-compatible true
    }
    modified-path {
        path /tapi-common:context/tapi-virtual-network:virtual-network-context/
            virtual-nw-service/layer-protocol-name
        info leaf-list type stack has changed
        backward-compatible false
    }
    admin@ncs(config)# devices device dev-1 ned-settings
    onf-tapi_rc restconf profile vendor-xyz
    > export NSO_RUNDIR=~/nso-lab-rundir
    > cd /tmp/ned-package-store
    > chmod u+x ncs-6.0-cisco-iosxr-7.43.signed.bin
    > ./ncs-6.0-cisco-iosxr-7.43.signed.bin
    > ncs-setup --package cisco-iosxr-7.43.tar.gz --dest $NSO_RUNDIR
    > cd /tmp/ned-package-store
    > chmod u+x ncs-6.0-cisco-iosxr-7.43.signed.bin
    > ./ncs-6.0-cisco-iosxr-7.43.signed.bin
    > ./ncs-6.0-cisco-iosxr-7.43.signed.bin --skip-verification
    > ncs_cli -C -u admin
    admin@ncs# configure
    Entering configuration mode terminal
    admin@ncs(config)#
    admin@ncs(config)# devices authgroup my-xrgroup default-map
    remote-name cisco remote-password cisco
    admin@ncs(config)# devices device xrdev-1 address 10.10.1.1
    admin@ncs(config)# devices device xrdev-1 port 22
    admin@ncs(config)# devices device xrdev-1 device-type cli ned-id cisco-iosxr-cli-7.43 protocol ssh
    admin@ncs(config)# devices device xrdev-1 state admin-state unlocked
    admin@ncs(config)# devices device xrdev-1 authgroup my-xrgroup
    > ncs_cli -C -u admin
    admin@ncs# configure
    Entering configuration mode terminal
    admin@ncs(config)#
    admin@ncs(config)# devices authgroup my-tapigroup default-map remote-name admin
    remote-password admin
    admin@ncs(config)# devices device tapidev-1 address 10.10.1.2
    admin@ncs(config)# devices device tapidev-1 port 443
    admin@ncs(config)# devices device tapidev-1 device-type generic ned-id onf-tapi_rc-gen-2.0
    admin@ncs(config)# devices device tapidev-1 state admin-state unlocked
    admin@ncs(config)# devices device tapidev-1 authgroup my-tapigroup
    > ncs_cli -C -u admin
    > echo $NED_YANG_TARGET_DIR
    /tmp/ned-package-store/onf-tapi_rc-2.0/src/yang
    > cd $NED_YANG_TARGET_DIR/..
    > make clean all
    ======== RUNNING YANG PRE-PROCESSOR (YPP) WITH THE FOLLOWING VARIABLES:
    tools/ypp  --var NCS_VER=6.0  --var NCS_VER_NUMERIC=6000000
        --var SUPPORTS_CDM=YES  --var SUPPORTS_ROLLBACK_FILES_OCTAL=YES
        --var SUPPORTS_SHOW_STATS_PATH=YES \
             \
            --from=' NEDCOM_SECRET_TYPE' --to=' string' \
            'tmp-yang/*.yang'
    touch tmp-yang/ypp_ned
    
    ======== REMOVE PRESENCE STATEMENT ON CONTEXT TOP CONTAINER
    tools/ypp --from="(presence \"Root container)" \
               --to="//\g<1>" \
            'tmp-yang/tapi-common.yang'
    
    ======== ADDING EXTRA ENUM WITH CORRECT SPELLING: NO_PROTECTION
    tools/jypp --add-stmt=/typedef#protection-type/type::"enum NO_PROTECTION;" \
            'tmp-yang/tapi-topology.yang' || true
    
    ======== ADDING EXTRA IDENTITIES USED BY CERTAIN TAPI DEVICES
    tools/jypp --add-stmt=/::"identity DIGITAL_SIGNAL_TYPE_400GBASE-R {  base DIGITAL_SIGNAL_TYPE; }" \
                --add-stmt=/::"identity DIGITAL_SIGNAL_TYPE_GigE_CONV {  base DIGITAL_SIGNAL_TYPE; }" \
                --add-stmt=/::"identity DIGITAL_SIGNAL_TYPE_ETHERNET {  base DIGITAL_SIGNAL_TYPE; }" \
            'tmp-yang/tapi-dsr.yang' || true
    > ncs_cli -C -u admin
    admin@ncs# packages reload
    
    >>> System upgrade is starting.
    >>> Sessions in configure mode must exit to operational mode.
    >>> No configuration changes can be performed until upgrade has completed.
    >>> System upgrade has completed successfully.
    reload-result {
        package onf-tapi_rc-gen-2.0
        result true
    }
    admin@ncs#
    > cd /tmp/ned-package-store
    > chmod u+x ncs-6.0-onf-tapi_rc-2.0.3.signed.bin
    > ./ncs-6.0-onf-tapi_rc-2.0.3.signed.bin
    > tar xfz ncs-6.0-onf-tapi_rc-2.0.3.tar.gz
    > ls -d */
    onf-tapi_rc-2.0
    > mv onf-tapi_rc-2.0 onf-tapi_rc-2.0-variant-1
    > cd /tmp/ned-package-store
    > chmod u+x ncs-6.0-onf-tapi_rc-2.0.3.signed.bin
    > ./ncs-6.0-onf-tapi_rc-2.0.3.signed.bin
    > tar xfz ncs-6.0-onf-tapi_rc-2.0.3.tar.gz
    > ls -d */
    onf-tapi_rc-2.0
    > mv onf-tapi_rc-2.0 onf-tapi_rc-2.0-for-new-firmware
    > ncs_cli -C -u admin
    admin@ncs# devices device dev-1 rpc rpc-get-modules get-modules
            profile onf-tapi-from-git remote { git { checkout v2.3.1 } }
            local-dir /tmp/ned-package-store/onf-tapi_rc-2.0-new-firmware/src/yang
    > cd /tmp/ned-package-store/onf-tapi_rc-2.0-for-new-firmware/src/yang
    > make clean all NED_ID_MAJOR=2 NED_ID_MINOR=3.1
    admin@ncs(config)# devices template acme-ntp ned-id router-nc-1.0
    copy ned-id router-nc-1.2
    > ./ncs-6.0-cisco-iosxr-7.43.signed.bin --skip-verification
    > ls *.tar.gz
    ncs-6.0-cisco-iosxr-7.43.tar.gz
    > ls *.tar.gz
    ncs-6.0-cisco-iosxr-7.43.tar.gz
    admin@ncs# devices device dev-1 connect
    result true
    info (admin) Connected to dev-1 - 127.0.0.1:7888
    admin@ncs# show devices device dev-1 module
    NAME                         REVISION    FEATURE  DEVIATION
    -------------------------------------------------------------
    ietf-restconf-monitoring     2017-01-26  -        -
    tailf-internal-rpcs          2022-07-08  -        -
    tailf-ned-onf-tapi_rc-stats  2022-10-17  -        -
    > ncs_cli -C -u admin
    admin@ncs# devices device dev-1 rpc rpc-get-modules get-modules profile
    onf-tapi-from-device local-dir /tmp/ned-package-store/onf-tapi_rc-2.0-variant-1/src/yang

    High Availability

    Implement redundancy in your deployment using High Availability (HA) setup.

    As a single NSO node can fail or lose network connectivity, you can configure multiple nodes in a highly available (HA) setup, which replicates the CDB configuration and operational data across participating nodes. It allows the system to continue functioning even when some nodes are inoperable.

    The replication architecture is that of one active primary and a number of secondaries. This means all configuration write operations must occur on the primary, which distributes the updates to the secondaries.

    Operational data in the CDB may be replicated or not based on the tailf:persistent statement in the data model. If replicated, operational data writes can only be performed on the primary, whereas non-replicated operational data can also be written on the secondaries.

    Replication is supported in several different architectural setups. For example, two-node active/standby designs as well as multi-node clusters with runtime software upgrade.

    Primary - Secondary Configuration
    One Primary - Several Secondaries

    This feature is independent of but compatible with the Layered Service Architecture (LSA), which also configures multiple NSO nodes to provide additional scalability. When the following text simply refers to a cluster, it identifies the set of NSO nodes participating in the same HA group, not an LSA cluster, which is a separate concept.

    NSO supports the following options for implementing an HA setup to cater to the widest possible range of use cases (only one can be used at a time):

    • : Using a modern, consensus-based algorithm, it offers a robust, hands-off solution that works best in the majority of cases.

    • : A less sophisticated solution that allows you to influence the primary selection but may require occasional manual operator action.

    • : NSO only provides data replication; all other functions, such as primary selection and group membership management, are performed by an external application, using the HA framework (HAFW).

    In addition to data replication, having a fixed address to connect to the current primary in an HA group greatly simplifies access for operators, users, and other systems alike. Use or an to manage it.

    NSO HA Raft

    is a consensus algorithm that reliably distributes a set of changes to a group of nodes and robustly handles network and node failure. It can operate in the face of multiple, subsequent failures, while also allowing a previously failed or disconnected node to automatically rejoin the cluster without risk of data conflicts.

    Compared to traditional fail-over HA solutions, Raft relies on the consensus of the participating nodes, which addresses the so-called “split-brain” problem, where multiple nodes assume a primary role. This problem is especially characteristic of two-node systems, where it is impossible for a single node on its own to distinguish between losing network connectivity itself versus the other node malfunctioning. For this reason, Raft requires at least three nodes in the cluster.

    Raft achieves robustness by requiring at least three nodes in the HA cluster. Three is the recommended cluster size, allowing the cluster to operate in the face of a single node failure. In case you need to tolerate two nodes failing simultaneously, you can add two additional nodes, for a 5-node cluster. However, permanently having more than five nodes in a single cluster is currently not recommended since Raft requires the majority of the currently configured nodes in the cluster to reach consensus. Without the consensus, the cluster cannot function.

    You can start a sample HA Raft cluster using the examples.ncs/high-availability/raft-cluster example to test it out. The scripts in the example show various aspects of cluster setup and operation, which are further described in the rest of this section.

    Optionally, examples using separate containers for each HA Raft cluster member with NSO system installations are available and referenced in the examples.ncs/development-guide/high-availability/hcc example in the NSO example set.

    Overview of Raft Operation

    The Raft algorithm works with the concept of (election) terms. In each term, nodes in the cluster vote for a leader. The leader is elected when it receives the majority of the votes. Since each node only votes for a single leader in a given term, there can only be one leader in the cluster for this term.

    Once elected, the leader becomes responsible for distributing the changes and ensuring consensus in the cluster for that term. Consensus means that the majority of the participating nodes must confirm a change before it is accepted. This is required for the system to ensure no changes ever get overwritten and provide reliability guarantees. On the other hand, it also means more than half of the nodes must be available for normal operation.

    Changes can only be performed on the leader, that will accept the change after the majority of the cluster nodes confirm it. This is the reason a typical Raft cluster has an odd number of nodes; exactly half of the nodes agreeing on a change is not sufficient. It also makes a two-node cluster (or any even number of nodes in a cluster) impractical; the system as a whole is no more available than it is with one fewer node.

    If the connection to the leader is broken, such as during a network partition, the nodes start a new term and a new election. Another node can become a leader if it gets the majority of the votes of all nodes initially in the cluster. While gathering votes, the node has the status of a candidate. In case multiple nodes assume candidate status, a split-vote scenario may occur, which is resolved by starting a fresh election until a candidate secures the majority vote.

    If it happens that there aren't enough reachable nodes to obtain a majority, a candidate can stay in the candidate state for an indefinite time. Otherwise, when a node votes for a candidate, it becomes a follower and stays a follower in this term, regardless if the candidate is elected or not.

    Additionally, the NSO node can also be in the stalled state, if HA Raft is enabled but the node has not joined a cluster.

    Node Names and Certificates

    Each node in an HA Raft cluster needs a unique name. Names are usually in the ADDRESS format, where ADDRESS identifies a network host where the NSO process is running, such as a fully qualified domain name (FQDN) or an IPv4 address.

    Other nodes in the cluster must be able to resolve and reach the ADDRESS, which creates a dependency on the DNS if you use domain names instead of IP addresses.

    Limitations of the underlying platform place a constraint on the format of ADDRESS, which can't be a simple short name (without a dot), even if the system is able to resolve such a name using hosts file or a similar mechanism.

    You specify the node address in the ncs.conf file as the value for node-address, under the listen container. You can also use the full node name (with the “@” character), however, that is usually unnecessary as the system prepends ncsd@ as-needed.

    Another aspect in which ADDRESS plays a role is authentication. The HA system uses mutual TLS to secure communication between cluster nodes. This requires you to configure a trusted Certificate Authority (CA) and a key/certificate pair for each node. When nodes connect, they check that the certificate of the peer validates against the CA and matches the ADDRESS of the peer.

    Consider that TLS not only verifies that the certificate/key pair comes from a trusted source (certificate is signed by a trusted CA), it also checks that the certificate matches the host you are connecting to. Host A may have a valid certificate and key, signed by a trusted CA, however, if the certificate is for another host, say host B, the authentication will fail.

    In most cases, this means the ADDRESS must appear in the node certificate's Subject Alternative Name (SAN) extension, as dNSName (see ).

    Create and use a self-signed CA to secure the NSO HA Raft cluster. A self-signed CA is the only secure option. The CA should only be used to sign the certificates of the member nodes in one NSO HA Raft cluster. It is critical for security that the CA is not used to sign any other certificates. Any certificate signed by the CA can be used to gain complete control of the NSO HA Raft cluster.

    See the examples.ncs/high-availability/raft-cluster example for one way to set up a self-signed CA and provision individual node certificates. The example uses a shell script gen_tls_certs.sh that invokes the openssl command. Consult the section for using it independently of the example.

    Examples using separate containers for each HA Raft cluster member with NSO system installations that use a variant of the gen_tls_certs.sh script are available and referenced in the examples.ncs/development-guide/high-availability/hcc example in the NSO example set.

    When using an IP address instead of a DNS name for node's ADDRESS, you must add the IP address to the certificate's dNSName SAN field (adding it to iPAddress field only is insufficient). This is a known limitation in the current version.

    The following is a HA Raft configuration snippet for ncs.conf that includes certificate settings and a sample ADDRESS:

    Recipe for a Self-signed CA

    HA Raft uses a standard TLS protocol with public key cryptography for securing cross-node communication, where each node requires a separate public/private key pair and a corresponding certificate. Key and certificate management is a broad topic and is critical to the overall security of the system.

    The following text provides a recipe for generating certificates using a self-signed CA. It uses strong cryptography and algorithms that are deemed suitable for production use. However, it makes a few assumptions that may not be appropriate for all environments. Always consider how they affect your own deployment and consult a security professional if in doubt.

    The recipe makes the following assumptions:

    • You use a secured workstation or server to run these commands and handle the generated keys with care. In particular, you must copy the generated keys to NSO nodes in a secure fashion, such as using scp.

    • The CA is used solely for a single NSO HA Raft cluster, with certificates valid for 10 years, and provides no CRL. If a single key or host is compromised, a new CA and all key/certificate pairs must be recreated and reprovisioned in the cluster.

    • Keys and signatures based on ecdsa-with-sha384/P-384 are sufficiently secure for the vast majority of environments. However, if your organization has specific requirements, be sure to follow those.

    To use this recipe, first, prepare a working environment on a secure host by creating a new directory and copying the gen_tls_certs.sh script from $NCS_DIR/examples.ncs/high-availability/raft-cluster into it. Additionally, ensure that the openssl command, version 1.1 or later, is available and the system time is set correctly. Supposing that you have a cluster named lower-west, you might run:

    Including cluster name in the directory name helps distinguish certificates of one HA cluster from another, such as when using an LSA deployment in an HA configuration.

    The recipe relies on the gen_tls_certs.sh script to generate individual certificates. For clusters using FQDN node addresses, invoke the script with full hostnames of all the participating nodes. For example:

    Using only hostnames, e.g. node1, will not work.

    If your HA cluster is using IP addresses instead, add the -a option to the command and list the IPs:

    The script outputs the location of the relevant files and you should securely transfer each set of files to the corresponding NSO node. For each node, transfer only the three files: ca.crt, host.crt, and host.key.

    Once certificates are deployed, you can check their validity with the openssl verify command:

    This command takes into account the current time and can be used during troubleshooting. It can also display information contained in the certificate if you use the openssl x509 -text -in ssl/certs/node1.example.org.crt -noout variant. The latter form allows you to inspect the incorporated hostname/IP address and certificate validity dates.

    Actions

    NSO HA Raft can be controlled through several actions. All actions are found under /ha-raft/. In the best-case scenario, you will only need the create-cluster action to initialize the cluster and the read-only and create-cluster actions when upgrading the NSO version. The available actions are listed below:

    Network and ncs.conf Prerequisites

    In addition to the network connectivity required for the normal operation of a standalone NSO node, nodes in the HA Raft cluster must be able to initiate TCP connections from a random ephemeral client port to the following ports on other nodes:

    • Port 4369

    • Ports in the range 4370-4399 (configurable)

    You can change the ports in the second listed range from the default of 4370-4399. Use the min-port and max-port settings of the ha-raft/listen container.

    The Raft implementation does not impose any other hard limits on the network but you should keep in mind that consensus requires communication with other nodes in the cluster. A high round-trip latency between cluster nodes is likely to negatively impact the transaction throughput of the system.

    The HA Raft cluster also requires compatible ncs.conf files among the member nodes. In particular, /ncs-config/cdb/operational/enabled and /ncs-config/rollback/enabled values affect replication behavior and must match. Likewise, each member must have the same set of encryption keys and the keys cannot be changed while the cluster is in operation.

    To update the ncs.conf configuration, you must manually update the copy on each member node, making sure the new versions contain compatible values. Then perform the reload on the leader and the follower members will automatically reload their copies of the configuration file as well.

    If a node is a cluster member but has been configured with a new, incompatible ncs.conf file, it gets automatically disabled. See the /ha-raft/status/disabled-reason for reason. You can re-enable the node with the ha-raft reset command, once you have reconciled the incompatibilities.

    Connected Nodes and Node Discovery

    Raft has a notion of cluster configuration, in particular, how many and which members the cluster has. You define member nodes when you first initialize the cluster with the create-cluster command or use the adjust-membership command. The member nodes allow the cluster to know how many nodes are needed for consensus and similar.

    However, not all cluster members may be reachable or alive all the time. Raft implementation in NSO uses TCP connections between nodes to transport data. The TCP connections are authenticated and encrypted using TLS by default (see ). A working connection between nodes is essential for the cluster to function but a number of factors, such as firewall rules or expired/invalid certificates, can prevent the connection from establishing.

    Therefore, NSO distinguishes between configured member nodes and nodes to which it has established a working transport connection. The latter are called connected nodes. In a normal, fully working, and properly configured cluster, the connected nodes will be the same as member nodes (except for the current node).

    To help troubleshoot connectivity issues without affecting cluster operation, connected nodes will show even nodes that are not actively participating in the cluster but have established a transport connection to nodes in the cluster. The optional discovery mechanism, described next, relies on this functionality.

    NSO includes a mechanism that simplifies the initial cluster setup by enumerating known nodes. This mechanism uses a set of seed nodes to discover all connectable nodes, which can then be used with the create-cluster command to form a Raft cluster.

    When you specify one or more nodes with the /ha-raft/seed-nodes/seed-node setting in the ncs.conf file, the current node tries to establish a connection to these seed nodes, in order to discover the list of all nodes potentially participating in the cluster. For the discovery to work properly, all other nodes must also use seed nodes and the set of seed nodes must overlap. The recommended practice is to use the same set of seed nodes on every participating node.

    Along with providing an autocompletion list for the create-cluster command, this feature streamlines the discovery of node names when using NSO in containerized or other dynamic environments, where node addresses are not known in advance.

    Initial Cluster Setup

    Creating a new HA cluster consists of two parts: configuring the individual nodes and running the create-cluster action.

    First, you must update the ncs.conf configuration file for each node. All HA Raft configuration comes under the /ncs-config/ha-raft element.

    As part of the configuration, you must:

    • Enable HA Raft functionality through the enabled leaf.

    • Set node-address and the corresponding TLS parameters (see ).

    • Identify the cluster this node belongs to with cluster-name.

    The cluster name is simply a character string that uniquely identifies this HA cluster. The nodes in the cluster must use the same cluster name or they will refuse to establish a connection. This setting helps prevent mistakenly adding a node to the wrong cluster when multiple clusters are in operation, such as in an LSA setup.

    With all the nodes configured and running, connect to the node that you would like to become the initial leader and invoke the ha-raft create-cluster action. The action takes a list of nodes identified by their names. If you have configured seed-nodes, you will get auto-completion support, otherwise, you have to type in the names of the nodes yourself.

    This action makes the current node a cluster leader and joins the other specified nodes to the newly created cluster. For example:

    You can use the show ha-raft command on any node to inspect the status of the HA Raft cluster. The output includes the current cluster leader and members according to this node, as well as information about the local node, such as node name (local-node) and role. The status/connected-node list contains the names of the nodes with which this node has active network connections.

    In case you get an error, such as the Error: NSO can't reach member node 'ncsd@ADDRESS'., please verify all of the following:

    • The node at the ADDRESS is reachable. You can use the ping ADDRESS command, for example.

    • The problematic node has the correct ncs.conf configuration, especially cluster-name and node-address. The latter should match the ADDRESS and should contain at least one dot.

    In addition to the above, you may also examine the logs/raft.log file for detailed information on the error message and overall operation of the Raft algorithm. The amount of information in the file is controlled by the /ncs-config/logs/raft-log configuration in the ncs.conf.

    Cluster Management

    After the initial cluster setup, you can add new nodes or remove existing nodes from the cluster with the help of the ha-raft adjust-membership action. For example:

    When removing nodes using the ha-raft adjust-membership remove-node command, the removed node is not made aware that it is removed from the cluster and continues signaling the other nodes. This is a limitation in the algorithm, as it must also handle situations, where the removed node is down or unreachable. To prevent further communication with the cluster, it is important you ensure the removed node is shut down. You should shut down the to-be-removed node prior to removal from the cluster, or immediately after it. The former is recommended but the latter is required if there are only two nodes left in the cluster and shutting down prior to removal would prevent the cluster from reaching consensus.

    Additionally, you can force an existing follower node to perform a full re-sync from the leader by invoking the ha-raft reset action with the force option. Using this action on the leader will make the node give up the leader role and perform a sync with the newly elected leader.

    As leader selection during the Raft election is not deterministic, NSO provides the ha-raft handover action, which allows you to either trigger a new election if called with no arguments or transfer leadership to a specific node. The latter is especially useful when, for example, one of the nodes resides in a different location and more traffic between locations may incur extra costs or additional latency, so you prefer this node is not the leader under normal conditions.

    Migrating From Existing Rule-based HA

    If you have an existing HA cluster using the rule-based built-in HA, you can migrate it to use HA Raft instead. This procedure is performed in four distinct high-level steps:

    • Ensuring the existing cluster meets migration prerequisites.

    • Preparing the required HA Raft configuration files.

    • Switching to HA Raft.

    • Adding additional nodes to the cluster.

    The procedure does not perform an NSO version upgrade, so the cluster remains on the same version. It also does not perform any schema upgrades, it only changes the type of the HA cluster.

    The migration procedure is in place, that is, the existing nodes are disconnected from the old cluster and connected to the new one. This results in a temporary disruption of the service, so it should be performed during a service window.

    First, you should ensure the cluster meets migration prerequisites. The cluster must use:

    • NSO 6.1.2 or later

    • tailf-hcc 6.0 or later (if used)

    In case these prerequisites are not met, follow the standard upgrade procedures to upgrade the existing cluster to supported versions first.

    Additionally, ensure that all used packages are compatible with HA Raft, as NSO uses some new or updated notifications about HA state changes. Also, verify the network supports the new cluster communications (see ).

    Secondly, prepare all the ncs.conf and related files for each node, such as certificates and keys. Create a copy of all the ncs.conf files and disable or remove the existing >ha< section in the copies. Then add the required configuration items to the copies, as described in and . Do not update the ncs.conf files used by the nodes yet.

    It is recommended but not necessary that you set the seed nodes in ncs.conf to the designated primary and fail-over primary. Do this for all ncs.conf files for all nodes.

    Procedure 1. Migration to HA Raft

    1. With the new configurations at hand and verified, start the switch to HA Raft. The cluster nodes should be in their nominal, designated roles. If not, perform a failover first.

    2. On the designated (actual) primary, called node1, enable read-only mode.

    3. Then take a backup of all nodes.

    4. Once the backup successfully completes, stop the designated fail-over primary (actual secondary) NSO process, update its

    Security Considerations

    Communication between the NSO nodes in an HA Raft cluster takes place over Distributed Erlang, an RPC protocol transported over TLS (unless explicitly disabled by setting /ncs-config/ha-raft/ssl/enabled to 'false').

    TLS (Transport Layer Security) provides Authentication and Privacy by only allowing NSO nodes to connect using certificates and keys issued from the same Certificate Authority (CA). Distributed Erlang is transported over TLS 1.2. Access to a host can be revoked by the CA through the means of a CRL (Certificate Revocation List). To enforce certificate revocation within an HA Raft cluster, invoke the action /ha-raft/disconnect to terminate the pre-existing connection. A connection to the node can re-establish once the node's certificate is valid.

    Please ensure the CA key is kept in a safe place since it can be used to generate new certificates and key pairs for peers.

    Distributed Erlang supports for multiple NSO nodes to run on the same host and the node addresses are resolved by the epmd () service. Once resolved, the NSO nodes communicate directly.

    The ports epmd and the NSO nodes listen to can be found in . epmd binds the wildcard IPv4 address 0.0.0.0 and the IPv6 address ::.

    In case epmd is exposed to a DoS attack, the HA Raft members may be unable to resolve addresses and communication could be disrupted. Please ensure traffic on these ports are only accepted between the HA Raft members by using firewall rules or other means.

    Two NSO nodes can only establish a connection if a shared secret "cookie" matches. The cookie is optionally configured from /ncs-config/ha-raft/cluster-name. Please note the cookie is not a security feature but a way to isolate HA Raft clusters and to avoid accidental misuse.

    Packages Upgrades in Raft Cluster

    NSO contains a mechanism for distributing packages to nodes in a Raft cluster, greatly simplifying package management in a highly-available setup.

    You perform all package management operations on the current leader node. To identify the leader node, you can use the show ha-raft status leader command on a running cluster.

    Invoking the packages reload command makes the leader node update its currently loaded packages, identical to a non-HA, single-node setup. At the same time, the leader also distributes these packages to the followers to load. However, the load paths on the follower nodes, such as /var/opt/ncs/packages/, are not updated. This means, that if a leader election took place, a different leader was elected, and you performed another packages reload, the system would try to load the versions of the packages on this other leader, which may be out of date or not even present.

    The recommended approach is, therefore, to use the packages ha sync and-reload command instead, unless a load path is shared between NSO nodes, such as the same network drive. This command distributes and updates packages in the load paths on the follower nodes, as well as loading them.

    For the full procedure, first, ensure all cluster nodes are up and operational, then follow these steps on the leader node:

    • Perform a full backup of the NSO instance, such as running ncs-backup.

    • Add, replace, or remove packages on the filesystem. The exact location depends on the type of NSO deployment, for example /var/opt/ncs/packages/.

    • Invoke the packages ha sync and-reload or packages ha sync and-add command to start the upgrade process.

    Note that while the upgrade is in progress, writes to the CDB are not allowed and will be rejected.

    For a packages ha sync and-reload example see the raft-upgrade-l2 NSO system installation-based example referenced by the examples.ncs/development-guide/high-availability/hcc example in the NSO example set.

    For more details, troubleshooting, and general upgrade recommendations, see and .

    Version Upgrade of Cluster Nodes

    Currently, the only supported and safe way of upgrading the Raft HA cluster NSO version requires that the cluster be taken offline since the nodes must, at all times, run the same software version.

    Do not attempt an upgrade unless all cluster member nodes are up and actively participating in the cluster. Verify the current cluster state with the show ha-raft status command. All member nodes must also be present in the connected-node list.

    The procedure differentiates between the current leader node versus followers. To identify the leader, you can use the show ha-raft status leader command on a running cluster.

    Procedure 2. Cluster version upgrade

    1. On the leader, first enable read-only mode using the ha-raft read-only mode true command and then verify that all cluster nodes are in sync with the show ha-raft status log replications state command.

    2. Before embarking on the upgrade procedure, it's imperative to backup each node. This ensures that you have a safety net in case of any unforeseen issues. For example, you can use the $NCS_DIR/bin/ncs-backup command.

    3. Delete the $NCS_RUN_DIR/cdb/compact.lock

    For a standard System Install, the single-node procedure is described in , but in general depends on the NSO deployment type. For example, it will be different for containerized environments. For specifics, please refer to the documentation for the deployment type.

    For an example see the raft-upgrade-l2 NSO system installation-based example referenced by the examples.ncs/development-guide/high-availability/hcc example in the NSO example set.

    If the upgrade fails before or during the upgrade of the original leader, start up the original followers to restore service and then restore the original leader, using backup as necessary.

    However, if the upgrade fails after the original leader was successfully upgraded, you should still be able to complete the cluster upgrade. If you are unable to upgrade a follower node, you may provision a (fresh) replacement and the data and packages in use will be copied from the leader.

    NSO Rule-based HA

    NSO can manage the HA groups based on a set of predefined rules. This functionality was added in NSO 5.4 and is sometimes referred to simply as the built-in HA. However, since NSO 6.1, HA Raft (which is also built-in) is available as well, and is likely a better choice in most situations.

    Rule-based HA allows administrators to:

    • Configure HA group members with IP addresses and default roles

    • Configure failover behavior

    • Configure start-up behavior

    • Configure HA group members with IP addresses and default roles

    NSO rule-based HA is defined in tailf-ncs-high-availability.yang, with data residing under the /high-availability/ container.

    In environments with high NETCONF traffic, particularly when using ncs_device_notifs, it's recommended to enable read-only mode on the designated primary node before performing HA activation or sync. This prevents app_sync from being blocked by notification processing.

    Use the following command prior to enabling HA or assigning roles:

    After successful sync and HA establishment, disable read-only mode:

    NSO rule-based HA does not manage any virtual IP addresses, or advertise any BGP routes or similar. This must be handled by an external package. Tail-f HCC 5.x and greater has this functionality compatible with NSO rule-based HA. You can read more about the HCC package in the .

    Prerequisites

    To use NSO rule-based HA, HA must first be enabled in ncs.conf - See .

    If the package tailf-hcc with a version less than 5.0 is loaded, NSO rule-based HA will not function. These HCC versions may still be used but NSO built-in HA will not function in parallel.

    HA Member Configuration

    All HA group members are defined under /high-availability/ha-node. Each configured node must have a unique IP address configured and a unique HA ID. Additionally, nominal roles and fail-over settings may be configured on a per-node basis.

    The HA Node ID is a unique identifier used to identify NSO instances in an HA group. The HA ID of the local node - relevant amongst others when an action is called - is determined by matching configured HA node IP addresses against IP addresses assigned to the host machine of the NSO instance. As the HA ID is crucial to NSO HA, NSO rule-based HA will not function if the local node cannot be identified.

    To join a HA group, a shared secret must be configured on the active primary and any prospective secondary. This is used for a CHAP-2-like authentication and is specified under /high-availability/token/.

    In an NSO System Install setup, not only does the shared token need to match between the HA group nodes but the configuration for encrypted strings, default stored in /etc/ncs/ncs.crypto_keys, need also to match between the nodes in the HA group.

    The token configured on the secondary node is overwritten with the encrypted token of type aes-256-cfb-128-encrypted-string from the primary node when the secondary node connects to the primary. If there is a mismatch between the encrypted-string configuration on the nodes, NSO will not decrypt the HA token to match the token presented. As a result, the primary node denies the secondary node access the next time the HA connection needs to reestablish with a "Token mismatch, secondary is not allowed" error.

    See the upgrade-l2 example, referenced from examples.ncs/development-guide/high-availability/hcc, for an example setup and the for a description of the example.

    Also, note that the ncs.crypto_keys file is highly sensitive. The file contains the encryption keys for all CDB data that is encrypted on disk. Besides the HA token, this often includes passwords for various entities, such as login credentials to managed devices.

    HA Roles

    NSO can assume HA roles primary, secondary and none. Roles can be assigned directly through actions, or at startup or failover. See for the definition of these roles.

    NSO rule-based HA does not support relay-secondaries.

    NSO rule-based HA distinguishes between the concepts of nominal role and assigned role. Nominal-role is configuration data that applies when an NSO instance starts up and at failover. The assigned role is the role that the NSO instance has been ordered to assume either by an action or as a result of startup or failover.

    Failover

    Failover may occur when a secondary node loses the connection to the primary node. A secondary may then take over the primary role. Failover behavior is configurable and controlled by the parameters:

    • /high-availability/ha-node{id}/failover-primary

    • /high-availability/settings/enable-failover

    For automatic failover to function, /high-availability/settings/enable-failover must be se to true. It is then possible to enable at most one node with a nominal role secondary as failover-primary, by setting the parameter /high-availability/ha-node{id}/failover-primary. The failover works in both directions; if a nominal primary is currently connected to the failover-primary as a secondary and loses the connection, then it will attempt to take over as a primary.

    Before failover happens, a failover-primary-enabled secondary node may attempt to reconnect to the previous primary before assuming the primary role. This behavior is configured by the parameters denoting how many reconnect attempts will be made, and with which interval, respectively.

    • /high-availability/settings/reconnect-attempts

    • /high-availability/settings/reconnect-interval

    HA Members that are assigned as secondaries, but are neither failover-primaries nor set with a nominal-role primary, may attempt to rejoin the HA group after losing connection to the primary.

    This is controlled by /high-availability/settings/reconnect-secondaries. If this is true, secondary nodes will query the nodes configured under /high-availability/ha-node for an NSO instance that currently has the primary role. Any configured nominal roles will not be considered. If no primary node is found, subsequent attempts to rejoin the HA setup will be issued with an interval defined by /high-availability/settings/reconnect-interval.

    In case a net-split provokes a failover it is possible to end up in a situation with two primaries, both nodes accepting writes. The primaries are then not synchronized and will end up in a split brain. Once one of the primaries joins the other as a secondary, the HA cluster is once again consistent but any out-of-sync changes will be overwritten.

    To prevent split-brain from occurring, NSO 5.7 or later comes with a rule-based algorithm. The algorithm is enabled by default, it can be disabled or changed from the parameters:

    • /high-availability/settings/consensus/enabled [true]

    • /high-availability/settings/consensus/algorithm [ncs:rule-based]

    The rule-based algorithm can be used in either of the two HA constellations:

    • Two nodes: one nominal primary and one nominal secondary configured as failover-primary.

    • Three nodes: one nominal primary, one nominal secondary configured as failover-primary, and one perpetual secondary.

    On failover:

    • Failover-primary: become primary but enable read-only mode. Once the secondary joins, disable read-only.

    • Nominal primary: on loss of all secondaries, change role to none. If one secondary node is connected, stay primary.

    In certain cases, the rule-based consensus algorithm results in nodes being disconnected and will not automatically rejoin the HA cluster, such as in the example above when the nominal primary becomes none on the loss of all secondaries.

    To restore the HA cluster one may need to manually invoke the /high-availability/be-secondary-to action.

    In the case where the failover-primary takes over as primary, it will enable read-only mode, if no secondary connects it will remain read-only. This is done to guarantee consistency.

    In a three-node cluster, when the nominal primary takes over as actual primary, it first enables read-only mode and stays in read-only mode until a secondary connects. This is done to guarantee consistency.

    The read-write mode can manually be enabled from the /high-availability/read-only action with the parameter mode passed with value false.

    When any node loses connection, this can also be observed in high-availability alarms as either a ha-primary-down or a ha-secondary-down alarm.

    Startup

    Startup behavior is defined by a combination of the parameters /high-availability/settings/start-up/assume-nominal-role and /high-availability/settings/start-up/join-ha as well as the node's nominal role:

    Actions

    NSO rule-based HA can be controlled through several actions. All actions are found under /high-availability/. The available actions are listed below:

    Status Check

    The current state of NSO rule-based HA can be monitored by observing /high-availability/status/. Information can be found about the current active HA mode and the current assigned role. For nodes with active mode primary, a list of connected nodes and their source IP addresses is shown. For nodes with assigned role secondary the latest result of the be-secondary operation is listed. All NSO rule-based HA status information is non-replicated operational data - the result here will differ between nodes connected in an HA setup.

    Tail-f HCC Package

    The Tail-f HCC package extends the built-in HA functionality by providing virtual IP addresses (VIPs) that can be used to connect to the NSO HA group primary node. HCC ensures that the VIP addresses are always bound by the HA group primary and never bound by a secondary. Each time a node transitions between primary and secondary states HCC reacts by binding (primary) or unbinding (secondary) the VIP addresses.

    HCC manages IP addresses at the link layer (OSI layer 2) for Ethernet interfaces, and optionally, also at the network layer (OSI layer 3) using BGP router advertisements. The layer-2 and layer-3 functions are mostly independent and this document describes the details of each one separately. However, the layer-3 function builds on top of the layer-2 function. The layer-2 function is always necessary, otherwise, the Linux kernel on the primary node would not recognize the VIP address or accept traffic directed to it.

    Tail-f HCC version 5.x is non-backward compatible with previous versions of Tail-f HCC and requires functionality provided by NSO version 5.4 and greater. For more details, see the .

    Dependencies

    Both the HCC layer-2 VIP and layer-3 BGP functionality depend on iproute2 utilities and awk. An optional dependency is arping (either from iputils or Thomas Habets arping implementation) which allows HCC to announce the VIP to MAC mapping to all nodes in the network by sending gratuitous ARP requests.

    The HCC layer-3 BGP functionality depends on the daemon version 2.x being installed on each NSO host that is configured to run HCC in BGP mode.

    GoBGP is open-source software originally developed by NTT Communications and released under the Apache License 2.0. GoBGP can be obtained directly from https://osrg.github.io/gobgp/ and is also packaged for mainstream Linux distributions.

    The HCC layer-3 DNS Update functionality depends on the command line utility nsupdate.

    Tools Dependencies are listed below:

    Tool
    Package
    Required
    Description

    Same as with built-in HA functionality, all NSO instances must be configured to run in HA mode. See the on how to enable HA on NSO instances.

    Running the HCC Package with NSO as a Non-Root User

    GoBGP uses TCP port 179 for its communications and binds to it at startup. As port 179 is considered a privileged port it is normally required to run gobgpd as root.

    When NSO is running as a non-root user the gobgpd command will be executed as the same user as NSO and will prevent gobgpd from binding to port 179.

    There a multiple ways of handling this and two are listed here.

    1. Set capability CAP_NET_BIND_SERVICE on the gobgpd file. May not be supported by all Linux distributions.

    2. Set the owner to root and the setuid bit of the gobgpd file. Works on all Linux distributions.

    3. The

    Tail-f HCC Compared with HCC Version 4.x and Older

    HA Group Management Decisions

    Tail-f HCC 5.x or later does not participate in decisions on which NSO node is primary or secondary. These decisions are taken by NSO's built-in HA and then pushed as notifications to HCC. The NSO built-in HA functionality is available in NSO starting with version 5.4, where older NSO versions are not compatible with the HCC 5.x or later.

    Embedded BGP Daemon

    HCC 5.x or later operates a GoBGP daemon as a subprocess completely managed by NSO. The old HCC function pack interacted with an external Quagga BGP daemon using a NED interface.

    Automatic Interface Assignment

    HCC 5.x or later automatically associates VIP addresses with Linux network interfaces using the ip utility from the iproute2 package. VIP addresses are also treated as /32 without defining a new subnet. The old HCC function pack used explicit configuration to associate VIPs with existing addresses on each NSO host and define IP subnets for VIP addresses.

    Upgrading

    Since version 5.0, HCC relies on the NSO built-in HA for cluster management and only performs address or route management in reaction to cluster changes. Therefore, no special measures are necessary if using HCC when performing an NSO version upgrade or a package upgrade. Instead, you should follow the standard best practice HA upgrade procedure from .

    A reference to upgrade examples can be found in the NSO example set under examples.ncs/development-guide/high-availability/hcc/README.

    Layer-2

    The purpose of the HCC layer-2 functionality is to ensure that the configured VIP addresses are bound in the Linux kernel of the NSO primary node only. This ensures that the primary node (and only the primary node) will accept traffic directed toward the VIP addresses.

    HCC also notifies the local layer-2 network when VIP addresses are bound by sending Gratuitous ARP (GARP) packets. Upon receiving the Gratuitous ARP, all the nodes in the network update their ARP tables with the new mapping so they can continue to send traffic to the non-failed, now primary node.

    Operational Details

    HCC binds the VIP addresses as additional (alias) addresses on existing Linux network interfaces (e.g. eth0). The network interface for each VIP is chosen automatically by performing a kernel routing lookup on the VIP address. That is, the VIP will automatically be associated with the same network interface that the Linux kernel chooses to send traffic to the VIP.

    This means that you can map each VIP onto a particular interface by defining a route for a subnet that includes the VIP. If no such specific route exists the VIP will automatically be mapped onto the interface of the default gateway.

    To check which interface HCC will choose for a particular VIP address, simply run for example and look at the device dev in the output, for example eth0:

    Configuration

    The layer-2 functionality is configured by providing a list of IPv4 and/or IPv6 VIP addresses and enabling HCC. The VIP configuration parameters are found under /hcc:hcc.

    Global Layer-2 Configuration:

    Parameters
    Type
    Description

    Example Configuration

    Layer-3 BGP

    The purpose of the HCC layer-3 BGP functionality is to operate a BGP daemon on each NSO node and to ensure that routes for the VIP addresses are advertised by the BGP daemon on the primary node only.

    The layer-3 functionality is an optional add-on to the layer-2 functionality. When enabled, the set of BGP neighbors must be configured separately for each NSO node. Each NSO node operates an embedded BGP daemon and maintains connections to peers but only the primary node announces the VIP addresses.

    The layer-3 functionality relies on the layer-2 functionality to assign the virtual IP addresses to one of the host's interfaces. One notable difference in assigning virtual IP addresses when operating in Layer-3 mode is that the virtual IP addresses are assigned to the loopback interface lo rather than to a specific physical interface.

    Operational Details

    HCC operates a subprocess as an embedded BGP daemon. The BGP daemon is started, configured, and monitored by HCC. The HCC YANG model includes basic BGP configuration data and state data.

    Operational data in the YANG model includes the state of the BGP daemon subprocess and the state of each BGP neighbor connection. The BGP daemon writes log messages directly to NSO where the HCC module extracts updated operational data and then repeats the BGP daemon log messages into the HCC log verbatim. You can find these log messages in the developer log (devel.log).

    GoBGP must be installed separately. The gobgp and gobgpd binaries must be found in paths specified by the $PATH environment variable. For system install, NSO reads $PATH in the systemd service file /etc/systemd/system/ncs.service. Since tailf-hcc 6.0.2, the path to gobgp/gobgpd is no longer possible to specify from the configuration data leaf /hcc/bgp/node/gobgp-bin-dir

    Configuration

    The layer-3 BGP functionality is configured as a list of BGP configurations with one list entry per node. Configurations are separate because each NSO node usually has different BGP neighbors with their own IP addresses, authentication parameters, etc.

    The BGP configuration parameters are found under /hcc:hcc/bgp/node{id}.

    Per-Node Layer-3 Configuration:

    Parameters
    Type
    Description

    Each NSO node can connect to a different set of BGP neighbors. For each node, the BGP neighbor list configuration parameters are found under /hcc:hcc/bgp/node{id}/neighbor{address}.

    Per-Neighbor BGP Configuration:

    Parameters
    Type
    Description

    Example

    Layer-3 DNS Update

    The purpose of the HCC layer-3 DNS Update functionality is to notify a DNS server of the IP address change of the active primary NSO server, allowing the DNS server to update the DNS record for the given domain name.

    Geographically redundant NSO setup typically relies on DNS support. To enable this use case, tailf-hcc can dynamically update DNS with the nsupdate utility on HA status change notification.

    The DNS server used should support updates through nsupdate command (RFC 2136).

    Operational Details

    HCC listens on the underlying NSO HA notifications stream. When HCC receives a notification about an NSO node being Primary, it updates the DNS Server with the IP address of the Primary NSO for the given hostname. The HCC YANG model includes basic DNS configuration data and operational status data.

    Operational data in the YANG model includes the result of the latest DNS update operation.

    If the DNS Update is unsuccessful, an error message will be populated in operational data, for example:

    The DNS Server must be installed and configured separately, and details are provided to HCC as configuration data. The DNS Server must be configured to update the reverse DNS record.

    Configuration

    The layer-3 DNS Update functionality needs DNS-related information like DNS server IP address, port, zone, etc, and information about NSO nodes involved in HA - node, ip, and location.

    The DNS configuration parameters are found under /hcc:hcc/dns.

    Layer-3 DNS Configuration:

    Parameters
    Type
    Description

    Each NSO node can be placed in a separate Location/Site/Availability-Zone. This is configured as a list member configuration, with one list entry per node ID. The member list configuration parameters are found under /hcc:hcc/dns/member{node-id}.

    Parameter
    Type
    Description

    Example

    Here is an example configuration for a setup of two dual-stack NSO nodes, node-1 and node-2, that have an IPv4 and an IPv6 address configured. The configuration also sets up an update signing with the specified key.

    Usage

    This section describes basic deployment scenarios for HCC. Layer-2 mode is demonstrated first and then the layer-3 BGP functionality is configured in addition:

    A reference to container-based examples for the layer-2 and layer-3 deployment scenarios described here can be found in the NSO example set under examples.ncs/development-guide/high-availability/hcc.

    Both scenarios consist of two test nodes: london and paris with a single IPv4 VIP address. For the layer-2 scenario, the nodes are on the same network. The layer-3 scenario also involves a BGP-enabled router node as the london and paris nodes are on two different networks.

    Layer-2 Deployment

    The layer-2 operation is configured by simply defining the VIP addresses and enabling HCC. The HCC configuration on both nodes should match, otherwise, the primary node's configuration will overwrite the secondary node configuration when the secondary connects to the primary node.

    Addresses:

    Hostname
    Address
    Role

    Configuring VIPs:

    Verifying VIP Availability:

    Once enabled, HCC on the HA group primary node will automatically assign the VIP addresses to corresponding Linux network interfaces.

    On the secondary node, HCC will not configure these addresses.

    Layer-2 Example Implementation:

    A reference to a container-based example of the layer-2 scenario can be found in the NSO example set. See the examples.ncs/development-guide/high-availability/hcc/README

    Enabling Layer-3 BGP

    Layer-3 operation is configured for each NSO HA group node separately. The HCC configuration on both nodes should match, otherwise, the primary node's configuration will overwrite the configuration on the secondary node.

    Addresses:

    Hostname
    Address
    AS
    Role

    Configuring BGP for Paris Node:

    Configuring BGP for London Node:

    Check BGP Neighbor Connectivity:

    Check neighbor connectivity on the paris primary node. Note that its connection to neighbor 192.168.31.2 (router) is ESTABLISHED.

    Check neighbor connectivity on the london secondary node. Note that the primary node also has an ESTABLISHED connection to its neighbor 192.168.30.2 (router). The primary and secondary nodes both maintain their BGP neighbor connections at all times when BGP is enabled, but only the primary node announces routes for the VIPs.

    Check Advertised BGP Routes Neighbors:

    Check the BGP routes received by the router.

    The VIP subnet is routed to the paris host, which is the primary node.

    Layer-3 BGP Example Implementation:

    A reference to a container-based example of the combined layer-2 and layer-3 BGP scenario can be found in the NSO example set. See the examples.ncs/development-guide/high-availability/hcc/README

    Enabling Layer-3 DNS

    If enabled prior to the HA being established, HCC will update the DNS server with the IP address of the Primary node once a primary is selected.

    If an HA is already operational, and Layer-3 DNS is enabled and configured afterward, HCC will not update the DNS server automatically. An automatic DNS server update will only happen if a HA switchover happens. HCC exposes an update action to manually trigger an update to the DNS server with the IP address of the primary node.

    DNS Update Action:

    The user can explicitly update DNS from the specific NSO node by running the update action.

    Check the result of invoking the DNS update utility using the operational data in /hcc/dns:

    One way to verify DNS server updates is through the nslookup program. However, be mindful of the DNS caching mechanism, which may cache the old value for the amount of time controlled by the TTL setting.

    DNS get-node-location Action:

    /hcc/dns/member holds the information about all members involved in HA. The get-node-location action provides information on the location of an NSO node.

    Data Model

    The HCC data model can be found in the HCC package (tailf-hcc.yang).

    Setup with an External Load Balancer

    As an alternative to the HCC package, NSO built-in HA, either rule-based or HA Raft, can also be used in conjunction with a load balancer device in a reverse proxy configuration. Instead of managing the virtual IP address directly as HCC does, this setup relies on an external load balancer to route traffic to the currently active primary node.

    The load balancer uses HTTP health checks to determine which node is currently the active primary. The example, found in the examples.ncs/development-guide/high-availability/load-balancer directory uses HTTP status codes on the health check endpoint to easily distinguish whether the node is currently primary or not.

    In the example, freely available HAProxy software is used as a load balancer to demonstrate the functionality. It is configured to steer connections on localhost to either of the TCP port 2024 (SSH CLI) and TCP port 8080 (web UI and RESTCONF) to the active node in a 2-node HA cluster. The HAProxy software is required if you wish to run this example yourself.

    You can start all the components in the example by running the make build start command. At the beginning, the first node n1 is the active primary. Connecting to the localhost port 2024 will establish a connection to this node:

    Then, you can disable the high availability subsystem on n1 to simulate a node failure.

    Disconnect and wait a few seconds for the built-in HA to perform the failover to node n2. The time depends on the high-availability/settings/reconnect-interval and is set quite aggressively in this example to make the failover in about 6 seconds. Reconnect with the SSH client and observe the connection is now made to the fail-over node which has become the active primary:

    Finally, shut down the example with the make stop clean command.

    NB Listens to Addresses on HA Primary for Load Balancers

    NSO can be configured for the HA primary to listen on additional ports for the northbound interfaces NETCONF, RESTCONF, the web server (including JSON-RPC), and the CLI over SSH. Once a different node transitions to role primary the configured listen addresses are brought up on that node instead.

    When the following configuration is added to ncs.conf, then the primary HA node will listen(2) and bind(2) port 1830 on the wildcard IPv4 and IPv6 addresses.

    A similar configuration can be added for other NB interfaces, see the ha-primary-listen list under /ncs-config/{restconf,webui,cli}.

    HA Framework Requirements

    If an external HAFW is used, NSO only replicates the CDB data. NSO must be told by the HAFW which node should be primary and which nodes should be secondaries.

    The HA framework must also detect when nodes fail and instruct NSO accordingly. If the primary node fails, the HAFW must elect one of the remaining secondaries and appoint it the new primary. The remaining secondaries must also be informed by the HAFW about the new primary situation.

    Mode of Operation

    NSO must be instructed through the ncs.conf configuration file that it should run in HA mode. The following configuration snippet enables HA mode:

    Make sure to restart the ncs process for the changes to take effect.

    The IP address and the port above indicate which IP and which port should be used for the communication between the HA nodes. extra-listen is an optional list of ip:port pairs that a HA primary also listens to for secondary connections. For IPv6 addresses, the syntax [ip]:port may be used. If the :port is omitted, port is used. The tick-timeout is a duration indicating how often each secondary must send a tick message to the primary indicating liveness. If the primary has not received a tick from a secondary within 3 times the configured tick time, the secondary is considered to be dead. Similarly, the primary sends tick messages to all the secondaries. If a secondary has not received any tick messages from the primary within the 3 times the timeout, the secondary will consider the primary dead and report accordingly.

    A HA node can be in one of three states: NONE, SECONDARY or PRIMARY. Initially, a node is in the NONE state. This implies that the node will read its configuration from CDB, stored locally on file. Once the HA framework has decided whether the node should be a secondary or a primary the HAFW must invoke either the methods Ha.beSecondary(primary) or Ha.bePrimary()

    When an NSO HA node starts, it always starts up in mode NONE. At this point, there are no other nodes connected. Each NSO node reads its configuration data from the locally stored CDB and applications on or off the node may connect to NSO and read the data they need. Although write operations are allowed in the NONE state it is highly discouraged to initiate southbound communication unless necessary. A node in NONE state should only be used to configure NSO itself or to do maintenance such as upgrades. When in NONE state, some features are disabled, including but not limited to:

    • commit queue

    • NSO scheduler

    • nano-service side effect queue

    This is to avoid situations where multiple NSO nodes are trying to perform the same southbound operation simultaneously.

    At some point, the HAFW will command some nodes to become secondary nodes of a named primary node. When this happens, each secondary node tracks changes and (logically or physically) copies all the data from the primary. Previous data at the secondary node is overwritten.

    Note that the HAFW, by using NSO's start phases, can make sure that NSO does not start its northbound interfaces (NETCONF, CLI, ...) until the HAFW has decided what type of node it is. Furthermore once a node has been set to the SECONDARY state, it is not possible to initiate new write transactions towards the node. It is thus never possible for an agent to write directly into a secondary node. Once a node is returned either to the NONE state or to the PRIMARY state, write transactions can once again be initiated towards the node.

    The HAFW may command a secondary node to become primary at any time. The secondary node already has up-to-date data, so it simply stops receiving updates from the previous primary. Presumably, the HAFW also commands the primary node to become a secondary node or takes it down, or handles the situation somehow. If it has crashed, the HAFW tells the secondary to become primary, restarts the necessary services on the previous primary node, and gives it an appropriate role, such as secondary. This is outside the scope of NSO.

    Each of the primary and secondary nodes has the same set of all callpoints and validation points locally on each node. The start sequence has to make sure the corresponding daemons are started before the HAFW starts directing secondary nodes to the primary, and before replication starts. The associated callbacks will however only be executed at the primary. If e.g. the validation executing at the primary needs to read data that is not stored in the configuration and only available on another node, the validation code must perform any needed RPC calls.

    If the order from the HAFW is to become primary, the node will start to listen for incoming secondaries at the ip:port configured under /ncs-config/ha. The secondaries TCP connect to the primary and this socket is used by NSO to distribute the replicated data.

    If the order is to be a secondary, the node will contact the primary and possibly copy the entire configuration from the primary. This copy is not performed if the primary and secondary decide that they have the same version of the CDB database loaded, in which case nothing needs to be copied. This mechanism is implemented by use of a unique token, the transaction id - it contains the node id of the node that generated it and a time stamp, but is effectively "opaque".

    This transaction ID is generated by the cluster primary each time a configuration change is committed, and all nodes write the same transaction ID into their copy of the committed configuration. If the primary dies and one of the remaining secondaries is appointed the new primary, the other secondaries must be told to connect to the new primary. They will compare their last transaction ID to the one from the newly appointed primary. If they are the same, no CDB copy occurs. This will be the case unless a configuration change has sneaked in since both the new primary and the remaining secondaries will still have the last transaction ID generated by the old primary - the new primary will not generate a new transaction ID until a new configuration change is committed. The same mechanism works if a secondary node is simply restarted. No cluster reconfiguration will lead to a CDB copy unless the configuration has been changed in between.

    Northbound agents should run on the primary, an agent can't commit write operations at a secondary node.

    When an agent commits its CDB data, CDB will stream the committed data out to all registered secondaries. If a secondary dies during the commit, nothing will happen, the commit will succeed anyway. When and if the secondary reconnects to the cluster, the secondary will have to copy the entire configuration. All data on the HA sockets between NSO nodes only go in the direction from the primary to the secondaries. A secondary that isn't reading its data will eventually lead to a situation with full TCP buffers at the primary. In principle, it is the responsibility of HAFW to discover this situation and notify the primary NSO about the hanging secondary. However, if 3 times the tick timeout is exceeded, NSO will itself consider the node dead and notify the HAFW. The default value for tick timeout is 20 seconds.

    The primary node holds the active copy of the entire configuration data in CDB. All configuration data has to be stored in CDB for replication to work. At a secondary node, any request to read will be serviced while write requests will be refused. Thus, the CDB subscription code works the same regardless of whether the CDB client is running at the primary or at any of the secondaries. Once a secondary has received the updates associated to a commit at the primary, all CDB subscribers at the secondary will be duly notified about any changes using the normal CDB subscription mechanism.

    If the system has been set up to subscribe for NETCONF notifications, the secondaries will have all subscriptions as configured in the system, but the subscription will be idle. All NETCONF notifications are handled by the primary, and once the notifications get written into stable storage (CDB) at the primary, the list of received notifications will be replicated to all secondaries.

    Security Aspects

    We specify in ncs.conf which IP address the primary should bind for incoming secondaries. If we choose the default value 0.0.0.0 it is the responsibility of the application to ensure that connection requests only arrive from acceptable trusted sources through some means of firewalling.

    A cluster is also protected by a token, a secret string only known to the application. The Ha.connect() method must be given the token. A secondary node that connects to a primary node negotiates with the primary using a CHAP-2-like protocol, thus both the primary and the secondary are ensured that the other end has the same token without ever revealing their own token. The token is never sent in clear text over the network. This mechanism ensures that a connection from an NSO secondary to a primary can only succeed if they both have the same token.

    It is indeed possible to store the token itself in CDB, thus an application can initially read the token from the local CDB data, and then use that token in . the constructor for the Ha class. In this case, it may very well be a good idea to have the token stored in CDB be of type tailf:aes-256-cfb-128-encrypted-string.

    If the actual CDB data that is sent on the wire between cluster nodes is sensitive, and the network is untrusted, the recommendation is to use IPSec between the nodes. An alternative option is to decide exactly which configuration data is sensitive and then use the tailf:aes-256-cfb-128-encrypted-string type for that data. If the configuration data is of type tailf:aes-256-cfb-128-encrypted-string the encrypted data will be sent on the wire in update messages from the primary to the secondaries.

    API

    There are two APIs used by the HA framework to control the replication aspects of NSO. First, there exists a synchronous API used to tell NSO what to do, secondly, the application may create a notifications socket and subscribe to HA-related events where NSO notifies the application on certain HA-related events such as the loss of the primary, etc. The HA-related notifications sent by NSO are crucial to how to program the HA framework.

    The HA-related classes reside in the com.tailf.ha package. See Javadocs for reference. The HA notifications-related classes reside in the com.tailf.notif package, See Javadocs for reference.

    Ticks

    The configuration parameter /ncs-cfg/ha/tick-timeout is by default set to 20 seconds. This means that every 20 seconds each secondary will send a tick message on the socket leading to the primary. Similarly, the primary will send a tick message every 20 seconds on every secondary socket.

    This aliveness detection mechanism is necessary for NSO. If a socket gets closed all is well, NSO will clean up and notify the application accordingly using the notifications API. However, if a remote node freezes, the socket will not get properly closed at the other end. NSO will distribute update data from the primary to the secondaries. If a remote node is not reading the data, TCP buffer will get full and NSO will have to start to buffer the data. NSO will buffer data for at most tickTime times 3 time units. If a tick has not been received from a remote node within that time, the node will be considered dead. NSO will report accordingly over the notifications socket and either remove the hanging secondary or, if it is a secondary that loses contact with the primary, go into the initial NONE state.

    If the HAFW can be really trusted, it is possible to set this timeout to PT0S, i.e zero, in which case the entire dead-node-detection mechanism in NSO is disabled.

    Relay Secondaries

    The normal setup of an NSO HA cluster is to have all secondaries connected directly to the primary. This is a configuration that is both conceptually simple and reasonably straightforward to manage for the HAFW. In some scenarios, in particular a cluster with multiple secondaries at a location that is network-wise distant from the primary, it can however be sub-optimal, since the replicated data will be sent to each remote secondary individually over a potentially low-bandwidth network connection.

    To make this case more efficient, we can instruct a secondary to be a relay for other secondaries, by invoking the Ha.beRelay() method. This will make the secondary start listening on the IP address and port configured for HA in ncs.conf, and handle connections from other secondaries in the same manner as the cluster primary does. The initial CDB copy (if needed) to a new secondary will be done from the relay secondary, and when the relay secondary receives CDB data for replication from its primary, it will distribute the data to all its connected secondaries in addition to updating its own CDB copy.

    To instruct a node to become a secondary connected to a relay secondary, we use the Ha.beSecondary() method as usual, but pass the node information for the relay secondary instead of the node information for the primary. I.e. the "sub-secondary" will in effect consider the relay secondary as its primary. To instruct a relay secondary to stop being a relay, we can invoke the Ha.beSecondary() method with the same parameters as in the original call. This is a no-op for a "normal" secondary, but it will cause a relay secondary to stop listening for secondary connections, and disconnect any already connected "sub-secondaries".

    This setup requires special consideration by the HAFW. Instead of just telling each secondary to connect to the primary independently, it must set up the secondaries that are intended to be relays, and tell them to become relays, before telling the "sub-secondaries" to connect to the relay secondaries. Consider the case of a primary M and a secondary S0 in one location, and two secondaries S1 and S2 in a remote location, where we want S1 to act as relay for S2. The setup of the cluster then needs to follow this procedure:

    1. Tell M to be primary.

    2. Tell S0 and S1 to be secondary with M as primary.

    3. Tell S1 to be relay.

    4. Tell S2 to be secondary with S1 as primary.

    Conversely, the handling of network outages and node failures must also take the relay secondary setup into account. For example, if a relay secondary loses contact with its primary, it will transition to the NONE state just like any other secondary, and it will then disconnect its sub-secondaries which will cause those to transition to NONE too, since they lost contact with "their" primary. Or if a relay secondary dies in a way that is detected by its sub-secondaries, they will also transition to NONE. Thus in the example above, S1 and S2 needs to be handled differently. E.g. if S2 dies, the HAFW probably won't take any action, but if S1 dies, it makes sense to instruct S2 to be a secondary of M instead (and when S1 comes back, perhaps tell S2 to be a relay and S1 to be a secondary of S2).

    Besides the use of Ha.beRelay(), the API is mostly unchanged when using relay secondaries. The HA event notifications reporting the arrival or the death of a secondary are still generated only by the "real" cluster primary. If the Ha.HaStatus() method is used towards a relay secondary, it will report the node state as SECONDARY_RELAY rather than just SECONDARY, and the array of nodes will have its primary as the first element (same as for a "normal" secondary), followed by its "sub-secondaries" (if any).

    CDB Replication

    When HA is enabled in ncs.conf, CDB automatically replicates data written on the primary to the connected secondary nodes. Replication is done on a per-transaction basis to all the secondaries in parallel and is synchronous. When NSO is in secondary mode the northbound APIs are in read-only mode, that is the configuration can not be changed on a secondary other than through replication updates from the primary. It is still possible to read from for example NETCONF or CLI (if they are enabled) on a secondary. CDB subscriptions work as usual. When NSO is in the NONE state CDB is unlocked and it behaves as when NSO is not in HA mode at all.

    Unlike configuration data, operational data is replicated only if it is defined as persistent in the data model (using the tailf:persistent extension).

    AAA Infrastructure

    Manage user authentication, authorization, and audit using NSO's AAA mechanism.

    The Problem

    Users log into NSO through the CLI, NETCONF, RESTCONF, SNMP, or via the Web UI. In either case, users need to be authenticated. That is, a user needs to present credentials, such as a password or a public key to gain access. As an alternative, for RESTCONF, users can be authenticated via token validation.

    Once a user is authenticated, all operations performed by that user need to be authorized. That is, certain users may be allowed to perform certain tasks, whereas others are not. This is called authorization. We differentiate between the authorization of commands and the authorization of data access.

    Reload or restart the NSO process (if already running).

  • Repeat the preceding steps for every participating node.

  • Enable read-only mode on designated leader to avoid potential sync issues in cluster formation.

  • Invoke the create-cluster action.

  • Nodes use compatible configuration. For example, make sure the ncs.crypto_keys file (if used) or the encrypted-strings configuration in ncs.conf is identical across nodes.

  • HA Raft is enabled, using the show ha-raft command on the unreachable node.

  • The firewall configuration on the OS and on the network level permits traffic on the required ports (see Network and ncs.conf Prerequisites).

  • The node uses a certificate that the CA can validate. For example, copy the certificates to the same location and run openssl verify -CAfile CA_CERT NODE_CERT to verify this.

  • Verify the epmd -names command on each node shows the ncsd process. If not, stop NSO, run epmd -kill, and then start NSO again.

  • ncs.conf
    and the related (certificate) files for HA Raft, and then start it again. Connect to this node's CLI, here called node2, and verify HA Raft is enabled with the
    show
    ha-raft
    command.
  • Now repeat the same for the designated primary (node1). If you have set the seed nodes, you should see the fail-over primary show under connected-node.

  • On the old designated primary (node1) invoke the ha-raft create-cluster action and create a two-node Raft cluster with the old fail-over primary (node2, actual secondary). The action takes a list of nodes identified by their names. If you have configured seed-nodes, you will get auto-completion support, otherwise you have to type in the name of the node yourself.

    In case of errors running the action, refer to Initial Cluster Setup for possible causes and troubleshooting steps.

  • Raft requires at least three nodes to operate effectively (as described in NSO HA Raft) and currently, there are only two in the cluster. If the initial cluster had only two nodes, you must provision an additional node and set it up for HA Raft. If the cluster initially had three nodes, there is the remaining secondary node, node3, which you must stop, update its configuration as you did with the other two nodes, and start it up again.

  • Finally, on the old designated primary and current HA Raft leader, use the ha-raft adjust-membership add-node action to add this third node to the cluster.

  • file and compact the CDB write log on all nodes using, for example, the
    $NCS_DIR/bin/ncs --cdb-compact $NCS_RUN_DIR/cdb
    command.
  • On all nodes, delete the $NCS_RUN_DIR/state/raft/ directory with a command such as rm -rf $NCS_RUN_DIR/state/raft/.

  • Stop NSO on all the follower nodes, for example, invoking the $NCS_DIR/bin/ncs --stop or systemctl stop ncs command on each node.

  • Stop NSO on the leader node only after you have stopped all the follower nodes in the previous step. Alternatively NSO can be stopped on the nodes before deleting the HA Raft state and compacting the CDB write log without needing to delete the compact.lock file.

  • Upgrade the NSO packages on the leader to support the new NSO version.

  • Install the new NSO version on all nodes.

  • Start NSO on all nodes.

  • Re-initialize the HA cluster using the ha-raft create-cluster action on the node to become the leader.

  • Finally, verify the cluster's state through the show ha-raft status command. Ensure that all data has been correctly synchronized across all cluster nodes and that the leader is no longer read-only. The latter happens automatically after re-initializing the HA cluster.

  • Assign roles, join HA group, enable/disable rule-based HA through actions

  • View the state of the current HA setup

  • false

    true

    primary

    Attempt to join HA setup as secondary by querying for current primary. Retries will be attempted. Retry attempt interval is defined by /high-availability/settings/reconnect-interval.

    false

    true

    secondary

    Attempt to join HA setup as secondary by querying for current primary. Retries will be attempted. Retry attempt interval is defined by /high-availability/settings/reconnect-interval. If all retry attempts fail, assume none role.

    false

    true

    none

    Assume none role.

    true

    true

    primary

    Query HA setup once for a node with primary role. If found, attempt to connect as secondary to that node. If no current primary is found, assume primary role.

    true

    true

    secondary

    Attempt to join HA setup as secondary by querying for current primary. Retries will be attempted. Retry attempt interval is defined by /high-availability/settings/reconnect-interval. If all retry attempts fail, assume none role.

    true

    true

    none

    Assume none role.

    false

    false

    -

    Assume none role.

    arping

    iputils or arping

    optional

    Installation recommended. Will reduce the propagation of changes to the virtual IP for layer-2 configurations.

    gobgpd and gobgp

    GoBGP 2.x

    optional

    Required for layer-3 configurations. gobgpd is started by the HCC package and advertises the virtual IP using BGP. gobgp is used to get advertised routes.

    nsupdate

    bind-tools or knot-dnsutils

    optional

    Required for layer-3 DNS update functionality and is used to submit Dynamic DNS Update requests to a name server.

    vipctl
    script, included in the HCC package, uses
    sudo
    to run the
    ip
    and
    arping
    commands when NSO is not running as root. If
    sudo
    is used, you must ensure it does not require password input. For example, if NSO runs as
    admin
    user, the
    sudoers
    file can be edited similarly to the following:
    . The leaf has been removed from the
    tailf-hcc/src/yang/tailf-hcc.yang
    module.

    Upgrades: If BGP is enabled and the gobgp or gobgpd binaries are not found, the tailf-hcc package will fail to load. The user must then install GoBGP and invoke the packages reload action or restart NSO with NCS_RELOAD_PACKAGES=true in /etc/ncs/ncs.systemd.conf and systemctl restart ncs.

    enabled

    boolean

    If set to true, then an outgoing BGP connection to this neighbor is established by the HA group primary node.

    server

    inet:ip-address

    DNS Server IP Address.

    port

    uint32

    DNS Server port, default 53.

    zone

    inet:host

    DNS Zone to update on the server.

    timeout

    uint32

    Timeout for nsupdate command, default 300.

    vip4

    192.168.23.122

    Primary node IPv4 VIP address

    create-cluster

    Initialise an HA Raft cluster. This action should only be invoked once to form a new cluster when no HA Raft log exists. The members of the HA Raft cluster consist of the NCS node where the /ha-raft/create-clusteraction is invoked, which will become the leader of the cluster; and the members specified by the member parameter.

    adjust-membership

    Add or remove an HA node from the HA Raft cluster.

    disconnect

    Disconnect an HA node from all remaining nodes. In the event of revoking a TLS certificate, invoke this action to disconnect the already established connections to the node with the revoked certificate. A disconnected node with a valid TLS certificate may re-establish the connection.

    reset

    Reset the (disabled) local node to make the leader perform a full sync to this local node if an HA Raft cluster exists. If reset is performed on the leader node, the node will step down from leadership and it will be synced by the next leader node. An HA Raft member will change role to disabled if ncs.conf has incompatible changes to the ncs.conf on the leader; a member will also change role to disabled if there are non-recoverable failures upon opening a snapshot. See the /ha-raft/status/disable-reason leaf for the reason. Set force to true to override reset when /ha-raft/status/role is not set to disabled.

    handover

    Handover leadership to another member of the HA Raft cluster or step down from leadership and start a new election.

    read-only

    Toggle read-only mode. If the mode is true no configuration changes can occur.

    assume-nominal-role

    join-ha

    nominal-role

    behaviour

    true

    false

    primary

    Assume primary role.

    true

    false

    secondary

    Attempt to connect as secondary to the node (if any) which has nominal-role primary. If this fails, make no retry attempts and assume none role.

    true

    false

    none

    Assume none role

    be-primary

    Order the local node to assume ha role primary

    be-none

    Order the local node to assume ha role none

    be-secondary-to

    Order the local node to connect as secondary to the provided HA node. This is an asynchronous operation, result can be found under /high-availability/status/be-secondary-result

    local-node-id

    Identify the which of the nodes in /high-availability/ha-node (if any) corresponds to the local NSO instance

    enable

    Enable NSO rule-based HA and optionally assume a ha role according to /high-availability/settings/start-up/ parameters

    disable

    Disable NSO rule-based HA and assume a ha role none

    ip

    iproute2

    yes

    Adds and deletes the virtual IP from the network interface.

    awk

    mawk or gawk

    yes

    Installed with most Linux distributions.

    sed

    sed

    yes

    enabled

    boolean

    If set to 'true', the primary node in an HA group automatically binds the set of Virtual IPv[46] addresses.

    vip-address

    list of inet:ip-address

    The list of virtual IPv[46] addresses to bind on the primary node. The addresses are automatically unbound when a node becomes secondary. The addresses can therefore be used externally to reliably connect to the HA group primary node.

    node-id

    string

    Unique node ID. A reference to /ncs:high-availability/ha-node/id.

    enabled

    boolean

    If set to true, this node uses BGP to announce VIP addresses when in the HA primary state.

    as

    inet:as-number

    The BGP Autonomous System Number for the local BGP daemon.

    router-id

    inet:ip-address

    The router ID for the local BGP daemon.

    address

    inet:ip-address

    BGP neighbor IP address.

    as

    inet:as-number

    BGP neighbor Autonomous System Number.

    ttl-min

    uint8

    Optional minimum TTL value for BGP packets. When configured enables BGP Generalized TTL Security Mechanism (GTSM).

    password

    string

    Optional password to use for BGP authentication with this neighbor.

    enabled

    boolean

    If set to true, DNS updates will be enabled.

    fqdn

    inet:domain-name

    DNS domain-name for the HA primary.

    ttl

    uint32

    Time to live for DNS record, default 86400.

    key-file

    string

    Specifies the file path for nsupdate keyfile.

    node-id

    string

    Unique NSO HA node ID. Valid values are: /high-availability/ha-node when built-in HA is used or /ha-raft/status/member for HA Raft.

    ip-address

    inet:ip-address

    IP where NSO listens for incoming requests to any northbound interfaces.

    location

    string

    Name of the Location/Site/Availability-Zone where node is placed.

    paris

    192.168.23.99

    Paris service node.

    london

    192.168.23.98

    London service node.

    vip4

    192.168.23.122

    NSO primary node IPv4 VIP address.

    paris

    192.168.31.99

    64512

    Paris node

    london

    192.168.30.98

    64513

    London node

    router

    192.168.30.2

    192.168.31.2

    64514

    HA Raft
    Rule-based HA
    External HA
    Tail-f HCC Package
    external load balancer
    Raft
    RFC2459
    Recipe for a Self-signed CA
    Security Considerations
    Node Names and Certificates
    Network and ncs.conf Prerequisites
    Initial Cluster Setup
    Node Names and Certificates
    Erlang Port Mapper Daemon
    Network and ncs.conf Prerequisites
    NSO Packages
    Upgrade
    Single Instance Upgrade
    following chapter
    Mode of Operation
    Deployment Example
    HA Framework Requirements
    following chapter
    GoBGP
    following instructions
    NSO HA Version Upgrade
    GoBGP
    Layer-2 Deployment
    Enabling Layer-3 BGP
    Enabling Layer-3 DNS
    Load Balancer Routes Connections to the Appropriate NSO Node
    Load Balancer Uses Health Checks to Determine the Currently Active Primary Node

    Installed with most Linux distributions.

    BGP-enabled router

    admin@node1# show ha-raft
    ha-raft status role stalled
    ha-raft status connected-node [ node2.example.org ]
    ha-raft status local-node node1.example.org
    > ... output omitted ... <
    admin@node1# ha-raft create-cluster member [ node2.example.org ]
    admin@node1# show ha-raft
    ha-raft status role leader
    ha-raft status leader node1.example.org
    ha-raft status member [ node1.example.org node2.example.org ]
    ha-raft status connected-node [ node2.example.org ]
    ha-raft status local-node node1.example.org
    > ... output omitted ... <
    admin@node1# ha-raft adjust-membership add-node node3.example.org
    admin@node1# show ha-raft status member
    ha-raft status member [ node1.example.org node2.example.org node3.example.org ]
      <ha-raft>
        <!-- ... -->
        <listen>
          <node-address>198.51.100.10</node-address>
        </listen>
        <ssl>
          <ca-cert-file>${NCS_CONFIG_DIR}/dist/ssl/cert/myca.crt</ca-cert-file>
          <cert-file>${NCS_CONFIG_DIR}/dist/ssl/cert/node-100-10.crt</cert-file>
          <key-file>${NCS_CONFIG_DIR}/dist/ssl/cert/node-100-10.key</key-file>
        </ssl>
      </ha-raft>
    $ mkdir raft-ca-lower-west
    $ cd raft-ca-lower-west
    $ cp $NCS_DIR/examples.ncs/high-availability/raft-cluster/gen_tls_certs.sh .
    $ openssl version
    $ date
    $ ./gen_tls_certs.sh node1.example.org node2.example.org node3.example.org
    $ ./gen_tls_certs.sh -a 192.0.2.1 192.0.2.2 192.0.2.3
    $ openssl verify -CAfile ssl/certs/ca.crt ssl/certs/node1.example.org.crt
    Sample HA Raft config for a cluster node
      <ha-raft>
        <enabled>true</enabled>
        <cluster-name>sherwood</cluster-name>
        <listen>
          <node-address>ash.example.org</node-address>
        </listen>
        <ssl>
          <ca-cert-file>${NCS_CONFIG_DIR}/dist/ssl/cert/myca.crt</ca-cert-file>
          <cert-file>${NCS_CONFIG_DIR}/dist/ssl/cert/ash.crt</cert-file>
          <key-file>${NCS_CONFIG_DIR}/dist/ssl/cert/ash.key</key-file>
        </ssl>
        <seed-nodes>
          <seed-node>birch.example.org</seed-node>
        </seed-nodes>
      </ha-raft>
    admin@ncs# request ha-raft read-only-mode true
    admin@ncs# ha-raft create-cluster member [ birch.example.org cedar.example.org ]
    admin@ncs# show ha-raft
    ha-raft status role leader
    ha-raft status leader ash.example.org
    ha-raft status member [ ash.example.org birch.example.org cedar.example.org ]
    ha-raft status connected-node [ birch.example.org cedar.example.org ]
    ha-raft status local-node ash.example.org
    ...
    admin@ncs# request ha-raft read-only-mode false
    admin@ncs# show ha-raft status member
    ha-raft status member [ ash.example.org birch.example.org cedar.example.org ]
    admin@ncs# ha-raft adjust-membership remove-node birch.example.org
    admin@ncs# show ha-raft status member
    ha-raft status member [ ash.example.org cedar.example.org ]
    admin@ncs# ha-raft adjust-membership add-node dollartree.example.org
    admin@ncs# show ha-raft status member
    ha-raft status member [ ash.example.org cedar.example.org dollartree.example.org ]
    admin@node1# high-availability read-only mode true
    admin@ncs# high-availability read-only mode true
    admin@ncs# high-availability read-only mode false
    alarms alarm-list alarm ncs ha-primary-down /high-availability/ha-node[id='paris']
     is-cleared              false
     last-status-change      2022-05-30T10:02:45.706947+00:00
     last-perceived-severity critical
     last-alarm-text         "Lost connection to primary due to: Primary closed connection"
     status-change 2022-05-30T10:02:45.706947+00:00
      received-time      2022-05-30T10:02:45.706947+00:00
      perceived-severity critical
      alarm-text         "Lost connection to primary due to: Primary closed connection"
    alarms alarm-list alarm ncs ha-secondary-down /high-availability/ha-node[id='london'] ""
     is-cleared              false
     last-status-change      2022-05-30T10:04:33.231808+00:00
     last-perceived-severity critical
     last-alarm-text         "Lost connection to secondary"
     status-change 2022-05-30T10:04:33.231808+00:00
      received-time      2022-05-30T10:04:33.231808+00:00
      perceived-severity critical
      alarm-text         "Lost connection to secondary"
    $ sudo setcap 'cap_net_bind_service=+ep' /usr/bin/gobgpd
    $ sudo chown root /usr/bin/gobgpd
    $ sudo chmod u+s /usr/bin/gobgpd
    admin@paris:~$ ip route get 192.168.123.22
    admin@ncs(config)# hcc enabled
    admin@ncs(config)# hcc vip 192.168.123.22
    admin@ncs(config)# hcc vip 2001:db8::10
    admin@ncs(config)# commit
    admin@ncs# show hcc
    NODE    BGPD  BGPD
    ID      PID   STATUS   ADDRESS       STATE        CONNECTED
    -------------------------------------------------------------
    london  -     -        192.168.30.2  -            -
    paris   827   running  192.168.31.2  ESTABLISHED  true
    admin@ncs(config)# hcc bgp node paris enabled
    admin@ncs(config)# hcc bgp node paris as 64512
    admin@ncs(config)# hcc bgp node paris router-id 192.168.31.99
    admin@ncs(config)# hcc bgp node paris neighbor 192.168.31.2 as 64514
    admin@ncs(config)# ... repeated for each neighbor if more than one ...
                ... repeated for each node ...
    admin@ncs(config)# commit
    admin@ncs# show hcc dns
    hcc dns status time 2023-10-20T23:16:33.472522+00:00
    hcc dns status exit-code 0
    admin@ncs# show hcc dns
    hcc dns status time 2023-10-20T23:36:33.372631+00:00
    hcc dns status exit-code 2
    hcc dns status error-message "; Communication with 10.0.0.10#53 failed: timed out"
    admin@ncs(config)#  hcc dns enabled 
    admin@ncs(config)#  hcc dns fqdn example.com 
    admin@ncs(config)#  hcc dns ttl 120 
    admin@ncs(config)#  hcc dns key-file /home/cisco/DNS-testing/good.key 
    admin@ncs(config)#  hcc dns server 10.0.0.10 
    admin@ncs(config)#  hcc dns port 53 
    admin@ncs(config)#  hcc dns zone zone1.nso 
    admin@ncs(config)#  hcc dns member node-1 ip-address [ 10.0.0.20 ::10 ] 
    admin@ncs(config)#  hcc dns member node-1 location SanJose 
    admin@ncs(config)#  hcc dns member node-2 ip-address [ 10.0.0.30 ::20 ] 
    admin@ncs(config)#  hcc dns member node-2 location NewYork 
    admin@ncs(config)# commit
    admin@ncs(config)# hcc enabled
    admin@ncs(config)# hcc vip 192.168.23.122
    admin@ncs(config)# commit
    root@paris:/var/log/ncs# ip address list
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
        valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
        valid_lft forever preferred_lft forever
    2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
        link/ether 52:54:00:fa:61:99 brd ff:ff:ff:ff:ff:ff
        inet 192.168.23.99/24 brd 192.168.23.255 scope global enp0s3
        valid_lft forever preferred_lft forever
        inet 192.168.23.122/32 scope global enp0s3
        valid_lft forever preferred_lft forever
        inet6 fe80::5054:ff:fefa:6199/64 scope link
        valid_lft forever preferred_lft forever
    root@london:~# ip address list
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 ...
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
        valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
        valid_lft forever preferred_lft forever
    2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 ...
        link/ether 52:54:00:fa:61:98 brd ff:ff:ff:ff:ff:ff
        inet 192.168.23.98/24 brd 192.168.23.255 scope global enp0s3
        valid_lft forever preferred_lft forever
        inet6 fe80::5054:ff:fefa:6198/64 scope link
        valid_lft forever preferred_lft forever
    admin@ncs(config)# hcc bgp node paris enabled
    admin@ncs(config)# hcc bgp node paris as 64512
    admin@ncs(config)# hcc bgp node paris router-id 192.168.31.99
    admin@ncs(config)# hcc bgp node paris neighbor 192.168.31.2 as 64514
    admin@ncs(config)# commit
    admin@ncs(config)# hcc bgp node london enabled
    admin@ncs(config)# hcc bgp node london as 64513
    admin@ncs(config)# hcc bgp node london router-id 192.168.30.98
    admin@ncs(config)# hcc bgp node london neighbor 192.168.30.2 as 64514
    admin@ncs(config)# commit
    admin@ncs# show hcc
          BGPD  BGPD
    NODE ID PID   STATUS   ADDRESS       STATE        CONNECTED
    ----------------------------------------------------------------
    london  -     -        192.168.30.2  -            -
    paris   2486  running  192.168.31.2  ESTABLISHED  true
    admin@ncs# show hcc
          BGPD  BGPD
    NODE ID PID   STATUS   ADDRESS       STATE        CONNECTED
    ----------------------------------------------------------------
    london  494   running  192.168.30.2  ESTABLISHED  true
    paris   -     -        192.168.31.2  -            -
    admin@ncs# show ip bgp
    ...
    Network          Next Hop            Metric LocPrf Weight Path
    *> 192.168.23.122/32
                      192.168.31.99                          0 64513 ?
    admin@ncs# hcc dns update
    admin@ncs# show hcc dns
    hcc dns status time 2023-10-10T20:47:31.733661+00:00
    hcc dns status exit-code 0
    hcc dns status error-message ""
    cisco@node-2:~$ nslookup example.com
    Server:   10.0.0.10
    Address:  10.0.0.10#53
    
    Name: example.com
    Address: 10.0.0.20
    Name: example.com
    Address: ::10
    admin@ncs(config)# hcc dns get-node-location
    location SanJose
    $ make build start
    Setting up run directory for nso-node1
     ... make output omitted ...
    Waiting for n2 to connect: .
    $ ssh -p 2024 admin@localhost
    admin@localhost's password: admin
    
    admin connected from 127.0.0.1 using ssh on localhost
    admin@n1> switch cli
    admin@n1# show high-availability
    high-availability enabled
    high-availability status mode primary
    high-availability status current-id n1
    high-availability status assigned-role primary
    high-availability status read-only-mode false
    ID  ADDRESS
    ---------------
    n2  127.0.0.1
    admin@n1# high-availability disable
    result NSO Built-in HA disabled
    admin@n1# exit
    Connection to localhost closed.
    $ ssh -p 2024 admin@localhost
    admin@localhost's password: admin
    
    admin connected from 127.0.0.1 using ssh on localhost
    admin@n2> switch cli
    admin@n2# show high-availability
    high-availability enabled
    high-availability status mode primary
    high-availability status current-id n2
    high-availability status assigned-role primary
    high-availability status read-only-mode false
    <netconf-north-bound>
      <transport>
        <ssh>
          <enabled>true</enabled>
          <ip>0.0.0.0</ip>
          <port>830</port>
          <ha-primary-listen>
            <ip>0.0.0.0</ip>
            <port>1830</port>
          </ha-primary-listen>
          <ha-primary-listen>
            <ip>::</ip>
            <port>1830</port>
          </ha-primary-listen>
        </ssh>
      </transport>
    </netconf-north-bound>
    <ha>
      <enabled>true</enabled>
      <ip>0.0.0.0</ip>
      <port>4570</port>
      <extra-listen>
        <ip>::</ip>
        <port>4569</port>
      </extra-listen>
      <tick-timeout>PT20S</tick-timeout>
    </ha>
    admin@node2# show ha-raft
    ha-raft status role stalled
    ha-raft status local-node node2.example.org
    > ... output omitted ... <
    $ sudo echo "admin ALL = (root) NOPASSWD: /bin/ip" >> /etc/sudoers
    $ sudo echo "admin ALL = (root) NOPASSWD: /path/to/arping" >> /etc/sudoers
    Structure - Data Models

    The NSO daemon manages device configuration including AAA information. NSO manages AAA information as well as uses it. The AAA information describes which users may log in, what passwords they have, and what they are allowed to do. This is solved in NSO by requiring a data model to be both loaded and populated with data. NSO uses the YANG module tailf-aaa.yang for authentication, while ietf-netconf-acm.yang (NETCONF Access Control Model (NACM), RFC 8341) as augmented by tailf-acm.yang is used for group assignment and authorization.

    Data Model Contents

    The NACM data model is targeted specifically towards access control for NETCONF operations and thus lacks some functionality that is needed in NSO, in particular, support for the authorization of CLI commands and the possibility to specify the context (NETCONF, CLI, etc.) that a given authorization rule should apply to. This functionality is modeled by augmentation of the NACM model, as defined in the tailf-acm.yang YANG module.

    The ietf-netconf-acm.yang and tailf-acm.yang modules can be found in $NCS_DIR/src/ncs/yang directory in the release, while tailf-aaa.yang can be found in the $NCS_DIR/src/ncs/aaa directory.

    NACM options related to services are modeled by augmentation of the NACM model, as defined in the tailf-ncs-acm.yang YANG module. The tailf-ncs-acm.yang can be found in $NCS_DIR/src/ncs/yang directory in the release.

    The complete AAA data model defines a set of users, a set of groups, and a set of rules. The data model must be populated with data that is subsequently used by by NSO itself when it authenticates users and authorizes user data access. These YANG modules work exactly like all other fxs files loaded into the system with the exception that NSO itself uses them. The data belongs to the application, but NSO itself is the user of the data.

    Since NSO requires a data model for the AAA information for its operation, it will report an error and fail to start if these data models cannot be found.

    AAA-related Items in ncs.conf

    NSO itself is configured through a configuration file - ncs.conf. In that file, we have the following items related to authentication and authorization:

    • /ncs-config/aaa/ssh-server-key-dir: If SSH termination is enabled for NETCONF or the CLI, the NSO built-in SSH server needs to have server keys. These keys are generated by the NSO install script and by default end up in $NCS_DIR/etc/ncs/ssh. It is also possible to use OpenSSH to terminate NETCONF or the CLI. If OpenSSH is used to terminate SSH traffic, this setting has no effect.

    • /ncs-config/aaa/ssh-pubkey-authentication: If SSH termination is enabled for NETCONF or the CLI, this item controls how the NSO SSH daemon locates the user keys for public key authentication. See Public Key Login for details.

    • /ncs-config/aaa/local-authentication/enabled: The term 'local user' refers to a user stored under /aaa/authentication/users. The alternative is a user unknown to NSO, typically authenticated by PAM. By default, NSO first checks local users before trying PAM or external authentication. Local authentication is practical in test environments. It is also useful when we want to have one set of users that are allowed to log in to the host with normal shell access and another set of users that are only allowed to access the system using the normal encrypted, fully authenticated, northbound interfaces of NSO. If we always authenticate users through PAM, it may make sense to set this configurable to false. If we disable local authentication, it implicitly means that we must use either PAM authentication or external authentication. It also means that we can leave the entire data trees under /aaa/authentication/users and, in the case of external authentication, also /nacm/groups (for NACM) or /aaa/authentication/groups (for legacy tailf-aaa) empty.

    • /ncs-config/aaa/pam: NSO can authenticate users using PAM (Pluggable Authentication Modules). PAM is an integral part of most Unix-like systems. PAM is a complicated - albeit powerful - subsystem. It may be easier to have all users stored locally on the host, However, if we want to store users in a central location, PAM can be used to access the remote information. PAM can be configured to perform most login scenarios including RADIUS and LDAP. One major drawback with PAM authentication is that there is no easy way to extract the group information from PAM. PAM authenticates users, it does not also assign a user to a set of groups. PAM authentication is thoroughly described later in this chapter.

    • /ncs-config/aaa/default-group: If this configuration parameter is defined and if the group of a user cannot be determined, a logged-in user ends up in the given default group.

    • /ncs-config/aaa/external-authentication: NSO can authenticate users using an external executable. This is further described later in . As an alternative, you may consider using package authentication.

    • /ncs-config/aaa/external-validation: NSO can authenticate users by validation of tokens using an external executable. This is further described later in . Where external authentication uses a username and password to authenticate a user, external validation uses a token. The validation script should use the token to authenticate a user and can, optionally, also return a new token to be returned with the result of the request. It is currently only supported for RESTCONF.

    • /ncs-config/aaa/external-challenge: NSO has support for multi-factor authentication by sending challenges to a user. Challenges may be sent from any of the external authentication mechanisms but are currently only supported by JSON-RPC and CLI over SSH. This is further described later in .

    • /ncs-config/aaa/package-authentication: NSO can authenticate users using package authentication. It extends the concept of external authentication by allowing multiple packages to be used for authentication instead of a single executable. This is further described in .

    • /ncs-config/aaa/single-sign-on: With this setting enabled, NSO invokes Package Authentication on all requests to HTTP endpoints with the /sso prefix. This way, Package Authentication packages that require custom endpoints can expose them under the /sso base route. For example, a SAMLv2 Single Sign-On (SSO) package needs to process requests to an AssertionConsumerService endpoint, such as /sso/saml/acs, and therefore requires enabling this setting. This is a valid authentication method for WEB UI and JSON-RPC interfaces and needs Package Authentication to be enabled as well.

    • /ncs-config/aaa/single-sign-on/enable-automatic-redirect: If only one Single Sign-On package is configured (a package with single-sign-on-url set in package-meta-data.xml) and also this setting is enabled, NSO automatically redirects all unauthenticated access attempts to the configured single-sign-on-url.

    Authentication

    Depending on the northbound management protocol, when a user session is created in NSO, it may or may not be authenticated. If the session is not yet authenticated, NSO's AAA subsystem is used to perform authentication and authorization, as described below. If the session already has been authenticated, NSO's AAA assigns groups to the user as described in Group Membership, and performs authorization, as described in Authorization.

    The authentication part of the data model can be found in tailf-aaa.yang:

    AAA authentication is used in the following cases:

    • When the built-in SSH server is used for NETCONF and CLI sessions.

    • For Web UI sessions and REST access.

    • When the method Maapi.Authenticate() is used.

    NSO's AAA authentication is not used in the following cases:

    • When NETCONF uses an external SSH daemon, such as OpenSSH.

      In this case, the NETCONF session is initiated using the program netconf-subsys, as described in NETCONF Transport Protocols in Northbound APIs.

    • When NETCONF uses TCP, as described in NETCONF Transport Protocols in Northbound APIs, e.g. through the command netconf-console.

    • When accessing the CLI by invoking the ncs_cli, e.g. through an external SSH daemon, such as OpenSSH, or a telnet daemon. An important special case here is when a user has shell access to the host and runs ncs_cli from the shell. This command, as well as direct access to the IPC socket, allows for authentication bypass. It is crucial to consider this case for your deployment. If non-trusted users have shell access to the host, IPC access must be restricted. See .

    • When SNMP is used, SNMP has its own authentication mechanisms. See in Northbound APIs.

    • When the method Maapi.startUserSession() is used without a preceding call of Maapi.authenticate().

    Public Key Login

    When a user logs in over NETCONF or the CLI using the built-in SSH server, with a public key login, the procedure is as follows.

    The user presents a username in accordance with the SSH protocol. The SSH server consults the settings for /ncs-config/aaa/ssh-pubkey-authentication and /ncs-config/aaa/local-authentication/enabled .

    1. If ssh-pubkey-authentication is set to local, and the SSH keys in /aaa/authentication/users/user{$USER}/ssh_keydir match the keys presented by the user, authentication succeeds.

    2. Otherwise, if ssh-pubkey-authentication is set to system, local-authentication is enabled, and the SSH keys in /aaa/authentication/users/user{$USER}/ssh_keydir match the keys presented by the user, authentication succeeds.

    3. Otherwise, if ssh-pubkey-authentication is set to system and the user /aaa/authentication/users/user{$USER} does not exist, but the user does exist in the OS password database, the keys in the user's $HOME/.ssh directory are checked. If these keys match the keys presented by the user, authentication succeeds.

    4. Otherwise, authentication fails.

    In all cases the keys are expected to be stored in a file called authorized_keys (or authorized_keys2 if authorized_keys does not exist), and in the native OpenSSH format (i.e. as generated by the OpenSSH ssh-keygen command). If authentication succeeds, the user's group membership is established as described in Group Membership.

    This is exactly the same procedure that is used by the OpenSSH server with the exception that the built-in SSH server also may locate the directory containing the public keys for a specific user by consulting the /aaa/authentication/users tree.

    Setting up Public Key Login

    We need to provide a directory where SSH keys are kept for a specific user and give the absolute path to this directory for the /aaa/authentication/users/user/ssh_keydir leaf. If a public key login is not desired at all for a user, the value of the ssh_keydir leaf should be set to "", i.e. the empty string. Similarly, if the directory does not contain any SSH keys, public key logins for that user will be disabled.

    The built-in SSH daemon supports DSA, RSA, and ED25519 keys. To generate and enable RSA keys of size 4096 bits for, say, user "bob", the following steps are required.

    On the client machine, as user "bob", generate a private/public key pair as:

    Now we need to copy the public key to the target machine where the NETCONF or CLI SSH client runs.

    Assume we have the following user entry:

    We need to copy the newly generated file id_rsa.pub, which is the public key, to a file on the target machine called /var/system/users/bob/.ssh/authorized_keys.

    Since the release of OpenSSH 7.0, support of ssh-dss host and user keys is disabled by default. If you want to continue using these, you may re-enable it using the following options for OpenSSH client:

    You can find full instructions at OpenSSH Legacy Options webpage.

    Password Login

    Password login is triggered in the following cases:

    • When a user logs in over NETCONF or the CLI using the built-in SSH server, with a password. The user presents a username and a password in accordance with the SSH protocol.

    • When a user logs in using the Web UI. The Web UI asks for a username and password.

    • When the method Maapi.authenticate() is used.

    In this case, NSO will by default try local authentication, PAM, external authentication, and package authentication in that order, as described below. It is possible to change the order in which these are tried, by modifying the ncs.conf. parameter /ncs-config/aaa/auth-order. See ncs.conf(5) in Manual Pages for details.

    1. If /aaa/authentication/users/user{$USER} exists and the presented password matches the encrypted password in /aaa/authentication/users/user{$USER}/password, the user is authenticated.

    2. If the password does not match or if the user does not exist in /aaa/authentication/users, PAM login is attempted, if enabled. See PAM for details.

    3. If all of the above fails and external authentication is enabled, the configured executable is invoked. See External Authentication for details.

    If authentication succeeds, the user's group membership is established as described in Group Membership.

    PAM

    On operating systems supporting PAM, NSO also supports PAM authentication. Using PAM, authentication with NSO can be very convenient since it allows us to have the same set of users and groups having access to NSO as those that have access to the UNIX/Linux host itself.

    PAM is the recommended way to authenticate NSO users.

    If we use PAM, we do not have to have any users or any groups configured in the NSO aaa namespace at all.

    To configure PAM we typically need to do the following:

    1. Remove all users and groups from the AAA initialization XML file.

    2. Enable PAM in ncs.conf by adding the following to the AAA section in ncs.conf. The service name specifies the PAM service, typically a file in the directory /etc/pam.d, but may alternatively, be an entry in a file /etc/pam.conf depending on OS and version. Thus, it is possible to have a different login procedure for NSO than for the host itself.

    3. If PAM is enabled and we want to use PAM for login, the system may have to run as root. This depends on how PAM is configured locally. However, the default system authentication will typically require root, since the PAM libraries then read /etc/shadow. If we don't want to run NSO as root, the solution here is to change the owner of a helper program called $NCS_DIR/lib/ncs/lib/core/pam/priv/epam and also set the setuid bit.

    As an example, say that we have a user test in /etc/passwd, and furthermore:

    Thus, the test user is part of the admin and the operator groups and logging in to NSO as the test user through CLI SSH, Web UI, or NETCONF, renders the following in the audit log.

    Thus, the test user was found and authenticated from /etc/passwd, and the crucial group assignment of the test user was done from /etc/group.

    If we wish to be able to also manipulate the users, their passwords, etc on the device, we can write a private YANG model for that data, store that data in CDB, set up a normal CDB subscriber for that data, and finally when our private user data is manipulated, our CDB subscriber picks up the changes and changes the contents of the relevant /etc files.

    External Authentication

    A common situation is when we wish to have all authentication data stored remotely, not locally, for example on a remote RADIUS or LDAP server. This remote authentication server typically not only stores the users and their passwords but also the group information.

    If we wish to have not only the users but also the group information stored on a remote server, the best option for NSO authentication is to use external authentication.

    If this feature is configured, NSO will invoke the executable configured in /ncs-config/aaa/external-authentication/executable in ncs.conf , and pass the username and the clear text password on stdin using the string notation: "[user;password;]\n".

    For example, if the user bob attempts to log in over SSH using the password 'secret', and external authentication is enabled, NSO will invoke the configured executable and write "[bob;secret;]\n" on the stdin stream for the executable. The task of the executable is then to authenticate the user and also establish the username-to-groups mapping.

    For example, the executable could be a RADIUS client which utilizes some proprietary vendor attributes to retrieve the groups of the user from the RADIUS server. If authentication is successful, the program should write accept followed by a space-separated list of groups that the user is a member of, and additional information as described below. Again, assuming that bob's password indeed was 'secret', and that bob is a member of the admin and the lamers groups, the program should write accept admin lamers $uid $gid $supplementary_gids $HOME on its standard output and then exit.

    There is a general limit of 16000 bytes of output from the externalauth program.

    Thus, the format of the output from an externalauth program when authentication is successful should be:

    "accept $groups $uid $gid $supplementary_gids $HOME\n"

    Where:

    • $groups is a space-separated list of the group names the user is a member of.

    • $uid is the UNIX integer user ID that NSO should use as a default when executing commands for this user.

    • $gid is the UNIX integer group ID that NSO should use as a default when executing commands for this user.

    • $supplementary_gids is a (possibly empty) space-separated list of additional UNIX group IDs the user is also a member of.

    • $HOME is the directory that should be used as HOME for this user when NSO executes commands on behalf of this user.

    It is further possible for the program to return a token on successful authentication, by using "accept_token" instead of "accept":

    "accept_token $groups $uid $gid $supplementary_gids $HOME $token\n"

    Where:

    • $token is an arbitrary string. NSO will then, for some northbound interfaces, include this token in responses.

    It is also possible for the program to return additional information on successful authentication, by using "accept_info" instead of "accept":

    "accept_info $groups $uid $gid $supplementary_gids $HOME $info\n"

    Where:

    • $info is some arbitrary text. NSO will then just append this text to the generated audit log message (CONFD_EXT_LOGIN).

    Yet another possibility is for the program to return a warning that the user's password is about to expire, by using "accept_warning" instead of "accept":

    "accept_warning $groups $uid $gid $supplementary_gids $HOME $warning\n"

    Where:

    • $warning is an appropriate warning message. The message will be processed by NSO according to the setting of /ncs-config/aaa/expiration-warning in ncs.conf.

    There is also support for token variations of "accept_info" and "accept_warning" namely "accept_token_info" and "accept_token_warning". Both "accept_token_info" and "accept_token_warning" expect the external program to output exactly the same as described above with the addition of a token after $HOME:

    • "accept_token_info $groups $uid $gid $supplementary_gids $HOME $token $info\n"

    • "accept_token_warning $groups $uid $gid $supplementary_gids $HOME $token $warning\n"

    If authentication failed, the program should write "reject" or "abort", possibly followed by a reason for the rejection, and a trailing newline. For example, "reject Bad password\n" or just "abort\n". The difference between "reject" and "abort" is that with "reject", NSO will try subsequent mechanisms configured for /ncs-config/aaa/auth-order in ncs.conf (if any), while with "abort", the authentication fails immediately. Thus "abort" can prevent subsequent mechanisms from being tried, but when external authentication is the last mechanism (as in the default order), it has the same effect as "reject".

    Supported by some northbound APIs, such as JSON-RPC and CLI over SSH, the external authentication may also choose to issue a challenge:

    "challenge $challenge-id $challenge-prompt\n"

    The challenge-prompt may be multi-line, why it must be base64 encoded.

    For more information on multi-factor authentication, see External Multi-Factor Authentication.

    When external authentication is used, the group list returned by the external program is prepended by any possible group information stored locally under the /aaa tree. Hence when we use external authentication it is indeed possible to have the entire /aaa/authentication tree empty. The group assignment performed by the external program will still be valid and the relevant groups will be used by NSO when the authorization rules are checked.

    External Token Validation

    When username and password authentication is not feasible, authentication by token validation is possible. Currently, only RESTCONF supports this mode of authentication. It shares all properties of external authentication, but instead of a username and password, it takes a token as input. The output is also almost the same, the only difference is that it is also expected to output a username.

    If this feature is configured, NSO will invoke the executable configured in /ncs-config/aaa/external-validation/executable in ncs.conf , and pass the token on stdin using the string notation: "[token;]\n".

    For example if the user bob attempts to log over RESTCONF using the token topsecret, and external validation is enabled, NSO will invoke the configured executable and write "[topsecret;]\n" on the stdin stream for the executable.

    The task of the executable is then to validate the token, thereby authenticating the user and also establishing the username and username-to-groups mapping.

    For example, the executable could be a FUSION client that utilizes some proprietary vendor attributes to retrieve the username and groups of the user from the FUSION server. If token validation is successful, the program should write accept followed by a space-separated list of groups that the user is a member of, and additional information as described below. Again, assuming that bob's token indeed was topsecret, and that bob is a member of the admin and the lamers groups, the program should write accept admin lamers $uid $gid $supplementary_gids $HOME $USER on its standard output and then exit.

    There is a general limit of 16000 bytes of output from the externalvalidation program.

    Thus the format of the output from an externalvalidation program when token validation authentication is successful should be:

    "accept $groups $uid $gid $supplementary_gids $HOME $USER\n"

    Where:

    • $groups is a space-separated list of the group names the user is a member of.

    • $uid is the UNIX integer user ID NSO should use as a default when executing commands for this user.

    • $gid is the UNIX integer group ID NSO should use as a default when executing commands for this user.

    • $supplementary_gids is a (possibly empty) space-separated list of additional UNIX group IDs the user is also a member of.

    • $HOME is the directory that should be used as HOME for this user when NSO executes commands on behalf of this user.

    • $USER is the user derived from mapping the token.

    It is further possible for the program to return a new token on successful token validation authentication, by using "accept_token" instead of "accept":

    "accept_token $groups $uid $gid $supplementary_gids $HOME $USER $token\n"

    Where:

    • $token is an arbitrary string. NSO will then, for some northbound interfaces, include this token in responses.

    It is also possible for the program to return additional information on successful token validation authentication, by using "accept_info" instead of "accept":

    "accept_info $groups $uid $gid $supplementary_gids $HOME $USER $info\n"

    Where:

    • $info is some arbitrary text. NSO will then just append this text to the generated audit log message (CONFD_EXT_LOGIN).

    Yet another possibility is for the program to return a warning that the user's password is about to expire, by using "accept_warning" instead of "accept":

    "accept_warning $groups $uid $gid $supplementary_gids $HOME $USER $warning\n"

    Where:

    • $warning is an appropriate warning message. The message will be processed by NSO according to the setting of /ncs-config/aaa/expiration-warning in ncs.conf.

    There is also support for token variations of "accept_info" and "accept_warning" namely "accept_token_info" and "accept_token_warning". Both "accept_token_info" and "accept_token_warning" expect the external program to output exactly the same as described above with the addition of a token after $USER:

    "accept_token_info $groups $uid $gid $supplementary_gids $HOME $USER $token $info\n"

    "accept_token_warning $groups $uid $gid $supplementary_gids $HOME $USER $token $warning\n"

    If token validation authentication fails, the program should write "reject" or "abort", possibly followed by a reason for the rejection and a trailing newline. For example "reject Bad password\n" or just "abort\n". The difference between "reject" and "abort" is that with "reject", NSO will try subsequent mechanisms configured for /ncs-config/aaa/validation-order in ncs.conf (if any), while with "abort", the token validation authentication fails immediately. Thus "abort" can prevent subsequent mechanisms from being tried. Currently, the only available token validation authentication mechanism is the external one.

    Supported by some northbound APIs, such as JSON-RPC and CLI over SSH, the external validation may also choose to issue a challenge:

    "challenge $challenge-id $challenge-prompt\n"

    The challenge prompt may be multi-line, why it must be base64 encoded.

    For more information on multi-factor authentication, see External Multi-Factor Authentication.

    External Multi-Factor Authentication

    When username, password, or token authentication is not enough, a challenge may be sent from any of the external authentication mechanisms to the user. A challenge consists of a challenge ID and a base64 encoded challenge prompt, and a user is supposed to send a response to the challenge. Currently, only JSONRPC and CLI over SSH support multi-factor authentication. Responses to challenges of multi-factor authentication have the same output as the token authentication mechanism.

    If this feature is configured, NSO will invoke the executable configured in /ncs-config/aaa/external-challenge/executable in ncs.conf , and pass the challenge ID and response on stdin using the string notation: "[challenge-id;response;]\n".

    For example, a user bob has received a challenge from external authentication, external validation, or external challenge and then attempts to log in over JSON-RPC with a response to the challenge using challenge ID "22efa",response:"ae457b". The external challenge mechanism is enabled, NSO will invoke the configured executable and write "[22efa;ae457b;]\n" on the stdin stream for the executable.

    The task of the executable is then to validate the challenge ID, and response combination, thereby authenticating the user and also establishing the username and username-to-groups mapping.

    For example, the executable could be a RADIUS client which utilizes some proprietary vendor attributes to retrieve the username and groups of the user from the RADIUS server. If challenge ID, response validation is successful, the program should write "accept " followed by a space-separated list of groups the user is a member of, and additional information as described below. Again, assuming that bob's challenge ID, the response combination indeed was "22efa", "ae457b" and that bob is a member of the admin and the lamers groups, the program should write "accept admin lamers $uid $gid $supplementary_gids $HOME $USER\n" on its standard output and then exit.

    There is a general limit of 16000 bytes of output from the externalchallenge program.

    Thus the format of the output from an externalchallenge program when challenge-based authentication is successful should be:

    "accept $groups $uid $gid $supplementary_gids $HOME $USER\n"

    Where:

    • $groups is a space-separated list of the group names the user is a member of.

    • $uid is the UNIX integer user ID NSO should use as a default when executing commands for this user.

    • $gid is the UNIX integer group ID NSO should use as a default when executing commands for this user.

    • $supplementary_gids is a (possibly empty) space-separated list of additional UNIX group IDs the user is also a member of.

    • $HOME is the directory that should be used as HOME for this user when NSO executes commands on behalf of this user.

    • $USER is the user derived from mapping the challenge ID, response.

    It is further possible for the program to return a token on successful authentication, by using "accept_token" instead of "accept":

    "accept_token $groups $uid $gid $supplementary_gids $HOME $USER $token\n"

    Where:

    • $token is an arbitrary string. NSO will then, for some northbound interfaces, include this token in responses.

    It is also possible for the program to return additional information on successful authentication, by using "accept_info" instead of "accept":

    "accept_info $groups $uid $gid $supplementary_gids $HOME $USER $info\n"

    Where:

    • $info is some arbitrary text. NSO will then just append this text to the generated audit log message (CONFD_EXT_LOGIN).

    Yet another possibility is for the program to return a warning that the user's password is about to expire, by using "accept_warning" instead of "accept":

    "accept_warning $groups $uid $gid $supplementary_gids $HOME $USER $warning\n"

    Where:

    • $warning is an appropriate warning message. The message will be processed by NSO according to the setting of /ncs-config/aaa/expiration-warning in ncs.conf.

    There is also support for token variations of "accept_info" and "accept_warning" namely "accept_token_info" and "accept_token_warning". Both "accept_token_info" and "accept_token_warning" expects the external program to output exactly the same as described above with the addition of a token after $USER:

    "accept_token_info $groups $uid $gid $supplementary_gids $HOME $USER $token $info\n"

    "accept_token_warning $groups $uid $gid $supplementary_gids $HOME $USER $token $warning\n"

    If authentication fails, the program should write "reject" or "abort", possibly followed by a reason for the rejection and a trailing newline. For example "reject Bad challenge response\n" or just "abort\n". The difference between "reject" and "abort" is that with "reject", NSO will try subsequent mechanisms configured for /ncs-config/aaa/challenge-order in ncs.conf (if any), while with "abort", the challenge-response authentication fails immediately. Thus "abort" can prevent subsequent mechanisms from being tried. Currently, the only available challenge-response authentication mechanism is the external one.

    Supported by some northbound APIs, such as JSON-RPC and CLI over SSH, the external challenge may also choose to issue a new challenge:

    "challenge $challenge-id $challenge-prompt\n"

    The challenge prompt may be multi-line, so it must be base64 encoded.

    Note that when using challenges with the CLI over SSH, the /ncs-config/cli/ssh/use-keyboard-interactive> need to be set to true for the challenges to be sent correctly to the client.

    The configuration of the SSH client used may need to be given the option to allow a higher number of allowed number of password prompts, e.g. -o NumberOfPasswordPrompts, else the default number may introduce an unexpected behavior when the client is presented with multiple challenges.

    Package Authentication

    The Package Authentication functionality allows for packages to handle the NSO authentication in a customized fashion. Authentication data can e.g. be stored remotely, and a script in the package is used to communicate with the remote system.

    Compared to external authentication, the Package Authentication mechanism allows specifying multiple packages to be invoked in the order they appear in the configuration. NSO provides implementations for LDAP, SAMLv2, and TACACS+ protocols with packages available in $NCS_DIR/packages/auth/. Additionally, you can implement your own authentication packages as detailed below.

    Authentication packages are NSO packages with the required content of an executable file scripts/authenticate. This executable basically follows the same API, and limitations, as the external auth script, but with a different input format and some additional functionality. Other than these requirements, it is possible to customize the package arbitrarily.

    Package authentication is supported for Single Sign-On (see Single Sign-On in Web UI), JSON-RPC, and RESTCONF. Note that Single Sign-On and (non-batch) JSON-RPC allow all functionality while the RESTCONF interface will treat anything other than a "accept_username" reply from the package as if authentication failed!

    Package authentication is enabled by setting the ncs.conf options /ncs-config/aaa/package-authentication/enabled to true, and adding the package by name in the /ncs-config/aaa/package-authentication/packages list. The order of the configured packages is the order that the packages will be used when attempting to authenticate a user. See ncs.conf(5) in Manual Pages for details.

    If this feature is configured in ncs.conf, NSO will for each configured package invoke script/authenticate, and pass username, password, and original HTTP request (i.e. the user-supplied next query parameter), HTTP request, HTTP headers, HTTP body, client source IP, client source port, northbound API context, and protocol on stdin using the string notation: "[user;password;orig_request;request;headers;body;src-ip;src-port;ctx;proto;]\n".

    The fields user, password, orig_request, request, headers, and body are all base64 encoded.

    If the body length exceeds the partial_post_size of the RESTCONF server, the body passed to the authenticate script will only contain the string '==nso_package_authentication_partial_body=='.

    The original request will be prefixed with the string ==nso_package_authentication_next== before the base64 encoded part. This means supplying the next query parameter value /my-location will pass the following string to the authentication script: ==nso_package_authentication_next==L215LWxvY2F0aW9u.

    For example, if an unauthenticated user attempts to start a single sign-on process over northbound HTTP-based APIs with the cisco-nso-saml2-auth package, package authentication is enabled and configured with packages, and also single sign-on is enabled, NSO will, for each configured package, invoke the executable scripts/authenticate and write "[;;;R0VUIC9zc28vc2FtbC9sb2dpbi8gSFRUUC8xLjE=;;;127.0.0.1;59226;webui;https;]\n". on the stdin stream for the executable.

    For clarity, the base64 decoded contents sent to stdin presented: "[;;;GET /sso/saml/login/ HTTP/1.1;;;127.0.0.1;54321;webui;https;]\n".

    The task of the package is then to authenticate the user and also establish the username-to-groups mapping.

    For example, the package could support a SAMLv2 authentication protocol which communicates with an Identity Provider (IdP) for authentication. If authentication is successful, the program should write either "accept", or "accept_username", depending on whether the authentication is started with a username or if an external entity handles the entire authentication and supplies the username for a successful authentication. (SAMLv2 uses accept_username, since the IdP handles the entire authentication.) The "accept_username " is followed by a username and then followed by a space-separated list of groups the user is a member of, and additional information as described below. If authentication is successful and the authenticated user bob is a member of the groups admin and wheel, the program should write "accept_username bob admin wheel 1000 1000 100 /home/bob\n" on its standard output and then exit.

    There is a general limit of 16000 bytes of output from the "packageauth" program.

    Thus the format of the output from a packageauth program when authentication is successful should be either the same as from externalauth (see External Authentication) or the following:

    "accept_username $USER $groups $uid $gid $supplementary_gids $HOME\n"

    Where:

    • $USER is the user derived during the execution of the "packageauth" program.

    • $groups is a space-separated list of the group names the user is a member of.

    • $uid is the UNIX integer user ID NSO should use as a default when executing commands for this user.

    • $gid is the UNIX integer group ID NSO should use as a default when executing commands for this user.

    • $supplementary_gids is a (possibly empty) space-separated list of additional UNIX group IDs the user is also a member of.

    • $HOME is the directory that should be used as HOME for this user when NSO executes commands on behalf of this user.

    In addition to the externalauth API, the authentication packages can also return the following responses:

    • unknown 'reason' - (reason being plain-text) if they can't handle authentication for the supplied input.

    • redirect 'url' - (url being base64 encoded) for an HTTP redirect.

    • content 'content-type' 'content' - (content-type being plain-text mime-type and content being base64 encoded) to relay supplied content.

    • accept_username_redirect url $USER $groups $uid $gid $supplementary_gids $HOME - which combines the accept_username and redirect.

    It is also possible for the program to return additional information on successful authentication, by using "accept_info" instead of "accept":

    "accept_info $groups $uid $gid $supplementary_gids $HOME $info\n"

    Where:

    • $info is some arbitrary text. NSO will then just append this text to the generated audit log message (NCS_PACKAGE_AUTH_SUCCESS).

    Yet another possibility is for the program to return a warning that the user's password is about to expire, by using "accept_warning" instead of "accept":

    "accept_warning $groups $uid $gid $supplementary_gids $HOME $warning\n"

    Where:

    • $warning is an appropriate warning message. The message will be processed by NSO according to the setting of /ncs-config/aaa/expiration-warning in ncs.conf.

    If authentication fails, the program should write "reject" or "abort", possibly followed by a reason for the rejection and a trailing newline. For example "reject 'Bad password'\n" or just "abort\n". The difference between "reject" and "abort" is that with "reject", NSO will try subsequent mechanisms configured for /ncs-config/aaa/auth-order, and packages configured for /ncs-config/aaa/package-authentication/packages in ncs.conf (if any), while with "abort", the authentication fails immediately. Thus "abort" can prevent subsequent mechanisms from being tried, but when external authentication is the last mechanism (as in the default order), it has the same effect as "reject".

    When package authentication is used, the group list returned by the package executable is prepended by any possible group information stored locally under the /aaa tree. Hence when package authentication is used, it is indeed possible to have the entire /aaa/authentication tree empty. The group assignment performed by the external program will still be valid and the relevant groups will be used by NSO when the authorization rules are checked.

    Username/Password Package Authentication for CLI

    Package authentication will invoke the scripts/authenticate when a user tries to authenticate using CLI. In this case, only the username, password, client source IP, client source port, northbound API context, and protocol will be passed to the script.

    When serving a username/password request, script output other than accept, challenge or abort will be treated as if authentication failed.

    Package Challenges

    When this is enabled, /ncs-config/aaa/package-authentication/package-challenge/enabled is set to true, packages will also be used to try to resolve challenges sent to the server and are only supported by CLI over SSH. The script script/challenge will be invoked passing challenge ID, response, client source IP, client source port, northbound API context, and protocol on stdin using the string notation: "[challengeid;response;src-ip;src-port;ctx;proto;]\n" . The output should follow that of the authenticate script.

    The fields challengeid and response are base64 encoded when passed to the script.

    Authenticating IPC Access

    NSO communicates with clients (client libraries, ncs_cli, and similar) using the NSO IPC socket. The protocol used allows the client to provide user and group information to use for authorization in NSO, effectively delegating authentication to the client.

    By default, only local connections to the IPC socket are allowed. If all local clients are considered trusted, the socket can provide unauthenticated access, with the client-supplied user name. This is what the --user option of ncs_cli does. For example:

    connects to NSO as the user admin. The same is possible for the group. This unauthenticated access is currently the default.

    The main condition here is that all clients connecting to the socket are trusted to use the correct user and group information. That is often not the case, such as untrusted users having shell access to the host to run ncs_cli or otherwise initiate local connections to the IPC socket. Then access to the socket must be restricted.

    In general, authenticating access to the IPC socket is a security best practice and should always be used. NSO implements it as an access check, where every IPC client must prove that it has access to a pre-shared key. See Restricting Access to the IPC Port on how to enable it.

    Group Membership

    Once a user is authenticated, group membership must be established. A single user can be a member of several groups. Group membership is used by the authorization rules to decide which operations a certain user is allowed to perform. Thus the NSO AAA authorization model is entirely group-based. This is also sometimes referred to as role-based authorization.

    All groups are stored under /nacm/groups, and each group contains a number of usernames. The ietf-netconf-acm.yang model defines a group entry:

    The tailf-acm.yang model augments this with a gid leaf:

    A valid group entry could thus look like:

    The above XML data would then mean that users bob and joe are members of the admin group. The users need not necessarily exist as actual users under /aaa/authentication/users in order to belong to a group. If for example PAM authentication is used, it does not make sense to have all users listed under /aaa/authentication/users.

    By default, the user is assigned to groups by using any groups provided by the northbound transport (e.g. via the ncs_cli or netconf-subsys programs), by consulting data under /nacm/groups, by consulting the /etc/group file, and by using any additional groups supplied by the authentication method. If /nacm/enable-external-groups is set to "false", only the data under /nacm/groups is consulted.

    The resulting group assignment is the union of these methods, if it is non-empty. Otherwise, the default group is used, if configured ( /ncs-config/aaa/default-group in ncs.conf).

    A user entry has a UNIX uid and UNIX gid assigned to it. Groups may have optional group IDs. When a user is logged in, and NSO tries to execute commands on behalf of that user, the uid/gid for the command execution is taken from the user entry. Furthermore, UNIX supplementary group IDs are assigned according to the gid's in the groups where the user is a member.

    Authorization

    Once a user is authenticated and group membership is established, when the user starts to perform various actions, each action must be authorized. Normally the authorization is done based on rules configured in the AAA data model as described in this section.

    The authorization procedure first checks the value of /nacm/enable-nacm. This leaf has a default of true, but if it is set to false, all access is permitted. Otherwise, the next step is to traverse the rule-list list:

    If the group leaf-list in a rule-list entry matches any of the user's groups, the cmdrule list entries are examined for command authorization, while the rule entries are examined for RPC, notification, and data authorization.

    Command Authorization

    The tailf-acm.yang module augments the rule-list entry in ietf-netconf-acm.yang with a cmdrule list:

    Each rule has seven leafs. The first is the name list key, the following three leafs are matching leafs. When NSO tries to run a command, it tries to match the command towards the matching leafs and if all of context, command, and access-operations match, the fifth field, i.e. the action, is applied.

    • name: name is the name of the rule. The rules are checked in order, with the ordering given by the YANG ordered-by user semantics, i.e. independent of the key values.

    • context: context is either of the strings cli, webui, or * for a command rule. This means that we can differentiate authorization rules for which access method is used. Thus if command access is attempted through the CLI, the context will be the string cli whereas for operations via the Web UI, the context will be the string webui.

    • command: This is the actual command getting executed. If the rule applies to one or several CLI commands, the string is a space-separated list of CLI command tokens, for example request system reboot. If the command applies to Web UI operations, it is a space-separated string similar to a CLI string. A string that consists of just * matches any command. In general, we do not recommend using command rules to protect the configuration. Use rules for data access as described in the next section to control access to different parts of the data. Command rules should be used only for CLI commands and Web UI operations that cannot be expressed as data rules. The individual tokens can be POSIX extended regular expressions. Each regular expression is implicitly anchored, i.e. an ^ is prepended and a $ is appended to the regular expression.

    • access-operations: access-operations is used to match the operation that NSO tries to perform. It must be one or both of the "read" and "exec" values from the access-operations-type bits type definition in ietf-netconf-acm.yang, or "*" to match any operation.

    • action: If all of the previous fields match, the rule as a whole matches and the value of action will be taken. I.e. if a match is found, a decision is made whether to permit or deny the request in its entirety. If action is permit, the request is permitted, if action is deny, the request is denied and an entry is written to the developer log.

    • log-if-permit: If this leaf is present, an entry is written to the developer log for a matching request also when action is permit. This is very useful when debugging command rules.

    • comment: An optional textual description of the rule.

    For the rule processing to be written to the devel log, the /ncs-config/logs/developer-log-level entry in ncs.conf must be set to trace.

    If no matching rule is found in any of the cmdrule lists in any rule-list entry that matches the user's groups, this augmentation from tailf-acm.yang is relevant:

    • If read access is requested, the value of /nacm/cmd-read-default determines whether access is permitted or denied.

    • If exec access is requested, the value of /nacm/cmd-exec-default determines whether access is permitted or denied.

    If access is permitted due to one of these default leafs, the /nacm/log-if-default-permithas the same effect as the log-if-permit leaf for the cmdrule lists.

    RPC, Notification, and Data Authorization

    The rules in the rule list are used to control access to rpc operations, notifications, and data nodes defined in YANG models. Access to invocation of actions (tailf:action) is controlled with the same method as access to data nodes, with a request for exec access. ietf-netconf-acm.yang defines a rule entry as:

    tailf-acm augments this with two additional leafs:

    Similar to the command access check, whenever a user through some agent tries to access an RPC, a notification, a data item, or an action, access is checked. For a rule to match, three or four leafs must match and when a match is found, the corresponding action is taken.

    We have the following leafs in the rule list entry.

    • name: The name of the rule. The rules are checked in order, with the ordering given by the YANG ordered-by user semantics, i.e. independent of the key values.

    • module-name: The module-name string is the name of the YANG module where the node being accessed is defined. The special value * (i.e. the default) matches all modules.\

    • rpc-name / notification-name / path: This is a choice between three possible leafs that are used for matching, in addition to the module-name:

    • rpc-name: The name of an RPC operation, or * to match any RPC.

    • notification-name: the name of a notification, or * to match any notification.

    • path: A restricted XPath expression leading down into the populated XML tree. A rule with a path specified matches if it is equal to or shorter than the checked path. Several types of paths are allowed.

      1. Tagpaths that do not contain any keys. For example /ncs/live-device/live-status.

      2. Instantiated key: as in /devices/device[name="x1"]/config/interface matches the interface configuration for managed device "x1" It's possible to have partially instantiated paths only containing some keys instantiated - i.e. combinations of tagpaths and keypaths. Assuming a deeper tree, the path

    • context: context is either of the strings cli, netconf, webui, snmp, or * for a data rule. Furthermore, when we initiate user sessions from MAAPI, we can choose any string we want. Similarly to command rules, we can differentiate access depending on which agent is used to gain access.

    • access-operations: access-operations is used to match the operation that NSO tries to perform. It must be one or more of the "create", "read", "update", "delete" and "exec" values from the access-operations-type bits type definition in ietf-netconf-acm.yang, or "*" to match any operation.

    • action: This leaf has the same characteristics as the action leaf for command access.

    • log-if-permit: This leaf has the same characteristics as the log-if-permit leaf for command access.

    • comment: An optional textual description of the rule.

    If no matching rule is found in any of the rule lists in any rule-list entry that matches the user's groups, the data model node for which access is requested is examined for the presence of the NACM extensions:

    • If the nacm:default-deny-all extension is specified for the data model node, the access is denied.

    • If the nacm:default-deny-write extension is specified for the data model node, and create, update, or delete access is requested, the access is denied.

    If examination of the NACM extensions did not result in access being denied, the value (permit or deny) of the relevant default leaf is examined:

    • If read access is requested, the value of /nacm/read-default determines whether access is permitted or denied.

    • If create, update, or delete access is requested, the value of /nacm/write-default determines whether access is permitted or denied.

    • If exec access is requested, the value of /nacm/exec-default determines whether access is permitted or denied.

    If access is permitted due to one of these default leafs, this augmentation from tailf-acm.yang is relevant:

    I.e. it has the same effect as the log-if-permit leaf for the rule lists, but for the case where the value of one of the default leafs permits access.

    When NSO executes a command, the command rules in the authorization database are searched, The rules are tried in order, as described above. When a rule matches the operation (command) that NSO is attempting, the action of the matching rule is applied - whether permit or deny.

    When actual data access is attempted, the data rules are searched. E.g. when a user attempts to execute delete aaa in the CLI, the user needs delete access to the entire tree /aaa.

    Another example is if a CLI user writes show configuration aaa TAB it suffices to have read access to at least one item below /aaa for the CLI to perform the TAB completion. If no rule matches or an explicit deny rule is found, the CLI will not TAB complete.

    Yet another example is if a user tries to execute delete aaa authentication users, we need to perform a check on the paths /aaa and /aaa/authentication before attempting to delete the sub-tree. Say that we have a rule for path /aaa/authentication/users which is a permit rule and we have a subsequent rule for path /aaa which is a deny rule. With this rule set the user should indeed be allowed to delete the entire /aaa/authentication/users tree but not the /aaa tree nor the /aaa/authentication tree.

    We have two variations on how the rules are processed. The easy case is when we actually try to read or write an item in the configuration database. The execution goes like this:

    The second case is when we execute TAB completion in the CLI. This is more complicated. The execution goes like this:

    The idea is that as we traverse (through TAB) down the XML tree, as long as there is at least one rule that can possibly match later, once we have more data, we must continue. For example, assume we have:

    1. "/system/config/foo" --> permit

    2. "/system/config" --> deny

    If we in the CLI stand at "/system/config" and hit TAB we want the CLI to show foo as a completion, but none of the other nodes that exist under /system/config. Whereas if we try to execute delete /system/config the request must be rejected.

    By default, NACM rules are configured for the entire tailf:action or YANG 1.1 action statements, but not for input statement child leafs. To override this behavior, and enable NACM rules on input leafs, set the following parameter to 'true': /ncs-config/aaa/action-input-rules/enabled. When enabled all action input leafs given to an action will be validated for NACM rules. If broad 'deny' NACM rules are used, you might need to add 'permit' rules for the affected action input leafs to allow actions to be used with parameters.

    NACM Rules and Services

    By design NACM rules are ignored for changes done by services - FASTMAP, Reactive FASTMAP, or Nano services. The reasoning behind this is that a service package can be seen as a controlled way to provide limited access to devices for a user group that is not allowed to apply arbitrary changes on the devices.

    However, there are NSO installations where this behavior is not desired, and NSO administrators want to enforce NACM rules even on changes done by services. For this purpose, the leaf called /nacm/enforce-nacm-on-services is provided. By default, it is set to false.

    Note however that currently, even with this leaf set to true, there are limitations. Namely, the post-actions for nano-services are run in a user session without any access checks. Besides that, NACM rules are not enforced on the read operations performed in the service callbacks.

    It might be desirable to deny everything for a user group and only allow access to a specific service. This pattern could be used to allow an operator to provision the service, but deny everything else. While this pattern works for a normal FASTMAP service, there are some caveats for stacked services, Reactive FASTMAP, and Nano services. For these kinds of services, in addition to the service itself, access should be provided to the user group for the following paths:

    • In case of stacked services, the user group needs read and write access to the leaf private/re-deploy-counter under the bottom service. Otherwise, the user will not be able to redeploy the service.

    • In the case of Reactive FASTMAP or Nano services, the user group needs read and write access to the following:

      • /zombies

      • /side-effect-queue

      • /kickers

    Device Group Authorization

    In deployments with many devices, it can become cumbersome to handle data authorization per device. To help with this there is a rule type that works on device group membership (for more on device groups, see Device Groups). To do this, devices are added to different device groups, and the rule type device-group-rule is used.

    The IETF NACM rule type is augmented with a new rule type named device-group-rule which contains a leafref to the device groups. See the following example.

    In the example below, we configure two device groups based on different regions and add devices to them.

    In the example below, we configure an operator for the us_east region:

    In the example below, we configure the device group rules and refer to the device group and the us_east group.

    In summary device group authorization gives a more compact configuration for deployments where devices can be grouped and authorization can be done on a device group basis.

    Modifications on the device-group subtree are recommended to be controlled by a limited set of users.

    Authorization Examples

    Assume that we have two groups, admin and oper. We want admin to be able to see and edit the XML tree rooted at /aaa, but we do not want users who are members of the oper group to even see the /aaa tree. We would have the following rule list and rule entries. Note, here we use the XML data from tailf-aaa.yang to exemplify. The examples apply to all data, for all data models loaded into the system.

    If we do not want the members of oper to be able to execute the NETCONF operation edit-config, we define the following rule list and rule entries:

    To spell it out, the above defines four elements to match. If NSO tries to perform a netconf operation, which is the operation edit-config, and the user who runs the command is a member of the oper group, and finally it is an exec (execute) operation, we have a match. If so, the action is deny.

    The path leaf can be used to specify explicit paths into the XML tree using XPath syntax. For example the following:

    Explicitly allows the admin group to change the password for precisely the bob user when the user is using the CLI. Had path been /aaa/authentication/users/user/password the rule would apply to all password elements for all users. Since the path leaf completely identifies the nodes that the rule applies to, we do not need to give tailf-aaa for the module-name leaf.

    NSO applies variable substitution, whereby the username of the logged-in user can be used in a path. Thus:

    The above rule allows all users that are part of the admin group to change their own passwords only.

    A member of oper is able to execute NETCONF operation action if that member has exec access on NETCONF RPC action operation, read access on all instances in the hierarchy of data nodes that identifies the specific action in the data store, and exec access on the specific action. For example, an action is defined as below.

    To be able to execute double action through NETCONF RPC, the members of oper need the following rule list and rule entries.

    Or, a simpler rule set as the following.

    Finally, if we wish members of the oper group to never be able to execute the request system reboot command, also available as a reboot NETCONF rpc, we have:

    Troubleshooting NACM Rules

    In this section, we list some tips to make it easier to troubleshoot NACM rules.

    Use log-if-permit and log-if-default-permit together with the developer log level set to trace.

    Use the tailf-acm.yang module augmentation log-if-permit leaf for rules with action permit. When those rules trigger a permit action a trace entry is added to the developer log. To see trace entries make sure the /ncs-config/logs/developer-log-level is set to trace.

    If you have a default rule with action permit you can use the log-if-default-permit leaf instead.

    NACM rules are read at the start of the session and are used throughout the session.

    When a user session is created it will gather the authorization rules that are relevant for that user's group(s). The rules are used throughout the user session lifetime. When we update the AAA rules the active sessions are not affected. For example, if an administrator updates the NACM rules in one session the update will not apply to any other currently active sessions. The updates will apply to new sessions created after the update.

    Explicitly state NACM groups when starting the CLI. For example ncs_cli -u oper -g oper.

    It is the user's group membership that determines what rules apply. Starting the CLI using the ncs_cli command without explicitly setting the groups, defaults to the actual UNIX groups the user is a member of. On Darwin, one of the default groups is usually admin, which can lead to the wrong group being used.

    Be careful with namespaces in rulepaths.

    Unless a rulepath is made explicit by specifying namespace it will apply to that specific path in all namespaces. Below we show parts of an example from RFC 8341, where the path element has an xmlns attribute and the path is namespaced. If these would not have been namespaced, the rules would not behave as expected.

    In the example above (Excerpt from RFC 8341 Appendix A.4), the path is namespaced.

    The AAA Cache

    NSO's AAA subsystem will cache the AAA information in order to speed up the authorization process. This cache must be updated whenever there is a change to the AAA information. The mechanism for this update depends on how the AAA information is stored, as described in the following two sections.

    Populating AAA using CDB

    To start NSO, the data models for AAA must be loaded. The defaults in the case that no actual data is loaded for these models allow all read and exec access, while write access is denied. Access may still be further restricted by the NACM extensions, though - e.g. the /nacm container has nacm:default-deny-all, meaning that not even read access is allowed if no data is loaded.

    The NSO installation ships with an XML initialization file containing AAA configuration. The file is called aaa_init.xml and is, by default, copied to the CDB directory by the NSO install scripts.

    The local installation variant, targeting development only, defines two users, admin and oper with passwords set to admin and oper respectively for authentication. The two users belong to user groups with NACM rules restricting their authorization level. The system installation aaa_init.xml variant, targeting production deployment, defines NACM rules only as users are, by default, authenticated using PAM. The NACM rules target two user groups, ncsadmin and ncsoper. Users belonging to the ncsoper group are limited to read-only access.

    The default aaa_init.xml file provided with the NSO system installation must not be used as-is in a deployment without reviewing and verifying that every NACM rule in the file matches

    Normally the AAA data will be stored as configuration in CDB. This allows for changes to be made through NSO's transaction-based configuration management. In this case, the AAA cache will be updated automatically when changes are made to the AAA data. If changing the AAA data via NSO's configuration management is not possible or desirable, it is alternatively possible to use the CDB operational data store for AAA data. In this case, the AAA cache can be updated either explicitly e.g. by using the maapi_aaa_reload() function, see the confd_lib_maapi(3) in the Manual Pages manual page, or by triggering a subscription notification by using the subscription lock when updating the CDB operational data store, see Using CDB in Development.

    Hiding the AAA Tree

    Some applications may not want to expose the AAA data to end users in the CLI or the Web UI. Two reasonable approaches exist here and both rely on the tailf:export statement. If a module has tailf:export none it will be invisible to all agents. We can then either use a transform whereby we define another AAA model and write a transform program that maps our AAA data to the data that must exist in tailf-aaa.yang and ietf-netconf-acm.yang. This way we can choose to export and and expose an entirely different AAA model.

    Yet another very easy way out, is to define a set of static AAA rules whereby a set of fixed users and fixed groups have fixed access to our configuration data. Possibly the only field we wish to manipulate is the password field.

    HostKeyAlgorithms=+ssh-dss
    PubkeyAcceptedKeyTypes=+ssh-dss
    <pam>
      <enabled>true</enabled>
      <service>common-auth</service>
    </pam>
    Since the elements of the path to a given node may be defined in different YANG modules when augmentation is used, rules that have a value other than `*` for the `module-name` leaf may require that additional processing is done before a decision to permit or deny, or the access can be taken. Thus if an XPath that completely identifies the nodes that the rule should apply to is given for the `path` leaf (see below), it may be best to leave the `module-name` leaf unset.
        container authentication {
          tailf:info "User management";
          container users {
            tailf:info "List of local users";
            list user {
              key name;
              leaf name {
                type string;
                tailf:info "Login name of the user";
              }
              leaf uid {
                type int32;
                mandatory true;
                tailf:info "User Identifier";
              }
              leaf gid {
                type int32;
                mandatory true;
                tailf:info "Group Identifier";
              }
              leaf password {
                type passwdStr;
                mandatory true;
              }
              leaf ssh_keydir {
                type string;
                mandatory true;
                tailf:info "Absolute path to directory where user's ssh keys
                            may be found";
              }
              leaf homedir {
                type string;
                mandatory true;
                tailf:info "Absolute path to user's home directory";
              }
            }
          }
        }
    # ssh-keygen -b 4096 -t rsa
    Generating public/private rsa key pair.
    Enter file in which to save the key (/home/bob/.ssh/id_rsa):
    Created directory '/home/bob/.ssh'.
    Enter passphrase (empty for no passphrase):
    Enter same passphrase again:
    Your identification has been saved in /home/bob/.ssh/id_rsa.
    Your public key has been saved in /home/bob/.ssh/id_rsa.pub.
    The key fingerprint is:
    ce:1b:63:0a:f9:d4:1d:04:7a:1d:98:0c:99:66:57:65 bob@buzz
    # ls -lt ~/.ssh
    total 8
    -rw-------  1 bob users 3247 Apr  4 12:28 id_rsa
    -rw-r--r--  1 bob users  738 Apr  4 12:28 id_rsa.pub
    <user>
      <name>bob</name>
      <uid>100</uid>
      <gid>10</gid>
      <password>$1$feedbabe$nGlMYlZpQ0bzenyFOQI3L1</password>
      <ssh_keydir>/var/system/users/bob/.ssh</ssh_keydir>
      <homedir>/var/system/users/bob</homedir>
    </user>
    # grep test /etc/group
    operator:x:37:test
    admin:x:1001:test
    <INFO> 28-Jan-2009::16:05:55.663 buzz ncs[14658]: audit user: test/0 logged
        in over ssh from 127.0.0.1 with authmeth:password
    <INFO> 28-Jan-2009::16:05:55.670 buzz ncs[14658]: audit user: test/5 assigned
        to groups: operator,admin
    <INFO> 28-Jan-2009::16:05:57.655 buzz ncs[14658]: audit user: test/5 CLI 'exit'
    ncs_cli --user admin
    list group {
      key name;
    
      description
        "One NACM Group Entry.  This list will only contain
         configured entries, not any entries learned from
         any transport protocols.";
    
      leaf name {
        type group-name-type;
        description
          "Group name associated with this entry.";
      }
    
      leaf-list user-name {
        type user-name-type;
        description
          "Each entry identifies the username of
           a member of the group associated with
           this entry.";
      }
    }
    augment /nacm:nacm/nacm:groups/nacm:group {
      leaf gid {
        type int32;
        description
          "This leaf associates a numerical group ID with the group.
           When a OS command is executed on behalf of a user,
           supplementary group IDs are assigned based on 'gid' values
           for the groups that the use is a member of.";
      }
    }
    <group>
      <name>admin</name>
      <user-name>bob</user-name>
      <user-name>joe</user-name>
      <gid xmlns="http://tail-f.com/yang/acm">99</gid>
    </group>
    list rule-list {
      key "name";
      ordered-by user;
      description
        "An ordered collection of access control rules.";
    
      leaf name {
        type string {
          length "1..max";
        }
        description
          "Arbitrary name assigned to the rule-list.";
      }
      leaf-list group {
        type union {
          type matchall-string-type;
          type group-name-type;
        }
        description
          "List of administrative groups that will be
           assigned the associated access rights
           defined by the 'rule' list.
    
           The string '*' indicates that all groups apply to the
           entry.";
      }
    
      // ...
    }
    augment /nacm:nacm/nacm:rule-list {
    
      list cmdrule {
        key "name";
        ordered-by user;
        description
          "One command access control rule. Command rules control access
           to CLI commands and Web UI functions.
    
           Rules are processed in user-defined order until a match is
           found.  A rule matches if 'context', 'command', and
           'access-operations' match the request.  If a rule
           matches, the 'action' leaf determines if access is granted
           or not.";
    
        leaf name {
          type string {
            length "1..max";
          }
          description
            "Arbitrary name assigned to the rule.";
        }
    
        leaf context {
          type union {
            type nacm:matchall-string-type;
            type string;
          }
          default "*";
          description
            "This leaf matches if it has the value '*' or if its value
             identifies the agent that is requesting access, i.e. 'cli'
             for CLI or 'webui' for Web UI.";
        }
    
        leaf command {
          type string;
          default "*";
          description
            "Space-separated tokens representing the command. Refer
             to the Tail-f AAA documentation for further details.";
        }
    
        leaf access-operations {
          type union {
            type nacm:matchall-string-type;
            type nacm:access-operations-type;
          }
          default "*";
          description
            "Access operations associated with this rule.
    
             This leaf matches if it has the value '*' or if the
             bit corresponding to the requested operation is set.";
        }
    
        leaf action {
          type nacm:action-type;
          mandatory true;
          description
            "The access control action associated with the
             rule.  If a rule is determined to match a
             particular request, then this object is used
             to determine whether to permit or deny the
             request.";
        }
    
        leaf log-if-permit {
          type empty;
          description
            "If this leaf is present, access granted due to this rule
             is logged in the developer log. Otherwise, only denied
             access is logged. Mainly intended for debugging of rules.";
        }
    
        leaf comment {
          type string;
          description
            "A textual description of the access rule.";
        }
      }
    }
    augment /nacm:nacm {
      leaf cmd-read-default {
        type nacm:action-type;
        default "permit";
        description
          "Controls whether command read access is granted
           if no appropriate cmdrule is found for a
           particular command read request.";
      }
    
      leaf cmd-exec-default {
        type nacm:action-type;
        default "permit";
        description
          "Controls whether command exec access is granted
           if no appropriate cmdrule is found for a
           particular command exec request.";
      }
    
      leaf log-if-default-permit {
        type empty;
        description
          "If this leaf is present, access granted due to one of
           /nacm/read-default, /nacm/write-default, or /nacm/exec-default
           /nacm/cmd-read-default, or /nacm/cmd-exec-default
           being set to 'permit' is logged in the developer log.
           Otherwise, only denied access is logged. Mainly intended
           for debugging of rules.";
      }
    }
    list rule {
      key "name";
      ordered-by user;
      description
        "One access control rule.
    
         Rules are processed in user-defined order until a match is
         found.  A rule matches if 'module-name', 'rule-type', and
         'access-operations' match the request.  If a rule
         matches, the 'action' leaf determines if access is granted
         or not.";
    
      leaf name {
        type string {
          length "1..max";
        }
        description
          "Arbitrary name assigned to the rule.";
      }
    
      leaf module-name {
        type union {
          type matchall-string-type;
          type string;
        }
        default "*";
        description
          "Name of the module associated with this rule.
    
           This leaf matches if it has the value '*' or if the
           object being accessed is defined in the module with the
           specified module name.";
      }
      choice rule-type {
        description
          "This choice matches if all leafs present in the rule
           match the request.  If no leafs are present, the
           choice matches all requests.";
        case protocol-operation {
          leaf rpc-name {
            type union {
              type matchall-string-type;
              type string;
            }
            description
              "This leaf matches if it has the value '*' or if
               its value equals the requested protocol operation
               name.";
          }
        }
        case notification {
          leaf notification-name {
            type union {
              type matchall-string-type;
              type string;
            }
            description
              "This leaf matches if it has the value '*' or if its
               value equals the requested notification name.";
          }
        }
        case data-node {
          leaf path {
            type node-instance-identifier;
            mandatory true;
            description
              "Data Node Instance Identifier associated with the
               data node controlled by this rule.
    
               Configuration data or state data instance
               identifiers start with a top-level data node.  A
               complete instance identifier is required for this
               type of path value.
    
               The special value '/' refers to all possible
               data-store contents.";
          }
        }
      }
    
      leaf access-operations {
        type union {
          type matchall-string-type;
          type access-operations-type;
        }
        default "*";
        description
          "Access operations associated with this rule.
    
           This leaf matches if it has the value '*' or if the
           bit corresponding to the requested operation is set.";
      }
    
      leaf action {
        type action-type;
        mandatory true;
        description
          "The access control action associated with the
           rule.  If a rule is determined to match a
           particular request, then this object is used
           to determine whether to permit or deny the
           request.";
      }
    
      leaf comment {
        type string;
        description
          "A textual description of the access rule.";
      }
    }
    augment /nacm:nacm/nacm:rule-list/nacm:rule {
    
      leaf context {
        type union {
          type nacm:matchall-string-type;
          type string;
        }
        default "*";
        description
          "This leaf matches if it has the value '*' or if its value
           identifies the agent that is requesting access, e.g. 'netconf'
           for NETCONF, 'cli' for CLI, or 'webui' for Web UI.";
    
      }
    
      leaf log-if-permit {
        type empty;
        description
          "If this leaf is present, access granted due to this rule
           is logged in the developer log. Otherwise, only denied
           access is logged. Mainly intended for debugging of rules.";
      }
    }
    augment /nacm:nacm {
      ...
      leaf log-if-default-permit {
        type empty;
        description
          "If this leaf is present, access granted due to one of
           /nacm/read-default, /nacm/write-default, /nacm/exec-default
           /nacm/cmd-read-default, or /nacm/cmd-exec-default
           being set to 'permit' is logged in the developer log.
           Otherwise, only denied access is logged. Mainly intended
           for debugging of rules.";
      }
    }
    foreach rule {
        if (match(rule, path)) {
           return rule.action;
        }
    }
    rules = select_rules_that_may_match(rules, path);
    if (any_rule_is_permit(rules))
        return permit;
    else
        return deny;
    Device Group Model Augmentation
    augment "/nacm:nacm/nacm:rule-list/nacm:rule/nacm:rule-type" {
      case device-group-rule {
        leaf device-group {
          type leafref {
            path "/ncs:devices/ncs:device-group/ncs:name";
          }
          description
            "Which device group this rule applies to.";
        }
      }
    }
    Device Group Configuration
    <devices>
      <device-group>
        <name>us_east</name>
        <device-name>cli0</device-name>
        <device-name>gen0</device-name>
      </device-group>
      <device-group>
        <name>us_west</name>
        <device-name>nc0</device-name>
      </device-group>
    </devices>
    NACM Group Configuration
    <nacm>
      <groups>
        <group>
          <name>us_east</name>
          <user-name>us_east_oper</user-name>
        </group>
      </groups>
    </nacm>
    Device Group Authorization Rules
    <nacm>
      <rule-list>
        <name>us_east</name>
        <group>us_east</group>
        <rule>
          <name>us_east_read_permit</name>
          <device-group xmlns="http://tail-f.com/yang/ncs-acm/device-group-authorization">us_east</device-group>
          <access-operations>read</access-operations>
          <action>permit</action>
        </rule>
        <rule>
          <name>us_east_create_permit</name>
          <device-group xmlns="http://tail-f.com/yang/ncs-acm/device-group-authorization">us_east</device-group>
          <access-operations>create</access-operations>
          <action>permit</action>
        </rule>
        <rule>
          <name>us_east_update_permit</name>
          <device-group xmlns="http://tail-f.com/yang/ncs-acm/device-group-authorization">us_east</device-group>
          <access-operations>update</access-operations>
          <action>permit</action>
        </rule>
        <rule>
          <name>us_east_delete_permit</name>
          <device-group xmlns="http://tail-f.com/yang/ncs-acm/device-group-authorization">us_east</device-group>
          <access-operations>delete</access-operations>
          <action>permit</action>
        </rule>
      </rule-list>
    </nacm>
    <rule-list>
      <name>admin</name>
      <group>admin</group>
      <rule>
        <name>tailf-aaa</name>
        <module-name>tailf-aaa</module-name>
        <path>/</path>
        <access-operations>read create update delete</access-operations>
        <action>permit</action>
      </rule>
    </rule-list>
    <rule-list>
      <name>oper</name>
      <group>oper</group>
      <rule>
        <name>tailf-aaa</name>
        <module-name>tailf-aaa</module-name>
        <path>/</path>
        <access-operations>read create update delete</access-operations>
        <action>deny</action>
      </rule>
    </rule-list>
    <rule-list>
      <name>oper</name>
      <group>oper</group>
      <rule>
        <name>edit-config</name>
        <rpc-name>edit-config</rpc-name>
        <context xmlns="http://tail-f.com/yang/acm">netconf</context>
        <access-operations>exec</access-operations>
        <action>deny</action>
      </rule>
    </rule-list>
    <rule-list>
      <name>admin</name>
      <group>admin</group>
      <rule>
        <name>bob-password</name>
        <path>/aaa/authentication/users/user[name='bob']/password</path>
        <context xmlns="http://tail-f.com/yang/acm">cli</context>
        <access-operations>read update</access-operations>
        <action>permit</action>
      </rule>
    </rule-list>
    <rule-list>
      <name>admin</name>
      <group>admin</group>
      <rule>
        <name>user-password</name>
        <path>/aaa/authentication/users/user[name='$USER']/password</path>
        <context xmlns="http://tail-f.com/yang/acm">cli</context>
        <access-operations>read update</access-operations>
        <action>permit</action>
      </rule>
    </rule-list>
    container test {
      action double {
        input {
          leaf number {
            type uint32;
          }
        }
        output {
          leaf result {
            type uint32;
          }
        }
      }
    }
    <rule-list>
      <name>oper</name>
      <group>oper</group>
    
      <rule>
        <name>allow-netconf-rpc-action</name>
        <rpc-name>action</rpc-name>
        <context xmlns="http://tail-f.com/yang/acm">netconf</context>
        <access-operations>exec</access-operations>
        <action>permit</action>
      </rule>
      <rule>
        <name>allow-read-test</name>
        <path>/test</path>
        <access-operations>read</access-operations>
        <action>permit</action>
      </rule>
      <rule>
        <name>allow-exec-double</name>
        <path>/test/double</path>
        <access-operations>exec</access-operations>
        <action>permit</action>
      </rule>
    </rule-list>
    <rule-list>
      <name>oper</name>
      <group>oper</group>
    
      <rule>
        <name>allow-netconf-rpc-action</name>
        <rpc-name>action</rpc-name>
        <context xmlns="http://tail-f.com/yang/acm">netconf</context>
        <access-operations>exec</access-operations>
        <action>permit</action>
      </rule>
      <rule>
        <name>allow-exec-double</name>
        <path>/test</path>
        <access-operations>read exec</access-operations>
        <action>permit</action>
      </rule>
    </rule-list>
    <rule-list>
      <name>oper</name>
      <group>oper</group>
    
      <cmdrule xmlns="http://tail-f.com/yang/acm">
        <name>request-system-reboot</name>
        <context>cli</context>
        <command>request system reboot</command>
        <access-operations>exec</access-operations>
        <action>deny</action>
      </cmdrule>
    
      <!-- The following rule is required since the user can -->
      <!-- do "edit system" -->
    
      <cmdrule xmlns="http://tail-f.com/yang/acm">
        <name>request-reboot</name>
        <context>cli</context>
        <command>request reboot</command>
        <access-operations>exec</access-operations>
        <action>deny</action>
      </cmdrule>
    
      <rule>
        <name>netconf-reboot</name>
        <rpc-name>reboot</rpc-name>
        <context xmlns="http://tail-f.com/yang/acm">netconf</context>
        <access-operations>exec</access-operations>
        <action>deny</action>
      </rule>
    
    </rule-list>
    Example: Excerpt from RFC 8341 Appendix A.4
             <rule>
               <name>permit-acme-config</name>
               <path xmlns:acme="http://example.com/ns/netconf">
                 /acme:acme-netconf/acme:config-parameters
               </path>
             ...
    /devices/device/config/interface[name="eth0"]
    matches the
    eth0
    interface configuration on all managed devices.
  • The wild card at the end as in: /services/web-site/* does not match the website service instances, but rather all children of the website service instances.\

  • Thus, the path in a rule is matched against the path in the attempted data access. If the attempted access has a path that is equal to or longer than the rule path - we have a match. If none of the leafs rpc-name, notification-name, or path are set, the rule matches for any RPC, notification, data, or action access.

    External Authentication
    External Token Validation
    External Multi-factor Authentication
    Package Authentication
    Authenticating IPC Access
    NSO SNMP Agent
    # cd $NCS_DIR/lib/ncs/lib/core/pam/priv/
    # chown root:root epam
    # chmod u+s epam

    Log Messages and Formats

    AAA_LOAD_FAIL

    AAA_LOAD_FAIL

    • Severity CRIT

    • Description Failed to load the AAA data, it could be that an external db is misbehaving or AAA is mounted/populated badly

    • Format String "Failed to load AAA: ~s"

    ABORT_CAND_COMMIT

    ABORT_CAND_COMMIT

    • Severity INFO

    • Description Aborting candidate commit, request from user, reverting configuration.

    ABORT_CAND_COMMIT_REBOOT

    ABORT_CAND_COMMIT_REBOOT

    • Severity INFO

    • Description ConfD restarted while having a ongoing candidate commit timer, reverting configuration.

    ABORT_CAND_COMMIT_TERM

    ABORT_CAND_COMMIT_TERM

    • Severity INFO

    • Description Candidate commit session terminated, reverting configuration.

    ABORT_CAND_COMMIT_TIMER

    ABORT_CAND_COMMIT_TIMER

    • Severity INFO

    • Description Candidate commit timer expired, reverting configuration.

    ACCEPT_FATAL

    ACCEPT_FATAL

    • Severity CRIT

    • Description ConfD encountered an OS-specific error indicating that networking support is unavailable.

    ACCEPT_FDLIMIT

    ACCEPT_FDLIMIT

    • Severity CRIT

    • Description ConfD failed to accept a connection due to reaching the process or system-wide file descriptor limit.

    AUTH_LOGIN_FAIL

    AUTH_LOGIN_FAIL

    • Severity INFO

    • Description A user failed to log in to ConfD.

    AUTH_LOGIN_SUCCESS

    AUTH_LOGIN_SUCCESS

    • Severity INFO

    • Description A user logged into ConfD.

    AUTH_LOGOUT

    AUTH_LOGOUT

    • Severity INFO

    • Description A user was logged out from ConfD.

    BADCONFIG

    BADCONFIG

    • Severity CRIT

    • Description confd.conf contained bad data.

    BAD_DEPENDENCY

    BAD_DEPENDENCY

    • Severity ERR

    • Description A dependency was not found

    BAD_NS_HASH

    BAD_NS_HASH

    • Severity CRIT

    • Description Two namespaces have the same hash value. The namespace hashvalue MUST be unique. You can pass the flag --nshash to confdc when linking the .xso files to force another value for the namespace hash.

    BIND_ERR

    BIND_ERR

    • Severity CRIT

    • Description ConfD failed to bind to one of the internally used listen sockets.

    BRIDGE_DIED

    BRIDGE_DIED

    • Severity ERR

    • Description ConfD is configured to start the confd_aaa_bridge and the C program died.

    CAND_COMMIT_ROLLBACK_DONE

    CAND_COMMIT_ROLLBACK_DONE

    • Severity INFO

    • Description Candidate commit rollback done

    CAND_COMMIT_ROLLBACK_FAILURE

    CAND_COMMIT_ROLLBACK_FAILURE

    • Severity ERR

    • Description Failed to rollback candidate commit

    CANDIDATE_BAD_FILE_FORMAT

    CANDIDATE_BAD_FILE_FORMAT

    • Severity WARNING

    • Description The candidate database file has a bad format. The candidate database is reset to the empty database.

    CANDIDATE_CORRUPT_FILE

    CANDIDATE_CORRUPT_FILE

    • Severity WARNING

    • Description The candidate database file is corrupt and cannot be read. The candidate database is reset to the empty database.

    CDB_BOOT_ERR

    CDB_BOOT_ERR

    • Severity CRIT

    • Description CDB failed to start. Some grave error in the cdb data files prevented CDB from starting - a recovery from backup is necessary.

    CDB_CLIENT_TIMEOUT

    CDB_CLIENT_TIMEOUT

    • Severity ERR

    • Description A CDB client failed to answer within the timeout period. The client will be disconnected.

    CDB_CONFIG_LOST

    CDB_CONFIG_LOST

    • Severity INFO

    • Description CDB found it's data files but no schema file. CDB recovers by starting from an empty database.

    CDB_DB_LOST

    CDB_DB_LOST

    • Severity INFO

    • Description CDB found it's data schema file but not it's data file. CDB recovers by starting from an empty database.

    CDB_FATAL_ERROR

    CDB_FATAL_ERROR

    • Severity CRIT

    • Description CDB encounterad an unrecoverable error

    CDB_INIT_LOAD

    CDB_INIT_LOAD

    • Severity INFO

    • Description CDB is processing an initialization file.

    CDB_OP_INIT

    CDB_OP_INIT

    • Severity ERR

    • Description The operational DB was deleted and re-initialized (because of upgrade or corrupt file)

    CDB_UPGRADE_FAILED

    CDB_UPGRADE_FAILED

    • Severity ERR

    • Description Automatic CDB upgrade failed. This means that the data model has been changed in a non-supported way.

    CGI_REQUEST

    CGI_REQUEST

    • Severity INFO

    • Description CGI script requested.

    CIPHER_NOT_SUPPORTED

    CIPHER_NOT_SUPPORTED

    • Severity ERR

    • Description libcrypto does not support the indicated cipher

    CLI_CMD_ABORTED

    CLI_CMD_ABORTED

    • Severity INFO

    • Description CLI command aborted.

    CLI_CMD_DONE

    CLI_CMD_DONE

    • Severity INFO

    • Description CLI command finished successfully.

    CLI_CMD

    CLI_CMD

    • Severity INFO

    • Description User executed a CLI command.

    CLI_DENIED

    CLI_DENIED

    • Severity INFO

    • Description User was denied to execute a CLI command due to permissions.

    COMMIT_INFO

    COMMIT_INFO

    • Severity INFO

    • Description Information about configuration changes committed to the running data store.

    COMMIT_QUEUE_CORRUPT

    COMMIT_QUEUE_CORRUPT

    • Severity ERR

    • Description Failed to load commit queue. ConfD recovers by starting from an empty commit queue.

    CONFIG_CHANGE

    CONFIG_CHANGE

    • Severity INFO

    • Description A change to ConfD configuration has taken place, e.g., by a reload of the configuration file

    CONFIG_TRANSACTION_LIMIT

    CONFIG_TRANSACTION_LIMIT

    • Severity INFO

    • Description Configuration transaction limit reached, rejected new transaction request.

    CONSULT_FILE

    CONSULT_FILE

    • Severity INFO

    • Description ConfD is reading its configuration file.

    DAEMON_DIED

    DAEMON_DIED

    • Severity CRIT

    • Description An external database daemon closed its control socket.

    DAEMON_TIMEOUT

    DAEMON_TIMEOUT

    • Severity CRIT

    • Description An external database daemon did not respond to a query.

    DEVEL_AAA

    DEVEL_AAA

    • Severity INFO

    • Description Developer aaa log message

    DEVEL_CAPI

    DEVEL_CAPI

    • Severity INFO

    • Description Developer C api log message

    DEVEL_CDB

    DEVEL_CDB

    • Severity INFO

    • Description Developer CDB log message

    DEVEL_CONFD

    DEVEL_CONFD

    • Severity INFO

    • Description Developer ConfD log message

    DEVEL_ECONFD

    DEVEL_ECONFD

    • Severity INFO

    • Description Developer econfd api log message

    DEVEL_SLS

    DEVEL_SLS

    • Severity INFO

    • Description Developer smartlicensing api log message

    DEVEL_SNMPA

    DEVEL_SNMPA

    • Severity INFO

    • Description Developer snmp agent log message

    DEVEL_SNMPGW

    DEVEL_SNMPGW

    • Severity INFO

    • Description Developer snmp GW log message

    DEVEL_WEBUI

    DEVEL_WEBUI

    • Severity INFO

    • Description Developer webui log message

    DUPLICATE_NAMESPACE

    DUPLICATE_NAMESPACE

    • Severity CRIT

    • Description Duplicate namespace found.

    DUPLICATE_PREFIX

    DUPLICATE_PREFIX

    • Severity CRIT

    • Description Duplicate prefix found.

    ERRLOG_SIZE_CHANGED

    ERRLOG_SIZE_CHANGED

    • Severity INFO

    • Description Notify change of log size for error log

    EVENT_SOCKET_TIMEOUT

    EVENT_SOCKET_TIMEOUT

    • Severity CRIT

    • Description An event notification subscriber did not reply within the configured timeout period

    EVENT_SOCKET_WRITE_BLOCK

    EVENT_SOCKET_WRITE_BLOCK

    • Severity CRIT

    • Description Write on an event socket blocked for too long time

    EXEC_WHEN_CIRCULAR_DEPENDENCY

    EXEC_WHEN_CIRCULAR_DEPENDENCY

    • Severity WARNING

    • Description An error occurred while evaluating a when-expression.

    EXT_AUTH_2FA_FAIL

    EXT_AUTH_2FA_FAIL

    • Severity INFO

    • Description External challenge authentication failed for a user.

    EXT_AUTH_2FA

    EXT_AUTH_2FA

    • Severity INFO

    • Description External challenge sent to a user.

    EXT_AUTH_2FA_SUCCESS

    EXT_AUTH_2FA_SUCCESS

    • Severity INFO

    • Description An external challenge authenticated user logged in.

    EXTAUTH_BAD_RET

    EXTAUTH_BAD_RET

    • Severity ERR

    • Description Authentication is external and the external program returned badly formatted data.

    EXT_AUTH_FAIL

    EXT_AUTH_FAIL

    • Severity INFO

    • Description External authentication failed for a user.

    EXT_AUTH_SUCCESS

    EXT_AUTH_SUCCESS

    • Severity INFO

    • Description An externally authenticated user logged in.

    EXT_AUTH_TOKEN_FAIL

    EXT_AUTH_TOKEN_FAIL

    • Severity INFO

    • Description External token authentication failed for a user.

    EXT_AUTH_TOKEN_SUCCESS

    EXT_AUTH_TOKEN_SUCCESS

    • Severity INFO

    • Description An externally token authenticated user logged in.

    EXT_BIND_ERR

    EXT_BIND_ERR

    • Severity CRIT

    • Description ConfD failed to bind to one of the externally visible listen sockets.

    FILE_ERROR

    FILE_ERROR

    • Severity CRIT

    • Description File error

    FILE_LOAD

    FILE_LOAD

    • Severity DEBUG

    • Description System loaded a file.

    FILE_LOAD_ERR

    FILE_LOAD_ERR

    • Severity CRIT

    • Description System tried to load a file in its load path and failed.

    FILE_LOADING

    FILE_LOADING

    • Severity DEBUG

    • Description System starts to load a file.

    FXS_MISMATCH

    FXS_MISMATCH

    • Severity ERR

    • Description A secondary connected to a primary where the fxs files are different

    GROUP_ASSIGN

    GROUP_ASSIGN

    • Severity INFO

    • Description A user was assigned to a set of groups.

    GROUP_NO_ASSIGN

    GROUP_NO_ASSIGN

    • Severity INFO

    • Description A user was logged in but wasn't assigned to any groups at all.

    HA_BAD_VSN

    HA_BAD_VSN

    • Severity ERR

    • Description A secondary connected to a primary with an incompatible HA protocol version

    HA_DUPLICATE_NODEID

    HA_DUPLICATE_NODEID

    • Severity ERR

    • Description A secondary arrived with a node id which already exists

    HA_FAILED_CONNECT

    HA_FAILED_CONNECT

    • Severity ERR

    • Description An attempted library become secondary call failed because the secondary couldn't connect to the primary

    HA_SECONDARY_KILLED

    HA_SECONDARY_KILLED

    • Severity ERR

    • Description A secondary node didn't produce its ticks

    INTERNAL_ERROR

    INTERNAL_ERROR

    • Severity CRIT

    • Description A ConfD internal error - should be reported to [email protected].

    JIT_ENABLED

    JIT_ENABLED

    • Severity INFO

    • Description Show if JIT is enabled.

    JSONRPC_LOG_MSG

    JSONRPC_LOG_MSG

    • Severity INFO

    • Description JSON-RPC traffic log message

    JSONRPC_REQUEST_ABSOLUTE_TIMEOUT

    JSONRPC_REQUEST_ABSOLUTE_TIMEOUT

    • Severity INFO

    • Description JSON-RPC absolute timeout.

    JSONRPC_REQUEST_IDLE_TIMEOUT

    JSONRPC_REQUEST_IDLE_TIMEOUT

    • Severity INFO

    • Description JSON-RPC idle timeout.

    JSONRPC_REQUEST

    JSONRPC_REQUEST

    • Severity INFO

    • Description JSON-RPC method requested.

    JSONRPC_WARN_MSG

    JSONRPC_WARN_MSG

    • Severity WARNING

    • Description JSON-RPC warning message

    KICKER_MISSING_SCHEMA

    KICKER_MISSING_SCHEMA

    • Severity INFO

    • Description Failed to load kicker schema

    LIB_BAD_SIZES

    LIB_BAD_SIZES

    • Severity ERR

    • Description An application connecting to ConfD used a library version that can't handle the depth and number of keys used by the data model.

    LIB_BAD_VSN

    LIB_BAD_VSN

    • Severity ERR

    • Description An application connecting to ConfD used a library version that doesn't match the ConfD version (e.g. old version of the client library).

    LIB_NO_ACCESS

    LIB_NO_ACCESS

    • Severity ERR

    • Description Access check failure occurred when an application connected to ConfD.

    LISTENER_INFO

    LISTENER_INFO

    • Severity INFO

    • Description ConfD starts or stops to listen for incoming connections.

    LOCAL_AUTH_FAIL_BADPASS

    LOCAL_AUTH_FAIL_BADPASS

    • Severity INFO

    • Description Authentication for a locally configured user failed due to providing bad password.

    LOCAL_AUTH_FAIL

    LOCAL_AUTH_FAIL

    • Severity INFO

    • Description Authentication for a locally configured user failed.

    LOCAL_AUTH_FAIL_NOUSER

    LOCAL_AUTH_FAIL_NOUSER

    • Severity INFO

    • Description Authentication for a locally configured user failed due to user not found.

    LOCAL_AUTH_SUCCESS

    LOCAL_AUTH_SUCCESS

    • Severity INFO

    • Description A locally authenticated user logged in.

    LOGGING_DEST_CHANGED

    LOGGING_DEST_CHANGED

    • Severity INFO

    • Description The target logfile will change to another file

    LOGGING_SHUTDOWN

    LOGGING_SHUTDOWN

    • Severity INFO

    • Description Logging subsystem terminating

    LOGGING_STARTED

    LOGGING_STARTED

    • Severity INFO

    • Description Logging subsystem started

    LOGGING_STARTED_TO

    LOGGING_STARTED_TO

    • Severity INFO

    • Description Write logs for a subsystem to a specific file

    LOGGING_STATUS_CHANGED

    LOGGING_STATUS_CHANGED

    • Severity INFO

    • Description Notify a change of logging status (enabled/disabled) for a subsystem

    LOGIN_REJECTED

    LOGIN_REJECTED

    • Severity INFO

    • Description Authentication for a user was rejected by application callback.

    MAAPI_LOGOUT

    MAAPI_LOGOUT

    • Severity INFO

    • Description A maapi user was logged out.

    MAAPI_WRITE_TO_SOCKET_FAIL

    MAAPI_WRITE_TO_SOCKET_FAIL

    • Severity INFO

    • Description maapi failed to write to a socket.

    MISSING_AES256CFB128_SETTINGS

    MISSING_AES256CFB128_SETTINGS

    • Severity ERR

    • Description AES256CFB128 keys were not found in confd.conf

    MISSING_AESCFB128_SETTINGS

    MISSING_AESCFB128_SETTINGS

    • Severity ERR

    • Description AESCFB128 keys were not found in confd.conf

    MISSING_DES3CBC_SETTINGS

    MISSING_DES3CBC_SETTINGS

    • Severity ERR

    • Description DES3CBC keys were not found in confd.conf

    MISSING_NS2

    MISSING_NS2

    • Severity CRIT

    • Description While validating the consistency of the config - a required namespace was missing.

    MISSING_NS

    MISSING_NS

    • Severity CRIT

    • Description While validating the consistency of the config - a required namespace was missing.

    MMAP_SCHEMA_FAIL

    MMAP_SCHEMA_FAIL

    • Severity ERR

    • Description Failed to setup the shared memory schema

    NETCONF_HDR_ERR

    NETCONF_HDR_ERR

    • Severity ERR

    • Description The cleartext header indicating user and groups was badly formatted.

    NETCONF

    NETCONF

    • Severity INFO

    • Description NETCONF traffic log message

    NIF_LOG

    NIF_LOG

    • Severity INFO

    • Description Log message from NIF code.

    NOAAA_CLI_LOGIN

    NOAAA_CLI_LOGIN

    • Severity INFO

    • Description A user used the --noaaa flag to confd_cli

    NO_CALLPOINT

    NO_CALLPOINT

    • Severity CRIT

    • Description ConfD tried to populate an XML tree but no code had registered under the relevant callpoint.

    NO_SUCH_IDENTITY

    NO_SUCH_IDENTITY

    • Severity CRIT

    • Description The fxs file with the base identity is not loaded

    NO_SUCH_NS

    NO_SUCH_NS

    • Severity CRIT

    • Description A nonexistent namespace was referred to. Typically this means that a .fxs was missing from the loadPath.

    NO_SUCH_TYPE

    NO_SUCH_TYPE

    • Severity CRIT

    • Description A nonexistent type was referred to from a ns. Typically this means that a bad version of an .fxs file was found in the loadPath.

    NOTIFICATION_REPLAY_STORE_FAILURE

    NOTIFICATION_REPLAY_STORE_FAILURE

    • Severity CRIT

    • Description A failure occurred in the builtin notification replay store

    NS_LOAD_ERR2

    NS_LOAD_ERR2

    • Severity CRIT

    • Description System tried to process a loaded namespace and failed.

    NS_LOAD_ERR

    NS_LOAD_ERR

    • Severity CRIT

    • Description System tried to process a loaded namespace and failed.

    OPEN_LOGFILE

    OPEN_LOGFILE

    • Severity INFO

    • Description Indicate target file for certain type of logging

    PAM_AUTH_FAIL

    PAM_AUTH_FAIL

    • Severity INFO

    • Description A user failed to authenticate through PAM.

    PAM_AUTH_SUCCESS

    PAM_AUTH_SUCCESS

    • Severity INFO

    • Description A PAM authenticated user logged in.

    PHASE0_STARTED

    PHASE0_STARTED

    • Severity INFO

    • Description ConfD has just started its start phase 0.

    PHASE1_STARTED

    PHASE1_STARTED

    • Severity INFO

    • Description ConfD has just started its start phase 1.

    READ_STATE_FILE_FAILED

    READ_STATE_FILE_FAILED

    • Severity CRIT

    • Description Reading of a state file failed

    RELOAD

    RELOAD

    • Severity INFO

    • Description Reload of daemon configuration has been initiated.

    REOPEN_LOGS

    REOPEN_LOGS

    • Severity INFO

    • Description Logging subsystem, reopening log files

    REST_AUTH_FAIL

    REST_AUTH_FAIL

    • Severity INFO

    • Description Rest authentication for a user failed.

    REST_AUTH_SUCCESS

    REST_AUTH_SUCCESS

    • Severity INFO

    • Description A rest authenticated user logged in.

    RESTCONF_REQUEST

    RESTCONF_REQUEST

    • Severity INFO

    • Description RESTCONF request

    RESTCONF_RESPONSE

    RESTCONF_RESPONSE

    • Severity INFO

    • Description RESTCONF response

    REST_REQUEST

    REST_REQUEST

    • Severity INFO

    • Description REST request

    REST_RESPONSE

    REST_RESPONSE

    • Severity INFO

    • Description REST response

    ROLLBACK_FAIL_CREATE

    ROLLBACK_FAIL_CREATE

    • Severity ERR

    • Description Error while creating rollback file.

    ROLLBACK_FAIL_DELETE

    ROLLBACK_FAIL_DELETE

    • Severity ERR

    • Description Failed to delete rollback file.

    ROLLBACK_FAIL_RENAME

    ROLLBACK_FAIL_RENAME

    • Severity ERR

    • Description Failed to rename rollback file.

    ROLLBACK_FAIL_REPAIR

    ROLLBACK_FAIL_REPAIR

    • Severity ERR

    • Description Failed to repair rollback files.

    ROLLBACK_REMOVE

    ROLLBACK_REMOVE

    • Severity INFO

    • Description Found half created rollback0 file - removing and creating new.

    ROLLBACK_REPAIR

    ROLLBACK_REPAIR

    • Severity INFO

    • Description Found half created rollback0 file - repairing.

    SESSION_CREATE

    SESSION_CREATE

    • Severity INFO

    • Description A new user session was created

    SESSION_LIMIT

    SESSION_LIMIT

    • Severity INFO

    • Description Session limit reached, rejected new session request.

    SESSION_MAX_EXCEEDED

    SESSION_MAX_EXCEEDED

    • Severity INFO

    • Description A user failed to create a new user sessions due to exceeding sessions limits

    SESSION_TERMINATION

    SESSION_TERMINATION

    • Severity INFO

    • Description A user session was terminated due to specified reason

    SKIP_FILE_LOADING

    SKIP_FILE_LOADING

    • Severity DEBUG

    • Description System skips a file.

    SNMP_AUTHENTICATION_FAILED

    SNMP_AUTHENTICATION_FAILED

    • Severity INFO

    • Description An SNMP authentication failed.

    SNMP_CANT_LOAD_MIB

    SNMP_CANT_LOAD_MIB

    • Severity CRIT

    • Description The SNMP Agent failed to load a MIB file

    SNMP_MIB_LOADING

    SNMP_MIB_LOADING

    • Severity DEBUG

    • Description SNMP Agent loading a MIB file

    SNMP_NOT_A_TRAP

    SNMP_NOT_A_TRAP

    • Severity INFO

    • Description An UDP package was received on the trap receiving port, but it's not an SNMP trap.

    SNMP_READ_STATE_FILE_FAILED

    SNMP_READ_STATE_FILE_FAILED

    • Severity CRIT

    • Description Read SNMP agent state file failed

    SNMP_REQUIRES_CDB

    SNMP_REQUIRES_CDB

    • Severity WARNING

    • Description The SNMP agent requires CDB to be enabled in order to be started.

    SNMP_TRAP_NOT_FORWARDED

    SNMP_TRAP_NOT_FORWARDED

    • Severity INFO

    • Description An SNMP trap was to be forwarded, but couldn't be.

    SNMP_TRAP_NOT_RECOGNIZED

    SNMP_TRAP_NOT_RECOGNIZED

    • Severity INFO

    • Description An SNMP trap was received on the trap receiving port, but its definition is not known

    SNMP_TRAP_OPEN_PORT

    SNMP_TRAP_OPEN_PORT

    • Severity ERR

    • Description The port for listening to SNMP traps could not be opened.

    SNMP_TRAP_UNKNOWN_SENDER

    SNMP_TRAP_UNKNOWN_SENDER

    • Severity INFO

    • Description An SNMP trap was to be forwarded, but the sender was not listed in confd.conf.

    SNMP_TRAP_V1

    SNMP_TRAP_V1

    • Severity INFO

    • Description An SNMP v1 trap was received on the trap receiving port, but forwarding v1 traps is not supported.

    SNMP_WRITE_STATE_FILE_FAILED

    SNMP_WRITE_STATE_FILE_FAILED

    • Severity WARNING

    • Description Write SNMP agent state file failed

    SSH_HOST_KEY_UNAVAILABLE

    SSH_HOST_KEY_UNAVAILABLE

    • Severity ERR

    • Description No SSH host keys available.

    SSH_SUBSYS_ERR

    SSH_SUBSYS_ERR

    • Severity INFO

    • Description Typically errors where the client doesn't properly send the "subsystem" command.

    STARTED

    STARTED

    • Severity INFO

    • Description ConfD has started.

    STARTING

    STARTING

    • Severity INFO

    • Description ConfD is starting.

    STOPPING

    STOPPING

    • Severity INFO

    • Description ConfD is stopping (due to e.g. confd --stop).

    TOKEN_MISMATCH

    TOKEN_MISMATCH

    • Severity ERR

    • Description A secondary connected to a primary with a bad auth token

    UPGRADE_ABORTED

    UPGRADE_ABORTED

    • Severity INFO

    • Description In-service upgrade was aborted.

    UPGRADE_COMMITTED

    UPGRADE_COMMITTED

    • Severity INFO

    • Description In-service upgrade was committed.

    UPGRADE_INIT_STARTED

    UPGRADE_INIT_STARTED

    • Severity INFO

    • Description In-service upgrade initialization has started.

    UPGRADE_INIT_SUCCEEDED

    UPGRADE_INIT_SUCCEEDED

    • Severity INFO

    • Description In-service upgrade initialization succeeded.

    UPGRADE_PERFORMED

    UPGRADE_PERFORMED

    • Severity INFO

    • Description In-service upgrade has been performed (not committed yet).

    WEB_ACTION

    WEB_ACTION

    • Severity INFO

    • Description User executed a Web UI action.

    WEB_CMD

    WEB_CMD

    • Severity INFO

    • Description User executed a Web UI command.

    WEB_COMMIT

    WEB_COMMIT

    • Severity INFO

    • Description User performed Web UI commit.

    WEBUI_LOG_MSG

    WEBUI_LOG_MSG

    • Severity INFO

    • Description WebUI access log message

    WRITE_STATE_FILE_FAILED

    WRITE_STATE_FILE_FAILED

    • Severity CRIT

    • Description Writing of a state file failed

    XPATH_EVAL_ERROR1

    XPATH_EVAL_ERROR1

    • Severity WARNING

    • Description An error occurred while evaluating an XPath expression.

    XPATH_EVAL_ERROR2

    XPATH_EVAL_ERROR2

    • Severity WARNING

    • Description An error occurred while evaluating an XPath expression.

    COMMIT_UN_SYNCED_DEV

    COMMIT_UN_SYNCED_DEV

    • Severity INFO

    • Description Data was committed toward a device with bad or unknown sync state

    NCS_DEVICE_OUT_OF_SYNC

    NCS_DEVICE_OUT_OF_SYNC

    • Severity INFO

    • Description A check-sync action reported out-of-sync for a device

    NCS_JAVA_VM_FAIL

    NCS_JAVA_VM_FAIL

    • Severity ERR

    • Description The NCS Java VM failure/timeout

    NCS_JAVA_VM_START

    NCS_JAVA_VM_START

    • Severity INFO

    • Description Starting the NCS Java VM

    NCS_PACKAGE_AUTH_BAD_RET

    NCS_PACKAGE_AUTH_BAD_RET

    • Severity ERR

    • Description Package authentication program returned badly formatted data.

    NCS_PACKAGE_AUTH_FAIL

    NCS_PACKAGE_AUTH_FAIL

    • Severity INFO

    • Description Package authentication failed.

    NCS_PACKAGE_AUTH_SUCCESS

    NCS_PACKAGE_AUTH_SUCCESS

    • Severity INFO

    • Description A package authenticated user logged in.

    NCS_PACKAGE_BAD_DEPENDENCY

    NCS_PACKAGE_BAD_DEPENDENCY

    • Severity CRIT

    • Description Bad NCS package dependency

    NCS_PACKAGE_BAD_NCS_VERSION

    NCS_PACKAGE_BAD_NCS_VERSION

    • Severity CRIT

    • Description Bad NCS version for package

    NCS_PACKAGE_CHAL_2FA

    NCS_PACKAGE_CHAL_2FA

    • Severity INFO

    • Description Package authentication challenge sent to a user.

    NCS_PACKAGE_CHAL_FAIL

    NCS_PACKAGE_CHAL_FAIL

    • Severity INFO

    • Description Package authentication challenge failed.

    NCS_PACKAGE_CIRCULAR_DEPENDENCY

    NCS_PACKAGE_CIRCULAR_DEPENDENCY

    • Severity CRIT

    • Description Circular NCS package dependency

    NCS_PACKAGE_COPYING

    NCS_PACKAGE_COPYING

    • Severity DEBUG

    • Description A package is copied from the load path to private directory

    NCS_PACKAGE_DUPLICATE

    NCS_PACKAGE_DUPLICATE

    • Severity CRIT

    • Description Duplicate package found

    NCS_PACKAGE_SYNTAX_ERROR

    NCS_PACKAGE_SYNTAX_ERROR

    • Severity CRIT

    • Description Syntax error in package file

    NCS_PACKAGE_UPGRADE_ABORTED

    NCS_PACKAGE_UPGRADE_ABORTED

    • Severity CRIT

    • Description The CDB upgrade was aborted implying that CDB is untouched. However the package state is changed

    NCS_PACKAGE_UPGRADE_UNSAFE

    NCS_PACKAGE_UPGRADE_UNSAFE

    • Severity CRIT

    • Description Package upgrade has been aborted due to warnings.

    NCS_PYTHON_VM_FAIL

    NCS_PYTHON_VM_FAIL

    • Severity ERR

    • Description The NCS Python VM failure/timeout

    NCS_PYTHON_VM_START

    NCS_PYTHON_VM_START

    • Severity INFO

    • Description Starting the named NCS Python VM

    NCS_PYTHON_VM_START_UPGRADE

    NCS_PYTHON_VM_START_UPGRADE

    • Severity INFO

    • Description Starting a Python VM to run upgrade code

    NCS_SERVICE_OUT_OF_SYNC

    NCS_SERVICE_OUT_OF_SYNC

    • Severity INFO

    • Description A check-sync action reported out-of-sync for a service

    NCS_SET_PLATFORM_DATA_ERROR

    NCS_SET_PLATFORM_DATA_ERROR

    • Severity ERR

    • Description The device failed to set the platform operational data at connect

    NCS_SMART_LICENSING_ENTITLEMENT_NOTIFICATION

    NCS_SMART_LICENSING_ENTITLEMENT_NOTIFICATION

    • Severity INFO

    • Description

    NCS_SMART_LICENSING_EVALUATION_COUNTDOWN

    NCS_SMART_LICENSING_EVALUATION_COUNTDOWN

    • Severity INFO

    • Description Smart Licensing evaluation time remaining

    NCS_SMART_LICENSING_FAIL

    NCS_SMART_LICENSING_FAIL

    • Severity INFO

    • Description The NCS Smart Licensing Java VM failure/timeout

    NCS_SMART_LICENSING_GLOBAL_NOTIFICATION

    NCS_SMART_LICENSING_GLOBAL_NOTIFICATION

    • Severity INFO

    • Description Smart Licensing Global Notification

    NCS_SMART_LICENSING_START

    NCS_SMART_LICENSING_START

    • Severity INFO

    • Description Starting the NCS Smart Licensing Java VM

    NCS_SNMP_INIT_ERR

    NCS_SNMP_INIT_ERR

    • Severity INFO

    • Description Failed to locate snmp_init.xml in loadpath

    NCS_SNMPM_START

    NCS_SNMPM_START

    • Severity INFO

    • Description Starting the NCS SNMP manager component

    NCS_SNMPM_STOP

    NCS_SNMPM_STOP

    • Severity INFO

    • Description The NCS SNMP manager component has been stopped

    NCS_UPGRADE_ABORTED_INTERNAL

    NCS_UPGRADE_ABORTED_INTERNAL

    • Severity CRIT

    • Description The CDB upgrade was aborted due to some internal error. CDB is left untouched

    BAD_LOCAL_PASS

    BAD_LOCAL_PASS

    • Severity INFO

    • Description A locally configured user provided a bad password.

    EXT_LOGIN

    EXT_LOGIN

    • Severity INFO

    • Description An externally authenticated user logged in.

    EXT_NO_LOGIN

    EXT_NO_LOGIN

    • Severity INFO

    • Description External authentication failed for a user.

    NO_SUCH_LOCAL_USER

    NO_SUCH_LOCAL_USER

    • Severity INFO

    • Description A non existing local user tried to login.

    PAM_LOGIN_FAILED

    PAM_LOGIN_FAILED

    • Severity INFO

    • Description A user failed to login through PAM.

    PAM_NO_LOGIN

    PAM_NO_LOGIN

    • Severity INFO

    • Description A user failed to login through PAM

    SSH_LOGIN

    SSH_LOGIN

    • Severity INFO

    • Description A user logged into ConfD's builtin ssh server.

    SSH_LOGOUT

    SSH_LOGOUT

    • Severity INFO

    • Description A user was logged out from ConfD's builtin ssh server.

    SSH_NO_LOGIN

    SSH_NO_LOGIN

    • Severity INFO

    • Description A user failed to login to ConfD's builtin SSH server.

    WEB_LOGIN

    WEB_LOGIN

    • Severity INFO

    • Description A user logged in through the WebUI.

    WEB_LOGOUT

    WEB_LOGOUT

    • Severity INFO

    • Description A Web UI user logged out.

    Format String "Aborting candidate commit, request from user, reverting configuration."

  • Format String "ConfD restarted while having a ongoing candidate commit timer, reverting configuration."

  • Format String "Candidate commit session terminated, reverting configuration."

    Format String "Candidate commit timer expired, reverting configuration."

    Format String "Fatal error for accept() - ~s"

  • Format String "Out of file descriptors for accept() - ~s limit reached"

  • Format String "login failed via ~s from ~s with ~s: ~s"

    Format String "logged in via ~s from ~s with ~s using ~s authentication"

    Format String "logged out <~s> user"

    Format String "Bad configuration: ~s:~s: ~s"

    Format String "The dependency node '~s' for node '~s' in module '~s' does not exist"

  • Format String "~s"

  • Format String "~s"

    Format String "confd_aaa_bridge died - ~s"

    Format String "Candidate commit rollback done"

    Format String "Failed to rollback candidate commit due to: ~s"

  • Format String "Bad format found in candidate db file ~s; resetting candidate"

  • Format String "Corrupt candidate db file ~s; resetting candidate"

  • Format String "CDB boot error: ~s"

  • Format String "CDB client (~s) timed out, waiting for ~s"

  • Format String "CDB: lost config, deleting DB"

  • Format String "CDB: lost DB, deleting old config"

    Format String "fatal error in CDB: ~s"

    Format String "CDB load: processing file: ~s"

    Format String "CDB: Operational DB re-initialized"

  • Format String "CDB: Upgrade failed: ~s"

  • Format String "CGI: '~s' script with method ~s"

    Format String "libcrypto does not support ~s"

    Format String "CLI aborted"

    Format String "CLI done"

    Format String "CLI '~s'"

    Format String "CLI denied '~s'"

    Format String "commit ~s"

  • Format String "Resetting commit queue due do inconsistent or corrupt data."

  • Format String "ConfD configuration change: ~s"

  • Format String "Configuration transaction limit of type '~s' reached, rejected new transaction request"

  • Format String "Consulting daemon configuration file ~s"

    Format String "Daemon ~s died"

    Format String "Daemon ~s timed out"

    Format String "~s"

    Format String "~s"

    Format String "~s"

    Format String "~s"

    Format String "~s"

    Format String "~s"

    Format String "~s"

    Format String "~s"

    Format String "~s"

    Format String "The namespace ~s is defined in both module ~s and ~s."

    Format String "The prefix ~s is defined in both ~s and ~s."

    Format String "Changing size of error log (~s) to ~s (was ~s)"

  • Format String "Event notification subscriber with bitmask ~s timed out, waiting for ~s"

  • Format String "~s"

  • Format String "When-expression evaluation error: circular dependency in ~s"

  • Format String "external challenge authentication failed via ~s from ~s with ~s: ~s"

    Format String "external challenge sent to ~s from ~s with ~s"

    Format String "external challenge authentication succeeded via ~s from ~s with ~s, member of groups: ~s~s"

    Format String "External auth program (user=~s) ret bad output: ~s"

    Format String "external authentication failed via ~s from ~s with ~s: ~s"

    Format String "external authentication succeeded via ~s from ~s with ~s, member of groups: ~s~s"

    Format String "external token authentication failed via ~s from ~s with ~s: ~s"

    Format String "external token authentication succeeded via ~s from ~s with ~s, member of groups: ~s~s"

    Format String "~s"

    Format String "~s: ~s"

    Format String "Loaded file ~s"

    Format String "Failed to load file ~s: ~s"

    Format String "Loading file ~s"

    Format String "Fxs mismatch, secondary is not allowed"

    Format String "assigned to groups: ~s"

    Format String "Not assigned to any groups - all access is denied"

    Format String "Incompatible HA version (~s, expected ~s), secondary is not allowed"

    Format String "Nodeid ~s already exists"

  • Format String "Failed to connect to primary: ~s"

  • Format String "Secondary ~s killed due to no ticks"

    Format String "Internal error: ~s"

    Format String "JIT ~s"

    Format String "JSON-RPC traffic log: ~s"

  • Format String "Stopping session due to absolute timeout: ~s"

  • Format String "Stopping session due to idle timeout: ~s"

    Format String "JSON-RPC: '~s' with JSON params ~s"

    Format String "JSON-RPC warning: ~s"

    Format String "Failed to load kicker schema"

  • Format String "Got connect from library with insufficient keypath depth/keys support (~s/~s, needs ~s/~s)"

  • Format String "Got library connect from wrong version (~s, expected ~s)"

  • Format String "Got library connect with failed access check: ~s"

    Format String "~s to listen for ~s on ~s:~s"

  • Format String "local authentication failed via ~s from ~s with ~s: ~s"

  • Format String "local authentication failed via ~s from ~s with ~s: ~s"

  • Format String "local authentication failed via ~s from ~s with ~s: ~s"

  • Format String "local authentication succeeded via ~s from ~s with ~s, member of groups: ~s"

    Format String "Changing destination of ~s log to ~s"

    Format String "Daemon logging terminating, reason: ~s"

    Format String "Daemon logging started"

    Format String "Writing ~s log to ~s"

  • Format String "~s ~s log"

  • Format String "~s"

    Format String "Logged out from maapi ctx=~s (~s)"

    Format String "maapi server failed to write to a socket. Op: ~s Ecode: ~s Error: ~s~s"

  • Format String "AES256CFB128 keys were not found in confd.conf"

  • Format String "AESCFB128 keys were not found in confd.conf"

    Format String "DES3CBC keys were not found in confd.conf"

    Format String "The namespace ~s (referenced by ~s) could not be found in the loadPath."

    Format String "The namespace ~s could not be found in the loadPath."

    Format String "Failed to setup the shared memory schema"

    Format String "Got bad NETCONF TCP header"

    Format String "~s"

    Format String "~s: ~s"

    Format String "logged in from the CLI with aaa disabled"

    Format String "no registration found for callpoint ~s of type=~s"

    Format String "The identity ~s in namespace ~s refers to a non-existing base identity ~s in namespace ~s"

    Format String "No such namespace ~s, used by ~s"

  • Format String "No such simpleType '~s' in ~s, used by ~s"

  • Format String "~s"

  • Format String "Failed to process namespaces: ~s"

    Format String "Failed to process namespace ~s: ~s"

    Format String "Logging subsystem, opening log file '~s' for ~s"

    Format String "PAM authentication failed via ~s from ~s with ~s: phase ~s, ~s"

    Format String "pam authentication succeeded via ~s from ~s with ~s"

    Format String "ConfD phase0 started"

    Format String "ConfD phase1 started"

    Format String "Reading state file failed: ~s: ~s (~s)"

    Format String "Reloading daemon configuration."

    Format String "Logging subsystem, reopening log files"

    Format String "rest authentication failed from ~s"

    Format String "rest authentication succeeded from ~s , member of groups: ~s"

    Format String "RESTCONF: request with ~s: ~s"

    Format String "RESTCONF: response with ~s: ~s duration ~s us"

    Format String "REST: request with ~s: ~s"

    Format String "REST: response with ~s: ~s duration ~s ms"

    Format String "Error while creating rollback file: ~s: ~s"

    Format String "Failed to delete rollback file ~s: ~s"

    Format String "Failed to rename rollback file ~s to ~s: ~s"

    Format String "Failed to repair rollback files."

    Format String "Found half created rollback0 file - removing and creating new"

    Format String "Found half created rollback0 file - repairing"

    Format String "created new session via ~s from ~s with ~s"

    Format String "Session limit of type '~s' reached, rejected new session request"

  • Format String "could not create new session via ~s from ~s with ~s due to session limits"

  • Format String "terminated session (reason: ~s)"

    Format String "Skipping file ~s: ~s"

    Format String "SNMP authentication failed: ~s"

    Format String "Can't load MIB file: ~s"

    Format String "Loading MIB: ~s"

    Format String "SNMP gateway: Non-trap received from ~s"

    Format String "Read state file failed: ~s: ~s"

    Format String "Can't start SNMP. CDB is not enabled"

    Format String "SNMP gateway: Can't forward trap from ~s; ~s"

  • Format String "SNMP gateway: Can't forward trap with OID ~s from ~s; There is no notification with this OID in the loaded models."

  • Format String "SNMP gateway: Can't open trap listening port ~s: ~s"

  • Format String "SNMP gateway: Not forwarding trap from ~s; the sender is not recognized"

  • Format String "SNMP gateway: V1 trap received from ~s"

    Format String "Write state file failed: ~s: ~s"

    Format String "No SSH host keys available"

    Format String "ssh protocol subsys - ~s"

    Format String "ConfD started vsn: ~s"

    Format String "Starting ConfD vsn: ~s"

    Format String "ConfD stopping (~s)"

    Format String "Token mismatch, secondary is not allowed"

    Format String "Upgrade aborted"

    Format String "Upgrade committed"

    Format String "Upgrade init started"

    Format String "Upgrade init succeeded"

    Format String "Upgrade performed"

    Format String "WebUI action '~s'"

    Format String "WebUI cmd '~s'"

    Format String "WebUI commit ~s"

    Format String "WebUI access log: ~s"

    Format String "Writing state file failed: ~s: ~s (~s)"

    Format String "XPath evaluation error: ~s for ~s"

    Format String "XPath evaluation error: '~s' resulted in ~s for ~s"

    Format String "Committed data towards device ~s which is out of sync"

    Format String "NCS device-out-of-sync Device '~s' Info '~s'"

    Format String "The NCS Java VM ~s"

    Format String "Starting the NCS Java VM"

  • Format String "package authentication using ~s program ret bad output: ~s"

  • Format String "package authentication using ~s failed via ~s from ~s with ~s: ~s"

    Format String "package authentication using ~s succeeded via ~s from ~s with ~s, member of groups: ~s~s"

    Format String "Failed to load NCS package: ~s; required package ~s of version ~s is not present (found ~s)"

    Format String "Failed to load NCS package: ~s; requires NCS version ~s"

    Format String "package authentication challenge sent to ~s from ~s with ~s"

    Format String "package authentication challenge using ~s failed via ~s from ~s with ~s: ~s"

  • Format String "Failed to load NCS package: ~s; circular dependency found"

  • Format String "Copying NCS package from ~s to ~s"

    Format String "Failed to load duplicate NCS package ~s: (~s)"

    Format String "Failed to load NCS package: ~s; syntax error in package file"

  • Format String "NCS package upgrade failed with reason '~s'"

  • Format String "NCS package upgrade has been aborted due to warnings:\n~s"

  • Format String "The NCS Python VM ~s"

    Format String "Starting the NCS Python VM ~s"

    Format String "Starting upgrade of NCS Python package ~s"

    Format String "NCS service-out-of-sync Service '~s' Info '~s'"

  • Format String "NCS Device '~s' failed to set platform data Info '~s'"

  • Smart Licensing Entitlement Notification
  • Format String "Smart Licensing Entitlement Notification: ~s"

  • Format String "Smart Licensing evaluation time remaining: ~s"

  • Format String "The NCS Smart Licensing Java VM ~s"

  • Format String "Smart Licensing Global Notification: ~s"

  • Format String "Starting the NCS Smart Licensing Java VM"

    Format String "Failed to locate snmp_init.xml in loadpath ~s"

    Format String "Starting the NCS SNMP manager component"

    Format String "The NCS SNMP manager component has been stopped"

  • Format String "NCS upgrade failed with reason '~s'"

  • Format String "Provided bad password"

    Format String "Logged in over ~s using externalauth, member of groups: ~s~s"

    Format String "failed to login using externalauth: ~s"

    Format String "no such local user"

    Format String "pam phase ~s failed to login through PAM: ~s"

    Format String "failed to login through PAM: ~s"

    Format String "logged in over ssh from ~s with authmeth:~s"

    Format String "Logged out ssh <~s> user"

    Format String "Failed to login over ssh: ~s"

    Format String "logged in through Web UI from ~s"

    Format String "logged out from Web UI"