Skip to content

1. Network troubleshooting and configuration with SR OS and Ansible#

Activity name Network troubleshooting and configuration with SR OS and Ansible
Activity ID 56
Short Description With Ansible you can build automation once and re-use implemented solutions for other devices easily. In this activity, we will apply that to configuration management for model-driven SR OS devices using NETCONF.
Difficulty Intermediate
Tools used SR OS, Model-Driven CLI (MD-CLI), Ansible, SR OS with Ansible, Ansible.Netcommon
Topology Nodes PE1, client01, leaf21, client21
References MD-CLI user guide
SR OS System management guide
SR OS with Ansible
Ansible Netcommon
Ansible network best practices

Ansible is a well-known suite of software tools that started gaining traction with the wider IT community over a decade ago. It has since been acquired by Red Hat and over the years it has found its way towards the networking industry. The existing community had at that point provided how-to guides, reusable plugins and modules for IT services and the Network functionality was then added. This activity follows that same transition. We will start from an Ansible playbook used for another task against a Linux host and expand upon it to include some of our network automation tasks.

You will be introduced to automated management of SR OS using Ansible's NETCONF netcommon collection. The netcommon modules are created and supported by Red Hat. They provide a consistent and supported multi-vendor interface.

1.1 Objective#

To accomplish this activity, several tasks are outlined below for you to go through and experience. The existing playbook sends a ping command to the Linux node client01 that is connected to pe1. The playbook is executed whenever a test of connectivity between client01 and pe1 is required and ensures the VPRN service is functioning correctly.

  1. Inspect the existing playbook available for you in the Hackathon repository under activities/nos/sros/activity-56/. You can find a copy of the repository in your group's hackathon VM under /home/nokia/SReXperts/.
  2. Run the playbook as an initial test to assess the current situation
  3. Add a task to your playbook that connects to pe1 and retrieves data about the service meant for client01
  4. Based on the outcome of the previous task, update the configuration of pe1.
  5. Introduce an additional step where the ping is originated from pe1 towards client01.

1.2 Technology explanation#

In this chapter, we will discuss the tools and concepts that will be used throughout the exercise.

1.2.1 Ansible#

Ansible is a suite of software tools that enables infrastructure as code. The suite is open-source and includes software provisioning, configuration management and application deployment functionality. Ansible is designed with several key principles in mind. It has to be simple to understand, readable, extensible and there should be a gentle learning curve. The language used for defining what Ansible should do is YAML. Check out some of the documentation for the reasoning behind it and to see how YAML is used.

In addition to what is being used in this activity, a commercial offering of Ansible exists in the Ansible Automation Platform (AAP) from RHEL that includes additional functionality and support that isn't available out of the box in the regular version.

1.2.1.1 Core and Community#

When installing Ansible you may be confronted with several versioning systems. This stems from the differentation that is made between Ansible Core and Ansible Community. The former provides the language and the runtime that powers any automation use cases driven with Ansible while the latter includes the Core functionality as well as a number of collections. Ansible Community is available for installation as the Python ansible package.

It is Ansible Community that is used for this activity, with special mention for some of the modules that come bundled with it. The version of the package that is used is 11.5.0 and the version of ansible-core that is provided with it is 2.18.5. The bundled collections that come with the package include some that are maintained directly by Ansible, some that are maintained by partner organizations and others that are maintained by community teams. Among these modules is a collection of network-specific plugins and modules (including those for NETCONF) that are developed and maintained directly by Ansible.

1.2.1.2 Ansible Netcommon#

Created by the Ansible Network Community, the Netcommon collection contains numerous resources that allow it to interface with network devices. The collection contains vendor agnostic elements so any networking equipment that exposes standards-based interfaces can be interfaced with. The collection includes modules and plugins for NETCONF, gRPC, plain CLI, RESTCONF, file operations and a few other tasks, as documented here.

In this activity, in addition to some builtin modules, the netcommon.netconf modules will be used.

1.2.1.3 Jargon#

Before diving into the remaining technical topic and the tasks to be completed, let's introduce some of the terms used when talking about automation with Ansible to establish a shared language. The list of terms that will come up is as follows:

Inventory

In any Ansible deployment an inventory must be defined. This inventory includes the target hosts that could be used as well as some key information about them. This could include credentials, reachability information or some specific details required to be able to connect to the device.

Collections, modules and plugins

Plugins augment Ansible's core functionality and are accessible to modules, they are written in Python. Modules are small programs that perform actions on local machines, APIs or remote hosts when instructed to do so by tasks. They can be written in any language though are often written in Python. Finally, collections are a distribution format for Ansible content. They typically address a usecase like interacting with networking equipment, as is the case for the netcommon collection.

Roles

Roles are a logical way of labelling and grouping tasks according to certain variables or files they need access to, tasks they need to execute or information that is applicable to them. By defining them in a role this information doesn't have to be repeated as often. By assigning or un-assigning roles to hosts they can easily be controlled without having to re-do the implementation of the role's tasks. Roles are created by adhering to a file hierarchy and naming structure.

Plays

Solutions built as Ansible playbooks are broken down into smaller pieces known as plays that are called from a playbook. A play may refer to a mapping of a role to one or more target hosts, though in general it refers to an ordered grouping of tasks mapped to specific hosts.

Playbooks

Finally, a playbook is what is executed from the CLI, potentially with some additional input flags. A playbook can call plays and import roles to perform complicated workflows.


With these terms in mind, we will be able to identify the different elements in the Ansible-based automation we are building in this activity.

1.2.2 Model-driven SR OS#

As the term "model-driven" suggests, a model-driven Network Operating System (NOS) such as SR OS has one or more data models at its core. These data models compile together to provide the schema for the system. These data models are written using a language called YANG and, in the case of SR OS, are available online. In this activity, you will need to interact with the SR OS system in various ways and knowing where to find information will be useful here.

Configuration added to model-driven SR OS has to be loaded into the candidate datastore before it can be applied to the system via a commit operation. To begin configuring the router, an operator must first enter a configuration session using configure private or edit-config private. Operations other than commit exist, key among which is ping that will also be used as part of this activity. Knowing this, when looking through the netcommon modules, try to understand what is and isn't being abstracted away for you.

Model-driven SR OS is built to enable automation by providing several helpful commands:

  • pwc can show you in a number of formats what path you are currently in
  • tree shows you what values would be valid further down the tree, useful when you are looking for a certain attribute
  • compare returns the difference between the currently active candidate datastore session and the running configuration. This command has several optional flags, key among which are netconf-rpc and summary. Combining these gives you back an XML payload that would be usable by a NETCONF client to make the same changes again.

Between NETCONF, gRPC and the MD-CLI model-driven SR OS boasts three model-driven interfaces. Each of these three interfaces can be used interchangeably, albeit with some differences in underlying transport and encoding. As such, when looking at the activity, any choice of module won't necessarily be the correct one as this can be up to personal preference.

1.3 Tasks#

You should read these tasks from top-to-bottom before beginning the activity.

It is tempting to skip ahead but tasks may require you to have completed previous tasks before tackling them.

1.3.1 Explore the existing implementation#

Look at the Hackathon repository that is available on your group's hackathon VM under /home/nokia/SReXperts. In the folder activities/nos/sros/activity-56 the existing playbook and accompanying files used to test service availability from the client01 machine are included. Look into this folder and try to map the terms introduced previously to the different files. The structure is as follows:

- activity-56/
  - ansible.cfg
  - inventory.yml
  - playbook.yml
  - roles
    - linux_ping/
      - files/
      - tasks/
        - main.yml
      - templates/

Look inside the files to understand where each piece of information is stored and discover where the ping command is actually being sent.

Existing situation

The inventory file inventory.yml contains a group named linux_hosts with a single member, clab-srexperts-client01. In addition, this file sets some variables required to connect to the target host. Setting variables in the inventory file is only one way of doing this, some other options are documented here.

The playbook file playbook.yml contains a play mapping the linux_ping role to the linux_hosts group so that any tasks in the role are executed using those hosts as targets when the playbook runs.

In the roles/ directory the aforementioned role named linux_ping exists in the form of a subdirectory and three subdirectories exist for that role. The files subdirectory would contain any generated files or other static content required for the task while the templates directory could contain templates used to generate those files. In this case, both are empty. The only subfolder with any content is tasks, which contains the main.yml file. Inside that file the ping command is defined.

The ansible.cfg file is included to disable SSH hostkey verification. While acceptable in the ephemeral lab environment provided to you for the Hackathon, this isn't recommended for live or production environments. By adding this setting we avoid having to manually check every individual SSH host key in this activity.

1.3.2 Run the existing playbook and look at the output#

Run the existing playbook using the ansible-playbook command. Use a CLI flag to point to the inventory file and check the output. What do you notice?

Output
$ ansible-playbook playbook.yml -i inventory.yml
$ ansible-playbook playbook.yml -i inventory.yml

PLAY [Linux ping to test provisioned service] ************************************************************************************************************************

TASK [linux_ping : Test service on PE1] ******************************************************************************************************************************
fatal: [clab-srexperts-client01]: FAILED! => {"changed": true, "msg": "non-zero return code", "rc": 1,
"stderr": "Shared connection to clab-srexperts-client01 closed.\r\n", "stderr_lines": ["Shared
  connection to clab-srexperts-client01 closed."], "stdout": "PING 10.70.11.101 (10.70.11.101) 56(84)
  bytes of data.\r\nFrom 10.70.11.1 icmp_seq=1 Destination Host Unreachable\r\n\r\n--- 10.70.11.101
  ping statistics ---\r\n1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
  \r\n\r\n", "stdout_lines": ["PING 10.70.11.101 (10.70.11.101) 56(84) bytes of data.", "From 10.70.11.1
  icmp_seq=1 Destination Host Unreachable", "", "--- 10.70.11.101 ping statistics ---", "1 packets
  transmitted, 0 received, +1 errors, 100% packet loss, time 0ms", ""]}

The ping is currently failing so the service provided from PE1 is not in the expected state.

1.3.3 Add a step to retrieve state information from pe1#

Don't worry, it is expected that your ping from the previous task is not working. We're going to fix it now.

The expected service configuration that should be present on node pe1 is the following:

    configure {
        service {
+           vprn "ansible-vprn" {
+               admin-state enable
+               service-id 600
+               customer "1"
+               interface "client01" {
+                   ipv4 {
+                       primary {
+                           address 10.70.11.101
+                           prefix-length 24
+                       }
+                   }
+                   sap 1/1/c6/1:600 {
+                   }
+               }
+           }
        }
    }

There should similarly be an entry in the system's running state that signifies that the service and the interface inside it are operationally up. The corresponding paths in the MD-CLI are

  • /state service vprn "ansible-vprn" oper-state
  • /state service vprn "ansible-vprn" interface "client01" oper-state

We could log in using SSH and get to work troubleshooting and fixing any issues, however, that wouldn't be very automated. Let's use Ansible instead. First, using the ansible.netcommon.netconf_get module, make sure that the state paths above are present and the values are in line with our expectations. That not being the case would be the easiest explanation for the failing connectivity check. Note that this check should only be done if the ping from the previous section failed.

To accomplish this, you’ll need to complete the following steps:

  • Create a group called sros_nodes in your inventory and add clab-srexperts-pe1 to it. Look for information on the ansible_network_os and ansible_connection attributes as they will be crucial here.
  • Change the existing role linux_ping to publish a variable ping_result and not fail on errors
  • Create a new role called check_service based on the linux_ping role that uses netconf_get to collect data from the target SR OS nodes
  • Call this role from your playbook after the linux_ping role if the connectivity play returned a nonzero return code
  • Make sure the rest of the playbook knows the outcome of this state verification by publishing another variable, use the builtin set_fact to reduce the output variable to be true or false as a response to the question whether the service is operational.

A few pointers to get you started

When a task is executed against a host by Ansible and a value is returned using the register directive, this variable is added to the host's hostvars. Look towards the documentation for more information on variables and how to access them throughout the playbook.

The netconf_get task can take an XML payload to be used to filter the amount of data retrieved from the target host. You can specify this in the role's main.yml file however to reduce clutter it may be preferable to use the builtin lookup function.

Debugging information

Ansible has several ways in which it can display debugging information. You can make the output of the playbook execution more verbose by adding more v flags to the command, e.g. -v or -vvvvv depending on the desired level of verbosity. The builtin debug module lets you print out variables as part of the playbook itself. Use these freely to follow what is going on whenever you execute your playbook.

Use the debug module to see which attribute(s) of the ping_result variable can tell you whether the ping succeeded or failed.

Solution - check_service
$ ansible-playbook playbook.yml -i inventory.yml -v
Using /home/nokia/SReXperts/activities/nos/sros/activity-56/ansible.cfg as config file

PLAY [Linux ping to test provisioned service] **************************************************************

TASK [linux_ping : Test service on PE1] ********************************************************************
fatal: [clab-srexperts-client01]: FAILED! => {"...(snip)", "rc": 1, "stderr": "... (snip)]}
...ignoring

PLAY [Check configuration state on SR OS node] **************************************************************

TASK [check_service : Pre-check PE1 service oper-state] ****************************************************
ok: [clab-srexperts-pe1] => {"changed": false, "output": null, "stdout": "<data xmlns=\"...(snip)</data>"]}

TASK [check_service : Render retrieved state to boolean `true` if service exists, false otherwise] *********
ok: [clab-srexperts-pe1] => {"ansible_facts": {"service_found": false}, "changed": false}

PLAY RECAP *************************************************************************************************
clab-srexperts-client01    : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=1
clab-srexperts-pe1         : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Add the following lines to inventory.yml to make Ansible aware of the SR OS node:

sros_nodes:
  hosts:
    clab-srexperts-pe1:
      ansible_connection: "ansible.netcommon.netconf"
      ansible_network_os: "<network-os>"
      ansible_user: "admin"
      ansible_password: #PROVIDED#

Change the existing role to publish a variable and not fail on errors:

---
- name: Test service on PE1
  raw: ping -c 1 10.70.11.101
  ignore_errors: true
  register: ping_result

Build a new role, check_service, by copying the file structure from linux_ping. Populate the files subdirectory with filter.xml. This file should contain a payload for filtering the oper-state of the VPRN and its associated interface to limit the amount of data received.

<state xmlns="urn:nokia.com:sros:ns:yang:sr:state" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0" xmlns:nokia-attr="urn:nokia.com:sros:ns:yang:sr:attributes">
    <service>
        <vprn>
            <service-name>ansible-vprn</service-name>
            <oper-state/>
            <interface>
                <interface-name>client01</interface-name>
                <ipv4>
                    <oper-state/>
                </ipv4>
            </interface>
        </vprn>
    </service>
</state>
Then, add a task to your role that checks the state:
---
- name: Pre-check PE1 service oper-state
  connection: netconf
  ansible.netcommon.netconf_get:
    filter: "{{ lookup('file', './filter.xml') }}"
  register: netconf_service_state

- name: Render retrieved state to boolean `true` if service exists, false otherwise
  set_fact:
    service_found: "{{ 'state' in (netconf_service_state.stdout |ansible.utils.from_xml() | from_json )['data'] }}"

from_json

As you might notice in the example solution above, to get to the dictionary variable that we would expect is returned by the call to from_xml we first have to pipe it through the from_json function. This is a known issue with a fix available however the ansible version downloaded by default with pip in this environment does not include it.

Add a play to playbook.yml to map the SR OS nodes group to your new role when the ping fails:

- name: Check configuration state on SR OS node
  hosts: sros_nodes
  gather_facts: False
  roles:
    - role: check_service
      when: hostvars['clab-srexperts-client01'].ping_result.rc != 0
  tags: ['check_service']

1.3.4 Make changes to the pe1 configuration so that service 600 works properly#

While we could address the original issue of the ping failing by adding the missing configuration manually that would not serve us should this issue reoccur. Instead, add the necessary changes to the Ansible files already present so that the action taken when the ping fails and the previous check finds the service missing is to apply the desired configuration to the router.

You can find this configuration in the previous section. Make sure the configuration is not pushed needlessly, so only do this when the configuration is shown to be missing and after we found the ping to not be working.

Use what you have learned so far, try to use the available documentation and example resources online to get as far as you can before looking at the proposed solution.

Some more pointers to get you started

Create another role, config_service, and add a play to your playbook that calls to it whenever the existing ping command returned a nonzero return code and the state check failed.

Try to use the MD-CLI to help you build the configuration step by letting compare generate the input needed for ansible.netcommon.netconf_config. The examples available on the Nokia developer portal may come in handy.

Solution - config_service
$ ansible-playbook playbook.yml -i inventory.yml -v
Using /home/nokia/SReXperts/activities/nos/sros/activity-56/ansible.cfg as config file

PLAY [Linux ping to test provisioned service] **************************************************************

TASK [linux_ping : Test service on PE1] ********************************************************************
fatal: [clab-srexperts-client01]: FAILED! => {"...(snip)", "rc": 1, "stderr": "... (snip)]}
...ignoring

PLAY [Check configuration state on SR OS node] *************************************************************

TASK [check_service : Pre-check PE1 service oper-state] ****************************************************
ok: [clab-srexperts-pe1] => {"changed": false, "output": null, "stdout": "<data xmlns=\"...(snip)</data>"]}

TASK [check_service : Render retrieved state to boolean `true` if service exists, false otherwise] *********
ok: [clab-srexperts-pe1] => {"ansible_facts": {"service_found": false}, "changed": false}

PLAY [Add configuration to SR OS node] *********************************************************************

TASK [config_service : Configure PE1 service] **************************************************************
changed: [clab-srexperts-pe1] => {"changed": true, "server_capabilities": [...(snip)]}

PLAY RECAP *************************************************************************************************
clab-srexperts-client01    : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=1
clab-srexperts-pe1         : ok=3    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Add a play to playbook.yml to execute your new role against the SR OS nodes when the ping fails and the service configuration is found to be missing:

- name: Add configuration to SR OS node
  hosts: sros_nodes
  gather_facts: False
  roles:
    - role: config_service
      when: hostvars['clab-srexperts-client01'].ping_result.rc != 0 and not hostvars['clab-srexperts-pe1'].service_found
  tags: ['config_service']

Copy the file structure from check_service and rename filter.xml to service.xml. Replace the contents of the file with a payload that will configure the service.

<nc:config xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
    <configure xmlns="urn:nokia.com:sros:ns:yang:sr:conf" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0" xmlns:nokia-attr="urn:nokia.com:sros:ns:yang:sr:attributes">
        <service>
            <vprn>
                <service-name>ansible-vprn</service-name>
                <admin-state>enable</admin-state>
                <service-id>600</service-id>
                <customer>1</customer>
                <interface>
                    <interface-name>client01</interface-name>
                    <ipv4>
                        <primary>
                            <address>10.70.11.101</address>
                            <prefix-length>24</prefix-length>
                        </primary>
                    </ipv4>
                    <sap>
                        <sap-id>1/1/c6/1:600</sap-id>
                    </sap>
                </interface>
            </vprn>
        </service>
    </configure>
</nc:config>
Add a task to your role to configure the service using this payload:
- name: Configure PE1 service
  connection: netconf
  ansible.netcommon.netconf_config:
    content: "{{ lookup('file', './service.xml') }}"
    commit: true
    target: candidate
    lock: never
    format: xml

Check your work

Make sure your configuration was applied correctly, either by checking the node directly or by running your playbook again. If the configuration is now present that should have a noticeable impact on your playbook's behavior.

1.3.5 Send a ping from pe1 to client01 via Ansible#

As client01 isn't model-driven and does not come with a Python interpreter installed, our options on the Ansible side are somewhat limited. If you have followed the example solution, this translates into being able to distinguish between different return codes from the raw ping command or text-scraping.

Let's add a ping that will give us some modeled output that we can more easily interpret within our automation. To do this and to introduce a little read-your-writes verification to your playbook, add another task to the check_service role that will send this ping for you. Add another play to your playbook that repeats the check_service role after the config_service play was executed.

Use a jinja2 template that takes input variables so that the task will be reusable for any other services or IP addresses you may need later on. Implement the new task in a way that ensures it will be skipped if there is no destination IP specified.

You can use the netcommon.netconf_rpc module for this. Use it to trigger a modeled ping operation on pe1 with the following parameters:

  • payload size of 800 bytes
  • destination IP of 10.70.11.1
  • source IP of 10.70.11.101
  • in routing instance ansible-vprn
  • only send a single ping request

Command-line arguments

We haven't talked about the tags we have been adding in the example solution yet. These let you single out one or more tasks by using the -t parameter when you invoke the playbook. Adding a unique tag to this second instance of check_service may make it easier to troubleshoot this task as it gives you the ability to call it directly.

Another useful thing you can do through CLI parameters uses the -e flag, which lets you define variables from the CLI. Try to override the programmed count of 1 to be 4 instead, using the CLI.

Ping payload and result

For potential comparison with your implementation, consider the following two tabs that contain the an example input and output for the ping operation as seen in verbose Ansible logs.

<global-operations xmlns=\"urn:nokia.com:sros:ns:yang:sr:oper-global\">
    <ping>
    <destination>10.70.11.1</destination>
        <router-instance>ansible-vprn</router-instance>
        <source-address>10.70.11.101</source-address>
        <count>1</count>
        <size>800</size>
    </ping>
</global-operations>
<rpc-reply xmlns:nc=\"urn:ietf:params:xml:ns:netconf:base:1.0\" xmlns=\"urn:ietf:params:xml:ns:netconf:base:1.0\" xmlns:nokiaoper=\"urn:nokia.com:sros:ns:yang:sr:oper-global\" message-id=\"urn:uuid:d6ebb25b-d5fa-4599-9f81-00fcd6f7ab42\">
    <nokiaoper:operation-id>35</nokiaoper:operation-id>
    <nokiaoper:start-time>2025-05-18T12:34:52.8Z</nokiaoper:start-time>
    <nokiaoper:results>
        <nokiaoper:test-parameters>
            <nokiaoper:destination>10.70.11.1</nokiaoper:destination>
            <nokiaoper:bypass-routing>false</nokiaoper:bypass-routing>
            <nokiaoper:router-instance>ansible-vprn</nokiaoper:router-instance>
            <nokiaoper:source-address>10.70.11.101</nokiaoper:source-address>
            <nokiaoper:srv6-policy>false</nokiaoper:srv6-policy>
            <nokiaoper:candidate-path>false</nokiaoper:candidate-path>
            <nokiaoper:preference>100</nokiaoper:preference>
            <nokiaoper:count>1</nokiaoper:count>
            <nokiaoper:output-format>detail</nokiaoper:output-format>
            <nokiaoper:do-not-fragment>false</nokiaoper:do-not-fragment>
            <nokiaoper:fc>nc</nokiaoper:fc>
            <nokiaoper:interval>1</nokiaoper:interval>
            <nokiaoper:pattern>sequential</nokiaoper:pattern>
            <nokiaoper:size>800</nokiaoper:size>
            <nokiaoper:timeout>5</nokiaoper:timeout>
            <nokiaoper:tos>0</nokiaoper:tos>
            <nokiaoper:ttl>64</nokiaoper:ttl>
        </nokiaoper:test-parameters>
        <nokiaoper:probe>
            <nokiaoper:probe-index>1</nokiaoper:probe-index>
            <nokiaoper:status>response-received</nokiaoper:status>
            <nokiaoper:round-trip-time>1443</nokiaoper:round-trip-time>
            <nokiaoper:response-packet>
                <nokiaoper:size>808</nokiaoper:size>
                <nokiaoper:source-address>10.70.11.1</nokiaoper:source-address>
                <nokiaoper:icmp-sequence-number>1</nokiaoper:icmp-sequence-number>
                <nokiaoper:ttl>64</nokiaoper:ttl>
            </nokiaoper:response-packet>
        </nokiaoper:probe>
        <nokiaoper:summary>
            <nokiaoper:statistics>
                <nokiaoper:packets>
                    <nokiaoper:sent>1</nokiaoper:sent>
                    <nokiaoper:received>1</nokiaoper:received>
                    <nokiaoper:loss>0.0</nokiaoper:loss>
                </nokiaoper:packets>
                <nokiaoper:round-trip-time>
                    <nokiaoper:minimum>1064</nokiaoper:minimum>
                    <nokiaoper:average>1064</nokiaoper:average>
                    <nokiaoper:maximum>1064</nokiaoper:maximum>
                    <nokiaoper:standard-deviation>0</nokiaoper:standard-deviation>
                </nokiaoper:round-trip-time>
            </nokiaoper:statistics>
        </nokiaoper:summary>
    </nokiaoper:results>
    <nokiaoper:status>completed</nokiaoper:status>
    <nokiaoper:end-time>2025-05-18T12:34:57.3Z</nokiaoper:end-time>
</rpc-reply>
Solution - add ping to check_service
$ ansible-playbook playbook.yml -i inventory.yml -v -t check_service_after -e count=4
Using /home/nokia/SReXperts/activities/nos/sros/activity-56/ansible.cfg as config file

PLAY [Linux ping to test provisioned service] **************************************************************

PLAY [Check configuration state on SR OS node] *************************************************************

PLAY [Add configuration to SR OS node] *********************************************************************

PLAY [Check service state on SR OS node] *******************************************************************

TASK [check_service : Pre-check PE1 service oper-state] ****************************************************
ok: [clab-srexperts-pe1] => {"changed": false, "output": null, "stdout": "<data ... (snip)</data>"]}

TASK [check_service : Render retrieved state to boolean `true` if service exists, false otherwise] *********
ok: [clab-srexperts-pe1] => {"ansible_facts": {"service_found": true}, "changed": false}

TASK [check_service : Send a ping from the router to the client to confirm activity] ***********************
ok: [clab-srexperts-pe1] => {"changed": false, "output": null, "stdout": "<rpc-reply xmlns:nc...(snip)"</rpc-reply>"]}

PLAY RECAP *************************************************************************************************
clab-srexperts-pe1         : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Add a template to the check_service role in the file templates/ping.j2:

<global-operations xmlns="urn:nokia.com:sros:ns:yang:sr:oper-global">
  <ping>
    <destination>{{dest_ip}}</destination>
    <router-instance>{{rtr_inst}}</router-instance>
    <source-address>{{src_ip}}</source-address>
    <count>{{count}}</count>
    <size>{{size}}</size>
  </ping>
</global-operations>

To render the template into payload ping.xml use the template and lookup builtins.

Add a test using Jinja2's defined test on the variable that represents the destination IP, dest_ip in our case, so that the task is usable if the ping isn't desired.

Send it with the netcommon.netconf_rpc module.

- name: Send a ping from the router to the client to confirm activity
  when: dest_ip is defined
  ansible.netcommon.netconf_rpc:
    rpc: action
    xmlns: urn:ietf:params:xml:ns:yang:1
    content: "{{ lookup('ansible.builtin.template', './ping.j2') }}"

Modify playbook.yml to make use of this functionality via an additional play.

- name: Check service state on SR OS node
  hosts: sros_nodes
  vars:
    dest_ip: "10.70.11.1"
    rtr_inst: "ansible-vprn"
    src_ip: "10.70.11.101"
    count: "1"
    size: "800"
  gather_facts: False
  roles:
    - role: check_service
  tags: ['check_service_after']

1.3.6 [Optional] Add leaf21 to the service provided and test the connectivity#

As an optional extension, can you build the same automation for SR Linux? This time you'll use client21 and leaf21. There are some resources available online (1, 2) that may be able to help get you started working with the SR Linux Ansible Collection.

To be able to use the collection, the JSON-RPC interface on SR Linux needs to be enabled. This is done by default in the provided topology. The interfaces on client21 and leaf21 re-use the IP addressing information from client01 and pe1. The target configuration for leaf21 is the following:

+     network-instance ansible-vprn {
+         type ip-vrf
+         admin-state enable
+         interface client01 {
+             interface-ref {
+                 interface ethernet-1/1
+                 subinterface 600
+             }
+         }
+     }
+     interface ethernet-1/1 {
+         subinterface 600 {
+             type routed
+             ipv4 {
+                 admin-state enable
+                 address 10.70.11.101/24 {
+                 }
+             }
+             vlan {
+                 encap {
+                     single-tagged {
+                         vlan-id 600
+                     }
+                 }
+             }
+         }
+     }

Several ways exist of integrating these additional nodes into your automation. We have chosen one such approach for the example solution however don't feel pressured, you can experiment.

Example solution
ansible-playbook playbook.yml -i inventory.yml -v
Using /home/nokia/SReXperts/activities/nos/sros/activity-56/ansible.cfg as config file

PLAY [Linux ping to test provisioned service] **************************************************************

TASK [linux_ping : Test service on PE1] ********************************************************************
changed: [clab-srexperts-client01] => {"...(snip)", "rc": 0, "stderr": "... (snip)]}
fatal: [clab-srexperts-client21]: FAILED! => {"...(snip)", "rc": 1, "stderr": "... (snip)]}
...ignoring

PLAY [Check configuration state on SR OS node] *************************************************************

TASK [check_service : Pre-check PE1 service oper-state] ****************************************************
skipping: [clab-srexperts-pe1] => {"...(snip)", "skip_reason": "Conditional result was False"}

TASK [check_service : Render retrieved state to boolean `true` if service exists, false otherwise] *********
skipping: [clab-srexperts-pe1] => {"...(snip)", "skip_reason": "Conditional result was False"}

TASK [check_service : Send a ping from the router to the client to confirm activity] ***********************
skipping: [clab-srexperts-pe1] => {"...(snip)", "skip_reason": "Conditional result was False"}

PLAY [Add configuration to SR OS node] *********************************************************************

TASK [config_service : Configure PE1 service] **************************************************************
skipping: [clab-srexperts-pe1] => {"...(snip)", "skip_reason": "Conditional result was False"}

PLAY [Check service state on SR OS node] *******************************************************************

TASK [check_service : Pre-check PE1 service oper-state] ****************************************************
ok: [clab-srexperts-pe1] => {"changed": false, "output": null, "stdout": "<data xmlns=\"...(snip)</data>"]}

TASK [check_service : Render retrieved state to boolean `true` if service exists, false otherwise] *********
ok: [clab-srexperts-pe1] => {"ansible_facts": {"service_found": true}, "changed": false}

TASK [check_service : Send a ping from the router to the client to confirm activity] ***********************
ok: [clab-srexperts-pe1] => {"changed": false, "output": null, "stdout": "<rpc-reply xmlns:nc=\"...(snip)</rpc-reply>"]}

PLAY [Add configuration to SR Linux node] ******************************************************************

TASK [config_service_srl : Configure Leaf21 service] *******************************************************
changed: [clab-srexperts-leaf21] => {"changed": true, "jsonrpc_req_id": "2025-05-21 08:55:00:898354", "saved": false}

PLAY RECAP *************************************************************************************************
clab-srexperts-client01    : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
clab-srexperts-client21    : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=1
clab-srexperts-leaf21      : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
clab-srexperts-pe1         : ok=3    changed=0    unreachable=0    failed=0    skipped=4    rescued=0    ignored=0

Add the additional linux host client21 and to your inventory file under the linux_hosts group. Create a new group for srlinux_nodes that contains clab-srexperts-leaf21:

linux_hosts:
  hosts:
    clab-srexperts-client01:
      ansible_connection: "ssh"
      ansible_ssh_user: "user"
      ansible_ssh_private_keyfile: "/home/nokia/.ssh/id_rsa"
    clab-srexperts-client21:
      ansible_connection: "ssh"
      ansible_ssh_user: "user"
      ansible_ssh_private_keyfile: "/home/nokia/.ssh/id_rsa"
sros_nodes:
  hosts:
    clab-srexperts-pe1:
      ansible_connection: "ansible.netcommon.netconf"
      ansible_network_os: "<network-os>"
      ansible_ssh_user: "admin"
      ansible_ssh_pass: #PROVIDED#
srl_nodes:
  hosts:
    clab-srexperts-leaf21:
      ansible_connection: "ansible.netcommon.httpapi"
      ansible_network_os: "nokia.srlinux.srlinux"
      ansible_user: "admin"
      ansible_password: #PROVIDED#

Create a new role, config_service_srl and add a task to create the configuration in main.yml using the nokia.srlinux collection.

---
  - name: Configure Leaf21 service
    nokia.srlinux.config:
      update:
        - path: /interface[name=ethernet-1/1]/subinterface[index=600]
          value:
            index: 600
            type: routed
            ipv4:
              admin-state: enable
              address:
                ip-prefix: 10.70.11.101/24
            vlan:
              encap:
                single-tagged:
                  vlan-id: 600
        - path: /network-instance[name=ansible-vprn]
          value:
            type: ip-vrf
            admin-state: enable
            interface:
                name: client01
                interface-ref:
                  "interface": ethernet-1/1
                  "subinterface": 600

You are free to add another ping check or try other verifications. The only thing that is definitely needed is a configuration play that calls the new role to configure the SR Linux node.

---
...
- name: Add configuration to SR Linux node
  hosts: srl_nodes
  gather_facts: False
  roles:
    - role: config_service_srl
      when: hostvars['clab-srexperts-client21'].ping_result.rc != 0
  tags: ['config_service_srl']

1.4 Summary and review#

Congratulations! If you have made it this far you have completed this activity and achieved the following:

  • You have used a virtual machine provided to you as a development environment
  • You have built upon existing automation to add your own contributions and improvements
  • You have learned how to interface from Ansible to SR OS (and optionally SR Linux)
  • You have written YAML, XML and Jinja2 files
  • You have seen that Ansible is a tool that allows putting things together to make something greater than the sum of its parts

This is a pretty extensive list of achievements! Well done!

If you're hungry for more have a go at another activity, or try to expand upon this one if you have some more ideas. If you are interested in taking some of this home with you, perhaps try to install Ansible on your own machine and see if your playbook works from that platform. The relevant SSH, NETCONF and JSON-RPC ports are exposed on your group's hackathon VM so reachability should not be a problem.


Do you feel you have achieved something?
Was the difficulty level graded appropriately?
How do you rate this activity?