Are you a network engineer and have had to repeat the same boring task at work, every day? Do you feel that there must be a way for you to do a task once, and then “automate” it? Theoretically, an infinite number of times? Or, have you been spending more time cleaning up and correcting configuration mistakes than you spend implementing those configurations? Or maybe you have been hearing a lot about this hot new “thing” called network programmability, but in the middle of the hype, could not figure out what exactly it is?
If any of those cases (and many others) apply to you, then you are in the right place. The fact that you are here, reading this now, means you know that there is probably a solution to your problem(s) in the realm of automation and/or programmability. In this case, buckle up because you are in for a ride!
If you are a network engineer and browsed to this page by mistake, I still urge you to read on. Netflix, Youtube, Facebook and Twitter will still be there when you are done. (Or not.) This is more fun. Trust me!
A Few Definitions For The Road
Before we dive into the nuances of network programmability and automation, let’s clear up some confusion. I hate nothing in the world more than definitions – well, maybe greasy pizza – but this is a necessary evil! In order to start clean, you must understand each of the following: network management, automation, orchestration, modeling, programmability and APIs.
Network Management is an umbrella term that covers the processes, tools, technologies, and job roles, among other things, required to manage a network and the lifecycle of the services offered by that network.
Many standards and frameworks exist today to define the different components of network management. One of them is FCAPS, where the acronym stands for Fault, Configuration, Accounting, Performance and Security Management. FCAPS is geared towards managing the systems that constitute the network.
Another is ITIL. The acronym stands for Information Technology Infrastructure Library and covers an extensive number of practices for IT Services Management (ITSM), which is basically the lifecycle of the services provided by the network. ITIL is divided into 5 major practices: Service Strategy, Service Design, Service Transition, Service Operation, and Continual Service Improvement. Each practice is divided into smaller sub-practices. For example, Service Design includes Capacity Management, Availability Management and Service Catalogue Management while Service Operation includes Incident Management, Problem Management and Request Fulfilment. Some people make a career being ITIL practitioners.
The Merriam-Webster dictionary defines Automation as “the technique of making an apparatus, a process, or a system operate automatically”. In other words, having a system of some sort do the work for you, work that you would otherwise do manually. However, you will have to tell this system what is it that you want to get done, and sometimes, how to do it.
So, configuring a network of routers with dynamic routing protocols, so that these routers speak with each other and figure out the shortest path per destination, is a form of automation. The alternative would be having someone do the calculations manually on a piece of paper, and then configuring static routes on each router. And so is writing a program that configures a VLAN on your switch – or your 500 switches – without someone having to log in to each switch individually and configuring the VLAN via the CLI.
As you have already guessed, the power of automation is not intrinsically in the automation itself. Logging into one switch manually and configuring one VLAN is probably much faster than writing a Python program to do that for you. So why automate? Obviously, the importance of automation is its application to repetitive tasks.
Automation will not only save the time you will spend repeating a task, it also maintains consistency and accuracy of performing that task, over all its iterations. It does not matter if you have 10 or 500 switches. The program you wrote will always go through the exact same steps for eachevery switch, with the exact same result. Every time. Of course, the assumption here is that no errors will happen because of factors external to your program, such as an unreachable switch, wrong credentials configured on a switch, or a switch with a corrupted IOS. Although, you can write a program to detect and mitigate these error conditions!
When you have several systems working together to get a job done, there is typically a need for a system, or a function, to coordinate the execution of the tasks performed by the different systems towards getting this job completed. This coordination function is called Orchestration.
For example, a private or public cloud that provides virtual machines to its users will include different systems to provision the network, compute, virtualization, operating systems, and maybe the applications, for those VMs. Orchestration will provide the function of coordination between all the different systems and applications to get the VM up and running.
Automation and orchestration work well in tandem. Automation covers single tasks. Using software to configure a VLAN on a switch is automation, and so is provisioning a VM over ESXi, or installing Linux on that VM. Orchestration, on the other hand, is the function of coordinating the execution of these automated tasks, in a specific sequence, each task using its own software and each on its respective system. The scope of automation involves single tasks. The scope of orchestration involves a workflow of tasks.
The concept of Modeling Data is not a new concept and is not exclusive to networks or even automation. Data modeling is very involved and is a major branch of data science. For the humble purpose of this blog, let’s use an example to demonstrate what a model is. In Example 1, you can see a configuration snippet of BGP on an IOS-XR router in the left column. In the right column, the specific values for this particular device were removed and replaced with a description of what should be there. A template of sorts.
Example 1 – BGP configuration snippet on IOS-XR and the corresponding data model in tree notation
As you can see, a model is a little more than just a template.
A model describes data types. An IP address is composed of four octets separated by the “.” character and each octet has a value between 0 and 255. An ASN is an integer between 1 and 65535. These two objects, an IP address and ASN, are leaf objects, each having a specific type and each instance of that leaf has a value.
As you have already guessed, a model also describes the data hierarchy. Address families are child objects to the main BGP process. Then the networks injected for a particular address family are children objects to that address family. And the same applies to neighbors: there are global neighbors that are children to the BGP process, and then there are neighbors defined under the different VRFs. You get the point.
So, in order to reflect hierarchy, other object types may exist in a model besides a leaf, such as leaf lists, lists and containers. A leaf-list is a list of leafs. For example, a snmp server configured on router is a leaf object. A list of snmp servers make up a leaf list. All leafs under a leaf list are of the same type.
A list is a group of other objects and has many instances. For example, the VRF in Example 1 is a list. It has children objects of different types (some of which are themselves lists), and at the same time you may have more than one VRF configured under the BGP process. A container is a group of objects of different types, but a container will only have a single instance. An example of a container is the BGP process itself. This is an over simplification of what a data model is just to elaborate on the concept.
Data models used in the arena of network programmability are described using a language called YANG. A “YANG model” is nothing more than a data model described using YANG.
Defining Programmability is not as easy as the previous terms. The reason for this is that the term is used across a very wide spectrum, and means different things depending on what context it is used in. Programming a device or a system basically means giving it instructions to do what you want it to do. A programmable device is a device that can execute different tasks based on the instructions it is given. In the world of electronics, an ASIC (Application Specific Integrated Circuit) is a chip that does one specific function. If this ASIC is built to accept two numbers as input and add those two numbers, it will always do that. A Microprocessor, on the other hand, accepts instructions describing what you want it to do with the input it is given. You can program it to add two numbers, multiply them, or subtract one from the other. A microprocessor is a programmable device while an ASIC is not.
But doesn’t this equate programming to configuration? Configuring a network device is basically telling that device what to do … right? Well, that is tricky question, and it is here that we discuss programmability as used today in the context of network automation.
Programmability for the purpose of network automation is basically the capability to retrieve data, whether configuration or operational data, from a system, or push configuration to a system, using an Application Programming Interface, or API for short. An API is an interface to a system that is designed for software interaction with this system. Contrast this to a router CLI. A CLI requires human interaction to be useful. An API on that same router would be designed so that a Python program, for example, can interact with the router without any human intervention.
But what really is an API?
An API is software running on a system. This software provides a particular function to other software, while not exposing this other software to how this function is performed. An API will typically have a predefined way to reach it, for example, over a specific TCP port. The API will also define the format of the data that it accepts, as well as the data it sends back. It may also define different message types and specific syntax and semantics to avail the services provided by the API. A device that implements an API is said to expose an API to be consumed by other software.
Now to connect the dots. Orchestration coordinates a number of automated tasks in a workflow to implement one of the disciplines of network management. Automation may leverage an API exposed by a device in order to manage this device programmatically. When a data model is used as a reference during programmatic access, the device is said to leverage model-driven programmability.
Quite a mouthful!
Network Programmability: The Details
Network Programmability as a practice is best summarized by the Programmability Stack in Figure 1 below. The stack defines six layers:
1. Application
2. Model
3. Protocol
4. Encoding
5. Transport
6. Infrastructure
Figure 1 – The Network Programmability Stack
(Disclaimer: The Programmability Stack has not been standardized by any industry-recognized entity such as the OSI 7-Layer Model. Therefore, you will probably run into a number of variations of this stack as you progress in your studies of programmability. I found that the one I have drawn here is the best version for the sake of getting the point across. Feel free to contrast it to other version you find elsewhere and tell me what you think in the comments below.)
At the top of the stack is an application that may be a simple python script or a sophisticated network management system such as Cisco Prime. At the bottom of the stack is the device exposing an API. In order for the application to programmatically speak with the device’s API, it will leverage a choice of components at the different layers of the stack.
The application will choose a model. Different types of models exist, the majority today described in YANG. A model may be vendor-specific or standards-based. In either case, the model will define the structure of the data that the application sends to or receives from the device (through the API).
The application will have to choose a protocol that defines the message types as well as the syntax and semantics used in those messages. There are three primary protocols used today for network programmability: NETCONF, RESTCONF and gRPC.
NETCONF, for example, defines three message types: hello, rpc and rpc-reply. It also defines specific operations that may be used in the rpc message to perform tasks such retrieving operational data from a device or pushing configuration to a device. The messages will use specific, well-defined syntax.
The protocols themselves are sometimes described as the APIs. Don’t get confused just yet ! When a device exposes a NETCONF API, then the application will have no choice but to use the NETCONF protocol to speak with the device. The same applies to the other protocols.
A protocol will have a choice of a data format, typically referred to as the data encoding. The most common data encodings in use today are XML, JSON, YAML and GPB. For example, NETCONF will send and receive data only in the form of XML documents. RESTCONF supports both, XML and JSON.
Then this data will be transported back and forth between the application and the device that is exposing the API using a transport protocol. For example, NETCONF uses SSH while RESTCONF uses HTTP.
This is network programmability in a nutshell!
0 comments:
Post a Comment