Friday, 27 May 2022

Perspectives on the Future of SP Networking: Intent and Outcome Based Transport Service Automation

One lesson we could all learn from cloud operators is that simplicity, ease of use, and “on-demand” are now expected behaviors for any new service offering. Cloud operators built their services with modular principles and well-abstracted service interfaces using common “black box” software programming fundamentals, which allow their capabilities to seamlessly snap together while eliminating unnecessary complexity. For many of us in the communication service provider (CSP) industry, those basic principles still need to be realized in how transport service offerings are requested from the transport orchestration layer.

The network service requestor (including northbound BSS/OSS) initiates an “intent” (or call it an “outcome”) and it expects the network service to be built and monitored to honor that intent within quantifiable service level objectives (SLOs) and promised service level expectations (SLEs). The network service requestor doesn’t want to be involved with the plethora of configuration parameters required to deploy that service at the device layer, relying instead on some other function to complete that information. Embracing such a basic principle would not only reduce the cost of operations but also enable new “as-a-Service” business models which could monetize the network for the operator.

But realizing the vision requires the creation of intent-based modularity for the value-added transport services via well-abstracted and declarative service layer application programming interfaces (APIs).  These service APIs would be exposed by an intelligent transport orchestration controller that acts in a declarative and outcome-based way. Work is being done by Cisco in network slicing and network-as-a-service (NaaS) to define this layer of service abstraction into a simplified – yet extensible – transport services model allowing for powerful network automation.

How we got here


Networking vendors build products (routers, switches, etc.) with an extensive set of rich features that we lovingly call “nerd-knobs”. From our early days building the first multi-protocol router, we’ve always taken great pride in our nerd-knob development. Our pace of innovation hasn’t slowed down as we continue to enable some of the richest networking capabilities, including awesome features around segment routing traffic engineering (SR-TE) that can be used to drive explicit path forwarding through the network (more on that later). Yet historically it’s been left to the operator to mold these features together into a set of valuable network service offerings that they then sell to their end customers. Operators also need to invest in building the automation tools required to support highly scalable mass deployments and include some aspects of on-demand service instantiation. While an atomic-level setting of the nerd knobs allows the operator to provide granular customization for clients or services, this level of service design creates complexity in other areas. It drives very long development timelines, service rigidity, and northbound OSS/BSS layer integration work, especially for multi-domain use cases.

With our work in defining service abstraction for NaaS and network slicing and the proposed slicing standards from the Internet Engineering Task Force (IETF), consumers of transport services can soon begin to think in terms of the service intent or outcome and less about the complexity of setting feature knobs on the machinery required to implement the service at the device level. Transport automation is moving towards intent, outcome, and declarative-based service definitions where the service user defines the what, not the how.

In the discussion that follows, we’ll define the attributes of the next-generation transport orchestrator based on what we’ve learned from user requirements. Figure 1 below illustrates an example of the advantages of the intent-based approach weaving SLOs and SLEs into the discussion. Network slicing, a concept inspired by cellular infrastructure, is introduced as an example of where intent-based networking can add value.

Cisco, Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Prep, Cisco News, Cisco Certifications
Figure 1. Increased confidence with transport services

What does success look like?


The next-generation transport orchestrator should be closed loop-based and implement these steps:

1. Support an intent-based request to instantiate a new transport service to meet specific SLEs/SLOs

2. Map the service intent into discrete changes, validate proposed changes against available resources and assurance, then implement (including service assurance tooling for monitoring)

3. Operational intelligence and service assurance tools monitor the health of service and report

4. Insights observe and signal out-of-tolerance SLO events

5. Recommended remediations/optimizations determined by AI tooling drawing on global model data and operational insights

6. Recommendations are automatically implemented or passed to a human for approval

7. Return to monitoring mode

Figure 2 shows an example of intent-based provisioning automation. On the left, we see the traditional transport orchestration layer that provides very little service abstraction. The service model is simply an aggregation point for network device provisioning that exposes the many ‘atomic-level’ parameters required to be set by northbound OSS/BSS layer components. The example shows provisioning an L3VPN service with quality of service (QoS) and SR-TE policies, but it’s only possible to proceed atomically. The example requires the higher layers to compose the service, including resource checks, building the service assurance needs, and then performing ongoing change control such as updating and then deleting the service (which may require some order of operations). Service monitoring and telemetry required to do any service level assurance (SLA) is an afterthought and built separately, and it’s not easily integrated into the service itself. The higher layer service orchestration would need to be custom-built to integrate all these components and wouldn’t be very flexible for new services.

Cisco, Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Prep, Cisco News, Cisco Certifications
Figure 2. Abstracting the service intent

On the right side of Figure 2, we see a next-gen transport service orchestrator which is declarative and intent-based. The user specifies the desired outcome (in YANG via a REST/NETCONF API), which is to connect a set of network endpoints, also called service demarcation points (SDPs) in an any-to-any way and to meet a specific set of SLO requirements around latency and loss. The idea here is to express the service intent in a well-defined YANG-modeled way directly based on the user’s connectivity and SLO/SLE needs. This transport service API is programable, on-demand, and declarative.

Cisco, Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Prep, Cisco News, Cisco Certifications
Figure 3. IETF slice framework draft definitions

The new transport service differentiator: SLOs and SLEs


So how will operators market and differentiate their new transport service offerings? While posting what SLOs can be requested will certainly be a part of this (requesting quantifiable bandwidth, latency, reliability, and jitter metrics), the big differentiators will be the set of SLE “catalog entries” they provide. SLEs are where “everything else” is defined as part of the service intent. What type of SLEs can we begin to consider? See Table 1 below for some examples. Can you think of some new ones? The good news is that operators can flexibly define their own SLEs and map those to explicit forwarding behaviors in the network to meet a market need.

Cisco, Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Prep, Cisco News, Cisco Certifications
Table 1. Sample SLE offerings

Capabilities needed in the network


The beauty of intent-based networking is that the approach treats the network as a “black box” that hides detailed configuration from the user. With that said, we still need those “nerd-knobs” at the device layer to realize the services (though abstracted by the transport controller in a programable way). At Cisco, we’ve developed a transport controller called Crosswork Network Controller (CNC) which works together with an IP-based network utilizing BGP-based VPN technology for the overlay connectivity along with device layer QoS and SR-TE for the underlay SLOs/SLEs. We’re looking to continue enhancing CNC to meet the full future vision of networking intent and closed loop.

While BGP VPNs (for both L2 and L3), private-line emulation (for L1), and packet-based QoS are well-known industry technologies, we should expound on the importance of SR-TE. SR-TE will allow for a very surgical network path forwarding capability that’s much more scalable than earlier approaches. All the services shown in Table 1 will require some aspect of explicit path forwarding through the network. Also, to meet specific SLO objectives (such as BW and latency), dictating and managing specific path forwarding behavior will be critical to understanding resource availability against resource commitments. Our innovation in this area includes an extensive set of PCE and SR-TE features such as flexible algorithm, automated steering, and “on-demand-next-hop” (ODN) as shown in Figure 4.

Cisco, Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Prep, Cisco News, Cisco Certifications
Figure 4. Intent-based SR-TE with Automated Steering and ODN

With granular path control capabilities, the transport controller, which includes an intelligent path computation element (PCE), can dynamically change the path to keep within the desired SLO boundaries depending on network conditions. This is the promise of software-defined networking (SDN), but when using SR-TE at scale in a service provider-class network, it’s like SDN for adults!

Given the system is intent-based, that should also mean it’s declarative. If the user wanted to switch from SLE No.1 to SLE No.2 (go from a “best effort” latency-based service to a lowest latency-based service), then that should be a simple change in the top-level service model request. The transport controller will then determine the necessary changes required to implement the new service intent and only change what’s needed at the device level (called a minimum-diff operation). This is NOT implemented as a complete deletion of the original service and then followed by a new service instantiation. Instead, it’s a modify-what’s-needed implementation. This approach thus allows for on-demand changes which offer the cloud-like flexibility consumers are looking for, including time-of-day and reactionary-based automation.

Even the standards bodies are getting on board


The network slicing concept initially defined by 3GPP TS 23.501 for 5G services as “a logical network that provides specific network capabilities and network characteristics”, was the first to mandate the service in an intent-based way, and to request a service based on specific SLOs. This approach has become a generic desire for any network service (not just 5G) and certainly for the transport domain, most service providers look to the IETF for standards definitions. The IETF is working on various drafts to help vendors and operators have common definitions and service models for intent-based transport services (called IETF Network Slice Services). These drafts include: Framework for IETF Network Slices and IETF Network Slice Service YANG Model.

Cisco, Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Prep, Cisco News, Cisco Certifications
Figure 5. IETF network slice details

Conclusion


We envision a future where transport network services are requested based on outcomes and intents and in a simplified and on-demand fashion. This doesn’t mean the transport network devices will lose rich functionality – far from it. The “nerd-knobs” will still be there! Rich devices (such as VPN, QoS, and SR-TE) and PCE-level functionality will still be needed to provide the granular control required to meet the desired service objectives and expectations, yet the implementation will now be abstracted into more consumable and user-oriented service structures by the intent-based next-gen transport orchestrator.

This approach is consistent with the industry’s requirements on 5G network slicing and for what some are calling NaaS, which is desired by application developers. In all cases, we see no difference in that the service is requested as an outcome that meets specific objectives for a business purpose. Vendors like us are working to develop the proper automation and orchestration systems for both Cisco and third-party device support to realize this future of networking vision into enhanced, on-demand, API-driven, operator-delivered transport services.

Source: cisco.com

Thursday, 26 May 2022

How to Contribute to Open Source and Why

Cisco Certification, Cisco Exam, Cisco Preparation, Cisco Career, Cisco Learning, Cisco Skills, Cisco Jobs, Cisco, Cisco Preparation, Cisco Tutorial and Material, Cisco News

Getting involved in the open-source community (especially early in your career) is a smart move for many reasons. When you help others, you almost always get help in return. You can make connections that can last your entire career, helping you down the road in ways you can’t anticipate.

In this article, we’ll cover more about why you should consider contributing to open source, and how to get started.

Why Should I Get Involved in Open Source?

Designing, building, deploying, and maintaining software is, believe it or not, a social activity. Our tech careers place us in a network of bright and empathetic professionals, and being in that network is part of what brings job satisfaction and career opportunities.

Nowhere in tech is this more apparent than in the world of free and open-source software (FOSS). In FOSS, we build in public, so our contributions are highly visible and done together with like-minded developers who enjoy helping others. And by contributing to the supply of well-maintained open-source software, we make the benefits of technology accessible around the world.

Where Should I Contribute?

If you’re looking to get started, then the first question you’re likely asking is: Where should I get started? A great starting place is an open-source project that you have used or are interested in.

Most open-source projects store their code in a repository on GitHub or GitLab. This is the place where you can find out what the project’s needs are, who the project maintainers are, and how you can contribute. Because of the collaborative and generous culture of FOSS, maintainers are often receptive to unsolicited offers of help. Often, you can simply reach out to a maintainer and offer to contribute.

For example, are you interested in contributing to Django? They make it very clear: We need your help to make Django as good as it can possibly be.

Cisco Certification, Cisco Exam, Cisco Preparation, Cisco Career, Cisco Learning, Cisco Skills, Cisco Jobs, Cisco, Cisco Preparation, Cisco Tutorial and Material, Cisco News

Finding known issues


Most projects keep a list of known issues. You can find a task that fits your knowledge and experience level. For example, the list of issues for Flask shows the following:

Cisco Certification, Cisco Exam, Cisco Preparation, Cisco Career, Cisco Learning, Cisco Skills, Cisco Jobs, Cisco, Cisco Preparation, Cisco Tutorial and Material, Cisco News

Finding tasks for new contributors


Finally, many maintainers take the time to mark specific issues as being better for new contributors. For example, the Electron project applies a “good first issue” label. Notice the “Labels” selector on GitHub. You can use this to filter, showing you the best issues to start with.

Cisco Certification, Cisco Exam, Cisco Preparation, Cisco Career, Cisco Learning, Cisco Skills, Cisco Jobs, Cisco, Cisco Preparation, Cisco Tutorial and Material, Cisco News

Now you’ve got an issue to work on. How should you get started?

The Contribution Process


The basic process for contributing to open source is fairly uniform across all projects. However, you should still read the contributor guidelines for an individual project to be aware of any special requirements.

In general, the process looks like this:

1. Fork the project repository
2. Solve the issue
3. Submit a pull request
4. Wait for feedback

Let’s examine each of these steps in detail. We’ll use GitHub for our examples; most online repositories will operate similarly.

Fork the Project Repository


When you fork a project repository, you create a local copy of the project to do your work on. After you have your own copy, be sure to read any special instructions in the project README so that you can get the project up and running on your machine.

In GitHub, you can simply use the “Fork” button to start this. You’ll find it in the upper-right part of your screen:

Cisco Certification, Cisco Exam, Cisco Preparation, Cisco Career, Cisco Learning, Cisco Skills, Cisco Jobs, Cisco, Cisco Preparation, Cisco Tutorial and Material, Cisco News

As you save the forked repository to your account, you’ll be prompted to provide a name for it.

Cisco Certification, Cisco Exam, Cisco Preparation, Cisco Career, Cisco Learning, Cisco Skills, Cisco Jobs, Cisco, Cisco Preparation, Cisco Tutorial and Material, Cisco News

Solve the Issue


With a forked local copy up and running, you’re now ready to tackle the issue at hand. As you solve the issue, it’s important to keep a few things in mind:

◉ Pay attention to any coding style guidelines provided for the project.
◉ Make sure the project will run as expected, and that any provided tests pass.
◉ Comment your code as needed to help future developers.

Now that you’ve got a solution in place, it’s time to present your solution to the project maintainers.

Submit a Pull Request


The maintainers of the project need to review your proposed changes before they (hopefully) merge those changes into the main project repository. You kick off this process by submitting a pull request (PR).

Open a new PR

You can start PR creation in GitHub right from the original repository by clicking on New pull request on the Pull requests page.

Cisco Certification, Cisco Exam, Cisco Preparation, Cisco Career, Cisco Learning, Cisco Skills, Cisco Jobs, Cisco, Cisco Preparation, Cisco Tutorial and Material, Cisco News

Set up the branch comparison

On the Compare changes page, click on compare across forks.

Cisco Certification, Cisco Exam, Cisco Preparation, Cisco Career, Cisco Learning, Cisco Skills, Cisco Jobs, Cisco, Cisco Preparation, Cisco Tutorial and Material, Cisco News

Choose the branch to merge

When creating a pull request, it’s very important to pay close attention to which branch you want to merge.

The branch in the original repository

First, select the desired branch that the code changes will merge into. Typically this will be the main branch in the original repository, but be sure to check the contributor guidelines.

Cisco Certification, Cisco Exam, Cisco Preparation, Cisco Career, Cisco Learning, Cisco Skills, Cisco Jobs, Cisco, Cisco Preparation, Cisco Tutorial and Material, Cisco News

The branch in your forked repo

Next, select the branch from your forked repository where you did the work.

Cisco Certification, Cisco Exam, Cisco Preparation, Cisco Career, Cisco Learning, Cisco Skills, Cisco Jobs, Cisco, Cisco Preparation, Cisco Tutorial and Material, Cisco News

Give your PR a title and description

Next, you’ll need to provide a title and description for your pull request. Don’t be overly wordy. You can explain your approach, but you should let your code and comments speak for themselves. Maintainers are often tight on time. Make your PR easy to read and review.

Some repositories provide template content for the PR description, and they include a checklist of items to ensure all contributors adhere to their process and format. Pay attention to any special instructions you’ve been given.

Cisco Certification, Cisco Exam, Cisco Preparation, Cisco Career, Cisco Learning, Cisco Skills, Cisco Jobs, Cisco, Cisco Preparation, Cisco Tutorial and Material, Cisco News

Create the pull request

After making sure you’ve provided everything the maintainers are asking for, click Create Pull Request.

You’ve done it! You have submitted your first PR for an open-source project!

Wait for Feedback


You’re likely anxious to hear back on your PR. Again, check the contributor guidelines for what to expect here. Often, it will be some time until you hear back, and maintainers may not want you to nudge them.

If there are any points to address in your PR, maintainers will probably have that conversation with you as a thread in the PR. Watch your email for notifications. Try to respond quickly to comments on your PR. Maintainers appreciate this.

If you need to refactor your code, do so, and then commit the changes. You likely will not need to notify the maintainer, but you should check the contributor guidelines to be sure. The platform (in our case, GitHub) will notify the maintainers of the commit, so they’ll know to look at the PR again.

Source: cisco.com

Tuesday, 24 May 2022

Broadband Planning: Who Should Lead, and How?

Cisco, Cisco Exam, Cisco Exam Prep, Cisco Career, Cisco Tutorial and Materials, Cisco Prep, Cisco Preparation, Cisco Learning, Cisco Planning, Cisco Jobs, Cisco Skills, Cisco News

As new Federal funding is released to help communities bridge the digital divide, you’ll need to gain a strong understanding of the solutions and deployment options available. Often overlooked, however, is the need to develop and commit to a realistic and inclusive broadband planning process. One that acknowledges the broad variety of stakeholders you’ll encounter and offers a realistic timeline to meet funding mandates. You’ll also need a strong leader. But who should lead and what should the process look like?

Why broadband planning is critical

As a licensed Landscape Architect and environmental planner, I’ve had the opportunity to work with state and local government leaders on a variety of infrastructure projects. In each case, we created and adhered to a detailed planning process. The projects ranged from a few acres to 23,000 acres, from roadways and utilities to commercial and residential communities. Even campuses and parks. In each case, sticking to a detailed planning process made things go smoother, resulting in a more successful project.

Cisco, Cisco Exam, Cisco Exam Prep, Cisco Career, Cisco Tutorial and Materials, Cisco Prep, Cisco Preparation, Cisco Learning, Cisco Planning, Cisco Jobs, Cisco Skills, Cisco News
As critical infrastructure, broadband projects should adopt the same approach. You’ll benefit greatly by leveraging a well thought out collaborative planning model. Your stress levels will be reduced, your stakeholders happier, and the outcome more resilient and sustainable.

Using a collaborative planning model helps accomplish this by:

◉ Establishing a clear vision and goals

◉ Limiting the scope of the project, preventing “scope-creep”

◉ Creating dedicated milestones to keep you on track

◉ Providing transparency for all stakeholders

◉ Setting a realistic timeline to better plan and promote your project.

Using a collaborative broadband planning process also creates a reference source for media outreach and promotion as milestones are reached. Lastly, by having a recorded process, funding mandates or data reporting can be more easily reported, keeping you and your team in compliance.

Who should lead broadband planning?

My involvement in traditional infrastructure planning has allowed me to experience first-hand how comfortable government personnel are in leading large-scale projects. Why are they? Because:

◉ They’re well versed in local ordinances, regulatory laws, and community standards

◉ They understand their community and its people

◉ They have established relationships that cross the public and private sector.

That’s why I, and many others in the IT industry, feel these same state and local government leaders can offer the most success leading broadband planning in their communities.

In addition, those in planning-specific positions are especially suited to do so, having unique skill sets that address:

◉ What type of infrastructure is needed and where to locate it

◉ Gathering realistic data via surveys, GIS mapping, and canvassing

◉ Construction issues that may serve as potential roadblocks or opportunities

◉ Understanding potential legal and maintenance issues.

A realistic broadband planning process

To help our partners in the public and private sectors achieve greater success in their broadband efforts, we’ve created a new guide. It outlines a realistic, inclusive broadband planning process, including suggested timelines and milestones.

Cisco, Cisco Exam, Cisco Exam Prep, Cisco Career, Cisco Tutorial and Materials, Cisco Prep, Cisco Preparation, Cisco Learning, Cisco Planning, Cisco Jobs, Cisco Skills, Cisco News

By leveraging our new guide, titled Powering a Future of Inclusive Connectivity, your team can increase the comfort level among stakeholders, increasing buy-in to your project. Moreover, you’ll learn:

◉ Key questions to ask when seeking funding

◉ Considerations when building public/private partnerships

◉ The five steps you need to implement a strong and transparent planning process (including suggested timelines and associated milestones)

◉ Use cases.

Funding for broadband


Up to $800 billion in direct and indirect investment is available over the next 5-10 years to fund broadband. This includes the Federal Coronavirus Aid, Relief and Economic Security Act (CARES) and the American Rescue Plan Act (ARPA). Plus, the Infrastructure Investment and Jobs Act (IIJA).

Each program is unique, so understanding them can be a challenge. As you start your broadband planning process, I encourage you to reach out to the Cisco Public Funding Office. Their experts will be glad to help answer questions and guide you through the funding opportunities that best fit your needs.

Source: cisco.com

Sunday, 22 May 2022

How Cisco DNA Assurance Proves It’s ‘Not a Network Issue’

Cisco DNA Assurance, Cisco Career, Cisco Skills, Cisco Jobs, Cisco Preparation, Cisco DNA

When something in your house breaks, it’s your problem. When something in your network breaks, it’s everyone’s problem. At least, that’s how it can feel when the sudden influx of support tickets, angry phone calls, and so on start rolling in. They quickly remind you that those numbers behind the traffic visualizations are more than numbers alone. They represent individuals. That includes individuals who don’t notice how the infrastructure supports them until suddenly… it’s not.

The adage that “time is money” applies here, and maybe better than anywhere else. Because when users on the network cannot do what they came to do, the value of their halted actions can add up quickly. That means reaction can’t be the first strategy for preserving a network. Instead, proactive measures that prevent problems (ha, alliteration) become first-order priorities.

That’s where Cisco DNA Center and Assurance comes in, and along with it, Leveraging Cisco Intent-Based Networking DNA Assurance (DNAAS) v2.0, the DNAAS course.

Let’s Start with Intent

This will come as no surprise to anyone, but networks are built for a purpose. From a top-down perspective, the network provides the infrastructure necessary to support business intent. Cisco DNA Center allows network admins and operators to make sure that the business intent is translated into network design and functionality. This ensures that the network is actually accomplishing what is needed. Cisco DNA Center has a load of tools, configs, and templates to make the network functional.

What is Cisco DNA Assurance?

Cisco DNA Assurance is the tool that keeps the network live. With it, we can use analytics, machine learning, and AI to understand the health of the intent-based network. DNA Assurance can identify problems before they manifest into critical issues. DNA Assurance allows us to gauge the overall health of the network across clients, devices, and applications and establish an idea of overall health. From there, we can troubleshoot and identify consistent issues compared to the baseline health of the network — before those issues have a significant impact. We don’t have to wait for an outage to act. (Or react.)

We’re no longer stuck in this red-light or green-light situation, where the network is either working or it’s not. When the light goes from green to yellow, we can start saying, “Hey, why is that happening? Let’s get to the root cause and fix it.”

Obviously, this was all-important before the big shift to hybrid work environments, but it’s even more critical now. When you have a problem, you can’t just walk down the hall to the IT guy, you’re sort of stranded on an island, hoping someone else can figure out what’s wrong. And on the other hand, when you’re the person tasked with fixing those problems, you want to know what’s going on as quickly as possible.

One customer I worked with installed Cisco DNA Assurance to ‘prove the innocence of the network.’ He felt that being able to quickly identify the network problem, especially if it was not necessarily a network issue, helped to get fixes done more quickly and efficiently. DNA Assurance helped to rule out the network or ‘prove it was innocent’ and allow him to narrow his troubleshooting focus.

Another benefit of DNA Assurance is that it’s built on Cisco’s expertise. 30+ years of experience with troubleshooting networks and devices have gone into developing Assurance. Its technology doesn’t just give you an overview of the network, it lets you know where things are going wrong and helps you discover solutions.

About the DNAAS course

Leveraging Cisco Intent-Based Networking DNA Assurance (DNAAS) v2.0 is the technology training course we developed to teach users about Cisco DNA Assurance. The course is designed to give a clear understanding of what DNA Assurance can do and to build a deep knowledge of the capabilities of the technology. It’s meant to give new users a firm handle on the technology while increasing the expertise of existing users and empowering them to further optimize their implementation of DNA Assurance.

One of the things we wanted to do was highlight some of the areas that users may not have touched on before. We give them a chance to experience those things and potentially roll them into tangible solutions on their own network. It’s all meant to be immediately actionable. Users can take this course and instantly turn back around and do something with the knowledge.

Labs are one of the ways that we’ve focused on bringing more of the experience to users who are taking the course. New users are going to interact with a real DNA Center instance, and experienced users are going to have the chance to see new configurations. We build out the fundamental skills necessary to use DNA Assurance, rather than focusing on strict use cases.

We treated it like learning to drive a car. We could teach you all the specifics about one highly specialized vehicle, or we could give you the foundational skills necessary to drive anything and allow you to work towards your specific needs.

Overall, students are going to expand their practical knowledge of DNA Assurance and gain actionable skills they can immediately use. DNAAS is an excellent entry into the technology for new users and an equally excellent learning opportunity for experienced users. It helps build important skills that help users to get the most out of the technology and keep their networks running smoothly.

Source: cisco.com

Saturday, 21 May 2022

ChatOps: Managing Kubernetes Deployments in Webex

This is the third post in a series about writing ChatOps services on top of the Webex API. In the first post, we built a Webex Bot that received message events from a group room and printed the event JSON out to the console. In the second, we added security to that Bot, adding an encrypted authentication header to Webex events, and subsequently adding a simple list of authorized users to the event handler. We also added user feedback by posting messages back to the room where the event was raised.

In this post, we’ll build on what was done in the first two posts, and start to apply real-world use cases to our Bot. The goal here will be to manage Deployments in a Kubernetes cluster using commands entered into a Webex room. Not only is this a fun challenge to solve, but it also provides wider visibility into the goings-on of an ops team, as they can scale a Deployment or push out a new container version in the public view of a Webex room. You can find the completed code for this post on GitHub.

This post assumes that you’ve completed the steps listed in the first two blog posts. You can find the code from the second post here. Also, very important, be sure to read the first post to learn how to make your local development environment publicly accessible so that Webex Webhook events can reach your API. Make sure your tunnel is up and running and Webhook events can flow through to your API successfully before proceeding on to the next section. In this case, I’ve set up a new Bot called Kubernetes Deployment Manager, but you can use your existing Bot if you like. From here on out, this post assumes that you’ve taken those steps and have a successful end-to-end data flow.

Architecture

Let’s take a look at what we’re going to build:

Cisco ChatOps, Cisco Career, Cisco Learning. Cisco Careers, Cisco Prep, Cisco Skills, Cisco Job, Cisco Preparation, Cisco Kubernetes

Building on top of our existing Bot, we’re going to create two new services: MessageIngestion, and Kubernetes. The latter will take a configuration object that gives it access to our Kubernetes cluster and will be responsible for sending requests to the K8s control plane. Our Index Router will continue to act as a controller, orchestrating data flows between services. And our WebexNotification service that we built in the second post will continue to be responsible for sending messages back to the user in Webex.

Our Kubernetes Resources


In this section, we’ll set up a simple Deployment in Kubernetes, as well as a Service Account that we can leverage to communicate with the Kubernetes API using the NodeJS SDK. Feel free to skip this part if you already have those resources created.

This section also assumes that you have a Kubernetes cluster up and running, and both you and your Bot have network access to interact with its API. There are plenty of resources online for getting a Kubernetes cluster set up, and getting kubetcl installed, both of which are beyond the scope of this blog post.

Our Test Deployment

To keep thing simple, I’m going to use Nginx as my deployment container – an easily-accessible image that doesn’t have any dependencies to get up and running. If you have a Deployment of your own that you’d like to use instead, feel free to replace what I’ve listed here with that.

# in resources/nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
    name: nginx-deployment
  labels:
      app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
template:
  metadata:
    labels:
      app: nginx
  spec:
    containers:
    - name: nginx
      image: nginx:1.20
      ports:
      - containerPort: 80

Our Service Account and Role

The next step is to make sure our Bot code has a way of interacting with the Kubernetes API. We can do that by creating a Service Account (SA) that our Bot will assume as its identity when calling the Kubernetes API, and ensuring it has proper access with a Kubernetes Role.

First, let’s set up an SA that can interact with the Kubernetes API:

# in resources/sa.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: chatops-bot

Now we’ll create a Role in our Kubernetes cluster that will have access to pretty much everything in the default Namespace. In a real-world application, you’ll likely want to take a more restrictive approach, only providing the permissions that allow your Bot to do what you intend; but wide-open access will work for a simple demo:

# in resources/role.yaml
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: default
  name: chatops-admin
rules:
- apiGroups: ["*"]
  resources: ["*"]
  verbs: ["*"]

Finally, we’ll bind the Role to our SA using a RoleBinding resource:

# in resources/rb.yaml
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: chatops-admin-binding
  namespace: default
subjects:
- kind: ServiceAccount
  name: chatops-bot
  apiGroup: ""
roleRef:
  kind: Role
  name: chatops-admin
  apiGroup: ""

Apply these using kubectl:

$ kubectl apply -f resources/sa.yaml
$ kubectl apply -f resources/role.yaml
$ kubectl apply -f resources/rb.yaml

Once your SA is created, fetching its info will show you the name of the Secret in which its Token is stored.

Cisco ChatOps, Cisco Career, Cisco Learning. Cisco Careers, Cisco Prep, Cisco Skills, Cisco Job, Cisco Preparation, Cisco Kubernetes

Fetching info about that Secret will print out the Token string in the console. Be careful with this Token, as it’s your SA’s secret, used to access the Kubernetes API!

Cisco ChatOps, Cisco Career, Cisco Learning. Cisco Careers, Cisco Prep, Cisco Skills, Cisco Job, Cisco Preparation, Cisco Kubernetes

Configuring the Kubernetes SDK


Since we’re writing a NodeJS Bot in this blog post, we’ll use the JavaScript Kubernetes SDK for calling our Kubernetes API. You’ll notice, if you look at the examples in the Readme, that the SDK expects to be able to pull from a local kubectl configuration file (which, for example, is stored on a Mac at ~/.kube/config). While that might work for local development, that’s not ideal for Twelve Factor development, where we typically pass in our configurations as environment variables. To get around this, we can pass in a pair of configuration objects that mimic the contents of our local Kubernetes config file and can use those configuration objects to assume the identity of our newly created service account.

Let’s add some environment variables to the AppConfig class that we created in the previous post:

// in config/AppConfig.js
// inside the constructor block
// after previous environment variables

// whatever you’d like to name this cluster, any string will do
this.clusterName = process.env['CLUSTER_NAME'];
// the base URL of your cluster, where the API can be reached
this.clusterUrl = process.env['CLUSTER_URL'];
// the CA cert set up for your cluster, if applicable
this.clusterCert = process.env['CLUSTER_CERT'];
// the SA name from above - chatops-bot
this.kubernetesUserame = process.env['KUBERNETES_USERNAME'];
// the token value referenced in the screenshot above
this.kubernetesToken = process.env['KUBERNETES_TOKEN'];

// the rest of the file is unchanged…

These five lines will allow us to pass configuration values into our Kubernetes SDK, and configure a local client. To do that, we’ll create a new service called KubernetesService, which we’ll use to communicate with our K8s cluster:

// in services/kubernetes.js

import {KubeConfig, AppsV1Api, PatchUtils} from '@kubernetes/client-node';

export class KubernetesService {
    constructor(appConfig) {
        this.appClient = this._initAppClient(appConfig);
        this.requestOptions = { "headers": { "Content-type": 
PatchUtils.PATCH_FORMAT_JSON_PATCH}};
    }

    _initAppClient(appConfig) { /* we’ll fill this in soon */  }

    async takeAction(k8sCommand) { /* we’ll fill this in later */ }
}

This set of imports at the top gives us the objects and methods that we’ll need from the Kubernetes SDK to get up and running. The requestOptions property set on this constructor will be used when we send updates to the K8s API.

Now, let’s populate the contents of the _initAppClient method so that we can have an instance of the SDK ready to use in our class:

// inside the KubernetesService class
_initAppClient(appConfig) {
    // building objects from the env vars we pulled in
    const cluster = {
        name: appConfig.clusterName,
        server: appConfig.clusterUrl,
        caData: appConfig.clusterCert
    };
    const user = {
        name: appConfig.kubernetesUserame,
        token: appConfig.kubernetesToken,
    };
    // create a new config factory object
    const kc = new KubeConfig();
    // pass in our cluster and user objects
    kc.loadFromClusterAndUser(cluster, user);
    // return the client created by the factory object
    return kc.makeApiClient(AppsV1Api);
}

Simple enough. At this point, we have a Kubernetes API client ready to use, and stored in a class property so that public methods can leverage it in their internal logic. Let’s move on to wiring this into our route handler.

Message Ingestion and Validation


In a previous post, we took a look at the full payload of JSON that Webex sends to our Bot when a new message event is raised. It’s worth taking a look again, since this will indicate what we need to do in our next step:

Cisco ChatOps, Cisco Career, Cisco Learning. Cisco Careers, Cisco Prep, Cisco Skills, Cisco Job, Cisco Preparation, Cisco Kubernetes

If you look through this JSON, you’ll notice that nowhere does it list the actual content of the message that was sent; it simply gives event data. However, we can use the data.id field to call the Webex API and fetch that content, so that we can take action on it. To do so, we’ll create a new service called MessageIngestion, which will be responsible for pulling in messages and validating their content.

Fetching Message Content

We’ll start with a very simple constructor that pulls in the AppConfig to build out its properties, one simple method that calls a couple of stubbed-out private methods:

// in services/MessageIngestion.js
export class MessageIngestion {
    constructor(appConfig) {
        this.botToken = appConfig.botToken;
    }

    async determineCommand(event) {
        const message = await this._fetchMessage(event);
        return this._interpret(message);
     }

    async _fetchMessage(event) { /* we’ll fill this in next */ }

    _interpret(rawMessageText) { /* we’ll talk about this */ }
}

We’ve got a good start, so now it’s time to write our code for fetching the raw message text. We’ll call the same /messages endpoint that we used to create messages in the previous blog post, but in this case, we’ll fetch a specific message by its ID:

// in services/MessageIngestion.js
// inside the MessageIngestion class

// notice we’re using fetch, which requires NodeJS 17.5 or higher, and a runtime flag
// see previous post for more info
async _fetchMessage(event) {
    const res = await fetch("https://webexapis.com/v1/messages/" + 
event.data.id, {
        headers: {
            "Content-Type": "application/json",
            "Authorization": `Bearer ${this.botToken}`
        },
        method: "GET"
    });
    const messageData = await res.json();
    if(!messageData.text) {
        throw new Error("Could not fetch message content.");
    }
    return messageData.text;
}

If you console.log the messageData output from this fetch request, it will look something like this:

Cisco ChatOps, Cisco Career, Cisco Learning. Cisco Careers, Cisco Prep, Cisco Skills, Cisco Job, Cisco Preparation, Cisco Kubernetes

As you can see, the message content takes two forms – first in plain text (pointed out with a red arrow), and second in an HTML block. For our purposes, as you can see from the code block above, we’ll use the plain text content that doesn’t include any formatting.

Message Analysis and Validation

This is a complex topic to say the least, and the complexities are beyond the scope of this blog post.  There are a lot of ways to analyze the content of the message to determine user intent.  You could explore natural language processing (NLP), for which Cisco offers an open-source Python library called MindMeld. Or you could leverage OTS software like Amazon Lex.

In my code, I took the simple approach of static string analysis, with some rigid rules around the expected format of the message, e.g.:

<tagged-bot-name> scale <name-of-deployment> to <number-of-instances>

It’s not the most user-friendly approach, but it gets the job done for a blog post.

I have two intents available in my codebase – scaling a Deployment and updating a Deployment with a new image tag. A switch statement runs analysis on the message text to determine which of the actions is intended, and a default case throws an error that will be handled in the index route handler. Both have their own validation logic, which adds up to over sixty lines of string manipulation, so I won’t list all of it here. If you’re interested in reading through or leveraging my string manipulation code, it can be found on GitHub.

Analysis Output

The happy path output of the _interpret method is a new data transfer object (DTO) created in a new file:

// in dto/KubernetesCommand.js
export class KubernetesCommand {
    constructor(props = {}) {
        this.type = props.type;
        this.deploymentName = props.deploymentName;
        this.imageTag = props.imageTag;
        this.scaleTarget = props.scaleTarget;
    }
}

This standardizes the expected format of the analysis output, which can be anticipated by the various command handlers that we’ll add to our Kubernetes service.

Sending Commands to Kubernetes


For simplicity’s sake, we’ll focus on the scaling workflow instead of the two I’ve got coded. Suffice it to say, this is by no means scratching the surface of what’s possible with your Bot’s interactions with the Kubernetes API.

Creating a Webex Notification DTO

The first thing we’ll do is craft the shared DTO that will contain the output of our Kubernetes command methods. This will be passed into the WebexNotification service that we built in our last blog post and will standardize the expected fields for the methods in that service. It’s a very simple class:

// in dto/Notification.js
export class Notification {
    constructor(props = {}) {
        this.success = props.success;
        this.message = props.message;
    }
}

This is the object we’ll build when we return the results of our interactions with the Kubernetes SDK.

Handling Commands

Previously in this post, we stubbed out the public takeAction method in the Kubernetes Service. This is where we’ll determine what action is being requested, and then pass it to internal private methods. Since we’re only looking at the scale approach in this post, we’ll have two paths in this implementation. The code on GitHub has more.

// in services/Kuberetes.js
// inside the KubernetesService class
async takeAction(k8sCommand) {
    let result;
    switch (k8sCommand.type) {
        case "scale":
            result = await this._updateDeploymentScale(k8sCommand);
            break;
        default:
            throw new Error(`The action type ${k8sCommand.type} that was 
determined by the system is not supported.`);
    }
    return result;
}

Very straightforward – if a recognized command type is identified (in this case, just “scale”) an internal method is called and the results are returned. If not, an error is thrown.

Implementing our internal _updateDeploymentScale method requires very little code. However it leverages the K8s SDK, which, to say the least, isn’t very intuitive. The data payload that we create includes an operation (op) that we’ll perform on a Deployment configuration property (path), with a new value (value). The SDK’s patchNamespacedDeployment method is documented in the Typedocs linked from the SDK repo. Here’s my implementation:

// in services/Kubernetes.js
// inside the KubernetesService class
async _updateDeploymentScale(k8sCommand) {
    // craft a PATCH body with an updated replica count
    const patch = [
        {
            "op": "replace",
            "path":"/spec/replicas",
            "value": k8sCommand.scaleTarget
        }
    ];
    // call the K8s API with a PATCH request
    const res = await 
this.appClient.patchNamespacedDeployment(k8sCommand.deploymentName, 
"default", patch, undefined, undefined, undefined, undefined, 
this.requestOptions);
    // validate response and return an success object to the
    return this._validateScaleResponse(k8sCommand, res.body)
}

The method on the last line of that code block is responsible for crafting our response output.

// in services/Kubernetes.js
// inside the KubernetesService class
_validateScaleResponse(k8sCommand, template) {
    if (template.spec.replicas === k8sCommand.scaleTarget) {
        return new Notification({
            success: true,
            message: `Successfully scaled to ${k8sCommand.scaleTarget} 
instances on the ${k8sCommand.deploymentName} deployment`
        });
    } else {
        return new Notification({
            success: false,
            message: `The Kubernetes API returned a replica count of 
${template.spec.replicas}, which does not match the desired 
${k8sCommand.scaleTarget}`
        });
    }
}

Updating the Webex Notification Service


We’re almost at the end! We still have one service that needs to be updated. In our last blog post, we created a very simple method that sent a message to the Webex room where the Bot was called, based on a simple success or failure flag. Now that we’ve built a more complex Bot, we need more complex user feedback.

There are only two methods that we need to cover here. They could easily be compacted into one, but I prefer to keep them separate for granularity.

The public method that our route handler will call is sendNotification, which we’ll refactor as follows here:

// in services/WebexNotifications
// inside the WebexNotifications class
// notice that we’re adding the original event
// and the Notification object
async sendNotification(event, notification) {
    let message = `<@personEmail:${event.data.personEmail}>`;
    if (!notification.success) {
        message += ` Oh no! Something went wrong! 
${notification.message}`;
    } else {
        message += ` Nicely done! ${notification.message}`;
    }
    const req = this._buildRequest(event, message); // a new private 
message, defined below
    const res = await fetch(req);
    return res.json();
}

Finally, we’ll build the private _buildRequest method, which returns a Request object that can be sent to the fetch call in the method above:

// in services/WebexNotifications
// inside the WebexNotifications class
_buildRequest(event, message) {
    return new Request("https://webexapis.com/v1/messages/", {
        headers: this._setHeaders(),
        method: "POST",
        body: JSON.stringify({
            roomId: event.data.roomId,
            markdown: message
        })
    })
}

Tying Everything Together in the Route Handler


In previous posts, we used simple route handler logic in routes/index.js that first logged out the event data, and then went on to respond to a Webex user depending on their access. We’ll now take a different approach, which is to wire in our services. We’ll start with pulling in the services we’ve created so far, keeping in mind that this will all take place after the auth/authz middleware checks are run. Here is the full code of the refactored route handler, with changes taking place in the import statements, initializations, and handler logic.

// revised routes/index.js
import express from 'express'
import {AppConfig} from '../config/AppConfig.js';
import {WebexNotifications} from '../services/WebexNotifications.js';
// ADD OUR NEW SERVICES AND TYPES
import {MessageIngestion} from "../services/MessageIngestion.js";
import {KubernetesService} from '../services/Kubernetes.js';
import {Notification} from "../dto/Notification.js";

const router = express.Router();
const config = new AppConfig();
const webex = new WebexNotifications(config);
// INSTANIATE THE NEW SERVICES
const ingestion = new MessageIngestion(config);
const k8s = new KubernetesService(config);

// Our refactored route handler
router.post('/', async function(req, res) {
  const event = req.body;
  try {
    // message ingestion and analysis
    const command = await ingestion.determineCommand(event);
    // taking action based on the command, currently stubbed-out
    const notification = await k8s.takeAction(command);
    // respond to the user 
    const wbxOutput = await webex.sendNotification(event, notification);
    res.statusCode = 200;
    res.send(wbxOutput);
  } catch (e) {
    // respond to the user
    await webex.sendNotification(event, new Notification({success: false, 
message: e}));
    res.statusCode = 500;
    res.end('Something went terribly wrong!');
  }
}
export default router;

Testing It Out!


If your service is publicly available, or if it’s running locally and your tunnel is exposing it to the internet, go ahead and send a message to your Bot to test it out. Remember that our test Deployment was called nginx-deployment, and we started with two instances. Let’s scale to three:

Cisco ChatOps, Cisco Career, Cisco Learning. Cisco Careers, Cisco Prep, Cisco Skills, Cisco Job, Cisco Preparation, Cisco Kubernetes

That takes care of the happy path. Now let’s see what happens if our command fails validation:

Cisco ChatOps, Cisco Career, Cisco Learning. Cisco Careers, Cisco Prep, Cisco Skills, Cisco Job, Cisco Preparation, Cisco Kubernetes

Success! From here, the possibilities are endless. Feel free to share all of your experiences leveraging ChatOps for managing your Kubernetes deployments in the comments section below.

Source: cisco.com