Friday 14 December 2018

Integrating Warehouses and Distribution Centers into the Enterprise Network

We’re witnessing digital transformation all around us, and it’s driving fierce competition, new challenges, and compelling opportunities in every industry. We talked about some of the ways business are using the Internet of Things (IoT) to extend operations beyond traditional carpeted spaces inside the office and out to environments of all kinds.

In this edition, we’ll focus on how organizations with complex enterprise supply chains are transforming their businesses. Global retailers, distributors, manufacturers, healthcare, pharmaceuticals, and container operators are all driving innovation in the Extended Enterprise.

Cisco Study Material, Cisco Guides, Cisco Learning, Cisco Tutorial and Material

These innovators are exploring new strategies for Digital Transformation in industries to increase profitability, fulfill orders faster, improve safety, reduce cost and complexity, and more.

Everything Starts with Connectivity

To achieve these outcomes, secure connectivity across your entire supply chain is key.

You may be hampered by siloed and independent enterprise operations in your warehouses and distribution facilities due to inadequate system integration, lack of real-time visibility, and inconsistent safety control. Manual and error-prone inventory and reporting processes cost time and money—and create dissatisfied customers. To overcome these challenges and increase productivity and safety of your employees, you need to adopt digitization of your entire enterprise.

This isn’t always easy in busy and open industrial environments with massive volumes of traffic and employees constantly on the move. You need smart cameras and sensors to monitor operations and physical security to protect yourself against asset loss and theft, but also sophisticated cybersecurity to keep sensitive company and customer data from being compromised.

By securely connecting various warehouse systems – sensors, automated conveyors and sorters, safety and security systems and smart mobile devices you can monitor all warehouse systems and operations closely and gain access to data from disparate data sources. You can then analyze the data and develop insights to improve safety and operational efficiency.

Finally, you need real-time visibility to track heavy equipment and other assets, so you can perform timely maintenance and keep resources available and working at their best.

Cisco Study Material, Cisco Guides, Cisco Learning, Cisco Tutorial and Material

What You’ll Need

What’s required to connect and digitize warehouses? Enterprise supply chain environments are often harsh, dynamic environments that may be exposed to the weather, and require ruggedized connectivity to warehouse systems. To support employees and equipment on the move, they also require wireless connectivity everywhere. Physical security and real-time surveillance systems are key to keeping materials, people, and assets safe.

Also in order to keep IT operational costs low and centrally manage the entire enterprise, the warehouse network must be fully manageable using the same familiar management and policy-based orchestration tools used in the core enterprise. The network should provide secure access between warehouse devices and the enterprise, with the ability to scale and replicate to more warehouses and distribution centers as the organization evolves and grows.

Putting the Extended Enterprise to Work

Let’s take a closer look at how Cisco’s IoT solutions can help retailers scale to hundreds of connected and fully digitized warehouses and fulfillment centers.

Cisco Industrial Ethernet Switches and Cisco Outdoor and Industrial Access Points are purpose-built ruggedized solutions that can operate in environments without air-conditioning and deliver high performance Gigabit Ethernet connectivity for warehouse systems – sensors, cameras, sorters, conveyors, mobile phones etc. You can use the Cisco DNA Center to design, provision and manage these switches and APs.

Retailers also require end-to-end management and smooth communication with upper layer systems such as Supervisory Control and Data Acquisition (SCADA), Warehouse Control Systems (WCS), and cloud services. Cisco Industrial Ethernet Switches and  Cisco Industrial Security Appliances (ISA) offer strong security mechanisms to enable secure integration and communication with these upper layer systems. Cisco Industrial Integrated Services Routers extend connectivity further to service fleet such as delivery trucks for efficient fleet management and tracking of shipments.

Cisco Extended Enterprise solutions improve tracking and management of assets and inventory, as well as machinery and shipping equipment. Robust cybersecurity and video surveillance help keep data and assets safe. Retailers also gain complete operational visibility into operations across the organization, by fully integrating their supply chain into business applications.

Cisco Study Material, Cisco Guides, Cisco Learning, Cisco Tutorial and Material

No matter what type of warehouse or distribution center environment you’re supporting, Cisco Extended Enterprise solutions can help you extend your organization to gain real competitive advantages in today’s IoT age.

Wednesday 12 December 2018

Simplify Your Communications Migration Options: Private & Public Cloud Solutions

In this post, we tackle another common challenge in migration planning – the decision to deploy sites and users on either multi-tenant or “public” cloud solutions versus dedicated, virtualized systems in “private” cloud solutions. In the communications industry, PBX vendors and service providers tend to promote the specific solutions that they sell, taking an almost “black and white” perspective about the benefits of either public or private cloud options.

Public or private cloud. Where do you stand?


If you are a service provider or IT manager, you are probably bombarded by strong opinions on each side – especially for telecom and business communications services. This resulting cacophony of opinions only serves to confuse decision-makers and delay migration plans. To bring some clarity here, this blog post tackles the merits of both public and private cloud, and particularly where these two solutions can work together

The reality, of course, is that there are advantages of both private and public cloud. In fact, a 2018 survey of almost 1,000 IT professionals reveals that 81% of businesses have a “multi-cloud” strategy as shown below: On the right-hand side, you can see some added color to what “Multi-Cloud” means. For that 81% that are Multi-Cloud, 10% are entirely private cloud, 21% are entirely public cloud while 51% is a mix of public and private.

Cisco Study Material, Cisco Tutorial and Material, Cisco Guides, Cisco Learning, Cisco Certification
Figure 1: RightScale | 2018 State of the Cloud Report

These numbers present a difficult starting point for migration planning for both IT Managers and CSPs. While in the long term (10+ years), most applications are envisioned for public cloud, the data show that a variety of intermediate steps and near-term plans (over ~5 years) will need a mix of private cloud. To help business IT planners and CSPs, this post will cover:

◈ Description of private and public cloud relative to premises equipment systems
◈ Migration scenarios where a mix of private and public cloud solutions might make sense
◈ CSP opportunity to productize a mix of cloud solution offers

Premises vs. Private vs. Public Cloud

The differences between private and public cloud are subtle but important. From the perspective of end-user applications, there may be no significant difference between the receipt of services from private and public cloud infrastructure. For IT planners and CSPs, the differences can have significant implications. Before describing the attributes of these solutions, it is important to provide some context – the typical starting point for a communications migration, the premises-based PBX system. Below see key attributes of premises PBX, private and public cloud solutions.

Premises-based PBX:

◈ Financing: Systems infrastructure is purchased as a capital expense and depreciated over the life of the equipment.
◈ Management: Systems are managed by the business’s IT staff or in combination with an IT tech firm to provide maintenance and support.
◈ Location: Individual systems are located on a site – in either a wiring closet or, for bigger sites, in an IT rack with power, cooling, controlled access, etc.
◈ Telecom Services (PSTN): Telecom and PSTN services are delivered from a 3rd party (CSP typically) to a “demarcation” point and interconnected to the PBX to allow PSTN calling and emergency services.
◈ Integrations: Connections into 3rd party cloud applications (e.g., salesforce.com) or local systems (e.g., elevators, entry/exit systems, and alarm systems) are managed as “custom” integrations leveraging native PBX APIs and through a specialist vendor or through a preferred IT tech firm.
◈ Economics: Provide lower cost / seat / month for larger sites with simple telephony needs.

Private cloud or dedicated, virtualized cloud systems:

◈ Financing: Systems infrastructure can be owned by the CSP with services delivered for a monthly service charge or owned by the business and treated as a capital expense. There are a variety of financing models depending on the CSP or PBX vendor.
◈ Management: Systems infrastructure can be managed by the CSP, an IT services firm, or the PBX vendor directly. There are multiple options for who manages systems, provides upgrades, and support.
◈ Location: Systems can be located in CSP data centers, IT datacenter, or even the business’s own IT data centers.
◈ Telecom Services: Telecom and PSTN services can either be bundled communications applications or split off and managed as separate contracts.
◈ Integrations: Similar to PBX systems with most integrations managed as “custom” projects leveraging the PBX APIs. There may be some limitations based on location or if the virtualized system leverages a specialized provisioning and management system or “wrapper.”
◈ Economics: Provide lower cost / seat / month when deployed as part of a broader “private cloud” deployment and serve more basic communications needs.

Public cloud or multi-tenant deployments:

◈ Financing: Systems infrastructure is owned and managed by the CSP. Services are provided off this infrastructure to the business for a monthly fee, typically priced by the user, metered charge (e.g., minutes), or virtual service (e.g., IVRs, Meeting Bridges).
◈ Management: Systems are managed by the CSP, including upgrades, maintenance, and support. CSP systems serve multiple businesses.
◈ Location: Systems are typically managed by the CSP in a state of the art data center that leverages public cloud providers (e.g., Google, Amazon, etc.) with extensive peering to partner networks.
◈ Telecom Services: Telecom and PSTN services are typically bundled with user, group, and virtual services with limited to no options to split traffic across multiple wholesale carriers.
◈ Integrations: Connections to 3rd party cloud providers are typically provided as part of standard service packages. There may be limitations to what is actually supported via CSP APIs.
◈ Economics: Provide lower cost / seat / month for multi-site deployments, with more advanced services, and more 3rd party cloud integrations.

Hybrid Cloud Migration Options

From the above description of the attributes of private and public cloud deployments, you may begin to see some of the scenarios where a business may want to take advantage of a mix of cloud solutions. Some common scenarios are as follows:

Large HQ with many small branch offices: Many businesses and specific verticals combine a large headquarters or regional site with multiple smaller branch offices. Headquarter sites typically serve knowledge workers, office workers, and executives. Within these large sites or even campus locations, PBX systems may integrate to co-located business systems serving specific business applications or added communications needs. These sites are different from branch locations. Consider in particular this scenario for retail sites and branch banks. The smaller sites may only need basic telephony for lightweight customer service and mobility applications.

Recommendation: These particular businesses may prefer private cloud for the larger sites or campus environments and public cloud for the branch locations. Larger sites can take advantage of PBX economics and legacy co-located integrations. The branch sites take advantage of multi-site services, reduced management costs, and enhanced mobility offers from public cloud services.

Site(s) with integrated call/contact center(s): In some cases, the PBX that services a large site serves two roles – providing both local calling for business telephony applications and also providing inbound and outbound for contact center employees. There may be certain integrations and efficiencies already achieved through using the same call control platform to support both applications. Deploying a PBX via this approach could be part of a broader effort to integrate employee calling and contact center services.

Recommendation: Sites with integrated contact centers might be targeted for a private cloud deployment while all other sites are targeted for public cloud. The business may want to look at vendors that bring a combination of contact center and enterprise telephony on a public cloud – though such a migration should be planned on its own timeline.

Differing regional cloud migration strategies: There are still some considerable differences in the maturity of cloud communications offers based on region. These differences apply for CSPs, PBX vendors, and IT tech integrators. These differences may also apply to how partners support private cloud solutions.

Recommendation: Businesses should complete a thorough assessment of the maturity of both private and public cloud offers across regional sites. A site-specific recommendation might be necessary along with an interim solution that combines private and public cloud communications across regions.

Communications Service Provider – Hybrid Solution Productization

Considering all instances where businesses may prefer a mix of public and private cloud, CSPs should consider how to “productize” such an offer with a mix of solutions. Such an offer may need to include “managed PBX” options where PBXs remain on site for a period of time during interim migration phases. CSPs should include state-of-the-art SIP trunking and network connectivity to support managed premises equipment sites along with public and private cloud service offerings.

Current forecasts show that businesses currently consume a mix of public and private cloud services along with a mix of SaaS, PaaS, IaaS and other IT solutions. Increasingly CSPs benefit through partnerships to offer a more complete portfolio of offers, enabling them to drive sales engagements. Note the current forecast below from a recent BAIN BRIEF publication which brings forward data from IDC, Gartner, and Forrester:

Cisco Study Material, Cisco Tutorial and Material, Cisco Guides, Cisco Learning, Cisco Certification
Figure 2: Bain.com

For the CSP who brings such a mix of managed equipment, private, and public cloud offers, they open the possibility of winning coveted large enterprise opportunities and forming deep partnerships with these businesses as they help them plan a prudent and transparent cloud migration.

Though businesses pay a premium for these types of migration offers, they can ultimately achieve 1) operating and management efficiencies through the consolidation of vendors and systems, 2) productivity benefits as they chart the path to greater enterprise-wide services engagement (see incremental cut-over plans) and 3) reduced risk of disruption to mission-critical communications and data services.

One of the most important things for CSPs in productizing a mix of cloud offers is around pricing. Studies have shown that lack of pricing transparency is one of the critical concerns for businesses that can delay or deter cloud migration. For vendors, offering clear pricing is critical to engaging with businesses’ IT leaders. An August 2018 BCG study suggests that over the next five years, over 90% of revenue growth in the hardware and software sectors will be coming from either on-premises or off-premises offerings sold with “cloud-like” pricing models. BCG goes on to state that re-thinking industry pricing models is “no longer optional.” Thus, the CSP that can position a combination of public and private cloud solutions, all delivered with cloud-like pricing, will likely see strong receptivity in the marketplace.

CSPs should prepare to encounter a variety of existing cloud solutions and “in process” cloud migrations as a “starting point” for preparing hybrid migration offers. This starting point might include an existing mix of premises systems, managed virtualized systems, and some public cloud solutions. The CSP should try to use a small number of relatively simple starting points to enable them to scale their offer. They should consider the Pareto principle (80:20 rule) where the majority of prospects can be targeted with a smaller subset of solutions. Defining these productized solutions may require the CSP to consider geographic or industry vertical characteristics. See below for the wide variety of cloud migration statuses reported in a 2016 McKinsey survey:

Cisco Study Material, Cisco Tutorial and Material, Cisco Guides, Cisco Learning, Cisco Certification
Figure 3: McKinsey & Company

This McKinsey chart shows that a CSP may want to consider a different mix of public and private cloud solution offers if the CSP is planning to target financial services vs. healthcare vs. insurance. Most important of all, the CSP needs to understand that they will struggle if they try to productize and especially scale a hybrid offer migration strategy as a one-size-fits-all approach.

In conclusion, there is a tremendous opportunity for both businesses and CSPs to benefit from a migration strategy that includes a mix of both private and public cloud solutions. To the extent that private and public cloud offers can work together can benefit both businesses and CSPs. In the most basic case, businesses and CSPs should look for combined solutions that can emulate an integrated or single cloud solution. Private and public cloud solutions can emulate a single cloud from a functional, economic, and/or delivery point of view – providing benefits in the form of added productivity, cost transparency, and migration project risk reduction.

Friday 7 December 2018

Automating Your Network Operations, Part 2 – Data Models

Keeping your IT infrastructure operational


Before I get into data models, I want to take a slight diversion to incorporate some of the feedback that I received from the first blog. It was pointed out that my use of the ios_config module was “naïve.” I contend that it is more accurate to say that my use of Ansible in general was “naïve,” since this was a pretty straight forward use of the module. In any case, it was by design. Why? Well, if you are like the majority of the network operators that I’ve worked with over the years, you are not a programmer (or you are a “naïve” programmer). You spend 110% of your time busting your chops making sure that the network that underpins your company’s IT infrastructure is operational. I certainly do not want to discourage network operators going to classes, workshops, etc. to learn new skills if they have the time and the motivation, but it should not be necessary to consume network automation. We can do better!

Cisco Study Materials, Cisco Guides, Cisco Tutorial and Material, Cisco Certification, Cisco Network Operations

The NTP example that I used in the first blog was also chosen for specific reasons. First, NTP and AAA were the questions that I got most from customers on this issue. They were able to configure these things initially, but ran into problems when they moved to an operational posture. Second, it is representative of the “naïve” approach that you’ll see in the examples and workshops that are being delivered in the Ansible network automation genre. These “naive” examples and workshops perpetuate the first reason. Lastly, there is no way to use the ios_config module, the most used Ansible module for Cisco IOS, to fix this. Nor is the issue addressed by an NTP-specific Ansible network module. Yes, there might be some other module in some git repo written by someone that addresses your corner case, but do you want to use it? Is writing a module for every single corner case for network automation a scalable approach? I say no. Partially because it is intractable, partially because it is a support nightmare.

Where do you get help for the modules that underpin the automation?


Red Hat supports a specific set of modules (generally the ones they write), other vendors support their modules, other modules are completely unsupported. Where do you get help for the modules that underpin the automation that makes your network work? Incidentally, we’ll cover a technique to augment the ios_config module with a parser to address this problem in a later blog, but it is not intuitively obvious to the casual observer.

I believe that we need to focus on a smaller number of more capable modules and couple that with a more sophisticated, or more realistic, approach to network automation. Subsequent blogs will focus on different aspects of this more realistic approach, but this one will start with the biggest: Data Models.

So, what is a data model?


I know what you are thinking: “Wait a minute! I didn’t need data models for automating my servers!” Well, building a system is a well-defined procedure with relatively few permutations or interdependencies on other systems. Also, provisioning a system generally consists of configuring values like hostname, IPs, DNS, AAA, and packages. Each of these are key/value pairs (e.g. nameservers = 8.8.8.8, 8.8.4.4) that define the operation of that system and there are relatively few of them for a system.

This is not the case for a network element. If we take a standard 48-port Top-of-Rack Switch, each port could have a description, a state, a VLAN, an MTU, etc. A single ToR could have hundreds of key/value pairs that dictate its operation. Multiply that across hundreds or even thousands of switches, and the number of key/value pairs grows rapidly. Collectively, all of these key/value pairs make up the Source of Truth (SoT) of your network and there can be a lot of them. In fact, automating networks is really more of a data management and manipulation problem than it is a programming problem.

So, what is a data model? Generally speaking, a data model is a structure in which the meaning of a key/value pair is defined by its relative position in that structure. As an example, let’s start with the de-facto standard in the networking space: YANG. According to RFC 7950, “YANG is a data modeling language used to model configuration data, state data, Remote Procedure Calls, and notifications for network management protocols.”

If we look at the way we setup a simple BGP peering on Cisco IOS and Juniper JunOS, we basically have a bunch of values accompanied by a bunch of words using a particular grammar that describe those values:

Cisco Study Materials, Cisco Guides, Cisco Tutorial and Material, Cisco Certification, Cisco Network Operations

But the values, two ASNs and an IP address, are the only things that really matter and they are the same in each.

Cisco Study Materials, Cisco Guides, Cisco Tutorial and Material, Cisco Certification, Cisco Network Operations

In fact, the switch hardware does not care about the words that describe those values since they get stored in a config DB anyway. The words are what the engineers gave to the humans to communicate the meaning of those values to the hardware. After all, we can’t just specify 2 ASNs because we need to know which is the local and which is the remote. We could, however, communicate their meaning by order: e.g. <Local ASN>, <Peer ASN>, <Peer IP>. This is basically a small data model. Well, BGP gets A LOT more complicated, so we’ll need a more capable data model. Here we have an example of the same data in the OpenConfig data model rendered in YAML:

Cisco Study Materials, Cisco Guides, Cisco Tutorial and Material, Cisco Certification, Cisco Network Operations

The data in the model on the left contains the information needed to deliver either of the syntax specific versions… just add words. Yes, we still have words as tags in the model, but it normalizes those tags across vendors and gets rid of the grammar needed to specify how values relate to each other. We do not want to add words back if we can avoid it, so the next step is to encode all of this data into XML and shove it into the device via NETCONF.

NETCONF/YANG gives us programmability, but we still need automation since the two are not the same. This is where Ansible enters back into the story. In my opinion, it is the best of the open-source IaC tools for delivering data models to devices.

Wednesday 5 December 2018

Cisco Threat Response with Email Security Integration: Harmonizing Your Security Products

Those of us who have been in security for more than 20 years are very familiar with the assertion that security is a process. For me, security has always been a process like a melody that ties in all other parts of the song.  Staying on this musical analogy, if process is the melody, and you consider Cisco’s security portfolio as different instruments, then Cisco Threat Response is leading in this beautiful orchestration of investigation. Threat Response focuses on the process aspect of security. In this blog post, I want to introduce you to its value as an incident response tool and show you how to best utilize it with the integration of the Cisco Email Security product.

Security, Cisco Tutorial and Material, Cisco Guides, Cisco Study Materials

What is Cisco Threat Response?


Just like how harmony is essential to a great piece of music, it’s equally a critical aspect of modern security architecture. Today’s Security Operations Centers (SOCs) are inundated with alerts to potential threats, all scattered across a discordant array of security products that don’t always like to play nice with each other. More often than not, this results in a lot of wasted labor and time. Think of Cisco Threat Response as a conductor that harmonizes the various components of your security infrastructure. And with Cisco’s open architecture, the Security portfolio of products works together like an orchestra.

Threat Response integrates threat intelligence from Cisco Talos and the various third party sources that make up your SOC to automatically research indicators of compromise (IOCs) and confirm threats quickly. By channeling threat data into a single user-friendly interface, your SOC team analysts can quickly aggregate alerts, investigate, and remediate threats lurking about your systems, network and cloud. No matter how big or small your organization may be, Threat Response is designed to scale, ensuring your cyber security programs run efficiently, effectively, and most of all harmoniously.

Security, Cisco Tutorial and Material, Cisco Guides, Cisco Study Materials

Cisco Email Security


Email is easily the most pervasive technology across businesses. It’s hard to imagine living without it. However, as integral as it is to day-to-day business functions, it’s also one of the most commonly exploited attack vectors. With over 90% of breaches starting via email, having an email security solution in place is no longer a luxury, but a necessity.

Cisco Email Security provides best in class efficacy and has been recognized for the third year in a row as the Top vendor in The Radicati Group’s 2018  Secure Email Gateway report. In addition, the integration of Email Security with Cisco Threat Response provides you with a more robust approach to email security. This layered defense provides industry-leading protection against malware, phishing, spoofing, ransomware and business email compromise (BEC).

Avoid Phishy Melodies


Most threat narratives have a very similar beginning. An attacker has phished their way into a network via an email and successfully socially engineered a user into disclosing their credentials. Unfortunately, this is usually followed by organizations discovering far too late that they’ve been compromised. Remember, however, that in order for the above scenario to play out, the attacker has to complete a series of phases of their attack without being detected. Luckily, as the defender, all you need to do is detect them in one phase! Even if something does get through, if you can detect that threat and respond with a countermeasure before the attacker completes that phase, you have won.

Cisco Threat Response brings your threat detection capabilities into simple focus with:

Simpler Integration: Let’s say you have a situation where AMP for Endpoints has detected a malicious file represented by a file hash (SHA) or Stealthwatch has alerted you of a user connecting to a suspicious URL. You can use Threat Response to pivot directly to Cisco’s Email Security and ask for the email messages associated with this SHA or URL and this information can be then used to stop that attack’s campaign dead in its tracks.

Simpler Data Tracking: Want to know where that malicious email or potential threat came from? Threat Response allows you to track and isolate which user received that SHA or URL and lets you block the domain of origin without needing to switch programs. Additionally, if said malicious link was shared amongst your network, you can also see when and where it was sent and prevent it from spreading further.

Simpler Workflow: Like I mentioned earlier, Cisco Threat Response is all about simplifying threat response through harmonizing the various tools you already have in place into a single resource. Furthermore, not only does this allow security analysts to reduce the number of interfaces they’re using, but Threat Response also provides them with context based graphs, telemetry charts, and even response suggestions.

Simpler Threat Hunting: As your analysts respond to threats, they can also log their threat response processes and observations into the integrated Casebook function. Since Casebooks are built on cloud API and data storage, they can also follow you from product to product, across your entire Cisco Security portfolio. In turn, this allows for faster and effective threat response and remediation efforts across your enterprise thanks to the ability for analysts to easily build and share Casebooks.

Sunday 2 December 2018

Automating Your Network Operations, Part 1 – Ansible Basics

I’ve spent the last couple of years at Red Hat helping customers automate their networks with Ansible. If there is one thing that I’ve learned during that time, it is that network automation is not as easy as many would have you believe. That is not to say that tools like Ansible are not good tools for automation or that anyone is trying to sell you snake oil, but I believe that there is a fundamental impedance mismatch in translating the success Ansible has had with automating systems to automate networks.

Cisco Certification, Cisco Guides, Cisco Learning, Cisco Tutorial and Material

Part of this disconnect stems from a fundamental mis-understanding of the capabilities that Ansible provides. According to Red Hat, Ansible is a “common language to describe your infrastructure.” In practice, however, Ansible is more of a framework that brings an inventory of things together with a set of modules, plugins, and Jinja2 capabilities that perform operations on those things. The language, rendered in YAML or JSON, just passes key/value pairs between the modules, plugins, and Jinja2 capabilities. (Yes, that’s a simple description of a complex tool, but one that is accurate to illustrate the point of this and subsequent blogs.)

That is not to say that Ansible is not a powerful framework, but it has no native linguistic ability to describe a network. When I say an “inventory of things,” it is because Ansible really does not care what that thing is. Because of its agentless approach, it can talk to many things: systems, network devices, clouds, lightbulbs, etc. This is a great capability and part of why Ansible is so popular, but Ansible truly does not know one thing from another. It has no innate prowess for automating networks. It is simply a tool for automating what an operator does task by task. You cannot “describe” what you want OSPF to look like on your network. You simply provide a bunch of key/value pairs that get passed to the devices on your network through modules in hopes of yielding the OSPF configuration that you want.

Configuring settings on an IOS device


To illustrate this, let’s look at configuring two simple settings on an IOS device: hostname and NTP servers. Using Ansible parlance, we’ll describe the desired end state of the hostname of a particular device. Hostname is a great use case because it is a scalar (i.e. a single value). To change the hostname, the Ansible ios_config module does a simple textual compare of the configuration. If ‘hostname newname’ is not present, it sends that line to the device. Since hostname is a scaler, the old hostname gets replaced by the desired hostname.

A list of NTP servers, however, is more difficult. Say you’ve set the NTP server to 1.1.1.1 with:

- ios_config:
   lines:
    - ntp server 1.1.1.1

Now you want to change your NTP server to 2.2.2.2, so you do:

- ios_config:
   lines:
    - ntp server 2.2.2.2

Simple, right? But the problem is that you would end up with 2 NTP servers in the configuration:

ntp server 1.1.1.1
ntp server 2.2.2.2

This is because the Ansible ios_config module does not see `ntp server 2.2.2.2` present in the configuration, so it sends the line. Since ntp server is a list, however, it adds a new NTP server instead of replacing the existing one, giving you 2 NTP servers (one that you do not want). To end up with just 2.2.2.2 as your NTP server, you would have to know that 1.1.1.1 was already defined as an NTP server and explicitly remove it… exactly what an operator would do. This is also the case with ACLs, IP prefix-lists, and any other list in IOS. Ansible does not have a native way to describe the desired end state of something simple like NTP servers on a network device, much less something more complex like OSPF, QoS, or Multicast.

Cisco Certification, Cisco Guides, Cisco Learning, Cisco Tutorial and Material

Does that mean that Ansible is not a great tool for network automation? No, but like any tool, it needs to be used for the right task and can only complete a complex task when used in concert with other tools. As a framework, it is not a complete solution.

The intent of this blog series is to go beyond the hype and simple demonstrations prevalent in network automation conversations today and to dive more deeply into how and why to automate your network operations. In the next installment, I’ll talk about data models and why they are a critical piece of any automation framework.

Friday 30 November 2018

AI Ops and the Self-Optimization of Resources

Cisco Study Material, Cisco Learning, Cisco Guides, Cisco Tutorial and Material, Cisco Live

AI Ops includes the ability to dynamically optimize infrastructure resources through a holistic approach. Cisco Workload Optimization Manager is an important component in our strategy of delivering enhanced customer benefits through AI Ops.

Our Strategy for Delivering the Benefits of AI Ops


Cisco is executing a strategy to consistently enhance the customer benefits we deliver through AI-driven Operations (AI Ops). This blog is the latest in a series that describes our strategy, our open architecture, and how we are implementing each of the benefits. In the first blog in this series we defined four categories of benefits from AI Ops:

1. Improved user experience
2. Proactive support and maintenance
3. Self-optimization of resources
4. Predictive operational analytics

Multi-Dimensional AI Ops Strategy


Vendors use the terms AI, machine learning and AI Ops in a variety of ways. Their focus is primarily on hardware. Our strategy for delivering the customer benefits of AI Ops is a broader architectural vision. This vision includes infrastructure, workloads, and enhanced customer support in on-premises and cloud environments. Cisco’s strategy incorporates an open API framework and integrations with Cisco and partner platforms.

Infrastructure management is one dimension of AI Ops, and Cisco Intersight is an integral component of Cisco’s strategy. Managing workloads is another essential dimension, so Cisco Workload Optimization Manager (CWOM) is also an important component of this strategy.

AI Ops Portfolio Working Together


In a prior blog we explained how Intersight delivers an AI-driven user experience through our open API framework. We posted two blogs in this series to explain how Intersight delivers benefit #2, AI-driven proactive support and proactive maintenance. The proactive support is enabled through the Intersight integration with the Cisco service desk digital intelligence platform. This AI platform (internally referred to as BORG) is  used by the Cisco Technical Assistance Center. It includes AI, analytics, and machine learning. In this blog, I explain how we deliver benefit #3, the self-optimization of resources, through monitoring and automation with Cisco Workload Optimization Manager.

Self-Optimization of Resources


The self-optimization of resources includes both on-premises and public cloud infrastructure. You need to monitor and automate across a variety of virtualized environments, containers and microservices.

In order to ensure that your applications continuously perform, and your IT resources are fully optimized, you need full visibility across compute infrastructure and applications, across networks and clouds…. and you need all this intelligence at your fingertips, so you can quickly and easily make the right decisions, in real-time to assure application performance, operate efficiently and maintain compliance in your IT environment.

Cisco Workload Optimization Manager is an AI-powered platform that delivers this functionality through integrations with Cisco’s multicloud portfolio, ACI, UCS management, HyperFlex, and a broad ecosystem of partner solutions that will continue to grow over time.  CWOM continuously analyzes workload consumption, costs and compliance constraints and automatically allocates resources in real-time.

How Does AI Ops Work?


Resource allocation, workload scheduling and load balancing are concepts that have been critical to efficient IT operations for decades. Workload Optimization Manager uses AI and advanced algorithms to manage complex multicloud environments. It views on-premises resources and the cloud stack as a supply chain of buyers and sellers. CWOM looks for the options for running workloads and manages the resources as “just in time” supply to cost-effectively support workload demands, helping customers maintain a continuous state of application health.

Cisco Study Material, Cisco Learning, Cisco Guides, Cisco Tutorial and Material, Cisco Live
CWOM showing cost analysis of pending actions

Many AI Ops solutions are complex to deploy, and they require require a significant amount of time to accumulate information before they can be effective for analysis. Workload Optimization Manager is easy to install, and the agentless technology will instantly begin to detect all the elements in your environment from applications to individual components. The unique decision engine curates workload demand, so it can generate faster, accurate recommendations after collecting data for a short period of time. CWOM uses three categories of functionality to optimize the use of available resources:

Abstraction: All workloads (applications, VMs, containers) and infrastructure resources (compute, storage, network, fabric, etc.) are abstracted into a common data model, creating a “market” of buyers and sellers of resources.

Analysis: A decision engine applies the principles of supply, demand, and price to the market. There are costs associated with on-premises infrastructure resources, and cloud providers price their resources based on utilization levels. The analytics ensure the right resource decisions are made at the right time.

Automation: Workloads are precisely resourced, automatically, to optimize performance, compliance and cost in real-time. The workloads become self-managing anywhere, spanning on-premises to public cloud environments.

These combined capabilities enable IT to assure application performance, at the lowest cost, while maintaining compliance with policy – from the data center to the public cloud and edge.

Wednesday 28 November 2018

Accelerating Enterprise AI with Network Architecture Search

AI/ML is a dominant trend in the enterprise. While AI/ML is not fundamentally new, the ubiquity of large amounts of observed data, the rise of distributed computing frameworks and the prevalence of large hardware-accelerated computing infrastructure has lead to new wave of breakthroughs in AI in the last 5 years or so. Today enterprises are rushing to apply AI in every part of the organization for a wide range of task, from making better decisions, to optimizing their processes.

However, to reap the benefit of AI, one needs significant investments into teams who understand the entire AI lifecycle, especially how to understand, design and tune the mathematical models that apply to their use cases. Often these models use bespoke techniques that are known to a select few who are highly trained in the field. Without this tuning, an enterprise can spend lots of opex running models by following the canonical models. How can we help the enterprise accelerate this step? One way is AutoML

AutoML is a broad class of techniques that help to solve the pain of iterative designing and tuning of models without the personnel investment. It ranges from tuning an existing model (e.g. in hyper parameter search) to designing new network models automatically. For those leveraging Deep Learning, one way is to use Neural Architecture Search (NAS), which aims to find the best neural network topology for a given task, automatically.

In recent years, several automated NAC  methods have been proposed using techniques such as evolutionary algorithms and reinforcement learning. These methods have found neural network architectures that outperform bespoke, human designed architectures on problems such as image classification and language modeling and have improved the state of the art on accuracy.  However, these methods have been largely limited by the resources needed to search for the best architecture.

Cisco Study Materials, Cisco Guides, Cisco Learning, Cisco Tutorial and Material

We present a method for NAS called Neural Architecture Construction (NAC) – it is a automated method to construct deep network architectures with close to state of art accuracy, in less than 1 GPU day — faster than current state of the art neural architecture search methods.  NAC works by pruning and expansion of a small base network called an EnvelopeNet. It runs a truncated training cycle and compares the utility of different network blocks and prunes and expands the base network based on these statistics.  Most conventional neural architecture search methods iterate through a full training cycle of a number of intermediate networks, comparing their accuracy, before discovering a final network. The time needed to discover the final network is limited by the need to run a full training and evaluation cycle on each intermediate network generated, resulting in large search times. In contrast, NAC speeds up the construction process because the pruning and expansion can be done without needing to wait for a full training cycle to complete.

Cisco Study Materials, Cisco Guides, Cisco Learning, Cisco Tutorial and Material

Figure 1: Results comparing our NAC with other state of the art work. Note the search time for both the dataset. The NAC numbers for ImageNet are preliminary.

Interestingly, our NAC algorithm mirrors theories on the ontogenesis of neurons in the brain. Brain development is believed to consist of neurogenesis, where the neural structure initially develops, gradually followed by apoptosis, where neural cells are eliminated, hippocampal neurogenesis, where more neurons are introduced, and synaptic pruning, where synapses are eliminated. Our NAC algorithm consists of analogous steps run in iterations: model initialization with a prior (neurogenesis), a truncated training cycle, pruning filters (apoptosis), adding new cells (hippocampal neurogenesis), and pruning of skip connections (synaptic pruning). Artificial neurogenesis has been previously studied as, among others, a method for continuous learning in neural networks.

Cisco Study Materials, Cisco Guides, Cisco Learning, Cisco Tutorial and Material

We also open sourced a tool called AMLA, an Automated Machine Learning frAmework for implementing and deploying neural architecture search algorithms.  AMLA is designed to deploy these algorithms at scale and allow comparison of the performance of the networks generated by different AutoML algorithms. Its key architectural features are the decoupling of the network generation from the network evaluation, support for network instrumentation, open model specification, and a microservices based architecture for deployment at scale. In AMLA, AutoML algorithms and training/evaluation code are written as containerized microservices that can be deployed at scale on a public or private infrastructure. The microservices communicate via well defined interfaces and models are persisted using standard model definition formats, allowing the plug and play of the AutoML algorithms as well as the AI/ML libraries. This makes it easy to prototype, compare, benchmark, and deploy different AutoML algorithms in production.

To help users incorporate NAS into their regular AI/ML workflows, we are working on integrating our NAS efforts into Kubeflow, an opensource platform to simplify the management of AI/ML lifecycles on Kubernetes based infrastructure. Once integrated, these NAS tools will help users optimize network architectures in addition to hyper parameter optimization (e.g. Katib tool within Kubeflow).

We believe that this is just the tip of the iceberg (of AutoML and NAS in particular). However these early results have given us confidence that we can design better mechanisms for AutoML that require less resources to operate, in a step towards accelerating the adoption of AI in the enterprise.