Sunday 3 May 2020

Cisco Secure Cloud Architecture for AWS

More and more customers are deploying workloads and applications in Amazon Web Service (AWS). AWS provides a flexible, reliable, secure, easy to use, scalable and high-performance environment for workloads and applications.

AWS recommends three-tier architecture for web applications. These tiers are separated to perform various functions independently. Multilayer architecture for web applications has a presentation layer (web tier), an application layer (app tier), and a database layer (database tier). There is the flexibility to make changes to each tier independent of another tier. The application requires scalability and availability; the three-tier architecture makes scalability and availability for each tier independent.

Amazon Web Services, AMP for Endpoints, AWS, Cisco Security, Cisco Stealthwatch Cloud, Cisco Exam Prep

Figure 1: AWS three-tier architecture

AWS has a shared security model i.e., the customers are still responsible for protecting workloads, applications, and data. The above three-tiered architecture offers scalable and highly available design. Each tier can scale-in or scale-out independently, but Cisco recommends using proper security controls for visibility, segmentation, and threat protection.

Amazon Web Services, AMP for Endpoints, AWS, Cisco Security, Cisco Stealthwatch Cloud, Cisco Exam Prep

Figure 2: Key pillars of a successful security architecture

Cisco recommends protecting workload and application in AWS using a Cisco Validated Design (CVD) shown in Figure 3. All the components mentioned in this design have been verified and tested in the AWS cloud. This design brings together Cisco and AWS security controls to provide visibility, segmentation, and threat protection.

Visibility: Cisco Tetration, Cisco Stealthwatch Cloud, Cisco AMP for Endpoint, Cisco Threat Response, and AWS VPC flow logs.

Segmentation: Cisco Next-Generation Firewall, Cisco Adaptive Security Appliance, Cisco Tetration, Cisco Defense Orchestrator, AWS security group, AWS gateway, AWS VPC, and AWS subnets.

Threat Protection: Cisco Next-Generation Firewall (NGFWv), Cisco Tetration, Cisco AMP for Endpoints, Cisco Umbrella, Cisco Threat Response, AWS WAF, AWS Shield (DDoS – Basic or Advance), and Radware WAF/DDoS.

Another key pillar is Identity and Access Management (IAM): Cisco Duo and AWS IAM

Amazon Web Services, AMP for Endpoints, AWS, Cisco Security, Cisco Stealthwatch Cloud, Cisco Exam Prep

Figure 3: Cisco Validated Design for AWS three-tier architecture

Cisco security controls used in the validated design (Figure 3):

◉ Cisco Defense Orchestrator (CDO) – CDO can now manage the AWS security group. CDO provides micro-segmentation capability by managing firewall hosts on the workload.

◉ Cisco Tetration (SaaS) – Cisco Tetration agent on AWS instances forwards “network flow and process information” this information essential for getting visibility and policy enforcement.

◉ Cisco Stealthwatch Cloud (SWC) – SWC consumes VPC flow logs, cloud trail, AWS Inspector, AWS IAM, and many more. SWC includes compliance-related observations and it provides visibility into your AWS cloud infrastructure

◉ Cisco Duo – Cisco Duo provides MFA service for AWS console and applications running on the workloads

◉ Cisco Umbrella – Cisco Umbrella virtual appliance is available for AWS, using DHCP options administrator can configure Cisco Umbrella as a primary DNS. Cisco Umbrella cloud provides a way to configure and enforce DNS layer security to workloads in the cloud.

◉ Cisco Adaptative Security Appliance Virtual (ASAv): Cisco ASAv provides a stateful firewall, network segmentation, and VPN capabilities in AWS VPC.

◉ Cisco Next-Generation Firewall Virtual (NGFWv): Cisco NGFWv provides capabilities like stateful firewall, “application visibility and control”, next-generation IPS, URL-filtering, and network AMP in AWS.

◉ Cisco Threat Response (CTR): Cisco Threat Response has API driven integration with Umbrella, AMP for Endpoints, and SWC (coming soon). Using this integration security ops team can get visibility and perform threat hunting.

AWS controls used in the Cisco Validated Design (Figure 3):

◉ AWS Security Groups (SG) – AWS security groups provide micro-segmentation capability by adding firewalls rules directly on the instance virtual interface (elastic network interface – eni).

◉ AWS Web Application Firewall (WAF) – AWS WAF protects against web exploits.

◉ AWS Shield (DDoS) – AWS Shield protects against DDoS.

◉ AWS Application Load Balancer (ALB) and Network Load Balancer (NLB) – AWS ALB and NLB provides load balancing for incoming traffic.

◉ AWS route 53 – AWS Route53 provides DNS based load balancing and used for load balancing RAVPN (SSL) across multiple firewalls in a VPC. 

Radware controls used in the Cisco Validated Design (Figure 3):

◉ Radware (WAF and DDoS): Radware provides WAF and DDoS capabilities as a service.

Cisco recommends enabling the following key capabilities on Cisco security controls. These controls not only provide unmatched visibility, segmentation and threat protection, but they also help in adhering to security compliance.

Amazon Web Services, AMP for Endpoints, AWS, Cisco Security, Cisco Stealthwatch Cloud, Cisco Exam Prep

In addition to the above Cisco security control, Cisco recommends using the following native AWS security components to protect workloads and applications.

Amazon Web Services, AMP for Endpoints, AWS, Cisco Security, Cisco Stealthwatch Cloud, Cisco Exam Prep

Friday 1 May 2020

Creating More Agile and Upgradable Networks with a Controller-Based Architecture

Cisco Exam Prep, Cisco Tutorial and Material, Cisco Learning, Cisco Guides, Cisco Networking

All enterprises are in a constant state of digital flux, striving to keep existing business processes running efficiently while pushing to build new applications to satisfy customer and B2B requirements. Underlying these initiatives is the enterprise network, connecting employees and applications to the world of customers and business partners. From the data center to wired and wireless campus offices to distributed branch sites and cloud applications, the network unifies communications, collaboration, and commerce.

But the network too is in the throes of digital flux. The network infrastructure devices—switches, routers, wireless access points (APs)—are frequently in need of upgrades to add new capabilities and software fixes and apply security patches to protect against new threats. In other cases, where a business requires extremely high availability—such as a securities trading nexus—upgrades to the network infrastructure may be few and far between and only high-priority security patches are applied over the span of years. In addition, different divisions of the enterprise may require variants of the core network operating system (NOS) in order to fine-tune performance for different business operations, mandated uptime, and traffic types.

Keeping the constellations of routers, switches, and access points up to date with the latest versions and variants of the NOS and security patches is a monumental task for budget-constrained IT organizations. The traditional upgrade path is an extremely manual process: from finding the correct gold version of NOS image, downloading it, testing on each platform, installing on each component, and manually comparing the previous network status to the upgraded state—and potentially rolling back the upgrade in case of unexpected issues. One. Box. At. A. Time. The process is often so complex that IT dedicates months to evaluate and test an upgrade before deploying it. Meanwhile business needs may go unmet as changes to the network are frozen, awaiting new capabilities from an upgrade.

As organizations seek to rapidly adopt new technologies and launch digital transformation projects—IoT, edge computing, mobile applications—and prepare for Wi-Fi 6 and 5G traffic increases, the network must be able to change frequently and on-demand to keep up with business needs. The old ways of manually managing software for thousands of network components simply will not suffice to keep the enterprise competitive.

In this post, we will take a much deeper dive into the benefits of controller-based networking.

What is a Controller-Based Network?


Cisco introduced the idea of Intent-Based Networking as a software-defined architecture that interprets business requirements—intents—and translates them into network actions such as segmentation, security policies, and device onboarding. Controllers are key to bringing purposeful intelligence into an Intent-Based Network. Controllers act as intermediaries between human operators specifying intents, and all the switches, routers, and access points that provide the required connectivity.

Controllers, such as Cisco DNA Center and Cisco vManage, translate intents into configurations and policies that are downloaded to network infrastructure devices—switches, routers, access points—that provide the connectivity to computers, mobile devices, and applications. Controllers provide visibility into the network by actively monitoring network nodes, analyzing telemetry, latency, Quality of Service levels, and error data in real time, and reporting statistics, alerts, and anomalies to IT managers. In turn, this provides insights into how the network has been functioning, along with its current state to ensure that the intents are being accomplished. Insights into network history and current states play a critical role in managing and automating the maintenance and upgrade processes.

Network Process Automation at Scale


With thousands of switches, routers, and APs requiring management, automating as many network processes as possible reduces the workload of IT as well as the chances of human error. In particular, controllers are key to automating enterprise-wide upgrade and patching processes at scale. Instead of upgrading individual switches and routers one at a time, an upgrade intention is set at the controller level that automates the entire process of upgrades in stages.

◉ Controllers can run network-wide checks—available storage space, uptime criticality, version control— to ensure readiness before upgrading the image for each type and location of device.

◉ Controllers automatically search and download images from Cisco Cloud repositories based on feature sets and network device types currently in use.

◉ The correct golden images are automatically staged to each switch and router, eliminating the need for an operator to manually copy and monitor the progress one at a time.

◉ Based on the uptime criteria of each section of the network, the actual upgrades are scheduled for the most appropriate time and automatically started.

◉ Controllers perform a pre-check of network devices to catalogue current network operating statistics such as number of clients, number of ports in use, and traffic levels.

◉ During the post-check phase, controllers observe the impact of the upgrades on the network by comparing pre-check statistics with post-upgrade statistics to ensure that the network is operating as expected.

◉ IT can add customized lists of pre- and post-check items—such as ensuring applications (cloud and on-premise) are reachable and responding appropriately—to run before and after the upgrade.

◉ Should network operating parameters be negatively impacted, the controllers can automatically initiate a rollback of the update to the previous state.

Cisco Exam Prep, Cisco Tutorial and Material, Cisco Learning, Cisco Guides, Cisco Networking

These general steps outline a series of programmable events that are ultimately driven by the intents of the organization filtered through IT and the controllers. Controllers provide the ability to change an organization’s structure and operations much faster. For example, an enterprise can react more quickly when preparing for a new application being rolled out to hundreds of branch sites and the network needs an upgrade to local branch routers to ensure application quality of experience. Automating the updates from a central management controller saves time, travel, and quickly prepares the branches for new business processes.

Automation at scale becomes even more important when dealing with the scope of IoT network infrastructure. With a geographically distributed network of nodes that connect thousands of IoT devices in the field and factory, being able to reach out from a central cloud management controller and apply segmentation rules and security policies to protect device connections is a very practical method of securing them. In cases like these, centralized controller automation eliminates thousands of hours of technician truck rolls.

Open Controller APIs Extend Programmability for Third-Party Automations


Automated updates to routers, switches, and access points are only one of the ways a controller-based network increases enterprise speed and agility. With an open set of APIs, controllers can communicate with higher-level applications and services such as Application-Aware Firewalls and Internet Protocol Address Management (IPAM) tools. To help automate managing the network’s health and support, IT can also employ APIs to program controllers to send network device analytics and events to an IT Service Management (ITSM) system. In turn, the ITSM can send commands and approvals to controllers to kick-off specific actions, such as times to schedule upgrades on the network. The two-way communication through APIs provides IT with a flexible method to minimize hands-on technical operations and free up valuable talent for other projects.

Controllers Are Foundational for Building Intent-Based Networks


As the examples in this post show, controllers are the basis for implementing Intent-Based Networks that intelligently monitor, pinpoint abnormal operations, and proactively apply remedies to keep the network optimized for performance and security. They significantly reduce the hands-on manual labor traditionally required for upgrading and patching complex networks. Controllers automate pre-checks, post-checks, and rollbacks to ensure network continuity and protect against human error. Automation of these processes frees up IT talent to work on business transformation initiatives while making the network easier to change to meet new business needs.

Thursday 30 April 2020

Writing Production-ready Code; Nornir Edition

Let’s start with the least interesting topic first – documentation. You have to consider two aspects here:

1. Peripheral files: Files not part of the project, such as architecture diagrams or README files, are good places for newcomers to learn about your work. Be complete yet concise; describe how everything fits together as simply as you can. Invest 10 minutes to learn Markdown or ReStructured Text if you don’t know them.

2. The code itself: Perhaps you’ve heard the term “self-documenting code”. Python makes this easy as many of its idioms and semantics read like plain English. Resist the urge to use overly clever or complex techniques where they aren’t necessary. Comment liberally, not just for others, but as a favor to your future self. Developers tend to forget how their code works a few weeks after it has been written (at least, I know I do)!

I think it is beyond dispute that static code analysis tools such as linters, security scanners, and code formatters are great additions to any code project. I don’t have strong opinions on precisely which tools are the best, but I’ve grown comfortable with the following options. All of them can be installed using pip:

1. pylint: Python linter that checks for syntax errors, styling issues, and minor security issues
2. bandit: Python security analyzer that reports vulnerabilities based on severity and confidence
3. black: Python formatter to keep source code consistent (spacing, quotes, continuations, etc.)
4. yamllint: YAML syntax formatter; similar to pylint but for configuration files

Sometimes you won’t find a public linter for the code you care about. Time permitting, write your own. Because the narc project consumes JSON files as input, I wrote a simple jsonlint.py script that just finds all JSON files, attempts to parse Python objects from then, and fails if any exceptions are raised. That’s it. I’m only trying to answer the question “Is the file formatted correctly?” I’d rather know right away instead of waiting for Nornir to crash later.

failed = False
for varfile in os.listdir(path):
    if varfile.endswith(".json"):
        filepath = os.path.join(path, varfile)
        with open(filepath, "r") as handle:
            try:
                # Attempt to load the JSON data into Python objects
                json.load(handle)
            except json.decoder.JSONDecodeError as exc:
                # Print specific file and error condition, mark failure
                print(f"{filepath}: {exc}")
                failed = True

# If failure occurred, use rc=1 to signal an error
if failed:
    sys.exit(1)

These tools take little effort to deploy and have a very high “return on effort”. However, they are superficial in their test coverage and wholly insufficient by themselves. Most developers begin testing their code by first constructing unit tests. These test the smallest, atomic (indivisible) parts of a program, such as functions, methods, or classes. Like in electronics manufacturing, a component on a circuit board may be tested by measuring the voltage across two pins. This particular measurement is useless in the context of the board’s overall purpose, but is a critical component in a larger, complex system. The same concept is true for software projects.

It is conventional to contain all tests, unit or otherwise, in a tests/ directory parallel to the project’s source code. This is keeps things organized and allows for your code project and test structure to be designed differently. My jsonlint.py script lives here, along with several other files beginning with test_. This naming convention is common in Python projects to identify files containing tests. Popular Python testing tools/frameworks like pytest will automatically discover and execute them.

$ tree tests/
tests/
|-- data
| |-- cmd_checks.yaml
| `-- dummy_checks.yaml
|-- jsonlint.py
|-- test_get_cmd.py
`-- test_validation.py

Consider the test_get_cmd.py file first, which tests the get_cmd() function. This function takes in a dictionary representing an ASA rule to check, and expands it into a packet-tracer command that the ASA will understand. Some people call this “unparsing” as it transforms structured data into plain text. This process is deterministic and easy to test; given any dictionary, we can predict what the command should be. In the data/ directory, I’ve defined a few YAML files which contain these test cases. I usually recommend keeping static data out of your test code and instead developing general test processes instead. The narc project supports TCP, UDP, ICMP, and raw IP protocol flows. Therefore, my test file should have at least 4 cases. Using nested dictionaries, we can define individual cases that represent the chk input values, then the expected_cmd field contains the expected packet-tracer command. I think the file is self-explanatory, and you can check test_get_cmd.py to see how this file is consumed.

$ cat tests/data/cmd_checks.yaml
---
cmd_checks:
  tcp_full:
    in_intf: "inside"
    proto: "tcp"
    src_ip: "192.0.2.1"
    src_port: 5001
    dst_ip: "192.0.2.2"
    dst_port: 5002
    expected_cmd: >-
      packet-tracer input inside tcp
      192.0.2.1 5001 192.0.2.2 5002 xml
  udp_full:
    in_intf: "inside"
    proto: "udp"
    src_ip: "192.0.2.1"
    src_port: 5001
    dst_ip: "192.0.2.2"
    dst_port: 5002
    expected_cmd: >-
      packet-tracer input inside udp
      192.0.2.1 5001 192.0.2.2 5002 xml
  icmp_full:
    in_intf: "inside"
    proto: "icmp"
    src_ip: "192.0.2.1"
    dst_ip: "192.0.2.2"
    icmp_type: 8
    icmp_code: 0
    expected_cmd: >-
      packet-tracer input inside icmp
      192.0.2.1 8 0 192.0.2.2 xml
  rawip_full:
    in_intf: "inside"
    proto: 123
    src_ip: "192.0.2.1"
    dst_ip: "192.0.2.2"
    expected_cmd: >-
      packet-tracer input inside rawip
      192.0.2.1 123 192.0.2.2 xml
...

All good code projects perform some degree of input data validation. Suppose a user enters an IPv4 address of 192.0.2.1.7 or a TCP port of -1. Surely the ASA would throw an error message, but why let it get to that point? Problems don’t get better over time, and we should test for these conditions early. In general, we want to “fail fast”. That’s what the test_validation.py script does and it works in conjunction with the dummy_checks.yml file. Invalid “check” dictionaries should be logged and not sent to the network device.

As a brief aside, data validation is inherent when using modeling languages like YANG. This is one of the reasons why model-driven programmability and telemetry are growing in popularity. In addition to removing the arbitrariness of data structures, it enforces data compliance without explicit coding logic.

We’ve tested quite a bit so far, but we haven’t tied anything together yet. Always consider building in some kind of integration/system level testing to your project. For narc, I introduced a feature named “dryrun” and it is easily toggled using a CLI argument at runtime. This code bypasses the Netmiko logic and instead generates simulated (sometimes called “mocked”) XML output for each packet-tracer command. This runs instantly and doesn’t require access to any network devices. We don’t really care if the rules pass or fail (hint: they’ll always pass), just that the solution is plumbed together correctly.

The diagram below illustrates how mocking works at a high level, and the goal is to keep the detour as short and as transparent as possible. You want to maximize testing before and after the mocking activity. Given Nornir’s flexible architecture with easy-to-define custom tasks, I’ve created a custom _mock_packet_trace task. It looks and feels like network_send_command as it returns an identical result, but is designed for local testing.

Cisco Prep, Cisco Exam Prep, Cisco Tutorial and Material, Cisco Learning, Cisco Guides

How do we tie this seemingly complex string of events together? Opinions on this topic, as with everything in programming, run far and wide. I’m old school and prefer to use a Makefile, but more modern tools exist like Task which are YAML-based and less finicky. Some people just prefer shell scripts. Makefiles were traditionally used to compile code and link the resulting objects in languages like C. For Python projects, you can create “targets” or “goals” to run various tasks. Think of each target as a small shell script. For example, make lint will run the static code analysis tools pylint, bandit, black, yamllint, and the jsonlint.py script. Then, make unit will run pytest on all test_*.py files. Finally, make dry will execute the Nornir runbook in dryrun mode, testing the system as a whole (minus Netmiko) with mock data. You can also create operational targets unrelated to the project code. I often define make clean to remove any application artifacts, Python byte code .pyc files, and logs.

Rather than having to type out all of these targets, a single target can reference other targets. For example, consider the make test target which runs all 4 targets in the correct sequence. You can simplify it further by defining a “default goal” so that when only make is typed, it invokes make test. We developers are lazy and cherish saving 5 keystrokes per test run!

.DEFAULT_GOAL := test
.PHONY: test
test: clean lint unit dry

Ideally, typing make should test your entire project from the simplest syntax checking to the most involved integration/system testing. Here’s the full, unedited logs from my dev environment relating to the narc project. I recommend NOT obscuring your command outputs; it is useful to see which commands have generated which outputs.

$ make
Starting clean
find . -name "*.pyc" | xargs -r rm
rm -f nornir.log
rm -rf outputs/
Starting clean
Starting lint
find . -name "*.yaml" | xargs yamllint -s
python tests/jsonlint.py
find . -name "*.py" | xargs pylint

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

find . -name "*.py" | xargs bandit --skip B101
[main] INFO profile include tests: None
[main] INFO profile exclude tests: None
[main] INFO cli include tests: None
[main] INFO cli exclude tests: B101
[main] INFO running on Python 3.7.3
Run started:2020-04-07 15:47:27.239623

Test results:
        No issues identified.

Code scanned:
        Total lines of code: 670
        Total lines skipped (#nosec): 0

Run metrics:
        Total issues (by severity):
                Undefined: 0.0
                Low: 0.0
                Medium: 0.0
                High: 0.0
        Total issues (by confidence):
                Undefined: 0.0
                Low: 0.0
                Medium: 0.0
                High: 0.0
Files skipped (0):
find . -name "*.py" | xargs black -l 85 --check
All done!
11 files would be left unchanged.
Completed lint
Starting unit tests
python -m pytest tests/ --verbose
================= test session starts ==================
platform linux -- Python 3.7.3, pytest-5.3.2, py-1.8.0, pluggy-0.13.1 -- /home/centos/environments/asapt/bin/python
cachedir: .pytest_cache
rootdir: /home/centos/code/narc
collected 11 items

tests/test_get_cmd.py::test_get_cmd_tcp PASSED [ 9%]
tests/test_get_cmd.py::test_get_cmd_udp PASSED [ 18%]
tests/test_get_cmd.py::test_get_cmd_icmp PASSED [ 27%]
tests/test_get_cmd.py::test_get_cmd_rawip PASSED [ 36%]
tests/test_validation.py::test_validate_id PASSED [ 45%]
tests/test_validation.py::test_validate_in_intf PASSED [ 54%]
tests/test_validation.py::test_validate_should PASSED [ 63%]
tests/test_validation.py::test_validate_ip PASSED [ 72%]
tests/test_validation.py::test_validate_proto PASSED [ 81%]
tests/test_validation.py::test_validate_port PASSED [ 90%]
tests/test_validation.py::test_validate_icmp PASSED [100%]

======================================= 11 passed in 0.09s ========================================
Completed unit tests
Starting dryruns
python runbook.py --dryrun --failonly
head -n 5 outputs/*
==> outputs/result.csv <==
host,id,proto,icmp type,icmp code,src_ip,src_port,dst_ip,dst_port,in_intf,out_intf,action,drop_reason,success

==> outputs/result.json <==
{}
==> outputs/result.txt <==
python runbook.py -d -s
ASAV1@2020-04-07T15:47:28.873590: loading YAML vars
ASAV1@2020-04-07T15:47:28.875094: loading vars succeeded
ASAV1@2020-04-07T15:47:28.875245: starting check DNS OUTBOUND (1/5)
ASAV1@2020-04-07T15:47:28.875291: completed check DNS OUTBOUND (1/5)
ASAV1@2020-04-07T15:47:28.875304: starting check HTTPS OUTBOUND (2/5)
ASAV1@2020-04-07T15:47:28.875333: completed check HTTPS OUTBOUND (2/5)
ASAV1@2020-04-07T15:47:28.875344: starting check SSH INBOUND (3/5)
ASAV1@2020-04-07T15:47:28.875371: completed check SSH INBOUND (3/5)
ASAV1@2020-04-07T15:47:28.875381: starting check PING OUTBOUND (4/5)
ASAV1@2020-04-07T15:47:28.875406: completed check PING OUTBOUND (4/5)
ASAV1@2020-04-07T15:47:28.875415: starting check L2TP OUTBOUND (5/5)
ASAV1@2020-04-07T15:47:28.875457: completed check L2TP OUTBOUND (5/5)
ASAV2@2020-04-07T15:47:28.878727: loading JSON vars
ASAV2@2020-04-07T15:47:28.878880: loading vars succeeded
ASAV2@2020-04-07T15:47:28.879018: starting check DNS OUTBOUND (1/5)
ASAV2@2020-04-07T15:47:28.879060: completed check DNS OUTBOUND (1/5)
ASAV2@2020-04-07T15:47:28.879073: starting check HTTPS OUTBOUND (2/5)
ASAV2@2020-04-07T15:47:28.879100: completed check HTTPS OUTBOUND (2/5)
ASAV2@2020-04-07T15:47:28.879110: starting check SSH INBOUND (3/5)
ASAV2@2020-04-07T15:47:28.879136: completed check SSH INBOUND (3/5)
ASAV2@2020-04-07T15:47:28.879146: starting check PING OUTBOUND (4/5)
ASAV2@2020-04-07T15:47:28.879169: completed check PING OUTBOUND (4/5)
ASAV2@2020-04-07T15:47:28.879179: starting check L2TP OUTBOUND (5/5)
ASAV2@2020-04-07T15:47:28.879202: completed check L2TP OUTBOUND (5/5)
head -n 5 outputs/*
==> outputs/result.csv <==
host,id,proto,icmp type,icmp code,src_ip,src_port,dst_ip,dst_port,in_intf,out_intf,action,drop_reason,success
ASAV1,DNS OUTBOUND,udp,,,192.0.2.2,5000,8.8.8.8,53,UNKNOWN,UNKNOWN,ALLOW,,True
ASAV1,HTTPS OUTBOUND,tcp,,,192.0.2.2,5000,20.0.0.1,443,UNKNOWN,UNKNOWN,ALLOW,,True
ASAV1,SSH INBOUND,tcp,,,fc00:172:31:1::a,5000,fc00:192:0:2::2,22,UNKNOWN,UNKNOWN,DROP,dummy,True
ASAV1,PING OUTBOUND,icmp,8,0,192.0.2.2,,8.8.8.8,,UNKNOWN,UNKNOWN,ALLOW,,True

==> outputs/result.json <==
{
  "ASAV1": {
    "DNS OUTBOUND": {
      "Phase": [
        {

==> outputs/result.txt <==
ASAV1 DNS OUTBOUND -> PASS
ASAV1 HTTPS OUTBOUND -> PASS
ASAV1 SSH INBOUND -> PASS
ASAV1 PING OUTBOUND -> PASS
ASAV1 L2TP OUTBOUND -> PASS
Completed dryruns

OK, so now we have a way to regression test an entire project, but it still requires manual human effort as part of a synchronous process: typing make, waiting for completion, observing results, and taking follow-on actions as needed. If your testing takes more than a few seconds, waiting will get old fast. A better solution would be automatically starting these tests whenever your code changes, then recording the results for review later. Put another way, when I type git push, I want to walk away with certainty that my updates will be tested. This is called “Continuous Integration” or CI, and is very easy to setup. There are plenty of solutions available: Gitlab CI, GitHub Actions (new), Circle CI, Jenkins, and many more. I’m a fan of Travis CI, and that’s what I’ve used for narc. Almost all of these solutions use a YAML file that defines the sequence in which test phases are executed. Below is the .travis.yml file from the project in question. The install phase installs all packages in the requirements.txt file using pip, and subsequence phases run various make targets.

$ cat .travis.yml
---
language: "python"
python:
  - "3.7"

# Install python packages for ansible and linters.
install:
  - "pip install -r requirements.txt"

# Perform pre-checks
before_script:
  - "make lint"
  - "make unit"

# Perform runbook testing with mock ASA inputs.
script:
  - "make dry"
...

Assuming you’ve set up Travis correctly (outside of scope for this blog), you’ll see your results in the web interface which clearly show each testing phase and the final results.

Cisco Prep, Cisco Exam Prep, Cisco Tutorial and Material, Cisco Learning, Cisco Guides

And that, my friends, is how you build a professional code project!

Wednesday 29 April 2020

Using Advanced Velocity Templates in DNA Center – Part 1

Cisco Prep, Cisco Tutorial and Materials, Cisco Exam Prep, Cisco Guides, Cisco Learning


Variables


At the heart of any template is the concept of a “variable”.  Variables allow parts of configuration to be customized for a specific device, while ensuring other parts are standardized across all devices.   A single configuration template can be applied to many devices.  In Velocity, variables begin with “$”.   If you need to have a variable embedded in a string, you can use ${var} to denote the variable.

To configure a hostname for a network device, the cli command “hostname adam-router” is used.  “adam-router” is the name of the device.  When applying this template to a set of devices, the only thing that changes is the variable (${hname}).  By setting the variable hname = “adam”, “adam-router” would be rendered.

hostname ${hname}-router

Bound Variables


It is possible to access information about the device (from DNAC perspective) using a binding.  Attributes about the device, such as it’s model number can by linked to a variable.  For example in the following template I want to set ${device} to the device product ID (PID) from the inventory.

hostname ${hname}-${device}

When the template is defined in template programmer, click on the variable section.

Cisco Prep, Cisco Tutorial and Materials, Cisco Exam Prep, Cisco Guides, Cisco Learning

Selecting the Variables for Template

Then click the variable (device) and bind to source (bottom right).  Select Source = “Inventory”, Entity = “Device” and Attribute = “platformId”.   This indicates this variable should come from the inventory, specifically data about the device.  The attribute is optional, but in this case just the “platformId” (model number) is required.  For a stack this will be a comma separated list.

Cisco Prep, Cisco Tutorial and Materials, Cisco Exam Prep, Cisco Guides, Cisco Learning

Binding the Variable

This will render as hostname adam-WS-C3650-48PQ-E when applied to a 3650 switch.

Conditionals


Most programming languages provide if-else constructs. In Velocity, this is simple, #if #end statements denote a simple condition.  There are a number of use cases for if statements.

Optional Variables


Sometimes a variable may be optional, and the configuration line should only be rendered if the variable is set.  In the example below, if the $data_vlan variable is empty, the vlan configuration would be skipped

#if($data_vlan != "")
vlan $data_vlan
name userdata
#end

Related Variables


Based on one variable, you may want to set some others.  This reduces the number of variables needed in the template.  #set is used to assign a variable.  In the example below, if the value of $hostname is “switch01” then the variable $loopback  is set to “10.10.100.1”.

#if ($hostname == "switch01")
#set ($loopback = "10.10.100.1")
#elseif ($hostname == "switch02")
#set ($loopback = "10.10.100.2")
#end

int lo0
ip address $loopback 255.255.255.255

Enable/Disable Trigger


Another example is to trigger a feature to be either enabled or disabled.  For example, a variable could be used to toggle enabling/disabling netflow.  To simplify the example, assume the definition of the netflow collector is also in the template.  The interface name could also be a variable.  In this example “apply” is set to “true” to enable netflow on the interface, and anything else will disable netflow.

int g1/0/10
;#if ($apply == "true")
ip flow monitor myflow input
#else 
no ip flow monitor myflow input
#end 

Regular Expressions


The if statements above showed an exact match of a string. How would a pattern match be done?  Fortunately, regular expression (regexp) are supported.  A detailed discussion of regexp is outside the scope of this post, as there are lots of other places to find tutorials on regexp.

For example,  a single template could do something specific for 9300 switches and still be applied to non-9300 switches.  The model number of the switch (from the inventory) is available via a bound variable.  As seen in the section above.   9300 series switches have a model number structured as 9300-NNXXX or 9300L-NNXXX-YY.  For example, C9300-24UB, C9300-48UXM, C9300L-24P-4G.

The regular expression is “C9300L?-[2|4][4|8].*”. The first part is just a string match “C9300”.  The “L?” means “L” is optional, sometimes it is present, sometimes not.  “-” is just a match.  “[2|4]” means either 2 or 4, and the same with “[4|8]”.  Finally, “.*” matches any remaining letters.  The variable is $model and $model.matches() will return true is the regular expression matches.

#if ($model.matches("C9300L?-[2|4][4|8].*") )
#set ($var = "9300")
#end

The other way regular expressions can be used is to replace parts of a string.   In this example, I want to extract the number of ports on the switch from the model number.

I am using “replaceAll” vs “match”.  ReplaceAll takes a regular expression, as well as an argument to specify what to replace it with.  In this case “$1” is going to replace the contents of the regular expression.  The regular expression is identical to above, with one difference “([2|4][4|8])“.  The () saves the pattern inside, and it can be later referenced as “$1”.  This pattern is the number of ports.  It will match 24 or 48.   $ports would be set to either 24 or 48.

#set($ports = $model.replaceAll("C9300L?-([2|4][4|8]).*","$1"))
$ports

If $model = ” C9300L-24P-4G”, then $ports will be set to 24.

Tuesday 28 April 2020

Skyrocket Cisco Contact Center Efficiency with Artificial Intelligence

Cisco Prep, Cisco Tutorial and Material, Cisco Guides, Cisco Certification, Cisco Exam Prep

Enhancing Call Center Efficiency


Customer Experience is the priority of business leaders, any size, any vertical. Today customers want to be independent, they want access to self-serve solutions, they do not like to be sold things, they love to buy solutions to their problems and needs, thus a key to delivering amazing customer experience is to quickly resolve a customer’s problem – ideally on the first contact, even better if through self-services.

Artificial Intelligence implementations in contact centers are the primary way to improve first contact resolution (FSR) and drive customer experience and retention to the point that some analysts predict that a huge percentage of the customer interactions can be resolved by well-designed bots.

Cisco Prep, Cisco Tutorial and Material, Cisco Guides, Cisco Certification, Cisco Exam Prep

The Future of Cisco Contact Centers


While I certainly believe that bots will be more and more powerful in the next future, my pragmatic suggestion, based on my field experience, would be slightly different, and the reason why I’m saying that is the incredible business case represented by Cognitive Contact Centers.

Let’s check some numbers together. Even assuming the monthly cost of an agent being $1000, optimizing by just 10% the efficiency of the Contact Center will trigger a huge benefit for the customer and even more in case of larger Contact Centers.

Cisco Prep, Cisco Tutorial and Material, Cisco Guides, Cisco Certification, Cisco Exam Prep

Artificial Intelligence BOT


But what do we mean by 10% optimization?

Quite often that means an Artificial Intelligence BOT is able to successfully handle 10% of the incoming calls/chats, from beginning to end, without engaging the agent and therefore improving the scalability of the Contact Center or giving back time to agents to deal with the most complex cases. This approach requires building a BOT able to successfully manage the entire conversation with the customer and therefore potentially sophisticated, even complex so the first approach to Artificial Intelligence I advise is a different one.

In the vast majority of cases the first part, let’s say at least 10%-20% of the time, of a call to a Contact Center, is about data collection, name, the reason for calling, service id, etc. and this is something highly repetitive, a dialog very structured and therefore much easier to automate with a simple BOT just meant to collect the DATA.

Cisco Prep, Cisco Tutorial and Material, Cisco Guides, Cisco Certification, Cisco Exam Prep

Once all the necessary inputs are collected saving 10-20% of the agent time, the BOT can hand over the most complex part of the call to the agent passing those DATA and the CONTEXT so that the agent can move forward from there.

Artificial Intelligence Use Case and Solution

Together with Marco, in the below video, we offer an example of the SALES and TECHNICAL journey to skyrocket business efficiency in a Cisco Contact Center Enterprise solution with Google Artificial Intelligence DialogFlow platform.

We cover both sales and technical aspects, connecting the dots, helping account managers to scout opportunities, and engineers to design solutions.

The first half is about the use case and the incredible business value of the solution offered by combining the best of both worlds, while in the second part we go through the technical details of the solution.

Monday 27 April 2020

Trustworthy Networking is Not Just Technological, It’s Cultural

Part 1: The Technology of Trust

Trustworthy Networks Are More Important Than Ever


It wasn’t that long ago that most enterprise resources resided in heavily-protected on-premises data centers surrounded by vigilant layers of security. Trust was placed in securing the computing environment itself—the physically isolated data center that only trained and vetted personnel could physically access and the network that strictly controlled connections from enterprise endpoints.

Now all the physical and virtual security layers built into data centers that we relied for years have changed. Data “centers” are wherever the data and applications are located—in public clouds, SaaS providers, branch sites, and edge compute and storage. Every employee uses mobile devices—some of which are personally owned—for accessing and manipulating corporate resources. Virtualized workloads move among cloud or IaaS providers to improve responsiveness to a workforce in constant motion. Branch sites quickly need new direct internet connections to work with cloud applications. Global black swan events cause enormous shifts as whole populations rapidly move to working remotely. The attack surface has grown exponentially, literally overnight.

Another evolutionary and significant change in the attack surface results from transformations in the basic architecture of router and switch hardware. Just a decade or so ago, most routers and switches that formed the foundation of corporate and world-wide networks were built with proprietary silicon, hardware, and specialized operating systems. Back then, hackers were content to focus on the population of billions of x86-based and Windows-operating PCs and Linux servers, taking advantage of the same vulnerabilities and exploitive toolsets to invade and infect them. They largely ignored routers and switches because they required specialized tools to attack vendor-specific vulnerabilities—too much work for threat actors with such a target-rich environment of x86 machines to troll.


Hackers Turn to the Network for Access

But the choice of targets changed as hackers increasingly turned their attention to attacking enterprise data sources spread over distributed data centers, cloud platforms, and a mobile workforce. The opportunity for attacks became more prevalent as routers and switches gradually standardized on commodity hardware, Linux, and modules of open source code. Some enterprises and cloud providers are even trying white box and bare metal boxes to shave a few dollars off data center networking CAPEX. However, it’s the shared x86 and ARM code, Linux, and open source code that draws the attention of hackers, providing a potential foothold into corporate data-rich networks. By focusing on gaining access to network devices through common attack routes, malevolent hackers may be able to steal security keys and IP addresses among other sensitive data, opening more pathways into the network and the treasure-troves of corporate data. This attack vector can be particularly insidious when open source code that is integrated into the Network Operating System has Common Vulnerabilities and Exposures (CVEs) that go unpatched, offering hackers convenient documented doorways.

Own Networking Devices, Own the Traffic.

As network components became a more attractive target for hackers and nation-state spies looking for access to corporate resources, the sophistication of attacks increases to not only gain control over routers, but to hide the evidence of infiltration. One particular advancement is the development of persistent malware that can survive hard resets because the malicious code alters the BIOS or even the lower level boot loaders to gain control of the root. Without built-in protection at the hardware root layer that prevents altered low-level code from loading and infecting the entire OS, detecting and eliminating these persistent malware infections is next to impossible. An infected networking device—router, switch, access point, firewall—becomes an open gateway to all the network traffic.

With the multitude of dangers constantly testing the gateways of networks, there should be no such concept of “implicit trust”. At the core of the defensive network is the principle of proven trustworthy hardware and software working in conjunction to protect network devices from attack. By building on a foundation of trustworthy networking components and secure software, the connected sources of data and applications distributed through cloud services, branch sites, and a mobile workforce are also protected.

What are the fundamentals of building trusted network systems and what should CIOs and CSOs be asking of their vendors?

What CIOs and CSOs Need to Ask About Trustworthiness


It’s interesting to examine the paradox of enterprise CIOs and CSOs insisting that the hardware and operating systems of data center components are verifiably trustworthy while relying on a network fabric based on commodity components. To secure critical data and application resources, trust must permeate the hardware and software that run networks from the data center to campus to branch and cloud. Needless to say, that covers a very large territory of intertwined components, each of which must reinforce the trustworthiness of the complete network. Some of the major points to research when selecting network vendors and choosing components are:

◉ Is the network hardware authentic and genuine?
◉ What is the origin of all the software installed, including BIOS and open source modules?
◉ Have components or installed software been modified while the unit was in transit from manufacturing to customer?
◉ Are network devices running authentic code from the vendor?
◉ Are there any known CVEs with the software, including any open source modules?
◉ Are software and security patches available and applied?
◉ Is enterprise sensitive information, such as encryption keys, stored on network devices protected?

To ensure trust in a network, security-focused processes and technologies must be built into the hardware and software across the full lifecycle of solutions. That high level of engineering and supply chain control is very difficult to accomplish on low-margin, bare metal hardware. If the choice comes down to savings from slightly less costly hardware versus increase in risk, it’s worthwhile remembering the average cost of a single stolen record from security breaches is $155 (in the U.S.), while the cost of the loss of customer trust and theft of intellectual property is incalculable.

Building a Chain of Trust


A Chain of Trust should be built-in, starting with the design, sourcing of parts, and construction phase of both hardware and root-level software. It continues throughout the Secure Development Lifecycle (SDL), all the way to end of life for secure disposal of routers and switches, which can have sensitive data still stored in memory.

Security Embedded in Hardware

Hardware engineers must have an overriding mindset of security starting from sourcing of parts from reliable and certified manufacturers; designing secure boot strategies using Secure Unique Device Identifiers (SUDI) embedded in Trust Anchor modules (TAm); and Root of Trust tamper-resistant chips for secure generation and storage of cryptographic key pairs.

The SUDI is an X.509v3 certificate with an associated key-pair protected in hardware. The SUDI certificate contains the product identifier and serial number and is rooted to the Cisco Public Key Infrastructure. This identity can be either RSA- or ECDSA-based. The key pair and the SUDI certificate are inserted into the TAm during manufacturing so that the private key cannot be exported. The SUDI provides an immutable identity for the router or switch that is used to verify that the device is a genuine product. The result is hardware that can be verified as authentic and untainted, maximizing the security of the traffic flowing through the network.

Secure Software Development Lifecycle

Software developers must strictly follow Secure Development Lifecycle guidelines for coding the network operating systems with a combination of tools, processes, and awareness training that provides a holistic approach to product resiliency and establishes a culture of security awareness. From a trust perspective, the SDL development process includes:

◉ Product security requirements
◉ Management of third-party software, including open source code
◉ Secure design processes
◉ Secure coding practices and common libraries
◉ Static analysis
◉ Vulnerability testing

Security and Vulnerability Audits provide assurance that as problems are uncovered during the software development and testing cycles, they cannot be ignored. Ideally, the audit team reports not to engineering management but to the office of CSO or CEO to ensure that problems are completely fixed or the release red-lighted until they are remediated. This is an example of a culture of trust that permeates across functional departments all the way to the C-level—all in service of protecting the customer.


Download Verified Software Directly to Customer Controllers


Once trust is established for the development and release of networking software, the next step is to ensure that the software images delivered to customers’ controllers are original and untainted. To make a downloadable image trustworthy, it can be protected with a hash created with SHA-512, encrypted with a private key, and combined with the software image as a digital signature package. That image is downloaded directly to customers’ network controllers. The controllers use a matching public key to decrypt the digitally-signed hash and image package to verify nothing has been altered from the original image. Only software images that pass this critical test can be loaded on devices, preventing any malicious code from booting on a controller, router, or switch. This step ensures that only the vendor’s approved and unaltered code is running on network devices.

Reporting on CVEs Is Only One Way to Measure Solution Maturity

As previously discussed, open source and third-party code and APIs are commonly integrated into all modern network operating systems. Likewise, the core OS for network devices is commonly based on open source Linux kernels. All of these externally-developed code modules need to be certified by their creators and tested by neutral third-party organizations as well as the ultimate end-use vendor. As the number of products, software packages, and connected devices in networks continue to rise, it’s inevitable that more security vulnerabilities will also increase. Ironically perhaps, one reason for the increase in Common Vulnerabilities and Exposures (CVEs) is that the industry is getting better at finding and reporting them.

To ensure trustworthy networking, vendors must commit to Coordinated Disclosure, starting with a system for tracking every piece of open source and third-party code incorporated into network products. In addition, every CVE discovered either internally or externally needs to be tracked and fixed by a responsible entity in the trusted value chain of partners and suppliers. Customers, partners, and researchers need to be consistently notified of every recently discovered and fixed CVE and the potential for risk. Transparency improves customer security. That’s why at Cisco we report and provide fixes for CVEs that are found by our own internal testing, our customers, and third-party researchers.

A Third-Party Software Digitization (TPSD) program is another example of a corporate-wide initiative for managing third party software—including royalty bearing commercial software as well as free and open source software. The system automates the tracking and tracing of “Where Used, Where Is, Where Distributed” for compliance, contributions, security, quality, and customer accountability. A TPSD ensures that when a CVE is uncovered in open source or third-party code, the exact usage of the code is immediately known so that fixes can be applied in every instance.

As with the Security and Vulnerability Audit team, a Product Security Incident Response Team (PSIRT) that is independent from engineering is critical to keeping an unbiased watchful eye on all internally and externally developed code. PSIRT members continuously monitor open source and third-party code used in NOSs for reported CVEs. In addition, customers and researchers need a documented method of reporting to PSIRTs any CVEs that they discover. In turn, vendors need a transparent process for reporting on fixes and patches and level of severity back to customers, partners, and security researchers.

An important way to collaborate with industry peers on CVEs is through the Forum of Incident Response and Security Teams (FIRST) organization, which authors the PSIRT Framework. The Framework identifies core responsibilities of PSIRT teams, provides guidance on how to build capabilities to investigate and disclose security vulnerabilities and their remediations to customers in a transparent manner. As you consider the trustworthiness of your networking ecosystem, it’s worthwhile considering your vendors involvement with organizations like FIRST and the Industry Consortium for Advancement of Security on the Internet (ICASI)—which are strong indicators of a fundamental commitment to transparency.

Building in Trust Throughout the Product Lifecycle

All the processes and capabilities I have described in this post are integrated into Cisco’s hardware and software development processes. Cisco embeds security and resilience throughout the lifecycle of our solutions including design, test, manufacturing, distribution, support, and end of life. We design our solutions with built-in trustworthy technologies to enhance security and provide verification of the authenticity and integrity of Cisco hardware and software. We use a secure development lifecycle to make security a primary design consideration—never an afterthought. And we work with our partner ecosystem to implement a comprehensive Value Chain Security program to mitigate supply chain risks such as counterfeit and taint.

Sunday 26 April 2020

Architecting Work-from-Home Solutions that Scale

The need for people to work from home challenges us to come up with new solutions that scale. One of these solutions involved the enablement of static IPs for VPN connections for individual users.

But why would anyone want to do that?

The Problem


It turns out that for years, India has had a strict no VoIP (Voice over IP) policy when using VPN law. This law only came to light when the head honcho, Chuck Robbins, encouraged all Cisco employees to work from home. In the past, our call center employees were not allowed to work from home because they were not allowed to use VoIP over VPN; but desperate times call for desperate measures.

With Cisco at the head, a bunch of tech companies sat down with India’s government and came up with an exception allowing VoIP over VPN as long as each employee was given the same IP upon connection along with a record provided to the DoT (Department of Telecommunications) of each employee’s work address. With the exception in place, that left us to come up with a solution.

The Solution


Conventionally, IP addresses are assigned through an IP address pool, which gives users the first available IP address from a range.

Take for example if I select a site within AnyConnect, e.g. “Headquarters” and click “Connect”. Within the AnyConnect client is an XML file that directly maps the site, “Headquarters”, to the URL, “headquarters.cisco.com/default”. This URL can be broken down into two parts: the device address, “headquarters.cisco.com”, and path, “default” which maps to the tunnel-group “Default_Tunnel_Group”. Within the VPN headend configuration is a line that says the address pool for the tunnel-group “Default_Tunnel_Group” is 10.0.0.1-10.0.0.254 or a “/24”. I am then assigned the first unallocated IP address in that range, in this case let’s say “10.0.0.101”, which becomes my IP address within the Cisco network. However, if I disconnect and then reconnect, I will then be assigned a new IP address from the above range.

Cisco Exam Prep, Cisco Prep, Cisco Tutorial and Materials, Cisco Learning, Cisco Certifications

The size of the IP address pool, the number of users connecting to a site, and the number of VPN headend devices (each with a unique address pool) in a cluster for a site, are all factors which make the likelihood of being assigned the same IP address upon connection extremely remote.

Example configuration of an IP address pool and tunnel group:

ip local pool DEFAULT_EMPLOYEE_POOL 10.0.0.1-10.0.0.254 mask 255.255.255.255 

tunnel-group Default_Tunnel_Group type remote-access 
tunnel-group Default_Tunnel_Group general-attributes 
 address-pool DEFAULT_EMPLOYEE_POOL 
 default-group-policy Default_Group_Policy 
tunnel-group Default_Tunnel_Group webvpn-attributes 
 group-url headquarters.cisco.com/default enable 

Our first approach to assigning static IPs was a solution that came up in forums from years past, which was to create a local user account on the ASA, and from there statically assign an IP for that specific user; however, this would require a static password stored on the ASA. And although encrypted, we knew our friends in InfoSec would have an absolute fit over that one. As a long shot, we attempted to authenticate a local user account with no static password against our AAA servers, but this attempt ultimately failed.

Our second attempt was to look at how we could use ISE (Identity Services Engine) in this scenario. ISE handles all of our authorization requests in the corporate network, whether remote or on-site, and it made sense to use it given we were mapping static IPs to users. With ISE we encountered two problems: first, ISE does not proxy all information given by RADIUS servers back to the VPN headends, so it was not a viable solution in our partner network where we rely on RADIUS groups to handle ACLs and second, there were concerns over how to complete this at scale – manually creating over 7,000 policies in ISE would take a serious effort both in people and time and we’d be sailing uncharted waters since it had never been tested for this type of scenario.

Our third approach was to use Active Directory in place of ISE for the IP address mapping. However, we once again faced the issue of resourcing to create 7,000 entries manually as well as the unknown strain we would be putting on the system.

Sometimes the best solution is the simplest, and after hours of trying fancy group manipulations with ISE and attempting to get it to pass RADIUS group information; we settled on one of the first ideas that came up while brainstorming and one we knew should work, a unique tunnel group and address pool of one IP for each user.

Cisco Exam Prep, Cisco Prep, Cisco Tutorial and Materials, Cisco Learning, Cisco Certifications

The solution can best be summarized by taking me, username “drew”, as an example of a user that needs a statically assigned IP address. By taking the “/24” from before with the IP range of 10.0.0.1-10.0.0.254, we designate the IP address 10.0.0.201 to be my statically assigned IP address. We declare an address pool of just this one IP address, which is now a “/32”. We assign this address pool to the tunnel group “drew”, with the URL “headquarters.cisco.com/drew”.

Example configuration of a static IP address pool and tunnel group:

ip local pool drew 10.0.0.201 mask 255.255.255.255 

tunnel-group drew type remote-access 
tunnel-group drew general-attributes 
 address-pool drew 
 default-group-policy Default_Group_Policy 
tunnel-group drew webvpn-attributes 
 group-url https://headquarters.cisco.com/drew enable 

After the successful testing and implementation of the above configuration (which used automation detailed below), questions rose throughout our team like wildfire (and to the credit of our customers, they have also had similar questions along these lines). The solution seems hacky to say the least. What are the security implications and very importantly, will it scale? We’re talking about a solution that has to work for thousands of Cisco call center employees in India (a number which has approached 7,000 as of today).

Here are some of the most notable questions:

1. How many tunnel groups (and thus users) can you have on each VPN headend?

Cisco ASA documentation states that the number of tunnel groups that can be configured is equivalent to the maximum number of VPN connections it can support. In our case we are using ASA 5585-SSP60s, which support 10,000 connections and thus can be configured with 10,000 tunnel groups.

2. Does the addition of such a large amount of tunnel groups increase overhead on the ASA and thus decrease performance?

The ASA uses a hash map for its tunnel groups (constant time lookup), so although there is memory used for the additional tunnel groups, this memory is constant and pales in comparison to the memory used for an ASA’s normal duties of encrypting/decrypting traffic.

Security


With our nerves slightly calmed about the number of tunnel groups we had just deployed to the VPN headend, we had some homework left to do. Because we’re Cisco, a solution is not complete without security, and DAP (dynamic access policies) on VPN headends is one of our core lines of defense. By keeping all tunnel groups under the same blanket group policy, we were able to maintain our standard DAP checks: such as verifying AnyConnect Client and operating system versions as well as other obscure policies such as AnyConnect session timeouts and FQDN split tunneling.

The last item was ensuring the static IP tunnel groups we had just created were used exclusively by the employees for which they were intended, and that employees who were supposed to be using these static IPs were not connecting to our regular corporate VPN headends and getting dynamically assigned IPs. To ensure the employee who was supposed to be connecting to a tunnel group was the only one successful, we applied a LUA script through DAP to the blanket group policy.


EVAL(cisco.aaa.username, "EQ", cisco.aaa.tunnelgroup) 

Essentially this checks the username authenticating is the same as the name of the tunnel group, which is purposely the same as the user’s username, preventing the user “damien” from connecting to my tunnel group, “drew”, and from using my static IP of 10.0.0.201. To secure employees were exclusively connecting to their assigned static IP tunnels, we used ISE to block all call center employees from connecting to our corporate VPN headends by denying the authorization of users in an Active Directory (AD) group to those sites.

Automation and Management


You can find the code in DevNet Code Exchange that we used to generate the ASA configuration for the thousands of tunnel groups and static IPs we needed. It uses simple string interpolation along with a text file of users. In addition to generating the tunnel groups, the functions provided also help you tear down these tunnel groups for easier clean up.

The intent of these functions is not meant to be a full-blown solution, but to provide you with the building blocks to make one.

The solution we created was not as elegant as we would have liked, however with automation we can change that. Using these configuration generation functions along with our favorite network configuration tools, such as NSO (Network Service Orchestrator) – or Ansible, Paramiko, etc. – we can create templates to automate the deployment of this configuration to the VPN headend.

Taking things a step further, we can build on top of these network configuration tools with an application to manage these tunnel groups paired with a database of the users and their statically assigned IPs. Thus, when you want to add or remove a user, the application does the allocation or deallocation of IPs for you without having to trove through thousands of lines of configuration.

Cisco Exam Prep, Cisco Prep, Cisco Tutorial and Materials, Cisco Learning, Cisco Certifications

In regard to the Walk-Run-Fly journey it presents, we see our solution as being in the “Run” state. We welcome and encourage use and enhancements from the community to achieve “Fly” status.

Closing Thoughts


It has now been over a month since we deployed our static IP solution for call center employees, and for the most part, things have been relatively smooth for such an adhoc implementation. This is not to say we have not faced issues since then. However, we have continued to work on improvements, such as adding redundancy to our call center VPN headend using an active failover configuration.

With all that being said, I cannot stress enough how much automation saved us in this hacky situation and continues to make things simple through the management of these static IP tunnels.