Saturday, 16 April 2022

Intersight Workload Optimizer: How to Tame the Public Cloud

In this installment, we’re going to focus on public cloud optimization, which differs slightly from its on-premises counterpart. In an on-premises data center, infrastructure is generally finite in scale and fixed in cost. By the time a new physical server hits the floor, the capital has been spent and has taken a hit on your business’s bottom line. In this context, on-premises optimization means maximizing utilization of the sunk cost of capital infrastructure (while still assuring performance of the workload, of course).

In the public cloud, however, infrastructure is effectively infinite. Resources are generally far more elastic and often paid for out of an operating expenditure budget rather than a capital budget. In this case, cloud optimization means minimizing cloud spend, and the burden of maximizing hardware utilization falls to the cloud provider. Minimizing cloud spend proves to be a daunting exercise for cloud administrators given the public cloud’s vast array of instance sizes and types (over 400 in Amazon Web Services alone, as shown in Figure 1: Amazon Web Services instance types, all with slightly different resource profiles and costs, and with new options and pricing changing almost daily. At scale, selecting the ideal instance type, size, term, etc. for every workload at every moment in order to assure performance and minimize spend is arguably an impossible task for a human, but is an ideal use case for the IWO decision engine.

Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco
Figure 1: Amazon Web Services instance types

Taking action in the public cloud


So let’s take a look at the types of real-time actions IWO offers for public cloud optimization. In Figure 2, starting on the Cloud tab of the main Supply Chain screen, we see a number of widgets on the right with actionable information – Pending Actions, Top Accounts, Necessary Investments, Potential Savings, etc.

Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco
Figure 2: Supply Chain view of the Public Cloud and Pending Actions widget

Clicking on “Show All” in the Top Accounts widget, we see a list of all our public cloud accounts and subscriptions in a hierarchical table, as shown in Figure 3.

Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco
Figure 3: Public cloud account details table

Clicking on one of the green action buttons on the right, we see the current pending actions for a specific account, as shown in Figure 4.  There we see a number of storage volume actions highlighted, some relating to performance needs, others to recoup savings due to over-provisioning (i.e. you can move to a cheaper tier of storage and still assure performance).

Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco
Figure 4: Action Center table with details on specific pending storage actions for a given account

In this specific example, a keen-eyed reader might notice something curious about the two performance actions at the top of the list: even though the actions are being taken to provide more IOPS (moving from 160 to 3000 IOPS) to assure performance, the cost impact is actually lower.  That’s right – these actions are providing more performance for less cost! While maybe not entirely common, this example shows just how quirky the plethora of options are in the public cloud, and how difficult it can be for humans to avoid leaving money on the table. (This example is also non-disruptive and reversible, as noted in the table, with the ability to execute immediately with the click of a button.  (What’s not to like?)

Clicking on the Scale Virtual Machines tab in the Action Center list, we see the current pending actions to rightsize our VMs, as shown in Figure 5.

Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco
Figure 5: Action Center table with details on specific pending VM actions for a given account

Clicking on the details button in the first row takes us to the Action Details window providing us clear data behind the decision, as well as the expected outcome of the action from both a performance and a cost perspective, as shown in Figure 6. We can also conveniently run the action with a single button click, right from the dashboard interface.

Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco
Figure 6: Action Details for a specific VM scaling action

This detailed information is available for every action IWO recommends, across all workloads in all cloud accounts. Choosing the right action, with even just a handful of workloads, is difficult for a human. Getting it right across many tens, hundreds, or thousands of workloads spread across multiple accounts in multiple clouds in real time is a problem that IWO is uniquely positioned to solve.

Reserved instances: rent or lease?


To further complicate matters for a cloud administrator, you have the option of consuming instances in an on-demand fashion — i.e., pay as you use — or via Reserved Instances (RIs) which you pay for in advance for a fixed term (usually a year or more). RIs can be incredibly attractive as they are typically heavily discounted compared to their on-demand counterparts, but they are not without their pitfalls.

The fundamental challenge of consuming RIs is that you will pay for the RI whether you use it or not. In this respect, RIs become more like the sunk cost of a physical server on-premises than the intermittent cost of an on-demand cloud instance. One can think of on-demand instances as being well-suited for temporary or highly variable workloads, analogous to a car-less city dweller renting a car: usually cost-effective for an occasional weekend trip, but cost-prohibitive for long-term use. RIs are akin to leasing a car: often the right economic choice for longer-term, more predictable usage patterns (say, commuting an hour to work each day).

When faced with a myriad of instance options and terms, you are generally forced down one of two paths: 1) only purchase RIs for workloads that are deemed static and consume on-demand instances for everything else (hoping, of course, that static workloads really do remain that way); or 2) pick a handful of RI instance types — e.g., small, medium, and large — and shoehorn all workloads, static or variable, into the closest fit. Both methods leave a lot to be desired.

In the first case, it’s not at all uncommon for static workloads to have their demand change over time as app use grows or new functionality comes online. In these cases, the workload will need to be relocated to a new instance type, and the administrator will have an empty hole to fill in the form of the old, already paid-for RI (see examples in Figure 7).

Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco
Figure 7: Changes in workload demand can trigger numerous cascading decisions for RI consumption

What should be done with that hole? What’s the best workload to move into it? And if that workload is coming from its own RI, the problem simply cascades downstream. The unpredictability of such headaches often negates the potential cost savings of RIs.

In the second scenario, limiting the RI choices almost by definition means mismatching workloads to instance types, negatively affecting either workload performance or cost savings, or both. In either case, human beings, even with complicated spreadsheets and scripts, will invariably get the answer wrong because the scale of the problem is too large and everything keeps changing, all the time, so the analysis done last week is likely to be invalid this week.

Thankfully, IWO was developed to understand both on-demand instances and RIs in detail through native API target integrations with popular public cloud providers like AWS and Azure. IWO capabilities are constantly receiving real-time data on consumption, pricing, and instance options directly from the cloud providers, and combining such data with the knowledge of applicable customer-specific pricing and enterprise agreements to determine the best actions available at any given point in time.

Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco
Figure 8: Detailed inventory information and purchase actions for RIs

Not only does IWO technology understand current and historical workload requirements and an organization’s current RI inventory (see above), but it also has the capability to intelligently recommend the optimal consumption of existing RI inventory and additional RI purchases to minimize future spending. In Figure 9, we have a Pending Action to buy 13 RIs which would take the RI coverage up to the horizontal black line in the chart.  Most of the area under the blue and turquoise curves, representing the workload resource requirements, would be covered by RIs – everything below the black line.  The peaks above the black line would be covered by on-demand purchases. While you could purchase enough RIs to cover all the area under the curve, this is not the most cost-effective option to meet workload demand.

Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco
Figure 9: Details supporting a specific RI purchase action

Continuing with our car analogy, in addition to knowing whether it’s better to rent or lease a car in any given circumstance, IWO can even suggest a car lease (RI purchase) that can be used as a vehicle for ride-sharing. IWO can fluidly move on-demand workloads in and out of a given RI to achieve the lowest possible cost while still assuring performance.

In short, IWO has the ability to understand the optimal combination of RI purchases and on-demand spending across your entire public cloud estate, in real-time.

Cloud Migration Planning


Finally, because IWO uses the same underlying decision engine for both the on-premises and public cloud environments, it can bridge the gap between them. The process of migrating VM workloads from on-prem to the public cloud can be simulated in IWO’s planning module and will allow the selection of specific VMs or VM groups to generate the optimal purchase actions required to run them, as shown in Figure 10.

Cisco Exam Prep, Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco
Figure 10: On-prem to public cloud workload migration planning results

These plan results offer two options: Lift & Shift and Optimized, depicted in the blue and green columns, respectively. Lift & Shift shows the recommended instances to buy, and their costs, assuming no changes to the size of the existing VMs. Optimized allows for VM right-sizing in the process of moving to the cloud, which often results in a lower overall cost if current VMs are oversized relative to their workload needs. Software licensing (e.g., bring- your-own vs. buy from the cloud) and RI profile customizations are also available to further fine-tune the plan results.

Have your cake and eat it too


IWO has the unique ability to apply the same market abstraction and analysis to both on-premises and public cloud workloads, in real-time, enabling it to add value far beyond any cloud-specific or hypervisor-specific, point-in-time tools that may be available. Besides being multi-vendor, multi-cloud, and real-time by design, IWO does not force you to choose between performance assurance and cost/resource optimization.

Source: cisco.com

Thursday, 14 April 2022

Expanding workloads for UCS X-Series with UCS X-Fabric Technology

First, we increased the amount of storage – going from two drives on a B-Series to six drives on a X-Series. Now we are adding GPUs to X-Series that were previously only available in rack servers.

How does it work?

The VIC (Virtual Interface Card) on the server node connects to the UCS X9416 X-Fabric Module. The X-Fabric Module connects to the UCS X440p PCIe node with the GPUs. This elegant, easily upgradable, cable-free solution is only possible on a mid-plane, free chassis design like UCS X-Series.

Cisco UCS X9416 X-Fabric Module

The first UCS X-Fabric Technology module is PCIe Gen 4 expansion for the UCS X210c M6 Compute Node. The two X9416 X-Fabric Modules expand the PCIe bus from the server to the UCS X440p PCIe Node. No cables, no fuss, no muss.

Cisco Exam Prep, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco

Cisco UCS X440p PCIe Node


More and more applications can benefit from accelerators like GPUs. Ranging from AI/ML to the stalwart VDI, adding one or more GPUs to a server can greatly improve the user experience and application performance.

The Cisco UCS X440p PCIe Node allows you to add up to four GPUs to a Cisco UCS X210c Compute Node in conjunction with the UCS X9416 X-Fabric Module.

Cisco Exam Prep, Cisco Learning, Cisco Career, Cisco Skills, Cisco Jobs, Cisco

Different workloads require different types of GPUs. Cisco initially supports:

◉ Up to two Nvidia A100 Tensor Core GPUs
◉ Up to two Nvidia A16 GPUs
◉ Up to two Nvidia A40 GPUs
◉ Up to four Nvidia T4 Tensor Core GPUs

The modular design of UCS X-Series allows you to decouple the CPU / GPU refresh cycles. GPU suppliers like Nvidia, AMD, and Intel release their products at a different cadence than CPUs. If your application performance is sensitive to GPU performance, being able to simply slide in an UCS X440p PCIe node with the latest GPUs allows you extend the investment of all the other solution components (chassis, servers, IFMs, & PSUs) providing a better overall TCO.

Cisco Intersight


Management has always been a UCS superpower. Being able to manage every component for X-Series in a single app is paramount. From inventory to firmware updates, you manage all the UCS X-Fabric Technology components with the same process you manage the server and Intelligent Fabric Modules.

Run any app


Your requirements for modern, accelerated workloads shouldn’t dictate your server form factor. They should seamlessly integrate into a system that also runs all your traditional applications from core infrastructure, to database, to VSI. A single system, managed from the cloud, spanning all your workloads not only simplifies your environment, but will allow you to focus on business needs, not figuring out what unique hardware is needed for a specific application.

Source: cisco.com

Wednesday, 13 April 2022

Hybrid cloud networks are calling – Cisco Nexus Dashboard has the answer

The scale and complexity of modern enterprise infrastructure environments are exploding as workloads become pervasive and infrastructure becomes more hybrid and distributed across data centers, edge, and public cloud resources. Recent events have put the importance of successful network operations at the forefront. Through a barrage of obstacles, today’s resilient and successful organizations are modernizing and automating their network operations to stay ahead of the curve. Their destination? Hybrid Cloud — and by a vast majority.

According to IDC research, 55% of organizations currently have hybrid cloud and use it as a framework to deploy applications where scale-out architecture and high availability networks are needed. Another 29% reported not having a hybrid cloud, but they planned to create one within a year. And by 2025, 70% of organizations will modernize their applications based on drivers like data security, organizational flexibility and agility, and productivity gains versus drivers like IT cost savings.

To solve hybrid cloud application complexity, your IT needs to focus on automating application infrastructure management. This includes the many personas involved in configuration, provisioning, lifecycle operations, and orchestration of cloud and hybrid datacenter environments. But scaling applications into hybrid cloud also increases the cost of managing thousands of distributed devices, containers, and network services. These elements require more knowledge and more time to troubleshoot the many interconnected parts using multiple purpose-built solution tools and methods.

To be successful, it’s critical to have the visibility and insights into your network wherever your data is created with consistent network and policy orchestration across multiple data centers —whether on-premises, in the cloud, or at the edge. At Cisco, we know that for any move to the hybrid cloud lifecycle journey, there is an uncompromising need for a centralized approach to manage network capabilities. Cisco’s Nexus Dashboard is our newest cloud networking platform innovation to help with this very problem. With its One View presentation of all your hybrid cloud network sites, your IT operators can use a single agile platform, to operate all their network infrastructure in a single place. How many split personas (NetOps, DevOps, SecOps, and CloudOps) do you have? Cisco Nexus Dashboard bridges all the tools needed by each persona with a flexible operational model for all use cases on a single platform.

Figure: Cisco Nexus Dashboard: Centralized hybrid cloud networking platform

We listened, here are the recent innovations that you asked for:

Bolster Cloud Neutral Support


Recent innovations include expansion to the hybrid cloud with added support for Google Cloud, simplifying network management across multiple public cloud sites. Nexus Dashboard is available in the AWS and Azure marketplaces and will also be featured in the Google Cloud marketplace.

Improved Intelligence and Site Management

Key new features support air gap environments, provide simplification of experience such as reduced app downtime due resource challenges when upgrading an app or installing by determining whether you have enough resources etc. The dashboard will predetermine the resources needed for the app and environment to run smoothly.

With air gap support, customers that are not connected to Cisco Cloud can utilize insights advisory features to better identity risks to their infrastructure and get decrypted updates on PSIRTS, EOS/EOL, field notices etc. Syslog support as well as customization and personalization features are newly available with the interface.

Decreased Dependence on Physical Hardware

Additional scale improvements are being implemented with the virtual form factor of Nexus Dashboard, where additional physical hardware is not required to run the Nexus Dashboard in your environment. Please refer to the Nexus Dashboard datasheet for more details.

End to End Visibility


External devices such as firewalls and integrations such as vCenter – offer broader visibility, correlated telemetry and deeper insights beyond the core network. For end-to-end visibility, it is imperative to additionally understand where exactly the problem truly lies, which allows for quick remediation. L4-L7 and cross-domain integrations are a key strategy to gain comprehensive visibility. With the new vCenter integration, the Insights function can incorporate virtualized workload data such as hypervisor name, VM name, VM health into telemetry. This will enable visibility across silos, into the virtualized environments and enable faster MTTR for customers by correlating network events and application issues.


The Insights function is also able to enable optimal network and application performance and ensure continuous availability with recent AppDynamics SaaS support.

Considering recent news of companies that have implemented changes resulting in outages, we’d like to emphasize that pre-change validation with upgrade assist are key capabilities of the Insights function. It’s a critical capability that evaluates configuration changes before they are deployed to allow IT Ops to make changes with confidence. This removes unintended consequences that take down applications and/or the network. I encourage you to check out these Nexus Dashboard capabilities.

Which performance zone is right for you?


Customers have different reasons as to operate in the various performance zones (be it OKRs, metrics, speed of the business or foundational capabilities) in terms of the people, process, and technology alignment.


With the Nexus Dashboard, we are helping customers figure out where in the network infrastructure automation journey they are. Then help them in their journey to move their performance zone from reactive, to proactive and then to optimizing and the visionary self-healing, self-driving, and self-diagnostic networks.

Source: cisco.com

Tuesday, 12 April 2022

Announcing Risk-Based Endpoint Security with Cisco Secure Endpoint and Kenna Security

With a tidal wave of vulnerabilities out there and brand-new vulnerabilities coming out daily, security teams have a lot to handle. Addressing every single vulnerability is nearly impossible and prioritizing them is no easy task either since it’s difficult to effectively focus on the small number of vulnerabilities that matter most to your organization. Moreover, the shift to hybrid work makes it harder to assess and prioritize your vulnerabilities across your endpoints with traditional vulnerability scanners.

Kenna Security maps out the vulnerabilities in your environment and prioritizes the order in which you should address them based on a risk score. We’re excited to announce that after Cisco acquired Kenna Security last year, we have recently launched an integration between Kenna and Cisco Secure Endpoint to add valuable vulnerability context into the endpoint.

With this initial integration, Secure Endpoint customers can now perform risk-based endpoint security. It enables customers to prioritize endpoint protection and enhances threat investigation to accelerate incident response with three main use cases:

1. Scannerless vulnerability visibility: In a hybrid work environment, it’s increasingly difficult for traditional vulnerability scanners to account for all devices being used. Instead of relying on IP address scanning to identify vulnerabilities in an environment, you can now use the existing Secure Endpoint agent to get a complete picture of the vulnerabilities you need to triage.

2. Risk-based vulnerability context: During incident response, customers now have an additional data point in the form of a Kenna risk score. For example, if a compromised endpoint has a risk score of 95+, there is a high likelihood that the attack vector relates to a vulnerability that Kenna has identified. This can dramatically speed up incident response by helping the responder focus on the right data.

3.Accurate, actionable risk scores: Organizations often struggle to prioritize the right vulnerabilities since most risk scores such as Common Vulnerability Scoring System (CVSS) are static and lack important context. In contrast, the Kenna Risk Score is dynamic with rich context since it uses advanced data science techniques such as predictive modeling and machine learning to consider real-world threats. This enables you to understand the actual level of risk in your environment and allows you effectively prioritize and remediate the most important vulnerabilities first.

How does the Kenna integration work?

The Kenna integration brings Kenna Risk Scores directly into your Secure Endpoint console. As an example of this integration, the computer in the screenshot below (Figure 1) has been assigned a Kenna Risk Score of 100.

Cisco Secure Endpoint, Kenna Security, Cisco, Cisco Exam Prep, Cisco Leaning, Cisco Preparation, Cisco Materials
Figure 1: Kenna Risk Score in the Secure Endpoint console

Risk scores can be anywhere from 0 (lowest risk) to 100 (highest risk). The score is inferred based on the reported OS version, build, and revision update information, combined with threat intelligence on vulnerabilities from Kenna.

Clicking on the actual numeric score itself brings you to a page with a detailed listing of all vulnerabilities present on the endpoint (see Figure 2 below).

Cisco Secure Endpoint, Kenna Security, Cisco, Cisco Exam Prep, Cisco Leaning, Cisco Preparation, Cisco Materials
Figure 2: List of all vulnerabilities on an endpoint

Each vulnerability has a risk score, an identifier, and a description that includes icons with additional details based on vulnerability intelligence from Kenna:

Active Internet Breach: This vulnerability is being exploited across active breaches on the Internet
Easily Exploitable: This vulnerability is easy to exploit with proof-of-concept code being potentially available

Malware Exploitable: There is known malware exploiting this vulnerability


All of this information is extremely valuable context during an incident investigation. Exploiting vulnerabilities is one of the most common ways malicious actors carry out attacks, so by quickly understanding which vulnerabilities are present in the environment, incident responders have a much easier time honing in on how an attacker got into their organization.

Additionally, for vulnerabilities that currently have fixes available, clicking on the green “Fix Available” button on each vulnerability displays a box with links to the applicable patches, knowledge base articles, and other relevant information (see Figure 3 below). This gives analysts the information they need to efficiently act on an endpoint.

Cisco Secure Endpoint, Kenna Security, Cisco, Cisco Exam Prep, Cisco Leaning, Cisco Preparation, Cisco Materials
Figure 3: Recommended fixes for each vulnerability

Who can access the Kenna integration?


Vulnerability information and Risk Scores from Kenna Security are now available in the Cisco Secure Endpoint console for:

◉ Windows 10 computers running Secure Endpoint Windows Connector version 7.5.3 and newer
◉ Customers with a Secure Endpoint Advantage or Premier tier license, including Secure Endpoint Pro

Most vulnerabilities in our customer base occur on Windows 10 workstations, so we decided to release first with Windows 10 to deliver this integration faster. We plan on adding support for other Windows versions and operating systems such as Windows 11, Windows Server 2016, 2019, and 2022 in the near future.

We hope that you find this integration useful! This is the first of many steps that we are taking to incorporate vulnerability information from Kenna Security into Secure Endpoint, and we are excited to see what other use cases we can enable for our customers.

The Cisco Secure Choice Enterprise Agreement is a great way to adopt and experience the complete Secure Endpoint and Kenna technology stack.  It provides instant cost savings, the freedom to grow, and you only pay for what you need.

Source: cisco.com

Sunday, 10 April 2022

Supercharging indoor IoT management – Cisco DNA Spaces IoT Services Policy Engine

IoT Management at scale

Cisco DNA Spaces IoT Services provides tools to manage a myriad of IoT devices easily. However, the management of these IoT devices was still a manual operation. Each IoT device had to be individually onboarded and configured. If there was an error, it needed to be manually reconfigured. This becomes cumbersome as the number of managed devices increases. Furthermore, manual maintenance for a large number of managed devices is equally taxing. How do we know when a device is about to run out of battery? How to ensure that customer experience is not impacted if someone moves a beacon from one zone to another? How to roll out firmware upgrades without impacting operation? Even with IoT Management, these problems remained intractable at scale.

IoT Services Policy Engine

Enter Cisco DNA Spaces IoT Services Policy Engine. IoT Service policies are use-case-based and address unique problems that the scale and complexity of a large IoT deployment entail. Devices no longer need to be individually onboarded to deploy a use case. Customized policies can be created beforehand and associated with a class of devices at a specific location. Whenever a new device is turned on, it inherits the policy associated with that location and gets auto-configured. IoT Services even provides policy templates to support single-click use case deployment.

Groups

Policies are configured to act on device groups. Classes of devices can be logically organized into groups. Groups can be created manually or based on some logical criteria such as the beacon location, manufacturer, or the mac address prefixes. Let’s say a customer wants to enable Asset Tracking on all the beacons in a certain zone of a building. In that case, the customer first creates a dynamic group targeting the zone. Whenever DNA Spaces locates a beacon in that zone, it automatically assigns it to the group. Group assignment for a beacon gets propagated through firehose notifications as well.

Fig #1: Dynamic Grouping

Policies


Policies help in rolling out use cases across device groups. Each policy solves a specific customer use case and comes with a suggested policy template which helps in rolling out a policy across a group easily. Customers can thus deploy a policy once and then DNA Spaces IoT Services ensures that the use case is always enforced across all the targeted beacons. This completely eliminates the need for manual onboarding or maintaining IoT devices.

Fig #2: Policy Configuration

Once a policy is deployed, IoT Services also displays the number and list of devices on which the policy got applied.

Fig #3: Policy Device count

Alerts


When a policy is applied or it fails to get applied, an alert is generated. Alerts may be system alerts that can be viewed in the DNA Spaces dashboard or notification alerts like emails. Notification alerts are batched and delivered every 15 mins.

Fig #4: Policy Alert

Alerts are especially important for monitoring and security-based policies such as battery monitoring or beacon spoofing.

A New Era


Cisco DNA Spaces IoT Services Policy ushers in a new era of hands-free enterprise IoT Management. It brings together unmatched processing and machine intelligence to deliver a seamless management experience hitherto unseen in enterprise IoT. With new policies being added over time, it is destined to become a bedrock for IoT Management.

Source: cisco.com

Saturday, 9 April 2022

Addressing the noisy neighbor syndrome in modern SANs

The noisy neighbor syndrome on cloud computing infrastructures

The noisy neighbor syndrome (NNS) represents a problematic situation often found in multi-tenant infrastructures. IT professionals associate this figurative expression with cloud computing. It comes manifest when a co-tenant virtual machine monopolizes resources such as network bandwidth, disk I/O or CPU and memory. Ultimately, it will negatively affect performance of other VMs and applications. Without implementing proper safeguards, appropriate and predictable application performance is difficult to achieve, resulting into ensuing end user dissatisfaction.

The noisy neighbor syndrome originates from the sharing of common resources in some unfair way. In fact, in a world of finite resources, if someone takes more than licit, others will only get leftovers. To some extent, it is acceptable that some VMs utilize more resources than others. However, this should not come with a reduction in performance for the less pretentious VMs. This is arguably one of the main reasons for which many organizations prefer to avoid virtualizing their business-critical applications. This way they try to reduce the risk of exposing business critical systems to noisy neighbor conditions.

To tackle the noisy neighbor syndrome on hosts, different solutions have been considered. One possibility comes from reserving resources to applications. The downside is a reduction in the average infrastructure utilization. Moreover, it will increase cost and impose artificial limits to vertical scale of some workloads. Another possibility comes from rebalancing and optimizing workloads on hosts in a cluster. Tools exist to resize or reallocate VMs to hosts for better performance. All this happens at the expense of an additional level of complexity.

In other cases, greedy workloads might be best served on a bare metal server rather than virtualized. Using bare metal instead of virtualized applications can address the noisy neighbor challenge at the host level. This is because bare metal servers are single tenant, with dedicated CPU and RAM resources. However, the network and the centralized storage system remain shared resources and so multi-tenant. Infrastructure over-commitment due to greedy workloads remains a possibility and that would limit overall performance.

The noisy neighbor syndrome on storage area networks

Generalizing the concept, the noisy neighbor syndrome can also be associated with storage area networks (SANs). In this case, it is more typically described in terms of congestion. There are four well-categorized situations determining congestion at the network level. They are poor link quality, lost or insufficient buffer credits, slow drain devices and link overutilization.

The noisy neighbor syndrome does not manifest in the presence of poor link quality or lost and insufficient buffer credits, nor with slow drain devices. That’s because they are essentially underperforming links or devices. The noisy neighbor syndrome is instead primarily associated to link overutilization. At the same time, the noisy neighbor terminology would refer to a server, not a disk. That’s because communication, either reads or writes, originates from initiators, not targets.

Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Learning, Cisco Tutorial and Materials

The SAN is a multi-tenant environment, hosting multiple applications and providing connectivity and data access to multiple servers. The noisy neighbor effect occurs when a rogue server or virtual machine uses a disproportionate quantity of the available network resources, such as bandwidth. This leaves insufficient resources for other end points on the same shared infrastructure, causing network performance issues.

The treatment for the noisy neighbor syndrome may happen at one or multiple levels, such as host, network, and storage level, depending on the specific circumstances. A common situational challenge presents when a backup application monopolizes bandwidth on ISLs for a long period of time. This may come to the performance detriment of other systems in the environment. In fact, other applications will be forced to reduce throughput or increase their wait time. This challenge is best solved at the network level. Another example is when a virtualized application is monopolizing the shared host connection. In this case, the solution might involve remediation at both the host and network level. Intuitively, this phenomenon becomes more pervasive as the number of hosts and applications increases in data center environments.

Strategies to solve the noisy neighbor syndrome


The solution to the noxious noisy neighbor syndrome is not found by statically assigning resources to all applications, in a democratic way. In fact, not all applications need the same quantity of resources or have the same priority. Dividing available resources in equal parts and assigning them to applications would not do justice to the heaviest and often mission critical ones. Also, the need for resources might change over time and be hard to predict with a level of accuracy.

The true solution for silencing noisy neighbors comes from ensuring any application in a shared infrastructure receives the necessary resources when needed. This is possible by designing and properly sizing the data center infrastructure. It should be able to sustain the aggregate load at any time and include ways to dynamically allocate resources based on needs. In other words, instead of provisioning your datacenter to average load, you should design to deal with the peak load or close to that.

At the storage network level, the best way to solve the noisy neighbor challenge is by doing a proper design and adding bandwidth, as well as frame buffers, to your SAN. At the same time, try making sure storage devices can handle input/output operations per second (IOPS) above and beyond the typical demand. Multiport all flash storage arrays can reach IOPS levels in the range of millions. Their adoption has virtually eliminated any storage I/O contention issues on the controllers and media, shifting the focus onto storage networks.

Overprovisioning of resources is an expensive strategy and not often a possibility. Some companies prefer to avoid this and postpone investments. They strive to find a balance between the cost of infrastructure and an acceptable level of performance. When shared resources are insufficient to satisfy all needs simultaneously, a possible line of defense comes from prioritization. This way, mission-critical applications will be served appropriately, while accepting that less important ones may get impacted.

Features like network and storage quality of service (QoS) can control IOPS and throughput for applications, limiting the noisy neighbor effect. By setting IOPS limits, port rate limits and network priority, we can control the quantity of resources each application receives. Therefore, no single server or application instance monopolizes resources and hinders the performance of others. The drawback of the QoS approach is the accretive administrative burden. It takes time to determine priority of individual applications and to configure the network and storage devices accordingly. This explains the low adoption of this methodology.

Another consideration is that traffic profile of applications changes over time. The fast detection and identification of SAN congestion might not be sufficient. The traditional methods for fixing SAN congestion are manual and unable to react quickly to changing traffic conditions. Ideally, always prefer a dynamic solution for adjusting the allocation of resources to applications.

Cisco MDS 9000 to the rescue


Cisco MDS 9000 Series of switches provides a set of nifty capabilities and high-fidelity metrics that can help address the noisy neighbor syndrome at the storage network layer. First and foremost, the availability of 64G FC technology coupled with a generous allocation of port buffers proves helpful in eliminating bandwidth bottlenecks, even on long distances. In addition, a proper design can alleviate network contention. This includes the use of a low oversubscription ratio and making sure ISL aggregate bandwidth matches or exceeds overall storage bandwidth.

Several monitoring options, including Cisco Port-Monitor (PMON) feature, can provide a policy-based configuration to detect, notify, and take automatic port-guard actions to prevent any form of congestion. Application prioritization can result from configuring QoS at the zone level. Port rate limits can impose an upper bound to voracious workloads. Automatic buffer credit recovery mechanisms, link diagnostic features and preventive link quality assessment using advanced Forward Error Correction techniques can help to address congestion from poor link quality or lost and insufficient buffer credits. The list of remedies includes Fabric Performance Impact Notification and Congestion Signals (FPIN), when host drivers and HBAs will support that standard-based feature. But there is more.

Cisco MDS Dynamic Ingress Rate Limiting (DIRL) software prevents congestion at the storage network level with an exclusive approach, based on an innovative buffer to buffer credit pacing mechanism. Not only does Cisco MDS DIRL software immediately detect situations of slow drain and overutilization in any network topology, but it also takes proper action to remediate. The goal is to reduce or eliminate the congestion by providing the end device the amount of data it can accept, not more. The result will be a dynamic allocation of bandwidth to all applications. This will eventually eliminate congestion from the SAN. What is exceedingly interesting about DIRL is its being network-centric and not requiring any compatibility with end hosts.

The diagram below shows a noisy neighbor host becoming active and monopolizing network resources, determining throughput degradation for two innocent hosts. Let’s now enable DIRL on the Cisco MDS switches. When repeating the same scenario, DIRL will prevent the same rogue host from monopolizing network resources and gradually adjust it to the performance level where innocent host will see no impact. With DIRL, the storage network will self-tune and reach a state where all the neighbors happily coexist.

Cisco Certification, Cisco Learning, Cisco Career, Cisco Skills, Cisco Learning, Cisco Tutorial and Materials

The trouble-free operation of the network can be verified by using the Nexus Dashboard Fabric Controller, the graphical management tool for Cisco SANs. Its slow drain analysis menu can report about situations of congestion at the port level and facilitate administrators with an easy to interpret color coding display. Similarly deep traffic visibility offered by SAN Insights feature can expose metrics at the FC flow level and in real time. This will further validate optimal network performance or help to evaluate possible design improvements.

Final note


In conclusion, Cisco MDS 9000 Series provides all necessary capabilities to contrast and eliminate the noisy neighbor syndrome at the storage network level. By combining proper network design with high-speed links, congestion avoidance techniques such as DIRL, slow drain analysis and SAN Insights, IT administrators can deliver an optimal data access solution on a shared network infrastructure.  And don’t regret if your network and storage utilization is not coming close to 100%. In a way, that would be your safeguard against the noisy neighbor syndrome.

Source: cisco.com

Thursday, 7 April 2022

Three Reasons to Prepare for Your Next Broadband Infrastructure Investment

Cisco Exam Prep, Cisco Learning, Cisco Career, Cisco Preparation, Cisco Skills, Cisco Jobs, Cisco Material

Two years after the COVID-19 pandemic proved the internet invaluable with so many of us working, shopping, educating our children, and accessing health care – all from home – we’re still faced with a digital divide between those who have access to broadband Internet and those who don’t. Efforts by service providers to upgrade their network infrastructure to handle increased load has been both rapid and impressive, but more is needed. There remains a significant percent of the population lacking sufficient broadband to fully participate in the digital economy and society. This must change, but how?

There are three areas we need to focus on if we hope to expand much-needed internet access to those who lack it: bridging the digital divide, locating and securing available funds, and improved expertise and planning. But first let’s examine the numbers as related to the ever-increasing value of the internet and those who lack full access to its benefits.

In March 2022, Cisco released its Global Broadband Index Report surveying more than 60,000 workers across 30 different markets about their home broadband access, quality, and usage. Below are a few stats that caught my eye:

• 84% use the internet at home for four or more hours each day

• 78% agree that everyone should be able to securely connect to fast and reliable internet regardless of location

• 65% believe access to affordable and reliable broadband will become a major issue in the future

• 58% state that they were unable to access critical services during lockdown due to unreliable internet

In the United States, there are about 20 million who lack access to high-speed broadband services, and some 17 million school children don’t have internet access at home. Ensuring broadband access and affordability are critical to closing the digital divide. The problem is significantly greater in rural areas, where about 19.3% of the total U.S. population resides. In rural areas, the cost to build and deliver broadband internet services are much higher due to lower population density, harsher environments, and other factors.

Bridging the digital divide is a great idea, but who’s going to pay for it?

The good news is the U. S. Federal Government is providing another $62 billion in grant dollars on top of the $38 billion pre-pandemic grants for broadband internet build outs. Along with wireless expansion, the government’s funding focus has also shifted to fiber and this new money, provided by the Infrastructure Investment and Jobs Act (IIJA), is part of a five-year program. This funding makes it easier to scale your network infrastructure because with the government helping to fund the last mile, it allows service providers to upgrade their middle mile as well, to support additional users and increased bandwidth. Using federal grants helps you build up the network backbone that might have otherwise been too costly.

The additional $65 billion seeks to address the digital divide and specifically focuses on groups of people that are “underserved” and “unserved” as defined in the law. By underserved we’re talking about those who are served by lower speed broadband that doesn’t exceed a certain threshold, for example 100 Mbps download by 20 Mbps upload. Unserved refers to those having internet speeds below 25 Mbps download by 3 Mbps upload.

Below are some of U.S. federal programs that are in the middle of funding broadband deployments, waiting on program rules, or still waiting for funding to be appropriated.

Cisco Exam Prep, Cisco Learning, Cisco Career, Cisco Preparation, Cisco Skills, Cisco Jobs, Cisco Material

The most significant grant program for both public and private entities is the Broadband Equity Access and Deployment (BEAD) with $42 billion set aside for last-mile broadband deployment. This is where both public and private entities can win grant money to deploy broadband to the unserved and underserved. This also means there’s a need for new affiliations like Public-Private Partnerships (PPP) which are contracts between a private party and a government agency to offer a public asset or service such as municipality-provided broadband through a partnership with an internet service provider. PPPs make obtaining right of ways much easier because you’re directly partnering with cities and counties.

PPPs provide many benefits to public entities such as Wi-Fi access and improved broadband for schools, and they help scale the economy because you’re adding subscribers who will consume content, shop online, and seek out other internet-based services. They need ISP partners in order to deliver these benefits.

Knowledge and expertise are key to success


Yet, funding alone is not enough to close the digital divide. You need to determine the right combination of solutions for a particular use case, region, and implementation to get the results you expect. This may require extensive expertise and answering all the questions ahead of time has proved difficult—until now.

Cisco is delivering a new generation of network infrastructure technologies and innovation that provide more capacity and greater flexibility at a lower cost per subscriber, helping to import the economics of the Internet. Here are a few examples:

• Capacity at lower cost with Cisco Silicon One and Routed Optical Networking
• Lower OpEx with simplified networks and automation
• Improved sustainability and flexibility for remote deployment scenarios
• Flexible consumption and payment methods that enable you to pay as you grow

These technologies can make it much easier and less expensive for service providers to expand their offerings in rural regions. Now you can experience them up close and in person at the Cisco Broadband Innovation Center located in Research Triangle Park, NC. This is a perfect opportunity to expand your knowledge and expertise in rural broadband development. Not only will you see how to model and address your own specific use cases, but service providers can also focus on how to be more prepared for grant applications by understanding ways to benefit from Cisco’s next-generation network innovations. And it’s important to remember that federal grants will be awarded to the service providers with the best solutions, so it’s critical to work with a proven company at the forefront of rural broadband development.

Source: cisco.com