Introduction
Network and security teams seem to have had a love-hate relationship with each other since the early days of IT. Having worked extensively and built expertise with both for the past few decades, we often notice how each have similar goals: both seek to provide connectivity and bring value to the business. At the same time, there are also certainly notable differences. Network teams tend to focus on building architectures that scale and provide universal connectivity, while security teams tend to focus more on limiting that connectivity to prevent unwanted access.
Often, these teams work together — sometimes on the same hardware — where network teams will configure connectivity (BGP/OSPF/STP/VLANs/VxLANs/etc.) while security teams configure access controls (ACLs/Dot1x/Snooping/etc.). Other times, we find that Security defines rules and hands them off to Networking to implement. Many times, in larger organizations, we find InfoSec also in the mix, defining somewhat abstract policy, handing that down to Security to render into rulesets that then either get implemented in routers, switches, and firewalls directly, or else again handed off to Networking to implement in those devices. These days Cloud teams play an increasingly large part in those roles, as well.
All-in-all, each team contributes important pieces to the larger puzzle albeit speaking slightly different languages, so to speak. What’s key to organizational success is for these teams to come together, find and communicate using a common language and framework, and work to decrease the complexity surrounding security controls while increasing the level of security provided, which altogether minimizes risk and adds value to the business.
As container-based development continues to rapidly expand, both the roles of who provides security and where those security enforcement points live are quickly changing, as well.
The challenge
For the past few years, organizations have begun to significantly enhance their security postures, moving from only enforcing security at the perimeter in a North-to-South fashion to enforcement throughout their internal Data Centers and Clouds alike in an East-to-West fashion. Granual control at the workload level is typically referred to as microsegmentation. This move toward distributed enforcement points has great advantages, but also presents unique new challenges, such as where those enforcement points will be located, how rulesets will be created, updated, and deprecated when necessary, all with the same level of agility business and thus its developers move at, and with precise accuracy.
At the same time, orchestration systems running container pods, such as Kubernetes (K8S), perpetuate that shift toward new security constructs using methods such as the CNI or Container Networking Interface. CNI provides exactly what it sounds like: an interface with which networking can be provided to a Kubernetes cluster. A plugin, if you will. There are many CNI plugins for K8S such as pure software overlays like Flannel (leveraging VxLAN) and Calico (leveraging BGP), while others tie worker nodes running the containers directly into the hardware switches they are connected to, shifting the responsibility of connectivity back into dedicated hardware.
Regardless of which CNI is utilized, instantiation of networking constructs is shifted from that of traditional CLI on a switch to that of a sort of structured text-code, in the form of YAML or JSON- which is sent to the Kubernetes cluster via it’s API server.
Now we have the groundwork laid to where we begin to see how things may start to get interesting.
Scale and precision are key
As we can see, we are talking about having a firewall in between every single workload and ensuring that such firewalls are always up to date with the latest rules.
Say we have a relatively small operation with only 500 workloads, some of which have been migrated into containers with more planned migrations every day.
This means in the traditional environment we would need 500 firewalls to deploy and maintain minus the workloads migrated to containers with a way to enforce the necessary rules for those, as well. Now, imagine that a new Active Directory server has just been added to the forest and holds the role of serving LDAP. This means that a slew of new rules must be added to nearly every single firewall, allowing the workload protected by it to talk to the new AD server via a range of ports – TCP 389, 686, 88, etc. If the workload is Windows-based it likely needs to have MS-RPC open – so that means 49152-65535; whereas if it is not a Windows box, it most certainly should not have those opened.
Quickly noticeable is how physical firewalls become untenable at this scale in the traditional environments, and even how dedicated virtual firewalls still present the complex challenge of requiring centralized policy with distributed enforcement. Neither does much to aid in our need to secure East-to-West traffic within the Kubernetes cluster, between containers. However, one might accurately surmise that any solution business leaders are likely to consider must be able to handle all scenarios equally from a policy creation and management perspective.
Seemingly apparent is how this centralized policy must be hierarchical in nature, requiring definition using natural human language such as “dev cannot talk to prod” rather than the archaic and unmanageable method using IP/CIDR addressing like “deny ip 10.4.20.0/24 10.27.8.0/24”, and yet the system must still translate that natural language into machine-understandable CIDR addressing.
The only way this works at any scale is to distribute those rules into every single workload running in every environment, leveraging the native and powerful built-in firewall co-located with each. For containers, this means the firewalls running on the worker nodes must secure traffic between containers (pods) within the node, as well as between nodes.
Business speed and agility
Back to our developers.
Businesses must move at the speed of market change, which can be dizzying at times. They must be able to code, check-in that code to an SCM like Git, have it pulled and automatically built, tested and, if passed, pushed into production. If everything works properly, we’re talking between five minutes and a few hours depending on complexity.
Whether five minutes or five hours, I have personally never witnessed a corporate environment where a ticket could be submitted to have security policies updated to reflect the new code requirements, and even hope to have it completed within a single day, forgetting for a moment about input accuracy and possible remediation for incorrect rule entry. It is usually between a two-day and a two-week process.
This is absolutely unacceptable given the rapid development process we just described, not to mention the dissonance experience from disaggregated people and systems. This method is ripe with problems and is the reason security is so difficult, cumbersome, and error prone within most organizations. As we shift to a more remote workforce, the problem becomes even further compounded as relevant parties cannot so easily congregate into “war rooms” to collaborate through the decision making process.
The simple fact is that policy must accompany code and be implemented directly by the build process itself, and this has never been truer than with container-based development.
Simplicity of automating policy
With Cisco Secure Workload (Tetration), automating policy is easier than you might imagine.
Think with me for a moment about how developers are working today when deploying applications on Kubernetes. They will create a deployment.yml file, in which they are required to input, at a minimum, the L4 port on which containers can be reached. The developers have become familiar with networking and security policy to provision connectivity for their applications, but they may not be fully aware of how their application fits into the wider scope of an organizations security posture and risk tolerance.
This is illustrated below with a simple example of deploying a frontend load balancer and a simple webapp that’s reachable on port 80 and will have some connections to both a production database (PROD_DB) and a dev database (DEV_DB). The sample policy for this deployment can be seen below in this `deploy-dev.yml` file: