Eliminate Legacy. Forever.

As we release AOS 3.0, it’s time for the industry to consider taking a hard look at legacy and it’s crippling effects. It’s also a fitting opportunity to propose a path forward that enables organizations to eliminate legacy forever.

Indeed, everyone hates legacy, when defined in the context of infrastructure management. So why does it happen and how to eliminate it? Legacy is created when growing complexity in your infrastructure prevents you from making the changes in your infrastructure that are required for your business with the required agility and without unreasonable risk. In order to compete, you are required to make risk-free, agile, and safe changes. Yet, even though you own your infrastructure (and not the other way around), legacy prevents you from doing so.

The challenge, of course, is to move away from your legacy to a modern infrastructure in a manner that doesn’t disrupt your business. You want to be able to say “it’s not you, it’s me” when you feel the time is right.

The key aspect that makes it difficult to actuate change is that you may not even know what your current state is. How do you achieve “not disrupting your business” when you cannot say with certainty what business applications you are running and what their requirements are?

Your business applications are essentially a collection of compute endpoints. These endpoints have specific reachability requirements. They may have distinct security requirements. Some of them may be members of a load balancing group. They may have different HA and QoS requirements. They may differ in how mission critical they are to your business. Typically all the aspects (reachability, security, load balancing, HA, QoS) are implemented using different, separate enforcement mechanisms.

Now, as you attempt to change, evolve your legacy infrastructure, or even move to a different one you first need to understand current enforcement mechanisms and their interactions. Then you need to understand and leverage new enforcement mechanisms in your evolved infrastructure (as they have likely changed) and how they map to the ones implemented in the legacy infrastructure, while at the same time ensuring your business application’s requirements are still met.

The foundational automation architecture principle that helps you with this situation is the separation between service/policy specification (representation of your business applications and their requirements) and the enforcement mechanisms (how to implement/enforce these requirements). It states that the specification of your business needs should be decoupled from the way to implement them, satisfy them, and enforce them. Once you have that separation, the portability of your workloads becomes feasible. Only once you have that separation, can you map service intent to enforcement mechanisms — but this separation is the prerequisite.

So the first question is, what should that service specification look like? The answer to that question, in principle, has been around for quite a while. Variants of the concept of endpoint/group based policy specification can be found in OpenStack, AWS, and Azure. What does it look like?

Business application intent is expressed as a composition of endpoints that are placed into groups with the purpose of expressing the need for some common behavior: reachability, security, load balancing requirements, to state a few examples. Endpoint definition can vary. It can be an application, a virtual machine, a container, an external (unmanaged) endpoint, a physical/logical port, etc. Policies are instantiated and related to groups or individual endpoints to define that behavior. Policies can relate to groups in a directional or non-directional manner. Policies are collections of rules, that abide by a “condition followed by an action” pattern. Groups can be composed of other groups, creating hierarchy.

Endpoints, groups, policies, and rules can be thought of as building blocks for expressing intent. Dynamism is achieved by adding/removing endpoints to/from groups and by inheriting the policies applied to groups. Changing of the behavior is achieved by modifying the policies/rules that apply to groups and endpoints.

So if this has been around for a while, what are we missing? When modeling, there are two places where things can go wrong. First is at the model definition time. To do it right, the model has to be expressive enough to cater to all use-cases, while not being overly complicated. It needs to be complete, yet minimal. Endpoint/group based policy specification is expressive enough. The second opportunity for error is at the model application time. There are many instances where this model is applied directly to enforcement mechanisms as opposed to service/policy declaration. What one gets, as a result, is the “right model, applied to the wrong domain”.

Then there is the task of mapping service intent to enforcement mechanisms mentioned earlier. If not architected correctly this can be a very challenging task. Intent-Based Networking systems at Level 2 of the IBN taxonomy have the necessary foundations to complete this task, namely the Single Source of Truth and real-time, event-based reaction to change.

At Apstra, we’ve build AOS to incorporate the foundations into the architecture of the solution. In the absence of these foundations, overcoming the challenge typically results in leaking of enforcement abstractions into service/policy abstractions. ( See micro-segmentation blog for example). This violates the separation of policy and enforcement principle and results in non-portable workloads, which was the primary motivation for the separation in the first place. Note that leaking abstractions are sometimes introduced on purpose as vendors building these APIs may have little or no interest in universal portability of the workloads. That spells “lock-in”, which spells inability to change, which spells legacy.

We discussed in earlier blogs that the inability to deal with Day 2 operations turns your greenfield into a brownfield overnight. Business requirements specified in an enforcement agnostic, portable manner eliminate this danger.

This is one reason I’m excited about AOS 3.0, which we released today. It introduces for the first time, Group Based Policies, implemented using the architectural principles described above. With AOS 3.0, customers have the right starting point to eradicate legacy forever.  

You can read more about AOS 3.0 in the press release  and the data sheet. Please register for our upcoming webinar.

About the Author – Sasha Ratkovic:

Sasha Ratkovic is a thought leader in Intent-Based Analytics and a very early pioneer in Intent-Based Networking. He has deep expertise in domain abstraction and intent-driven automation. Sasha holds a Ph.D. in Electrical Engineering from UCLA.

* This article was originally published here

* This article was originally published here

The Dangers of Hardware Vendor Lock-in

Hardware Vendor Lock-In: A Long and Messy Past

If you think about the beginnings of networking, hardware lock-in was the norm. The original IBM Token Ring, for example was a proprietary networking solution; and every new purchase order had to go to the same vendor if customers wanted to ensure connectivity.

Customers demanded interoperability between hardware vendors, and so came Ethernet which promised to be an open, interoperable standard. However, vendors recreated lock-in by implementing proprietary VLAN extensions. Some vendors even went as far as going after other vendors who implemented private VLANs in court, to the detriment of customers and the industry as a whole!

Internet Protocol (IP) was another open standard. Yet hardware vendors convoluted the standard, and came out with proprietary routing protocols such as IGRP which were designed to lock customers into the hardware vendor’s equipment exclusively.

Today, with the emergence of merchant silicon, whitebox switches, and open source switch operating systems, it is more important than ever for businesses to defend themselves from lock in, and have the flexibility to be vendor-agnostic so they can leverage those options.

Yet by the same token, it has become more critical than ever for the hardware vendors to defend themselves from the competition by locking in their customers!  

So with white box switches, open source device operating systems, and commoditization on the rise, what is the new hardware vendor lock-in strategy? Hardware vendors are coming out with proprietary management APIs which lock their hardware to their management systems. These proprietary network management solutions are the ultimate form of lock-in — not the SNMP add-ons of the past, but sophisticated network monitoring, configuration and trouble-shooting solutions that only work with that one vendor’s equipment. Uber lock-in!

Why is Hardware Lock-in dangerous?

The dangers of hardware vendor lock-in are real, consequential, and affect every aspect of IT business operation. Hardware vendor lock-in is a primary inhibitor of organizations’ digital transformation initiatives and their ability to compete. It starts with the fact that IT is completely at that vendor’s mercy, which has profound consequences.

Extremely high costs

When an organization locks itself into their hardware vendor, two things happen:

1. IT loses control over the pricing of hardware:

How can IT have any leverage when there is only one vendor to talk to? IT may have negotiated an initial deal for the first batch of hardware, or for the first year. But when IT is locked-in, nothing prevents the hardware vendor from coming back with higher prices as soon as they are able.

Hardware vendors aggressively promote and position their own management software because they know once deployed they gain account control, become deeply entrenched, and make it difficult to replace them.  At that point, IT has all but guaranteed the highest prices and TCO (total cost of ownership).

In this report, Gartner comments that “By introducing competition in this thoughtful manner, Gartner has seen clients typically achieve sustained savings of between 10% and 30% and of as much as 300% on specific components like optical transceivers”. In another report, Gartner analysts Mark Fabbi and Debra Curtis find that “Sole-sourcing with any vendor will cost a minimum 20% premium, with potential savings generally reaching 30% to 50% or more of capital budgets when dealing with premium-priced vendors”.

2. IT loses control over operational expenses

Even more importantly, with vendor lock-in IT loses control over the ability to reduce operational expenses. This happens for many reasons:

  1. Reducing operational expenses is achieved by using the automation tools that meet IT’s business needs. Those automation tools are generally built by specialized best-of-breed companies which are 100% focused on solving those problems. It is highly unlikely that the management software built by the hardware vendor, and designed to lock the customer into that hardware vendor will meet all the customer’s requirements for automation.
  2. Hardware vendor positions extremely expensive and highly profitable “professional services” that build spaghetti code to remediate this divergence between IT requirements and the capabilities of their software solution. With every new version of the management software, the spaghetti code has to be upgraded, at excessive cumulative expense. The result is massive costs, coupled with an innate inability to deliver on the needs of the business.
  3. The IT team becomes mired in hardware vendor minutia – from arcane commands to arcane workflows, or arcane vendor specific tools; this wastes resources and time, and keeps IT away from the tasks that are more aligned with the needs of the business.  It is no wonder that according to ZK Research, businesses are now dedicating 82% of their IT budgets solely to keeping the lights on, leaving very little budget for innovation.

In fact Gartner finds that by breaking their lock-in from one single vendor, some organizations were able to reduce their opex costs by as much as 95%!

Outages and security risks

1. Substantially higher rate of outages

When IT is locked into one hardware vendor, the business is completely at the mercy of their bugs and quality problems (both hardware and software). When problems occur, customers have no recourse but to depend on their hardware vendor, and them alone, to solve these problems. Even if they are motivated to fix the issues, they are limited by development cycles, and it may take them a lot longer to solve a problem than what IT has written down in an SLA contract, or by what is acceptable by the business. Even if IT is able to hold them accountable for violating their SLA, it doesn’t help the business, and frankly IT has very little recourse since they’re locked in. Being locked into the management software and analytics that the hardware vendor has provided means not being able to take advantage of other tools on the market that may be more effective at preventing these outages in the first place.  

2. Security vulnerabilities Exposure

Security vulnerabilities are common and are routinely discovered on devices in the infrastructure. When a hardware vendor discovers a security vulnerability in a customer’s hardware and device OS that they are locked into, the customer must wait for the hardware vendor to provide a patch, which may take months; and then when this patch shows up, the customer will need to go through their qualification process for that security patch, which may take many more months. Skipping the qualification process is akin to rolling the dice on new potential unknown bugs (a very common occurrence with new device OS versions) that could potentially cause bigger problems, performance problems, or outages. Gartner analyst Andrew Lerner wrote a great blog about the pain involved in network upgrades, where he compares the process to going to the dentist!

In summary, when a customer is locked into a hardware vendor and faces a security vulnerability there is a risk of being exposed for months before it can be remediated.

Failing to deliver for the business and losing relevance as a result

1. Unable to meet the requirements of the business

When IT locks itself into hardware, a massive missed opportunity cost is incurred. As engineers are deployed to become experts on vendor-specific hardware and device OS versions, they are unable to focus on the initiatives that are most critical to the business: the cloudification of their infrastructure; improving their infrastructure to avoid outages and improve application availability; automation efforts in support of the business’s digital transformation initiative, to name a few.

Indeed, it is common that Fortune 500 enterprises delay their critical automation initiatives because of their investment in becoming world experts at the vendor’s latest hardware or latest device operating systems. This is often viewed as a “sunk cost” and results in dramatic consequences to the business, and also to the relevance of infrastructure teams. As businesses digitally transform and application teams grow impatient with their hardware-first infrastructure teams’ inability to deliver on their requirements, application teams turn elsewhere to make progress.

2. Shadow IT

What often happens is that application teams and DevOps teams start spinning up workloads in the cloud, hence bypassing the IT infrastructure teams completely! The phenomena, often referred to as “Shadow IT,” is quite pervasive. For example, according to a 2014 PwC report, it was estimated that “80% of enterprises have used cloud platforms and SaaS applications that have not been approved by IT; and for those enterprises, the percentage of applications that were running on shadow IT was a whopping 35%.”  

Needless to say, shadow IT significantly increases the security risks for an organization; it also results in costs ballooning out of control.

In a future blog, I’ll tell you how a software first approach featuring Intent-Based Networking can help you avoid vendor lock-in.

* This article was originally published here

* This article was originally published here