posted by: Eric Siegel
Let's take a few moments and think about what Cisco is doing with its AON (Application-Oriented Networking).
Cisco has already sold a router to every organization, enterprise, man, woman, child, and household pet in the entire known universe. Now its hungry army of salespeople need new markets, and "moving up the protocol stack" into the world of applications is an obvious direction.
AON has the potential to be much more than a quick move into a few hot applications markets, however. There's some fascinating technological history and concepts behind Cisco's recent moves. It seems to me that Cisco is beginning to re-create the Tandem Computers message-based operating system, Guardian, in a heterogeneous world.
The Tandem Computers system, which is still very successful and is sold by HP as the "NonStop" computer system, is a loosely-coupled, linearly-scalable, fault-tolerant highly-integrated set of hardware, OS, transaction monitor, and SQL database. It was originally designed for transaction processing but is now expanding into other areas that can use massive scalability and fault-tolerance with a UNIX and SQL programmer interface. One of the keys to its success is its message-based OS kernel, which guarantees to deliver any message exactly one time to its destination and to return a reply exactly one time to the message's originator. The message destination is a process name (which can be an application process or a systems process, such as a database or communications link), and there can be dozens, or hundreds, of identical processes in different processors sharing the same name and therefore sharing the workload. The OS distributes the workload over the processes, and it also handles recovery from failed processes. If a process hasn't successfully returned a reply to a message's originator, the OS automatically ensures that another process with the same name gets the work and that there isn't any database corruption. (I worked at Tandem Computers many years ago, before it became part of Compaq and then part of HP.)
Let's see what Cisco is doing, and compare it to the NonStop systems.
First, we need a message-based set of standards for applications to talk to one another. What a surprise! It seems that XML and SOAP are already standards, and they're already widely adopted. They can be used to send messages, and they have standardized internal metadata that can be used to find the information and context for load balancing. A load balancer (or Cisco device!) can look for a few key internal message fields for choosing a server; there's no need to understand everything in the message.
With XML and SOAP, it's easy build a general-purpose message switch that looks for a few internal fields. Note the contrast with raw TCP/IP, which is a stream of unformatted data. Doing the same thing with raw TCP/IP would require that the organization build its own custom internal format and program its own message switches to handle that format.
Second, we need to load-balance among multiple members of the same class. This capability is already provided by WCCP Version 2 and other load-balancing systems, especially for context-free messages, which is what we're talking about here.
Third, we need to provide the ability to return the reply to the originator of the message. Again, a Cisco device could easily do that.
And fourth, we need to guarantee delivery exactly once. This is more complex, but not insurmountable. Using SOAP header fields, the Cisco devices could embed a guaranteed-unique message identifier that Cisco devices would use to check that each message is actually delivered once, to suppress unwanted duplicates, and to re-deliver messages in a controlled way if the initial server fails before generating a reply.
There's more that should be done, such as the integration of a SQL database and transaction commit/rollback capabilities, which would provide fault-tolerant updates of a database despite server failure. But, clearly, it's possible for Cisco to start building the equivalent of the HP NonStop kernel on Cisco equipment.
And look at the result: Cisco would own the infrastructure underlying a massive distributed-processing system, which could use processors and operating systems from many different vendors. They'd need to use XML and SOAP, but they're already doing that! There would be a requirement that applications be context free and that replies repeat the transaction identifier, but that's minor. And many systems are already heading in that direction.
Cisco would be perfectly positioned to provide more than guaranteed-delivery and load balancing. It could also provide QoS prioritization and classification, security authentication and encryption, and -- because it would then know the security keys -- it could also provide detailed measurement, diagnostic transaction tracing (with synchronized clocks and records of the transaction IDs, of course!), logging, WAN performance optimization, and XML message and protocol transformation.
Cisco wouldn't need to get involved in the complexities of the operating systems and applications, although it would be sensible to build, or standardize on, a distributed-architecture SQL database and transaction monitor. Cisco could leave applications processors, operating systems, and applications to existing vendors.
And so, after enumerating all of these thoughts, I asked John Chambers at the Cisco Networkers conference if he was remaking Cisco as a distributed-computer company.
His answer? "Yes."

Comments