Dynamic Networks for Enterprise Applications
Posted by indroneel on December 2, 2007
Modern enterprise applications are characterized by a preponderance of distributed computing paradigms and methodologies. In consequence, the network between different application elements (services) now play an increasingly significant role. Typically, the network layer for most distributed environments possess the following three characteristics:
- the nodes in the network (e. g. the services themselves) encapsulate complex functionality.
- the communication part is hierarchically layered into a set of relatively simple transports and protocols.
- the network topology is static with the location and capability of services defined a-priori.
In this article, we shall explore the feasibility and possible benefits of having a dynamic topology for the networking layer of enterprise applications.
Some Classic Examples
Dynamic network topologies are not a new concept. Multiple protocols and implementations exist to support the same, typically as low-level infrastructure services. The Service Location Protocol (SLP), Universal Plug and Play (UPnP) and Jini are good examples of such implementations.
To many application developers, ubiquitous examples of dynamic network topologies would include the Microsoft LAN Manager and peer-to-peer file sharing using a variety of Gnutella and torrent protocols. Given the relatively static nature of an enterprise intranet, this has resulted in certain misconceptions regarding dynamic networks.
- It is a transport-level feature oriented more towards resource sharing rather than distribution of functionality/logic.
- Resources and functionalities offered are transient in nature (ad-hoc networking).
- Performance (throughput) will be lower in the absence of durable entities.
The move to SOA
Even though distributed computing has been in effect for the past two decades, only recently with service-oriented architectures do we see its widespread acceptance for enterprise systems.
Traditionally, Internet-based applications have adhered to a server-centric approach using simple and static network topologies. The application is logically divided into presentation, business and data access layers — the so-called three-tier models. Incidentally, all these layers are deployed as part of the same process and memory space (e. g. an application server). The network is relevant only during database connectivity, browser to server communication and interchange with message-oriented middle-wares.
As enterprises grew in complexity and size, service-oriented architectures gained prominence as the alternative for an increasing number of heterogeneous and disjoint applications within corporate IT spaces. Under the new scheme, diverse functionalities from different applications are unified and streamlined into a set of discrete services bound together using a standard infrastructure (e. g. ESB, web-services and BPM to name a few).
In today’s SOA driven implementations, services are extremely complex entities that are one-step away from being full-blown standalone applications. With time, as the SOA concept matures, this complexity is expected to change. One can predict the following to happen with SOAs of the future:
- The IT space for a typical enterprise will be powered by a large number of service instances.
- These services will be more fine-grained and less complex than what they are today.
- There will be few limitations on the number of service instances available or on the number of physical locations used to host these services.
- The network in between these services will become more complex both in terms of visualization and implementation.
A dynamic network topology is one such complexity that can be expected of enterprise networks of the future.
The following are some of the features of dynamic networks relevant to enterprise systems that are making use of certain distributed computing paradigms and/or SOA principles.
A dynamic network topology represents a shrink-and-grow environment. New nodes can be added to the network and existing nodes dropped. The introduction or removal of a node does not affect the availability of other nodes within the network. This is not the case with static topologies wherein a downtime (a rolling outage for better-designed systems) is necessary to finalize such changes.
One important feature, inherent in such dynamics, is that of transparent sharing and take-over. The introduction of a node should automatically redirect a portion of the system load to the newly available entity. Similarly, once a node is removed, the corresponding load is distributed among other available nodes within the network.
In a dynamic environment, localization of capabilities can have serious effects on the overall availability of functionality and data. It is therefore imperative to replicate features and information across multiple nodes within the network. For example, purely data-driven systems, like peer-to-peer file sharing, rely on the fact that any file being shared will be available (in part or whole) on multiple machines within the network.
Availability takes on a different meaning under dynamic network topologies. The focus is no longer on the uptime of individual nodes, as is the case with static networks. Instead, one should consider the availability of overall functionalities and information that is powered by and replicated across multiple nodes within the network.
A general observation in this regard is that managed environments (e.g. a corporate intranet) provide for higher functional availability while open environments (e.g. peer-to-peer networks on the Internet) enable better availability of information.
Discovery of Capabilities
Nodes in a network must be aware of each other’s capabilities for collaboration (this is true for both static and dynamic networks). In a static scenario, this amounts to updating each node configuration followed by a downtime to finalize the changes. Unfortunately, in a dynamic environment, it may not be a good idea to perform such manual configuration for individual node elements. Not only will this involve a downtime, the volume of nodes available combined with their transient nature make such procedures difficult to implement.
To counter this, most dynamic networks include a feature that allows every node to discover each other’s availability and capabilities. This is an automated mechanism, involving a variety and combination of multi-casting (both active and passive), broadcasting, information routing and directory-based services.
Let us now consider the benefits of dynamic network topologies in an enterprise context vis-a-vis the features described so far.
Managed networks (e. g. a corporate intranet) represent finite spaces containing services that are stable in nature. In such networks, a dynamic topology augments durability by addressing scenarios where services become unavailable. This unavailability can be due to abnormal program termination or planned downtimes (e. g. for maintenance and upgrades).
In server-centric applications, non-availability of services usually means downtime of the application in whole or in part. For example, the downtime of the database server constitutes a shutdown of the whole application. The downtime of a mail server on the other hand causes a temporary suspension of the email-related capabilities but may not require a system downtime.
The situation is different for services deployed in a dynamic network environment. Since redundancy is a key feature of such topologies, there will be multiple services within the same network that are capable of performing the same set of functions using the same set of information. In the event of a failure or termination of one service, the network ensures (through transparent distribution and allocation) that the workload is automatically picked up by another service with similar capabilities. This results in virtually no loss of availability and maintains quality of service of the system as a whole.
Scalability is expressed as the upper limit for a variety of parameters that depict the workload that a system is capable of handling without critical errors or significant loss in performance. These parameters include number of simultaneous users sessions, transactions per second and page hits per second. For enterprise applications, the usual practice to improve scalability is the introduction of additional identical service nodes within the network (e. g. multiple application servers in a cluster) for load sharing.
Introduction of a new node in a cluster that is deployed on a static network can be quite an involved process. Every other node in the cluster and the load-balancer need to be configured with the new node being introduced. Subsequent to configuration, the cluster may require a restart (downtime) — a costly operation for heavily-loaded, transaction oriented sites.
Dynamic networks come with inbuilt support for clustering and load-balancing. Since these are shrink-and-grow spaces, addition of new nodes/services is always transparent and does not require manual intervention or interruptions in the form of downtimes.
Single Point of Failure
Single point of failure is the most commonly encountered problem with distributed systems. Simply stated, a single point of failure is any service whose non-availability results in a downtime of the overall application. Such failures occur due to lack of duplication using backup elements and fail-safe mechanisms for critical portions of the application.
In the absence of redundancy, every service within an application is a potential single point of failure. In a distributed setup with fail-safe mechanisms, the common points of failure include the database, load-balancers and transaction managers (these are the critical elements that determine work allocation, coordination and data sharing).
Even the most carefully designed distributed systems can suffer from single point of failure. However, dynamic networks, with their inherent support for redundancy can, to a large extent, limit the probability of such errors.
The failure of any service in a distributed system usually involves some kind of rollback operations. Even if the workload is taken over by and distributed among other backup services, the operation that failed would need to be re-executed.
In addition, the amount of rollback that needs to be done before resumption of normal operations also requires consideration. For well-designed systems the usual practice is to break up complex operations into simpler ones with frequent save-points in between.
Since dynamic networks deal with potentially transient entities, there is a natural and implicit drive to organize functionality and data into manageable chunks to avoid major rollbacks in the event of failures.
Configuration of services in a distributed environment is an involved process from a maintenance point of view. For complex services, the number of configuration items also increases proportionately to include:
- platform (static) configuration e. g. application assembly descriptors and web application configuration.
- application (dynamic) configuration that is read from a combination of properties, INI and XML files external to the code-base.
- configuration for integration with other services within the domain e. g. establishing the connectivity with database servers, mail servers and interoperability with indexing and directory services to name a few.
Of these, the first two items are defined by the application developer and requires simple replication for multiple services during deployment. Management of the third type of configuration can potentially get complex especially with large distributed systems.
Dynamic networks with inbuilt features of service discovery and capabilities evaluation can help minimize the effort involved in maintain the third type of configuration.
Resources and Links
 Dynamic Networks Overview Presentation by Mark Wallis.
 Detlef Schoder and Kai Fischbach, Core Concepts in Peer-to-Peer (P2P) Networking. In: Subramanian, R.; Goodman, B. (eds.): P2P Computing: The Evolution of a Disruptive Technology, Idea Group Inc, Hershey.
 The Guntella Protocol Documentation project on Sourceforge.
 The Distributed Contributing Site provides comprehensive information on articles, projects news and links related to this particular style of computing.
 Nadiminti, Dias de Assunção, Buyya (September 2006). “Distributed Systems and Recent Innovations: Challenges and Benefits“. InfoNet Magazine, Volume 16, Issue 3, Melbourne, Australia.