How performance requirements can prove a stumbling block for cloud transformation
As businesses look to cloud for faster, more flexible growth, they confront significant challenges from a legacy application base that has varying levels of cloud suitability.
Some applications have specific kinds of performance requirements that may limit or eliminate their eligibility for virtualisation, which is fundamental to optimising cloud efficiencies. These performance characteristics come in various flavours: A requirement for specialty hardware, a requirement for particularly extreme CPU or memory utilisation, or a requirement for real time deterministic performance.
Under the hood
The core value proposition of virtualisation and cloud is using a single standard hardware platform across a wide range of applications or services. In some cases, though, specialised hardware might be required to most effectively provide some function. Sophisticated maths can sometimes benefit from utilising Graphical Processing Units (GPUs), high-throughput systems benefit from solid-state disk, while cryptographic applications can use random number generators, which can be difficult to come by in virtual environments. Low latency stock trading environments often use specialised low-latency switches and network taps for operations and instrumentation. While a specialty hardware requirement may not prevent migrating an application to a cloud environment, it may limit the vendor options or require special accommodation in the case of private clouds.
Another common obstacle to virtualisation is a requirement for raw processing horsepower. Virtualisation requires some CPU cycles to facilitate management among the various virtual machines running on the server. Maximising the CPU available to an application means using only one virtual machine on that hardware. At which point, the question becomes whether it is cost-effective to use a hypervisor at all.
A more subtle performance requirement that is particularly troublesome in shared service environments is deterministic performance. Some transactions or activities need to happen within a fixed – often short – amount of time. Software defined networking (SDN) solutions, some types of live media broadcasting, big data real time analytics applications, algorithmic trading platforms, all benefit from deterministic, consistent, performance. Cloud-like, virtual, shared resource provisioning is subject to the “noisy neighbour” problem, where it’s difficult to know what other applications might be running without some workload planning and engineering. This yields unpredictable performance patterns which can impact the usability of the application.
While this issue is most obvious in multi-tenant public clouds, private clouds, where there is more knowledge of the overall workload, can be problematic as well. Mixing different operating environments, commonly done as a cost or flexibility measure, can create issues. The classic example is sharing hardware between development and QA or production. Development virtual machines can often come and go, with different performance characteristics between each machine and its predecessor. If hardware is shared with production, this may influence production performance. While horsepower performance may be satisfactory, the “noise” brought on by changing out virtual machines may create unacceptable inconsistency.
Various strategies can be used to manage performance capacity more actively. In a private cloud, it is possible to be selective about how VMs are packed into hardware and workloads are deployed. In public clouds, buying larger instances might limit the number of neighbours. Another strategy is to forgo virtualisation altogether and run on hardware while trying to leverage some of the self-service and on-demand qualities of cloud computing. The market is showing up in this area with bare metal options from providers like Rackspace and IBM. The open source cloud platform OpenStack has a sub-project that brings bare metal provisioning into the same framework as virtual machine provisioning.
Latency and congestion
Along with processing and memory, network and storage latency requirements for an application should be evaluated. Latency is the amount of time it takes a packet to traverse a network end to end. Since individual transactions can use several packets (or blocks, for storage), the round trip time can add up quickly. While there are strategies to offset latency – TCP flow control, multi-threaded applications like web servers – some applications remain latency sensitive.
There are three different primary latency areas to examine. First, latency between the application and the end-user or client can create performance issues. In today’s applications, that connection is commonly a web server and browser. Modern web servers and browsers generally use multi-threading to get multiple page components at once, so the latency issue is obscured by running multiple connections. But there are instances where code is downloaded to the browser (Flash, Java, HTML5) that relies on single-threaded connectivity to the server. It is also possible the communication itself may be structured in a way that exacerbates latency issues (retrieving database tables row by row, for example). Finally, an application may have a custom client that is latency sensitive.
The second primary network latency area is storage latency, or the latency between the application and data. This will show up when application code does not have direct local access to the back-end storage, be it a database or some other persistent store. This is a common case in cloud environments where the storage interface for the compute node tends not to be directly attached.
As a result, there is network latency due to the storage network and latency due to the underlying storage hardware responsiveness. In public clouds, storage traffic also competes with traffic from other customers of the cloud provider. Latency can build up around any of these areas, particularly if an application uses smaller reads and writes. Writing individual transactions to database logs is a good example of this, and special attention to IO tuning for transaction logs is common. Storage latencies and bandwidth requirements can be offset by running multiple VMs, multiple storage volumes, or by using a solid state disk based service, but the choice will impact the functional and financial requirements of the cloud solution.
The third primary latency area is between cooperating applications; that is, application code may depend on some other application or service to do its job. In this case, the network latency between the applications (along with latencies in the applications themselves) can often cause slowness in the overall end-user or service delivery experience.
Depending on the specifics of the cloud solution being considered, these various latencies may be LAN, SAN, WAN, or even Internet dependent. Careful analysis of the latencies in an application and the likely impact of the planned cloud implementation is warranted. Often, as with raw performance above, consistency rather than raw speed is important.
In short, while moving applications into a private or public cloud environment may present an opportunity to save costs or improve operations, applications vary in their suitability for cloud infrastructures. The common technical concerns presented here can add complexity but are manageable with proper planning, design, and execution. Evaluating applications for cloud readiness allows evidence-based planning to take best advantage of cloud economics and efficiencies in the enterprise.
- » Understanding Kubernetes today: Misconceptions, challenges and opportunities
- » IBM focuses on second chapter of cloud story at Think – hybrid and open but secure
- » The four barriers between your business and global connectivity – and how to break them down
- » Google Cloud acquires Alooma to bolster enterprise data migration capabilities
- » The cloud in 2020: Enterprise compatibility with edge computing, containers and serverless