Defying data gravity: How can organisations escape cloud vendor lock-in?

The process of deriving the maximum possible business value from the data you hold is not a new challenge, but it is one that all too many organisations are still learning to address in the most sustainable way.

The concept of ‘data gravity’, coined by software engineer Dave McCrory in 2010, refers to the ability of bodies of data to attract applications, services and other data. The larger the amount of data, the more applications, services and other data will be ‘attracted’ to it and the faster they will be drawn. As the amount of data increases exponentially it gains mass, and becomes far more rooted in place. In a business context, it becomes harder and harder for that data to be moved to different environments.

What is especially true is that data has far more mass than the compute instances utilising it – for example, moving 1,000 virtual machines to the cloud is far easier than moving 1,000GB of data to the cloud – the same is true for migrating out of the cloud. Therefore with data gravity it has become more important than ever where that data resides, and how ‘portable’ it can really be for it to be utilised to its full potential. Increasingly, the ‘where’ for many businesses is in the cloud.

Locked-in to the cloud?

Most forward-looking businesses agree that it is no longer enough to leverage the tools at their own datacentres – on-premises – and thrive. While cloud security is still among the top concerns for CIOs, the cloud is starting to play a key role for organisations. The ease of migrating data to the cloud creates a common trap that businesses are falling into – based on the assumption that by moving data and compute to one cloud provider their digital transformation journey is complete. On the contrary.

Cloud providers are constantly leapfrogging each other in their ability to provide the ‘next big thing’, so businesses need to define a clear strategy to ensure that data gravity is not tying them down to one cloud provider. Thinking back to data gravity, this is easier said than done! At present, the structure of the cloud market and the volumes of data we’re dealing with have brought many to a position where they are stuck using the compute functionality of the provider they’ve been using for data storage, due to the sheer cost & complexity of extracting and moving that data to another cloud provider. Rather than gaining flexibility and agility and letting the cloud vendors compete for their business, businesses are back in a state of lock-in and can only gain the level of agility that their particular provider has chosen to give them. In many ways their competitive advantage is the hands of the cloud provider.

Remaining competitive in the age of digital transformation means being able to respond and adapt to the latest technologies available to you as a business. So what is the next step in breaking away from cloud vendor lock in?

Regain the sovereignty of your data

The next development in cloud is the disaggregation of storage and compute, with the introduction of a sovereign storage cloud, which is provider-agnostic and ‘neutral’ while at the same time physically located within the same building as the cloud providers to avoid adding latency.

When Google wrote the famous 2003 white paper that laid the foundations for big data, they wrote that placing the data inside the nodes was the only means as no external storage could handle 100s of Terabytes at the time. However, times have changed and 15 years later this assumption doesn't hold true anymore. Taking data out of the servers (disaggregating data and compute) has not only become possible, it also holds the key to reducing cost and improving the efficiency of your clusters

The data within this neutral cloud-adjacent storage can be utilised for compute instances on any cloud platform or environment, allowing businesses to pick and choose cloud compute instances based on which service provides the functionality they need at the time. This is the next iteration of a sustainable multi-cloud strategy for large enterprises, where data is immune to this lock-in.

With businesses gaining the ability to pick and choose cloud computing services whilst entrusting their data to neutral cloud-adjacent storage, this new model will also usher in a greater level of real-time competition for customer workloads between the public cloud providers themselves. The net result of this competition is a win-win for businesses in terms of greater choice, flexibility, and cost benefits.

By regaining the sovereignty of their data whilst still allowing for total flexibility and freedom to adopt the latest innovations in the cloud sphere, forward-thinking businesses will emerge head and shoulders above competitors. in hearing industry leaders discuss subjects like this and sharing their experiences and use-cases? Attend the Cyber Security & Cloud Expo World Series with upcoming events in Silicon Valley, London and Amsterdam to learn more.

Related Stories

Leave a comment


This will only be used to quickly provide signup information and will not allow us to post to your account or appear on your timeline.

23 Nov 2018, 2:01 p.m.

Eran - that is an interesting approach, but how should enterprises deal with other issues of multiple cloud providers: for example, differing functionality across providers?

Should they adopt a lowest common denominator approach, and hence not take advantage of 'the next big thing' from a provider that you refer to?

Or should they expect to have to continually invest in ongoing development to facilitate migration of workloads between cloud providers?

Or in practice is a cloud-agnostic approach not practical for most companies?


7 Jan 2019, 10:14 a.m.

you raise a great point @mc110, one that often confuses customers.
For the vast majority of functionalities critical to the business, all cloud align *over time*, so if you chose to use a feature that is unique to AWS today, it will likely become available in Azure in the next 12 months, essentially limiting the vendor lock-in.
However now the question becomes will you be able to leverage Azure (in that example) after 12 months or will you remain locked in AWS?
The only way to be able to leverage the other providers is by:
1) Avoiding data gravity lock-in by placing data outside of the cloud
2) Avoiding multiple copies of the data by placing data in a cloud-neutral location.

The best example is an ML model you train in AWS, and then want to move to GCP / Azure Etc. If the training data *usually in the terabytes to petabytes range) is inside AWS the egress cost alone will kill the project. If the data is in a vendor neutral location but still adjacent to all the cloud providers, you can train GCP and Azure, compare them to AWS on price / quality and change.

A the end of the day, avoiding vendor lock in is a business imperative, and as is often the case with business imperatives it's the IT department that provides the technical solutions / enablers.

Eran B