0 In Cloud/ Data Centers/ General News

Understanding Data Gravity: The Force Surrounding Big Data and Machine Learning

21 October 2020

Data-intensive applications like analytics, Artificial Intelligence (AI), and IoT are modernizing and, in many cases, driving digital transformation for global businesses. This is prompting new concepts and trends to surface as we shape our understanding of the impacts these applications will have on the future of IT architectures. Data gravity is one of those concepts.

Now a trending term in the IT lexicon, particularly since Digital Realty released its global Data Gravity Index based on a year’s worth of research, the ‘data gravity’ phenomenon is becoming top of mind for industry leaders.

According to the report, Data Gravity Intensity – as measured in gigabytes per second – is expected to increase at a CAGR of 139% globally by 2024 “…as data stewardship drives global enterprises to increase their digital infrastructure capacity to aggregate, store, and manage the majority of the world’s data.”

We got to grips with this trend and its impact, gaining insights from Cloudscene Founder, Bevan Slattery on how the prevalence of data gravity is connected with Big Data and the increased application of Machine Learning – and how these types of ecosystems are shaping the future for businesses and the IT industry.

What exactly is data gravity?

Data gravity is essentially a metaphor, referring to the ability of masses of data to attract applications, services, and other data. “It’s very much like gravity in the traditional sense,” Slattery says, in that the attraction between objects – like planets – which is explained by the Law of Gravity, is principally very similar to the idea of data sets being drawn to other data.

“The denser the planet is, the greater gravity it has to attract other objects. Data gravity has the same connotation where a mass of data that is hosted or resides somewhere, in itself, attracts more data,” he explains.

Gauging the pull of data gravity by region

Digital Realty’s Data Gravity Index measures the intensity and gravitational force of enterprise data growth across 21 metros globally whose scores provide a relative proxy for measuring data creation, aggregation, and processing. These scores take into account enterprises’ data mass, data activity, total aggregate available bandwidth, and latency, which are all factors that contribute to a region’s potential for data gravity.

The Index reveals that, with London scoring 167.05, the EMEA region is home to the greatest intensity of data gravity across the 21 metros, followed by APAC whose highest scoring city was Tokyo at 80.32, and then North America which was led by New York City with a score of 79.61.

The intensity of data gravity across different regions can be attributed to certain market-specific activities like increasing digitization of enterprise workflows, mergers and acquisitions, and increases in the volumes of data being aggregated and stored, globally. 

Seeking greater value from data processing

As enterprises operating in the digital economy seek to obtain the greatest possible value from their data-intensive applications, their growing understanding of the impacts of data gravity has prompted different considerations for data processing. Namely, the idea of aggregated processing versus distributed processing at the edge.

“Historically, enterprises have only really considered the cost and performance associated with the processing and transmission of data,” Slattery says. “But there is now a third dimension emerging, which surrounds the value of that data.”

Slattery believes that while operating closer to the edge of a network has been key for many providers to enable cheaper, faster, and more convenient data processing, some are now shifting their focus to the value of aggregating this data as a means to process it through Machine Learning.

Data lakes are becoming more important among Big Data companies that acknowledge the value of storing aggregated raw data sets to which AI can be applied for processing, which Slattery says will “…become a major driver of data services in the near future.”

Fundamentally, it’s the value that’s being created through applying Machine Learning to large data sets, that is driving the concept of data gravity, today. What we’re seeing to be most critical for organizations is the Machine Learning that’s being applied to data gravity; combining the two produces the crucial insights into their businesses and their industry required to thrive and remain competitive.

Looking ahead at Big Data and Machine Learning

Although the term was first coined by software engineer Dave McCrory back in 2010, data gravity has made a bold resurgence in the industry in more recent times. And according to Slattery, we have Machine Learning to thank for that.

“The last couple of years represent the first time that all the key ingredients of wide-scale Machine Learning have really come to the fore, and with that, we’re seeing these massive data lakes and data gravity take hold, which will only accelerate from now,” Slattery says.

In the next five years, he thinks we’re going to see massive amounts of raw data sets being transferred into larger and larger data lakes as people begin to understand the exponential value that the bigger data lake has when you apply Machine Learning to it.

Companies driving this change will likely be those that specialize in Big Data and already recognize the value in applying AI, such as Palantir Technologies, Snowflake, and even cloud giants like Google, Microsoft, Amazon, and Oracle. 

“These multi-billion dollar companies are creating tens – if not, soon to be hundreds – of billions of dollars worth of enterprise value by applying the latest Machine Learning to the largest data set they can get their hands on, and this is only the beginning,” Slattery says.

As more organizations around the world start to integrate Machine Learning into their processing systems, data services and operations will only continue to evolve. As a result, data gravity is bound to intensify as we move into a future driven by those who see value in applying AI to critical data sets and workloads.

You Might Also Like

Be the first to comment

POST A COMMENT

Only your name will be displayed with your comment.