Experts discuss managing data explosion as Digital Transformation takes hold

Experts discuss managing data explosion as Digital Transformation takes hold

Data is, without a doubt, extremely valuable and is becoming more so as Digital Transformation dominates the technology industry. Its growth in terms of sheer volume is causing IT leaders to become concerned as they are tasked with storing, preserving and managing the data. A resulting concern is unavoidably, cost.

Businesses across the globe are attempting to discover new ways to tackle the explosion of data and companies such as Scality are offering a helping hand. As a leader in software solutions for global data orchestration and distributed file and object storage, Scality has reported how the Scality RING brings unique efficiencies in today’s modern healthcare and genomic data centres. 

Over 40 global hospitals, hospital systems and genomics research institutions in the US, UK, Germany, France, Switzerland, Israel, Japan and South America have implemented the Scality RING. These customers trust Scality with their mission-critical diagnostic imaging data managed by leading PACS and VNA solutions, as well as key genomics data for use in development of new biopharmaceuticals. Many of these customers have experienced the ease of initial deployment and the scaling of RING with seamless capacity expansions to petabytes of storage.

IT leaders in these healthcare institutions share significant data growth challenges for radiological imaging, genomic sequencing and other healthcare-related services and applications. To modernise and address these challenges, IT leaders are now deploying scale-out software defined storage for on-premises private and hybrid cloud environments.  

Respondents reported that scale-out storage deployments are 52% faster than traditional storage, require 46% less staff time managing the storage platform and result in a 28% lower TCO (saving US$270,000 per petabyte over three years). Expand these savings over five years and these IT leaders are saving millions of dollars in resource and capital expenditures with software-defined storage solutions.

“The growing nature of data in the age of COVID-19 puts more pressure on the healthcare and genomics industries to modernise with cost-effective solutions,” said Amita Potnis, IDC Research Director. “Our survey suggests that IT leaders who are required to build on-prem private or hybrid clouds, can rest a little easier with a cost-effective software-defined, scale-out object storage solution like Scality offers.”

Industry experts discuss this further and offer some solutions to the management of data in today’s digital environment.

Assaad El Saadi, Regional Director – Middle East, Pure Storage: “Remember those photos you took during your holidays last year and are now stored on your phone? One day you might want to see them again or send it to someone. But for the most part, these photos are just taking up space. If this happens often enough, the day comes when you have no idea what is stored there due to the massive number of files.

“This same thing happens, on a larger scale, to companies. Data is being saved or collected every day and it takes time for someone to become aware that there’s a huge amount of useless information stored on servers. This is what we call ‘dark data’. Gartner defines this data as the information assets that organisations collect, process and store during regular business activities, but are not generally used for other purposes such as analytics, business relationships and direct monetisation.

“Storing and protecting unused data often incurs more expense and sometimes greater risk than value. Still, the existence of dark data cannot be ignored. According to Heinz College of Carnegie Mellon University, about 90% of corporate information falls into this category as organisations generally retain this data for compliance purposes only. For Deloitte, the generally accepted figure has long been 80% — known as ‘the 80% rule’ — though recent estimates put the number closer to 90%.

“This type of data should not be limited to regulatory and compliance use cases. It could be quite useful for gaining insight for decision-makers. In this sense, data analysis is fundamental. Knowing what type of data will be relevant and should be stored is a differential that can directly impact company spending. Moreover, turning this data into high-quality information and insights is another point that needs to be taken into account.

“In a 2019 global survey on data quality by Serasa Experian, it was found that 95% of companies believe that poor quality of data in businesses negatively impacts consumer interaction, reputation and the efficiency of operations. Thus, it is becoming increasingly evident that the best way to deal with the situation is to apply an analytical base to the data even before storing it, and the sooner this information is structured, the sooner one can know what should be available and what should be stored.

“With the billions of files that many companies keep, it’s not possible to do a manual analysis. But there are several tools — that use leading Edge technologies such as all-flash, Artificial Intelligence and Machine Learning — that you can leverage; categorising what can be used and eliminating what isn’t of use to the company. This information management becomes essential to the future of a business as it enables intelligent access to the company’s most valuable asset and maintains its continuity, providing important decision-making tools as data advancement progresses.”

Michael Cade, Senior Global Technologist, Veeam: “As more of our work and personal lives have become digital, we’ve seen a staggering growth in the amount of data we’re generating, storing and accessing. According to various studies, Google processes 3.5 billion searches every day, while 4.3 million videos are watched on YouTube. By 2025, it’s estimated that 463 exabytes of data will be created each day globally. And with around 40% of the world’s population still to be connected online, the amount of data we’ll need to store and manage will skyrocket further.

“The staggering amount of data we’re generating is already causing challenges, with data centre technologies requiring significant power and cooling, as well as ongoing maintenance and monitoring. We could be moving towards a huge bottleneck in the capabilities that are available, as both the volumes and speed of access to data increase further. What’s more, hardware such as servers, hard drives and flash storage can degrade.

“One alternative to our current storage devices could be DNA-based data storage. Being ultra-compact and easy to replicate – thanks to its primary role in creating life – gives DNA two big advantages. One gram of DNA could potentially hold as much as 455 exabytes of data, according to the New Scientist. That’s more than all the digital data currently in the world, by a huge margin. And while DNA is itself quite fragile, when stored in the right conditions it can be incredibly stable. Thousand-year-old fossilised remains have been found with DNA still intact. The longevity of cassettes and CDs just doesn’t compare, and so from an archiving and backup perspective, it could be the perfect material.

“Progress on the technology has been extremely promising, with Microsoft and University of Washington researchers last year developing the world’s first DNA storage device that can carry out the entire process automatically.

“While techniques might be steadily improving, the time and cost of decoding the information needs to come down before DNA data storage can be used commercially.

“The business of backup could be transformed by DNA. Archives and data centres, and their immense physical footprints, could be eliminated. The sum of the world’s knowledge may well one day be stored on something you need a microscope to observe. And as we generate even more data and reach the limit of our current storage technologies, the value of powerful alternatives will only become greater. Today’s complex backup efforts could be reduced down to a single record, created once, that lasts well beyond any living memory. The next generation of storage technology is in some ways already here – we just need to learn how to harness it.”

Don Schuerman, CTO and Vice President of Product Strategy and Marketing at Pegasystems: “As businesses continue to rollout their Digital Transformation plans – which have been accelerated by the Coronavirus pandemic – an inevitable side-effect has been the explosion of data, specifically customer data, as consumers have been forced to shift almost their entire life online.

“But organisations must not let this valuable information go to waste. It can be harnessed to help improve customer interactions. However, because of the increase in the scale and complexity of customer data and the need for real time insights, businesses need AI to step in.

“Used wisely, Artificial Intelligence (AI) yields a deeper understanding of customers across different contexts and channels. AI can read signals and sense your customer’s unique intent – to purchase, to upgrade, to get support, even to cancel – before they act. By feeding the technology with real-time data, AI can serve up unique, relevant actions – offers, yes, but also conversation and guidance – automatically, or guide customer service representatives (CSRs) to make the right offer at the right time. In highly regulated industries, AI can also be an invaluable transparency tool to demonstrate why you are presenting particular offers to specific customers and prove that no unconscious bias is at work.

“An additional approach that organisations can take to handle the influx of customer data is by introducing process automation. For example, this type of automation can digitise forms for employees and automate their routine processes, meaning customer service representatives don’t have to spend as long trawling through information to get to what they need. Instead, they have more time to spend interacting with the customers who need serving most, providing a truly tailored service. 

“Lastly, as the volume of data grows, organisations need to be sure they have the correct governance in place to guarantee compliance and keep customer data safe. For any successful Digital Transformation journey, governance is an important aspect to have as it helps organisations to properly manage data assets and ensure they have the right customer data processes in place.”

David Friend, Founder and CEO of Wasabi: “We used to think about data storage as an unfortunate but necessary expense, what I call the ‘scarcity’ mentality. As the cost of data storage continues to drop, it frees us to think about data as a resource, what I call the ‘abundance’ mentality. Taken to its extreme, if the cost of data storage were zero, everybody would store everything forever. Why not? 

“For most companies, data is the lifeblood of Digital Transformation. The increasing sophistication of technologies like AI depends on a need to collect and store huge troves of data. In my job running a cloud storage company, I frequently talk with customers who now regret having thrown away data because the cost of storage outweighed its likely value. AI has changed that equation. It turns out that there could have been much to learn from that old discarded data. Today, the trend is to keep more data because AI is moving so fast that you risk grossly underestimating its potential future value. 

“By 2025, IDC estimates that the amount of data stored worldwide will explode to 175 zettabytes (that’s 175 billion terabytes). This represents a compound annual growth rate of 61%, which is such a frenetic pace that it is forcing organisations to rethink how they store the data they generate.  Storing all of their data on-premises, as enterprises traditionally did, becomes a logistical challenge when storage is expanding at that pace. Many are coming to the conclusion that it’s time to migrate data storage to the cloud where there is virtually unlimited capacity. 

“Cloud data storage comes at a fraction of the overall cost of on-premise storage. There’s little strategic advantage to managing one’s own storage infrastructure. Moving to cloud not only saves money, but it increases reliability and, by definition, makes data far more accessible to people who are working remotely. Companies save on manpower, space, electricity, the distraction of periodic equipment upgrades and data migrations, and having to make and manage off-site backups. Organisations can focus on projects that benefit the revenue-generating side of their business. 

“Moreover, to cope with the explosion of data that Digital Transformation heralds, many organisations – wary of putting all their eggs in one basket – are considering a multi-cloud approach. Data that resides in the cloud still needs to be backed up and it would be foolish to keep backups in the same cloud as the primary data. So it’s increasingly common to see organisations that might have their primary data in Amazon’s cloud but backup their data to another cloud like Microsoft or Wasabi. 

“A multi-cloud approach also gives you more business leverage over your cloud providers. If one vendor has all the copies of your data, it’s easy to envision circumstances, such as billing disputes, in which the cloud vendor holds all the cards. Furthermore, the hyperscalers are also engaged in businesses that may be directly competitive with their customers.

“Companies are clearly waking up to the benefits of the multi-cloud approach to support their transformation efforts. According to a recent IBM survey, 85% of companies are already using multiple clouds for their business needs and this is set to increase in the coming years.

“The hyperscalers have made their APIs incompatible. One might reasonably conclude that they want to make it as hard as possible for customers to migrate from one to another. Not only are the APIs incompatible, but they also charge egress fees if you want to get your data back. For me, the opportunity to store all the world’s data (or as much as we can win) is a sufficiently large opportunity.”

David Craig, CEO, Iceotope: “Digital Transformation is creating a roadmap for divergent requirements in data centres that no longer conform to the historic ‘one design suits all’. Customers are now more knowledgeable about specific approaches to data management, whether that is the HPC, streaming low latency content, or general enterprise access requirements – the data infrastructure provided must help them add value to their results.

“Gartner predicts that public cloud services will grow to a US$368 billion market by 2022, and that all major countries will experience between 15% and 33% growth. This can only serve to drive colocation expansion and that will develop the regional cores that service a multitude of Edge deployments.

“For most corporations, hybrid IT has become mainstream and future IT infrastructures will be multi-cloud based. Applications and workloads will be located where they have the best fit and can deliver the best possible business outcomes.

“Western Europe has a highly developed colocation market and deregulated telecom infrastructure, which provided for fast expansion, but must also cope with compliance requirements from individual countries, for which localised data centres and Edge facilities can provide the solution. This explosion in data generation and the expansion in infrastructure deployed requires data centre providers to take the lead on facilities design and HVAC and become more transparent and open-minded to tailored offerings to their customers.

“The Uptime Institute recently stated that, average data centre PUE in 2020 is 1.58. This has not significantly improved in the last seven years. Many data centre developers are still wedded to a chilled air-cooled approach to technology spaces, which rolls out the older style fan-assisted servers. This legacy approach consumes large amounts of water and up to 30% of data centre’s energy in cooling, while restricting the server capability at a time when greatly increased data throughput is expected. A more enlightened approach whether in data centres or at the Edge is sealed chassis-level immersive liquid cooling technology, which has a significantly lower PUE 1.03.

“Liquid cooling is 1,000 times more efficient than air cooling and eliminates the requirement for refrigerants. It removes the need for server fans while dramatically increasing the compute density that can be effectively managed in each server rack.

“Digital transition and the requirement for IoT and 5G networks in local environments will situate more data centre capacity at the Edge. The current IT equipment manufacturers’ approach to repackaging fan-reliant servers demonstrates that data centre operators need to take a step back and ask – what is the most effective and sustainable design for this site?

“Data centres compete for space and resources with people, but hide in out of the way locations. Building out the Edge will place them among us and therefore they must be as efficient and non-intrusive as possible.”

Browse our latest issue

Intelligent Data Centres

View Magazine Archive