Deduplication and compression – are you asking the right questions?

A file next to a smaller file

Posted on Thursday, May 18, 2017

The exponential growth of corporate data holdings presents a significant challenge for the CTO who is tasked with finding a way to store it all. The realisation that 10-30% (or more) of all data held is duplicated is an obvious concern, particularly when upper capacity limits are reached.

Two options – cloud or deduplication systems

One of the most popular solutions to this conundrum is to simply offload some datasets to the cloud. Using the infinite scalability potential of hosted storage services, your business can keep all the data you generate – including duplicates – without fear of running out of capacity.

Vendors have a different take however. Dedicated deduplication systems can help to identify and remove file copies for instance. They can even detect duplicate data at the point of writing to disk, and insert a pointer rather than create another copy. These units are powerful, effective – and (can be) expensive.

A third way – storage redeployment

It is worth noting that the cost of deduplication units is not purely financial. These systems introduce a latency in the write process as information is checked, slowing disk operations. When speed is critical to operations, it may be that this latency and processing overheads is simply unacceptable.

The emergence of SDS offers a third option however. Leveraging cloud-like storage allocation techniques, you can redeploy post-warranty arrays to increase overall capacity.

SDS can be configured to create as many or as few replication copies of data as you require to further reduce the risk of a data loss or data availability event. It can then be automatically moved across the storage fabric depending on application, guaranteeing a high quality of service for users.

At the same time, SDS allows you to shift low priority data onto this lower-performing storage when applications no longer require it. You can also retain everything as is, maintaining the status quo.

Because you are redeploying existing assets, the financial impact of storing duplicate data is virtually nothing.

And rather than deploying in-line data deduplication, consider making your data protection solutions work harder. Many already include deduplication options – put this into action and you can create a retention policy that not only prevent duplication, but also stores backups on low-cost disk arrays to reduce overall storage costs. In this way your retention policy also aligns with the value of the data being stored.

Next steps

To learn more about SDS, data deduplication and the redeployment of your post-warranty storage assets, please get in touch.