Themes & Channels

Grab our RSS feed !

Stay informed !
Subscribe to our FREE newsletter...

Next-Gen De-Dupe and Compression: Object-Based De-Dupe

By John Everett, storage business manager, Dell EMEA

This article has been written by John Everett, storage business manager, Dell EMEA.

dell_john_everett_dedupe

The Next-Generation of Deduplication and Compression

With data growing at an unprecedented rate, organisations of all sizes are looking to maximise the efficiency of how they store and manage data throughout its entire lifecycle. This ongoing challenge has led to the proliferation of technologies such as thin provisioning, automated tiering and scale-out storage, which can deliver both Capex and Opex savings through smart resource management for better utilisation rates, increased energy efficiency and simplified administration.

Now, advances in deduplication and compression technologies are allowing organisations to push utilisation rates even higher through what Dell calls 'content-aware storage optimisation' - also known as object-based deduplication - shrinking meaningful amounts of data for significant cost and management savings.

At a basic level, deduplication is the process of eliminating duplicate copies of data and replacing them with pointers to a single copy. Its function helps organisations reach two primary goals: to reduce the amount of storage capacity needed to store a myriad of data, and to decrease the amount of data in flight during backup or replication processes. As it stands, the dominant use case for deduplication is backup storage, because of the amount of static data that organisations have to backup. Nevertheless, deduplication technology has developed into other data centre storage platforms such as NAS.

Some deduplication processes examine files in their entirety to determine whether they are duplicates, which is referred to as file-level deduplication, or 'Single Instance Storage,' while others break the data into blocks and try to find duplicates among the blocks, which is referred to as block-level deduplication. Block-level deduplication typically provides more granularity and a greater reduction in the amount of utilised storage capacity compared with file-level deduplication. This is particularly appealing from a bac-up perspective. Both types of deduplication are commonly used and offered today; however, there is a growing appreciation that these approaches may not be sufficient to handle the growth of big data in verticals such as oil and gas, life sciences, media and entertainment.

A more intelligent form of deduplication has emerged in the form of object-based deduplication. Now, organisations can take advantage of next-generation technology that is tailored to their particular vertical. This can be achieved with a solution that bridges the gap between applications and native storage platforms to optimise the way data is stored. This optimisation technology identifies how a given file is structured, breaking it down to component sub files and then selecting which is most effective from a library of more than 100 different compression algorithms for the targeted file. Even if the file has never before been identified, and there is no content-specific compressor, the technology will infer information about the structure and nature of the contents to select the most effective data-reduction algorithm. By understanding the layout of specific application files - like an email programme or a digital image - IT can make intelligent decisions about how to de-dupe and compress that data for optimal storage.

The central components of Dell's data-processing system include two types of content-aware algorithms and a neural net framework for testing and selecting different compressors for best run-time efficiency. The two types of content-aware algorithms are de-layering algorithms, which dissect files to identify the contiguous sub-objects, and data-shrinking algorithms, which include deduplication and compression. These custom compressors are more capable of shrinking meaningful amounts of data that plague specific verticals.

To further reap the benefits of deduplication, this technology should be able to be seamlessly applied across the entire IT infrastructure. To this end, Dell is rolling out storage optimisation technology across a variety of solutions for primary storage, archive, and backup. Deduplication and compression will be integrated in the Dell Scalable File System and Dell Object storage; once data is deduplicated, it can move in a deduplicated state from one storage system to another. For example, data that is deduplicated on Dell primary storage solutions can be backed up without rehydration to Dell backup storage, which can then be replicated in a deduped state over a LAN/WAN to a Dell backup storage replica. It is this end-to-end optimisation of data from the server to storage to the cloud that brings the most value to an end user organisation in a data heavy world.

Even though dededuplication and compression technology has been around for a few years, it is here to stay and is evolving rapidly. To be truly effective in today's business world as well as tomorrow's, organisations should look to a solution that adheres to three main tenants:
  • to be transparent to the end user and applications, meaning that there should not be any performance delays upon retrieval;
  • to be customised to specific verticals with more and better algorithms and logic; and
  • to be utilised end-to-end across the entire workflow to ensure optimisation of the overall IT environment.

News Options >

AddThis Social Bookmark Button

print this news Print this news

With all the daily news

on the WW storage industry, this

website is updated every day at 9AM

in Chicago or 4PM in Paris.

You can subscribe to receive

an email with the daily headlines.

arkeia_sm_230x230

 

230x345 


COMPLETE STORAGE
START-UP DATABASE
It contains more than 350 current
storage start-ups in the world
(2/3 in USA), with, for each firm:
- Company name,
- Headquarters, web site, CEO
- Year founded,
- Business activity,
- Yearly financial funding
  and total received,
- Classification by sector.

Complete package for €590.
To order this unique database
(in Excel format), please contact us
for an invoice by return mail.

ALL STORAGE M&As

More than 900 mergers or acquisitions
in the WW storage industry.
This database contains for each deal:
- Name of acquirer
- Acquired company
- Price (when available)
- Activity of acquired firm

Complete package for €490.
To order this unique database
(in Excel format), please contact us
for an invoice by return mail.


MORE THAN 2,000
ONLINE BACKUP COMPANIES
IN THE WORLD

This database contains for each firm:
- Company name,
- Country
- Web site
Complete package for €490.
To order this unique database
(in Excel format), please contact us
for an invoice by return mail.