Datadobi

Gartner: Hype Cycle for Storage in 2017 – Part One

34 technologies profiled, 7 mature, 7 emerging, 12 in early mainstream
This is a Press Release edited by StorageNewsletter.com on 2017.11.09

AddThis Social Bookmark Button

This paper, Hype Cycle for Storage Technologies, 2017 (ID: G00313764), was delivered on July 19, 2017 by Gartner, Inc. and mainly written by analysts Pushan Rinnen and John McArthur. Here we publish the first part of the report. The second one can be found today just after this top news

Summary
This Hype Cycle evaluates storage-related hardware and software technologies in terms of their business impact, adoption rate and maturity level to help users decide where and when to invest.

Sectors analyzed:
Shared Accelerated Storage
Management SDS
Cloud backup
Backup Tools for Mobile Devices
File Analysis
Open-Source Storage
Copy Data Management
Infrastructure SDS
Integrated Systems: Hyperconvergence
Data Sanitization
Integrated Backup Appliances
Storage Cluster File Systems
Cross-Platform Structured Data Archiving
Information Dispersal Algorithms
Object Storage
Solid-State DIMMs
Emerging storage Protection Schemes
Hybrid DIMMs
Enterprise Endpoint Backup
Cloud Storage Gateways
DR as a Service
Public Cloud Storage
VM Backup and Recovery

Analysis

What You Need to Know
The report does not add any new emerging technology. Storage hardware and storage management software technologies are showing signs of aging. Out of the 34 technologies profiled, seven (20%) have entered the mature mainstream stage, while 12 (35%) have become early mainstream. In the maturing storage market, innovative technologies such as deduplication and SSDs have been adopted by many competitive products, resulting in vendor consolidation, reduced product differentiation and downward pricing trends. This environment is ideal for IT leaders to engage in and benefit from pricing negotiations. Only seven (20%) of the technologies profiled are still at the emerging stage. Specifically, two emerging technologies are showing fast movement on the Hype Cycle curve and increased adoption: file analysis and cloud backup. Both are driven by the increased fragmentation in terms of data location and data management as a side effect of cloud computing and SaaS applications such as Microsoft Office 365.

The Hype Cycle
This research informs I&O leaders and infrastructure technology vendors about what new storage technologies are entering the market, how Gartner evaluates highly hyped technologies or concepts, and how quickly existing technologies are being adopted. In addition to cloud backup and file analysis, other technologies that have experienced fast movement in the past year include DR as a service (DRaaS), integrated backup appliances, VM backup and recovery, online compression, object storage, and solid-state dual in-line memory modules (DIMMs). DRaaS (often cloud-based) has entered the early mainstream stage with an estimated 400 providers - a 60% annual growth in the number of providers, triggering rapid price declines and increased pilot adoption. On the backup side, integrated backup appliances are resuming healthy adoption rates due to new product entrants from high-profile vendors. VM backup has become mature mainstream, especially for VMware VMs; this technology profile may be obsolete before reaching the plateau because most vendors have standardized on change-block-tracking-based backup and because the major VM-only backup vendor has introduced physical server support. Object storage is showing a revival due to standardization on the S3 API and cloud storage vendors' promotion of solutions. Solid-state arrays' movement on the Hype Cycle has slowed down as the installed base becomes large. Hyperconverged integrated systems have fallen off the peak of the Hype Cycle as the technology matures, while two other high-profile technologies - management software-defined storage (SDS) and open-source storage - have barely made any movement in the past year, indicating a wide gap between vendors' marketing hype and users' resonation.

High-profile technologies with high and transformational business impacts tend to reach their Hype Cycle peak and trough quickly, while low-profile technologies or technologies with low or moderate business impact typically move much more slowly and sometimes never reach their peak before approaching the Trough of Disillusionment or becoming obsolete. One technology marked 'obsolete before plateau' this year is cloud storage gateways. Although such products are still being sold in the market, they are mainly deployed to address niche workloads that are not readily addressed by traditional on-premises products.

Figure 1. Hype Cycle for Storage Technologies, 2017
Click to enlarge

The Priority Matrix
The Priority Matrix maps the benefit rating for each technology against the length of time before Gartner expects it to reach the beginning of mainstream adoption. This alternative perspective can help users determine how to prioritize their storage hardware and storage software technology investments and adoption. In general, companies should begin with fast-moving technologies that are rated transformational or high in business benefits and are likely to reach mainstream adoption quickly. These technologies tend to have the most dramatic impact on business processes, revenue or cost-cutting efforts.

After these transformational technologies, users are advised to evaluate high-impact technologies that will reach mainstream adoption status in the near term, and work downward and to the right from there. This year, management SDS' benefit rating is dropped to moderate from high, while cloud storage gateway's benefit rating is dropped to low.

Figure 2 shows where the storage technologies evaluated in this year's Hype Cycle fall on the Priority Matrix. Note that this is Gartner's generic evaluation; each organization's graph will differ based on its specific circumstances and goals.

Figure 2. Priority Matrix for Storage Technologies, 2017
Click to enlarge

Off the Hype Cycle
Technology profiles that have fallen off the chart because of high maturity and widespread adoption may still be discussed in related Gartner IT Market Clock research. This year, the enterprise file sharing and synchronization technology profile has been retired.

On the Rise
Shared Accelerated Storage
Analysis by Julia Palmer

Definition: Shared accelerated storage is an approach that has emerged to tackle new data-intensive workloads by bringing high-performance and high-density next-generation solid-state-shared storage to the compute servers over a low-latency network. This technology delivers benefits of shared storage with performance of server-side flash by leveraging standardized NVMe PCIe super-low-latency technology over low-latency NVMe over Fabrics (NVMe-oF) network.

Position and Adoption Speed Justification: Shared accelerated storage is an emerging architecture that takes advantage of the latest nonvolatile memory, currently flash technology, to address the needs of extreme-low-latency workloads. NVMe technology is fast-evolving and replacing server-side flash used to accelerate workloads at the compute layer, but it has limited capacity, and is managed as a silo on a server-per-server basis. The NVME.org consortium has now extended the specification beyond local attached devices to create the NVMe over Fabrics (NVMe-oF) standard. Enterprise workloads are now able to benefit from NVMe-oF protocol capabilities such as the performance and simplicity of server side NVMe acceleration, together with efficiency and scalability of shared solid-state arrays. The NVMe-oF protocol can take advantage of high speed RDMA networks, and will accelerate the adoption of next generation architectures, like disaggregated compute, scale-out software-defined storage and hyperconverged infrastructures, bringing HPC-like performance to the mainstream enterprise. Unlike server-attached flash storage, shared accelerated storage can scale out to high capacity, uses ultra-dense flash, has HA features and can be managed from a central location, serving dozens of compute clients. Current shared accelerated storage specs are promising to deliver over 1 million IO/s per 1 unit of the rack space. Shared accelerated storage is nascent technology, with just a few vendors and products in the market. It is expected to grow as the cost of NVMe continues to decline. Expect it to be delivered as a stand-alone storage array product, software-defined storage or as part of a hyperconverged integrated systems infrastructure offering in the next five years.

User Advice: Shared accelerated storage products feature disaggregated compute and storage architecture that allows flash memory acceleration to be deployed centrally, but accessed by many servers, connected via high-bandwidth, low-latency networks. The typical design would have dozens of high-capacity PCIe-based flash modules forming a single pool of storage that is presented to servers over an NVMe-oF network. NVMe fabric technology provides low-latency access to the shared storage resource pool by avoiding the overhead associated with traditional protocol translation. Shared accelerated storage products will appeal to the most demanding workloads that require extreme performance, storage persistency, ultra-low latency and a compact form factor. Today, shared accelerated storage is well suited for extremely-high-performance and big data analytics workloads, such as applications built on top of Hadoop, and in-memory database use cases, such as SAP Hana.

While the benefits of 10 times higher IO/s, and five times lower latency than industry-leading solid-state arrays, are very attractive for tier 0 HPC, the technology is currently in the beginning stages of adoption. Thus, cost and capacity are inhibiting mass deployment. In the next five years - as adoption accelerates, awareness increases and the price point drops - any application that benefits from ultra-low-latency persistent storage can take advantage of this architecture. Buyers need to be aware that nonvolatile memory technology is evolving, and the solution architecture must be flexible to adapt to the newest, most effective technologies, like 3D XPoint.

The most prominent use cases for shared accelerated storage will be online transaction processing (OLTP) databases, data mining, real-time analytics, HPC applications for video editing, financial processing and analyses, online trading, oil and gas exploration, genomic research and fraud detection.

Business Impact: Shared accelerated storage can have a dramatic impact on business use cases where large bandwidth, high IO/s and low-latency requirements are critical to the bottom line of the enterprise. It is designed to power emerging tier 0 workloads like: enterprise analytics; real-time, big data analyses; and high-volume transactions that require high performance, capacity and availability, where you need an architecture that extends beyond modern general purpose solid-state arrays. While it requires some retooling on the compute side (PCIe/NVMe) to integrate with this type of storage, its benefits are likely to attract high-performance computing customers that will be able to show a positive ROI. Unlikely to be relevant as general purpose storage in the near future, it's targeted at analytics and transactional workloads where low latency is a crucial requirement and a business differentiator.

Benefit Rating: High
Market Penetration: Less than 1% of target audience
Maturity: Emerging
Sample Vendors: Cray; E8 Storage; Excelero; Mangstor; Micron; Pivot3; Pure Storage; WekaIO; X-IO Technologies; Zstor

Management SDS
Analysis by Julia Palmer and John McArthur

Definition: Management SDS coordinates the delivery of storage services to enable greater storage agility. It can be deployed as an out-of-band technology with robust policy management, I/O optimization and automation functions to configure, manage and provision other storage resources. Products in Management SDS category enable abstraction, mobility, virtualization, SRM and I/O optimization of storage resources to reduce expenses, making external storage virtualization software products a subset of management SDS category.

Position and Adoption Speed Justification: While management SDS is still largely a vision, it is a powerful notion that could revolutionize storage architectural approaches and storage consumption models over time. The concept of abstracting and separating physical or virtual storage services via bifurcating the control plane (action signals) regarding storage from the data plane (how data actually flows) is foundational to SDS. This is achieved largely through programmable interfaces (such as APIs), which are still evolving. Management SDS requests will negotiate capabilities through software that, in turn, will translate those capabilities into storage services that meet a defined policy or SLA. Storage virtualization abstracts storage resources, which is foundational to SDS, whereas the concepts of policy-based automation and orchestration - possibly triggered and managed by applications and hypervisors - are key differentiators between simple virtualization and SDS.

The goal of Management SDS is to deliver greater business value than traditional implementations via better linkage of storage to the rest of IT, improved agility and cost optimization. This is achieved through policy management, such that automation and storage administration are simplified with less manual oversight required, which allows larger storage capacity to be managed with fewer people. Due to its hardware-agnostic nature, management SDS products are more likely to provide deep capability for data mobility between private and public clouds to enable a hybrid cloud enterprise strategy.

User Advice: Gartner's opinion is that management SDS is targeting end-user use cases where the ultimate goal is to improve or extend existing storage capabilities and improve Opex. However, value propositions and leading use cases of management SDS are not clear, as the technology itself is fragmented by many categories. The software-defined storage market is still in a formative stage, with many vendors exiting the marketplace and tackling different SDS use cases. When looking at different products, identify and focus on use case applicable to your enterprise, and investigate each product for its capabilities.

Gartner recommends proof of concept (POC) implementations to determine suitability for broader deployment.

Top reasons for interest in SDS, as gathered from interactions with Gartner clients, include:
• Improving the management and agility of the overall storage infrastructure through better programmability, interoperability, automation and orchestration
• Storage virtualization and abstraction
• Performance improvement by optimizing and aggregating storage I/O
• Better linkage of storage to the rest of IT and the software-defined data center
• Opex reductions by reducing the demands of administrators
• Capex reductions from more efficient utilization of existing storage systems

Despite the promise of SDS, there are potential problems with some storage point solutions that have been rebranded as SDS to present a higher value proposition versus built-in storage features, and it needs to be carefully examined for ROI benefits.

Business Impact: Management SDS's ultimate value is to provide broad capability in the policy management and orchestration of many storage resources. While some management SDS products are focusing on enabling provisioning and automation of storage resources, more comprehensive solutions feature robust utilization and management of heterogeneous storage services, allowing mobility between different types of storage platforms on-premises and in the cloud. As a subset of management SDS, I/O optimization SDS products can reduce storage response times, improve storage resource utilization and control costs by deferring major infrastructure upgrades. The benefits of management SDS are in improved operational efficiency by unifying storage management practices and providing common layers across different storage technologies. The operational ROI of management SDS will depend on IT leaders' ability to quantify the impact of improved ongoing data management, increased operational excellence and reduction of opex.

Benefit Rating: Moderate
Market Penetration: 1% to 5% of target audience
Maturity: Emerging
Sample Vendors: DataCore Software; Dell EMC; FalconStor; ioFABRIC; IBM; Infinio; Primary Data; VMware

Cloud Backup
Analysis by Pushan Rinnen and Robert Rhame

Definition: Cloud backup refers to policy-based backup tools that can backup and restore production data generated natively in the cloud. Such data could be generated by software as a service applications, such as Microsoft Office 365 and Salesforce, or by infrastructure as a service compute services, such as Amazon Elastic Compute Cloud (EC2) instances. Backup copies can be stored in the same or a different cloud location, and various restore/recovery options should be offered in terms of restore granularity and recovery location.

Position and Adoption Speed Justification: Backup of data generated natively in public cloud is an emerging requirement as cloud providers only offer infrastructure HA and DR, but are not responsible for application or user data loss. Most software as a service (SaaS) applications' natively included data protection capabilities are not true backup and lack secure access control and consistent recovery points to recover from internal and external threats. As Microsoft Office 365 (O365) gains more momentum, O365 backup capabilities have begun to emerge from mainstream backup vendors in addition to small vendors. Infrastructure as a service (IaaS) backup, on the other hand, is a more nascent area to cater to organizations' need to backup production data generated in the IaaS cloud. Native backup of IaaS usually resorts to snapshots and scripting, which may lack application consistency, restore options, storage efficiency and policy-based automation. A few small vendors offer cloud storage snapshot management tools that address some cloud native limitations, while data center backup vendors have introduced public cloud guest agents to backup cloud-native production workloads.

User Advice: Before migrating critical on-premises applications to SaaS or IaaS, organizations should have a thorough understanding of cloud-native backup and recovery capabilities. If they decide the native capabilities fall short, such as in terms of application consistency, security requirements and recovery point objective, they should factor in additional backup cost into the TCO calculation before making the decision to migrate to the cloud.

For those organizations planning to use cloud-native recovery mechanisms, they should ensure that their contract with the cloud provider clearly specifies the capabilities and cost associated with the following items in terms of native data protection:
Backup/Restore methods: This describes how user backup and restore are done, including any methods to prevent users from purging their own 'backup copies' and to speed up recovery after a propagated attack such as ransomware.
Retention period: This measures how long cloud providers can retain native backups free of charge or with additional cost.
Clear expectations in writing, if not SLA guarantees, regarding RTOs: RTO measures how long it takes to restore at different granular levels such as a file, a mailbox or an entire application.
Additional storage cost due to backup: Insist on concrete guidelines on how much storage IaaS's native snapshots will consume, so that organizations can predict backup storage cost.

For third-party backup tools, focus on low-cost licensing, ease of cloud deployment, policy automation for easy management, storage efficiency and flexible options in terms of backup/recovery granularity and location.

Business Impact: As more production workloads migrate to the cloud (either in the form of SaaS or IaaS), it has become critical to protect data generated natively in the cloud. SaaS and IaaS providers typically offer infrastructure resiliency and availability to protect their system from a site failure. However, when data is lost due to their infrastructure failure, the providers are not financially responsible for the value of lost data and only provide limited credit for the period of downtime. When data is lost due to user errors, software corruption or malicious attacks, user organizations are fully responsible themselves. The more critical cloud-generated data is, the more critical it is for users to provide recoverability of such data.

Benefit Rating: High
Market Penetration: 5% to 20% of target audience
Maturity: Emerging
Sample Vendors: Arcserve; Commvault; Datos IO; Datto; Dell EMC; Druva; Microsoft; N2W Software; Spanning; Veeam

At the Peak
Backup Tools for Mobile Devices
Analysis by John Girard and Pushan Rinnen

Definition: This technology profile describes tools and services that backup and restore mobile device data, settings and images. Backups may be made via a cable (that is, tethered) or over the internet to a server hosted at a company site or cloud service provider. Restoration of some user data, but usually not the entire system image, is accomplished after loss, theft or migration.

Position and Adoption Speed Justification: Unlike choices in the workstation world, enterprise backup solutions for phones and tablets cannot be consistently implemented. While solutions are available for Android, Apple restricts full backup to personal iTunes and iCloud. With a major platform beyond control, companies tend to leave backup unsolved. IT organizations assume that reloading device settings and refreshing email will satisfy most needs, but this is a mistake when mobile devices contain business apps and local data. Users are often left to make their own choices and may resort to simpler and less secure methods, including the use of commercial shared file and sync tools, or even free solutions. The result is unmanaged and incomplete backups that are unlikely to meet business recovery, security and privacy requirements. This technology earns pre-peak status because the majority of companies only dimly realize that they need consolidated mobile backup, and are not able to obtain the cross-platform solutions they need.

User Advice: Users of both company and personal devices are responsible for ensuring that important business data is preserved, given the inconsistency of current backup methods and the chaotic effects of distributed mobile information storage. In typical circumstances where a restoration is needed, users may have access to more than one copy of the same calendar, contacts and emails for some period of time simultaneously from a workstation, tablet and phone, because of persistent storage of the inbox, and to certain files which had been placed in cloud storage. Companies should get in the habit of archiving copies of important information as part of normal workflow design, and should provide enterprise managed choices for EFSS and backup solutions that meet business security requirements. Users must be directed to get in the habit of regularly backing up critical business information, or face the consequences of data loss.

Business Impact: IT planners should document best and worst practices for mobile backups, and tie them into their BC plans, training and help desk procedures. EFSS tools may not meet business security and privacy requirements and are not designed for managing structured backups. Left on their own, users will prefer free or inexpensive services lacking company-controlled encryption, where ownership of and future access to stored data may fall into the hands of the vendor. Neither EFSS, nor enterprise backup systems are yet suited to comprehensively backup mobile devices; however, long-term plans should prioritize centrally managed, certified, professional-grade tools. Recent debates over the need for and use of legally mandated back doors raise scenarios where companies could completely lose disclosure control over confidential information. A solid backup solution can help companies to comply promptly with information requests, before they become subject to blanket search demands.

Benefit Rating: Moderate
Market Penetration: 5% to 20% of target audience
Maturity: Emerging
Sample Vendors: Asigra; Commvault; Datacastle; Druva; IDrive

File Analysis
Analysis by Alan Dayley and Julian Tirsu

Definition: File analysis (FA) tools analyze, index, search, track and report on file metadata and, in most cases (such as in unstructured data environments), on file content. FA tools are usually offered as software options. FA tools report on file attributes and provide detailed metadata and contextual information to enable better information governance and data management actions.

Position and Adoption Speed Justification: FA is a growing technology that assists organizations in understanding the ever-growing repository of unstructured 'dark' data, including file shares, email databases, SharePoint, enterprise file sync and share (EFSS), and cloud platforms, especially the rapid adoption of Microsoft Office 365. Metadata reports include data owner, location, duplicate copies, size, last accessed or modified, security attribute changes, file types and custom metadata.

The primary use cases for FA for unstructured data environments include but are not limited to:
• Organizational efficiency and cost optimization
• Information governance and analytics
• Risk mitigation

The desire to mitigate business risks (including security and privacy risks), identify sensitive data, optimize storage cost and implement information governance is a key factor driving the adoption of FA. The identification, classification, migration, protection, remediation and disposition of data are key features of FA tools.

User Advice: Organizations should use FA to better understand their unstructured data, including where it resides and who has access to it. Data visualization maps created by FA can be presented to other parts of the organization and be used to better identify the value and risk of the data, enabling IT, line-of-business and compliance organizations to make better-informed decisions regarding classification, information governance, storage management and content migration. Once known, redundant, outdated and trivial data can be defensibly deleted, data can be migrated or quarantined, and retention policies can be applied to other data.
Business Impact: FA tools reduce risk by identifying which files reside where and who has access to them. They support remediation in areas such as the elimination or quarantining of sensitive data, identifying and protecting intellectual property, and finding and eliminating redundant and outdated data that may lead to unnecessary business risk. FA shrinks costs by reducing the amount of data stored. It also classifies valuable business data so that it can be more easily leveraged and analyzed, and it supports e-discovery efforts for legal and regulatory investigations. In addition, FA products feed data into corporate retention initiatives by using file attributes.

Benefit Rating: High
Market Penetration: 5% to 20% of target audience
Maturity: Adolescent
Sample Vendors: Active Navigation; Bloomberg; Capax Discovery; Hewlett Packard Enterprise; IBM (StoredIQ); Kazoup; Komprise; STEALTHbits; Varonis; Veritas Technologies

Open-Source Storage
Analysis by Julia Palmer and Arun Chandrasekaran

Definition: Open-source storage is core storage software that is used to create a storage system, as well as data protection and management software. It involves software abstracted from the underlying hardware for which the source code is made available to the public through a free distribution license. Similar to proprietary storage, open-source storage software supports primary, secondary and tertiary storage tiers, as well as heterogeneous management.

Position and Adoption Speed Justification: Although open-source storage (OSS) has been around over a decade, it has been mainly relegated to file-serving and backup deployments in small business environments. Recent innovations in x86 hardware and flash, combined with an innovative open-source ecosystem, are making OSS attractive for cloud and big data workloads and as a potential alternative to proprietary storage. As cloud computing, big data analytics and information archiving push the capacity, pricing and performance frontiers of traditional scale-up storage architectures, there has been renewed interest in OSS as a means to achieve high scalability in capacity and performance at lower acquisition costs.

The emergence of open-source platforms such as Apache Hadoop and OpenStack, which are backed by large, innovative communities of developers and vendors, together with the entry of disruptive vendors such as Red Hat (Gluster Storage, Ceph Storage) is enabling enterprises to consider OSS for use cases such as cloud storage, big data and archiving. There has been a growing number of OSS projects for container-based storage such as Minio.

User Advice: Although OSS offers a less-expensive upfront alternative to proprietary storage, IT leaders need to measure the benefits, risks and costs accurately. Some enterprise IT leaders often overstate the benefits and understate the costs and risks. Conversely, with the emerging maturity of open-source storage solutions, enterprise IT buyers should not overlook the value proposition of these solutions. IT leaders should actively deploy pilot projects, identify internal champions, train storage teams and prepare the overall organization for this disruptive trend. Although source code can be downloaded for free, it is advisable to use a commercial distribution and to obtain support through a vendor, because OSS requires significant effort and expertise to install, maintain and support. IT leaders deploying 'open core' or 'freemium' storage products need to carefully evaluate the strength of lock-in against the perceived benefits. This is a model in which the vendor provides proprietary software - in the form of add-on modules or management tools - that functions on top of OSS.

In most cases, open-source storage is not general-purpose storage. Therefore, choose use cases that leverage the strengths of open-source platforms - for example, batch processing or a low-cost archive for Hadoop and test/development private cloud for OpenStack - and use them appropriately. It is important to focus on hardware design and choose cost-effective reference architectures that have been certified by the vendors and for which support is delivered in an integrated manner. Overall, on-premises integration, management automation and customer support should be key priorities when selecting OSS solutions.

Business Impact: OSS is playing an important role in enabling cost-effective, scalable platforms for new cloud and big data workloads. Gartner is seeing rapid adoption among technology firms and service providers, as well as in research and academic environments. big data, dev/test and private cloud use cases in enterprises are also promising use cases for open-source storage, where Gartner is witnessing keen interest. As data continues to grow at a frantic pace, OSS will enable customers to store and maintain data, particularly unstructured data, at a lower acquisition cost, with 'good enough' availability, performance and manageability.

Benefit Rating: High
Market Penetration: 1% to 5% of target audience
Maturity: Emerging
Sample Vendors: Cloudera; Hortonworks; iXsystems; IBM; Intel; Minio; OpenIO; Red Hat; SUSE; SwiftStack

Copy Data Management
Analysis by Pushan Rinnen and Garth Landers

Definition: Copy data management (CDM) refers to products that capture application-consistent data via snapshots in primary storage and create a live 'golden image' in a secondary storage system where virtual copies in native disk format can be mounted for use cases such as backup/recovery or test/development. Support for heterogeneous primary storage is an essential component. Different CDM products have different additional data management capabilities.

Position and Adoption Speed Justification: CDM has become a hyped term as various vendors start to use it to promote their product capabilities. CDM awareness and deployments are increasing. Some storage array vendors use the term to describe their internal array capabilities, which don't fit into Gartner's definition in terms of heterogeneous primary array support. CDM adoption continues to focus on two areas: (1) consolidated backup and DR, and (2) test/development workflow automation. Newer products tend to focus more on the first use case. The main challenge faced by CDM products is being leveraged across these two very different use cases, as these represent very different buying centers and decision makers. The lack of products from major vendors and the inconsistent usage of the CDM term impede greater adoption.

User Advice: IT should look at CDM as part of a backup modernization effort, or when managing multiple application copies for testing/development has become costly, overwhelming or a bottleneck. CDM could also be useful for organizations that are looking for active access to secondary data sources for reporting or analytics due to its separation from the production environment. Enterprises should also look at opportunities for database and application archiving for storage reduction or governance initiatives to further justify investment. Due to the short history of the new architecture and vendors, new use cases beyond the common ones are not field-proven and should be approached with caution.

Business Impact: IT organizations have historically used different hardware and software products to deliver backup, archive, replication, test/development, legacy application archiving and other data-intensive services with very little control or management across these services. This results in over-investment in storage capacity, software licenses and operational expenditure costs associated with managing multiple solutions. CDM facilitates the use of one copy of data for all of these functions via virtual copies, thereby dramatically reducing the need for multiple physical copies of data and enabling organizations to cut the costs associated with multiple disparate software licenses and storage islands. The separation of the 'golden image' from the production environment can facilitate aggressive RPOs and RTOs . In the case of test/development, CDM improves the workflow process and operational efficiency by enabling database administrators and application developers more self-services capabilities.

Benefit Rating: High
Market Penetration: 1% to 5% of target audience
Maturity: Emerging
Sample Vendors: Actifio; Catalogic Software; Cohesity; Dell EMC; Delphix; Druva; Rubrik; Veritas Technologies

Infrastructure SDS
Analysis by Julia Palmer and John McArthur

Definition: Infrastructure software-defined storage (SDS) creates and provides data center services to replace or augment traditional storage arrays. It can be deployed as a VM, container or as software on a bare-metal x86 industry standard server, allowing organizations to deploy a storage-as-software package. This creates a storage platform that can be accessible by file, block or object protocols.

Position and Adoption Speed Justification: Infrastructure SDS is positioned to change the economics and delivery model of enterprise storage infrastructures. Whether deployed independently, or as an element of a hyperconverged integrated system, SDS is altering how organizations buy and deploy enterprise storage. Following web-scale IT's lead, I&O leaders are deploying SDS as hardware-agnostic storage, and breaking the bond from higher-priced proprietary, legacy external-controller-based (ECB) storage hardware. The power of multicore Intel x86 processors, use of SSDs and high throughput networking have essentially eliminated hardware-associated differentiation, transferring all of the value to storage software. Expect new infrastructure SDS vendors and products to emerge and to target a broad range of delivery models and workloads, including server virtualization, archiving, big data analytics and unstructured data. Comprehensive analyses of SDS TCO benefits involve both Capex and Opex, including administrative design, verification, deployment, and ongoing management, maintenance and support, as well as a potential improvement on business agility.

User Advice: Infrastructure SDS is the delivery of data services and storage-array-like functionality on top of industry standard hardware.

Enterprises choose a software-defined approach when they wish to accomplish some or all of the following goals:
• Build a storage solution at a low acquisition price point on commodity x86 platform.
• Decouple storage software and hardware to standardize their data center platforms.
• Establish a scalable solution specifically geared toward Mode 2 workloads.
• Build agile, 'infrastructure as code' architecture to enable storage to be a part of software-defined data center automation and orchestration framework.

Advice to end users:
• Recognize that infrastructure SDS remains a nascent, but growing, deployment model that will be focused on web-scale deployment agility.
• Implement infrastructure SDS solutions that enable you to decouple software from hardware, reduce TCO and enable greater data mobility.
• Assess emerging storage vendors, technologies and approaches, and create a matrix that matches these offerings with the requirements of specific workloads.
• Deploy infrastructure SDS for single workload or use case. Take the lessons learned from this first deployment and apply SDS to additional use cases.
• For infrastructure SDS products, identify upcoming initiatives where SDS could deliver high value. Use infrastructure SDS with commodity hardware as the basis for a new application deployment aligned with these initiatives.
• Build infrastructure SDS efficiency justification as a result of a proof-of-concept deployment, based on capital expenditure ROI data and potential operating expenditure impact, as well as better alignment with the core business requirements.

Business Impact: Infrastructure SDS is a hardware-agnostic platform. It breaks the dependency on proprietary storage hardware and lowers acquisition costs by utilizing the industry standard x86 server platform of the customer's choice. Some Gartner customers report up to 40% TCO reduction with infrastructure SDS that comes from the use of x86 industry standard hardware and lower cost of upgrades and maintenance fees. However, the real value of infrastructure SDS in the long term is increased flexibility and programmability that is required for Mode 2 workloads. I&O leaders that successfully deployed and benefited from infrastructure SDS have usually belonged to large enterprises or cloud service providers that pursued web-scale-like efficiency, flexibility and scalability, and viewed SDS as a critical enablement technology for their IT initiatives. I&O leaders should look at infrastructure SDS not as another storage product but as an investment in improving storage economics and providing data mobility including hybrid cloud integration.

Benefit Rating: Transformational
Market Penetration: 5% to 20% of target audience
Maturity: Adolescent
Sample Vendors: Dell EMC; Hedvig; IBM; Maxta; Nexenta; Red Hat; Scality; StorMagic; SwiftStack; VMware

Sliding Into the Trough
Integrated Systems: Hyperconvergence
Analysis by George J. Weiss and Andrew Butler

Definition: Hyperconverged systems are integrated systems that apply a modular and shared compute/network/storage building block approach, with a unified management layer on commodity hardware and DAS leveraging scale-out clusters.

Position and Adoption Speed Justification: Hyperconverged infrastructure is a rapidly expanding market segment that is expected to grow at a 48% CAGR from 2016 to 2021. By 2021, hyperconverged integrated systems (HCISs) will represent 54% of total converged infrastructure shipments by revenue, with HCIS reaching $10.8 billion. HCIS enables IT to start from a small base - a single or dual node - and incrementally scale out as demand requires. The modular-building-block approach enables enterprises to take small steps, rather than make the significant upfront investments required by traditional integrated systems, which typically have a costly proprietary chassis with fabric infrastructure.

HCIS continues to expand with new providers, and most traditional system vendors shipping HCIS as part of their portfolios. Systems will continue to evolve with additional feature/function deliverables and broader vendor portfolios to address mixed workloads and hybrid cloud integration. The fast pace of HCIS will begin slowing by 2020-2021, but remains well ahead of the growth rates of other integrated and converged systems. We expect continued feature/function evolution, hybrid cloud IaaS configurations, and higher levels of application delivery agility and efficiency through advancements such as composable infrastructure and cloud management functions.

User Advice: IT leaders should recognize HCIS as an evolution within the broader category of integrated systems that lays the foundation for ease of use, simplicity, virtualization, cloud deployment and eventual bimodal implementations. IT should be able to harness its fundamental advantages in efficiency, utilization, agility, data protection, continued life cycle deployment and orchestration as part of a strategic data center modernization objective. Plan strategically, but invest tactically in hyperconverged systems, because the market is subject to volatility (new entrants or M&A). Plan for a payback of two years or less to ensure financial success and investment value. Test the scalability limits of solutions and TCO benefits, because improvements will occur rapidly during this period.

Business Impact: HCIS will generally be driven by expectations of lower TCO costs and data center modernization initiatives. HCIS will continue to experience enthusiastic reception from the mid-market, due to the simplicity and convenience of appliance configurations. Use cases especially well-suited to HCIS include VDIs (VDIs), server virtualization and consolidation, data migration, DR, private cloud, remote or branch office, relational databases, Hadoop, and dedicated application infrastructures. Moreover, general-purpose workloads are increasingly being moved from server virtualization blades to exploit HCIS's favorable TCO properties.

Benefit Rating: High
Market Penetration: 5% to 20% of target audience
Maturity: Adolescent
Sample Vendors: Atlantis Computing; Cisco; Dell; HPE SimpliVity; Nutanix; Pivot3; Scale Computing

Data Sanitization
Analysis by Rob Schafer, Philip Dawson and Christopher Dixon

Definition: Data sanitization is the consistently applied, disciplined process of reliably and completely removing all data from a R/W medium so that it can no longer be read or recovered.

Position and Adoption Speed Justification: Growing concerns about data privacy and security, leakage, regulatory compliance, and the ever-expanding capacity of storage media are making robust data sanitization a core requirement for all IT organizations.

This requirement should be applied to all devices with storage components (including PCs, mobile phones, tablets, and high-end printers and copiers) when they are repurposed, returned to the supplier/lessor, sold, donated to charity or otherwise disposed of. Where organizations lack this robust data sanitization competency, it is often due to handling the various stages of the asset life cycle as isolated events, with little coordination between business boundaries (such as finance, security, procurement and IT). Thus, the personnel assigned to IT asset disposition (ITAD) are often different from those responsible for risk management and compliance, which can put the organization at risk of both internal and external noncompliance.

For mobile devices, a remote data-wiping capability is commonly implemented, triggered by either the user logging into a website or an administrator remotely invoking a mobile device manager (MDM), although such a remote capability should not be considered a fail-safe mechanism, reliability should be adequate for a significant majority of lost or stolen mobile devices. The degree to which various hardware storage technologies are reliably wiped varies according to organization and device type.

User Advice: Follow a life cycle process approach to IT risk management that includes making an explicit decision about data sanitization and destruction, device reuse and retirement, and data archiving.

Implement policies hat assign explicit responsibility forall media carrying sensitive or regulated data - whether corporate or personal - to ensure that they are properly wiped or destroyed at the end of their production use.

Collaborate with data sanitization stakeholders (e.g., security, compliance, legal, IT) to create appropriate data sanitization/destruction standards that provide specific guidance on the end-to-end destruction process, based on data sensitivity.

Regularly (e.g., annually) verify that your ITAD vendor consistently meets your data sanitization security specifications and standards.

Understand the security implications of personal devices and plug-and-play storage. Organizations that have yet to address portable data-bearing devices (e.g., USB drives) are even less prepared to deal with these implications.

Consider using whole-volume encryption for portable devices and laptops, and self-encrypting devices in the data center.

Consider destroying storage devices containing highly sensitive and/or regulated data (e.g., organizations in the financial and healthcare industries), either by mechanical means or by using degaussing machines, rendering them permanently unusable and ensuring that the data is not recoverable.

Consider software tape shredding. Tape shredding performs a three-pass wipe of the selected virtual tapes using an algorithm specified by the U.S. Department of Defense (Standard 5220.22-M), which helps IT managers meet security and regulatory compliance requirements.

Forbid the use of USB memory sticks for sensitive, unencrypted files. Some undeleted but largely inaccessible data remains on most USB memory sticks.

Understand end-of-contract implications, and ask current and potential providers for an explanation of their storage reuse and account retirement practices. This advice applies to buyers of any form of externally provisioned service.

Business Impact: At a relatively low cost, the proper use of encryption, wiping and, when necessary, destruction will help minimize the risk that proprietary and regulated data will leak.

By limiting data sanitization to encryption and/or software wiping, organizations can preserve the asset's residual market value; the destruction of data-bearing devices within an IT asset typically reduces the asset's residual value (RV) to salvage, incurring the cost of environmentally compliant recycling.

The National Association for Information Destruction (NAID) supports best practices in data destruction services, and offers a list of service providers. Also refer to the National Institute of Standards and Technology (NIST) December 2014 revision of its Special Publication 800-88: Guidelines for Media Sanitization.

Benefit Rating: Moderate
Market Penetration: 20% to 50% of target audience
Maturity: Early mainstream
Sample Vendors: Blancco Technology Group; DestructData; ITRenew; Kroll Ontrack

Integrated Backup Appliances
Analysis by Robert Rhame

Definition: An integrated backup appliance is an all-in-one backup software and hardware solution that combines the functions of a backup application server, media server (if applicable) and backup target device. The appliance is typically pre-configured and fine-tuned to cater to the capabilities of the onboard backup software. It is a more simplified and easier-to-deploy backup solution than the traditional approach of separate software and hardware installations, but lacks flexibility on hardware choices and scalability.

Position and Adoption Speed Justification: Integrated backup appliances have been around for many years without much fanfare. The current hype is driven by existing large backup software vendors that have started packaging their software in an appliance, and by innovative emerging vendors offering all-in-one solutions. The momentum of integrated backup appliances is driven by the desire to simplify the setup and management of the backup infrastructure, because 'complexity' is a leading challenge when it comes to backup management. Overall, integrated backup appliances have resonated well with many small and midsize enterprise customers that are attracted by the one-stop-shop support experience and tight integration between software and hardware. As the appliances scale up, they will be deployed in larger environments.

Within the integrated backup appliance market, the former clear segmentation by backup repository limitations has vanished, with all vendors adding cloud target or tiering capabilities.

There are generally three types of vendor selling integrated backup appliances, separated primarily by heritage:
• The first kind includes backup software vendors that package their software with hardware in order to offer customers integrated appliances. Examples include Arcserve, Dell EMC, and Veritas Technologies.
• The second type is made up of emerging products that tightly integrate software with hardware, such as Actifio, Cohesity and Rubrik.
• The third kind is a cloud backup provider that offers a customer and on-premises backup appliance as part of a cloud backup solution. Examples include, Barracuda Networks, Ctera Networks, Datto and Unitrends.

User Advice:
• Organizations should first evaluate backup software functions to ensure that their business requirements are met, before making a decision about acquiring an integrated backup appliance or a software-only solution.
• Once a specific backup software product is chosen, deploying an appliance with that software will simplify operational processes and address any compatibility issues and functionality gaps between backup software-only products and deduplication backup target appliances.
• Customers should keep in mind that integrated appliances can also act as a lock-in for the duration of the useful life of the hardware.
• If customers prefer deploying backup software-only products to gain hardware flexibility, they should carefully consider which back-end storage to choose - be it generic disk array/network-attached storage (NAS) or deduplication backup target appliances.

Business Impact: Integrated backup appliances ride the current trend of converged infrastructure and offer tight integration between software and hardware, simplify the initial purchase and configuration process, and provide the one-vendor support experience with no finger-pointing risks.

On the down side, an integrated backup appliance tends to lack the flexibility and heterogeneous hardware support offered by backup software-only solutions, which is often needed by large, complex environments.

Benefit Rating: Moderate
Market Penetration: 20% to 50% of target audience
Maturity: Early mainstream
Sample Vendors: Actifio; Arcserve; Barracuda Networks; Cohesity; Ctera Networks; Datto; Dell EMC; Rubrik; Unitrends; Veritas Technologies

Storage Cluster File Systems
Analysis by Julia Palmer and Arun Chandrasekaran

Definition: Distributed file systems storage uses a single parallel file system to cluster multiple storage nodes together, presenting a single namespace and storage pool to provide high bandwidth for multiple hosts in parallel. Data is distributed over multiple nodes in the cluster to handle availability and data protection in a self-healing manner, and to provide high throughput and scalable capacity in a linear manner.

Position and Adoption Speed Justification: The strategic importance of storing and analyzing large-scale, unstructured data is bringing scale-out storage architectures to the forefront of IT infrastructure planning. Storage vendors are continuing to develop distributed cluster file systems to address performance and scalability limitations in traditional, scale-up, NAS environments. This makes them suitable for batch and interactive processing, and other high-bandwidth workloads. Apart from academic HPC environments, commercial vertical industries - such as oil and gas, financial services, media and entertainment, life sciences, research and telecommunication services - are leading adopters for applications that require highly scalable storage bandwidth.

Beyond the HPC use case, large home directories' storage, rich-media streaming, content distribution, backup and archiving are other common use cases for cluster file systems. Built on a 'shared nothing' architecture, distributed file systems are providing resilience at the software layer, and may not require proprietary hardware. Products from vendors such as Panasas, DataDirect Networks (DDN) and Cray are most common in HPC environments. Most leading storage vendors, such as Dell EMC, Huawei and IBM, as well as emerging products from vendors, such as Elastifile, Red Hat and Qumulo, also have a presence in this segment. Vendors are also increasingly starting to offer software-based deployment options in a capacity-based perpetual licensing model, or with subscription based licensing to stimulate market adoption.

Hadoop Distributed File System (HDFS) is starting to see wide enterprise adoption for big data, batch processing use cases and beyond. With the growing demand for high IO/s and aggregated bandwidth for shared storage, cluster file systems are expected to see robust adoption in the future.

User Advice: Storage cluster file systems have been around for decades although vendor maturity varies widely. Users that need products that enable them to pay as they grow in a highly dynamic environment, or that need high bandwidth for shared storage, should put distributed file systems on their shortlists. Most commercial and open-source products specialize in tackling specific use cases, but integration with workflows may be lacking in several products. Evaluate your application I/O requirements to select a pertinent cluster file system.

Prioritize scalability, performance, manageability, independent software vendor (ISV) support, deployment flexibility and resiliency features as important selection criteria. There is little technical know-how regarding scale-out file systems in many enterprise IT organizations; hence, I&O leaders should allocate a portion of the storage budget to training.

Business Impact: Storage cluster file systems are alternatives to traditional architectures that scale storage bandwidth more linearly, surpassing expensive monolithic frame storage arrays in this capability. The business impact of storage cluster file systems is most pronounced in environments in which applications generate large amounts of unstructured data, and the primary access is through file protocols. However, they will also have an increasing impact on traditional data centers that want to overcome the limitations of dual-controller storage designs, as well as for use cases such as backup and archiving. Many of the file systems product are being deployed as software-only products on top of industry standard x86 server hardware which has potential to have lower TCO compared to ECB storage arrays. Many storage cluster file systems will have a significant impact on private cloud services, which require a highly scalable, resilient and elastic infrastructure. IT professionals keen to consolidate file server or NAS file sprawl should consider using cluster file system storage products that offer operational simplicity and nearly linear scalability.

Benefit Rating: High
Market Penetration: 20% to 50% of target audience
Maturity: Early mainstream
Sample Vendors: Cray; DDN; Dell EMC; Elastifile; Huawei; IBM; Panasas; Quantum; Qumulo; Red Hat

Cross-Platform Structured Data Archiving
Analysis by Garth Landers

Definition: Cross-platform structured data archiving software moves data from custom or commercially provided applications to an alternate file system or DBMS while maintaining data access and referential integrity. Reducing the volume of data in production instances can improve performance; shrink batch windows; and reduce storage acquisition costs, facility requirements, the cost of preserving data for compliance when retiring applications and environmental footprints. Archives can also be used for historical and other analyses.

Position and Adoption Speed Justification: Structured data archiving tools have been available for two decades and have historically seen more adoption in larger enterprises. These products provide functionality to identify aging application data and manage it appropriately. Although ROI can be high, developing policies for retaining and deleting old application data is difficult and often not seen as a priority. In addition, vendor offerings are expensive, and enterprises will only engage when events demand it. Organizations generally tend to add more database licenses or use native database capabilities, such as purging and partitioning, to address application growth. The technology has long been seen as a cost avoidance measure used to contain operational and capital expenditures related to data growth, as well as to improve factors like application performance.

Today's data archiving products are mature and will face challenges as various distributions of Hadoop add capabilities such as retention management and enterprises embrace big data/NoSQL environments. As enterprises embrace SaaS-based ERP, CRM and other systems, the need for archiving to an independent repository has not presented itself. In addition, new approaches to curbing structured data growth are happening through areas such as copy data management. This approach, while less mature when applied to these use cases, is growing, and is less complex than the technology offered by leading offerings. Application retirement continues to be a significant driver. Organizations are looking for ways to cut costs associated with maintaining no-longer-needed legacy applications while preserving application data for compliance or historical value. Data center consolidations, moving to the cloud, and mergers and acquisitions are contributing to the interest in structured data archiving solutions to reduce the number of enterprise applications.

Competition often comes from internal resources who want to build it themselves and from improvements in storage technology that transparently improve performance while reducing storage acquisition and ownership costs - more specifically, auto-tiering, SSDs, data compression and data deduplication. Do-it-yourself efforts typically lack appropriate governance controls such as secure access, data masking and retention management. The allure of tools that can support multiple applications and underlying databases, and the added capabilities these tools provide for viewing data independent of the application, are driving administrators to consider them as viable solutions. New capabilities - such as better search and reporting, integration with big data analysis tools, retention management, support for database partitioning, and support for SAP archiving in preparation for Hana adoption - are broadening their appeal. Increasingly, product offerings are beginning to include unstructured data along with relational data for a more holistic approach to application archiving.

User Advice: The ROI for implementing a structured data archiving solution can be exceptionally high, especially to retire an application or to deploy a packaged application for which vendor-supplied templates are available to ease implementation and maintenance. Structured data archiving often makes sense in heterogeneous application and database environments. Expect that the planning phase may take longer than the implementation. Among the roadblocks to implementation are required consulting services, gaining application owner acceptance, defining archiving policies and building the initial business case. Application retirement projects with large data footprints and numerous applications are projects that can span more than a year. Most vendors in this space can provide good references, and organizations should speak with references that have similar application portfolios and goals for managing their data. Enterprises should consider developing their own solutions when the number of applications being retired is very low, data retention requirements are not very long (such as one to two years), or governance requirements such as audit or litigation are unlikely.

Business Impact: Creating an archive of infrequently accessed data and reducing the size of the active application database (and all related copies of that database) improve application performance and recoverability, and lower costs related to database and application license, server, infrastructure and operation. Transferring old, rarely accessed data from a disk archive to tape can further reduce storage requirements. Most vendors in this space are supporting cloud storage as the repository for archived data. Retiring or consolidating legacy applications cuts the costs and risks associated with maintaining these systems. Optimally, historical data can be preserved for analysis, supporting improvements to digital business. Overall, organizations can experience better information governance, including reduced risk associated with governance events like audits.

Benefit Rating: Moderate
Market Penetration: 5% to 20% of target audience
Maturity: Mature mainstream
Sample Vendors: Actifio; Delphix; HPE; IBM; Informatica; OpenText; PBS; Solix Technologies

Information Dispersal Algorithms
Analysis by Valdis Filks

Definition: Information dispersal algorithms provide a methodology for storing information in pieces (i.e., dispersed) across multiple locations, so that redundancy protects the information in the event of localized outages, and unauthorized data access at a single location does not provide usable information. Only the originator or a user with a list of the latest pointers created by the original dispersal algorithm can properly assemble the complete information.

Position and Adoption Speed Justification: These algorithms are being used in more data center storage devices to improve data availability and scale. Nevertheless, commercial solutions continue to become available for the data center from large, established vendors and smaller start-ups for domestic use, IoT and file sync and share. The solutions are also built-in and are available in home consumer storage appliances. However, they differ from the presently prevailing centralized cloud storage offerings as these solutions are not centralized, but distributed and, similar to the internet, have no central control or fault domain. The information dispersal algorithm technology has been expanded to include peer-to-peer (P2P) file-sharing technologies and protocols, such as those based on the BitTorrent protocol, which has proved robust on the internet. A variation is the open-source BitTorrent protocol used in P2P networks to store and recreate data among systems or even blockchain. This is an early neo cloud technology in which the data is truly dispersed in a cloud outside of a small number of centralized, hyperscale or traditional data centers. Therefore, fault tolerance is provided by the nature of the design of these systems, and, due to the dispersal nature of the data, some protection is also provided by their design. Due to their innate design, many scale-out storage systems are implementing RAID designs that disperse data among nodes within racks. This technology is developing into geographically distributed, file-sharing nodes, blurring the lines between scale-out storage systems, information dispersal algorithms and cloud storage.

User Advice: Many vendors have been using this technology in scale-out storage systems for more than five years and it has proved to be reliable. However, these technologies and solutions store data in thousands to millions of separate dispersed nodes within the internet, not within the traditional cloud data center. Customers that are not satisfied with centralized cloud storage offerings or developers of IoT applications that store and preprocess data at the source should investigate information dispersal algorithms, as they reduce customer dependence on a few large hyperscale vendors and locations that still use the traditional centralized data center design. In many ways, these algorithms are tantamount to a form of encryption. The design, coding and testing of the attack resistance of dispersion algorithms has proven to be a difficult undertaking because it is of similar complexity to the design of encryption implementations. Just as proprietary forms of encryption should not be considered as reliable as implementations based on well-proven algorithms and code, the robustness of proprietary dispersal algorithms - and especially their implementations - should not automatically be considered trusted code. Buyers that expect to rely on this technology for confidentiality control should seek evidence from the tester at high levels of testing and from peer review.

Business Impact: Information dispersal algorithms are used in the latest storage arrays, integrated systems and SDS software. It can provide secure storage over the internet and other public or private networks without the overhead and other costs of encryption (such as blockchain) and the need to have centralized hyperscale data centers, such as those from Amazon and Google. Use of the BitTorrent protocol has been political because one of its early applications was to share copyrighted data via the internet among home PCs. However, the protocol is content-neutral and simple to use. It could just as easily be used by software companies to distribute software, updates and any digital information that is stored and geographically dispersed among many nodes and computers in a network.

Open-source implementations are integrated into products by commercial companies as a new method to distribute and store digital data. This is one factor that increases the amount of unstructured data stored on the planet.

Benefit Rating: High
Market Penetration: 5% to 20% of target audience
Maturity: Early mainstream
Sample Vendors: BitTorrent; Caringo; Ctera Networks; Dell; Hedvig; HGST; HPE SimpliVity; IBM; Storj; Vivint

Object Storage
Analysis by Raj Bala and Arun Chandrasekaran

Definition: Object storage refers to software that is often paired with commodity hardware that houses data in structures called 'objects,' and serve hosts via APIs such as Amazon Simple Storage Service (S3). Conceptually, objects are similar to files in that they are composed of content and metadata. In general, objects support richer metadata and are stored in a flat namespace compared with file and block-based storage platforms.

Position and Adoption Speed Justification: Although object storage products have been around for more than a decade, the first-generation products had limitations around scalability, performance and induced lock-in through proprietary interfaces. Broad adoption of second-generation commercial object storage has remained low so far; however, it is increasing, albeit slowly, as enterprises seek storage infrastructure at a lower TCO. The growing maturity of solutions from emerging vendors and refreshed products from large storage portfolio vendors are expected to further stimulate adoption from end users, as the addressable use cases for these products increase. While cost containment of traditional SAN/NAS infrastructure continues to be the key driver for object storage adoption, application development use cases in industries such as media and entertainment, life sciences, the public sector, and education/research are spawning new investments. Object storage products are available in a variety of deployment models - virtual appliances, managed hosting, purpose-built hardware appliances or software that can be consumed in a flexible manner.

User Advice: IT leaders that require highly scalable, self-healing and cost-effective storage platforms for unstructured data should evaluate the suitability of object storage products. The common use cases that Gartner sees for object storage are archiving, content distribution, analytics and backup. When building on-premises object storage repositories, customers should evaluate the product's API support for dominant public cloud providers, so that they can extend their workloads to a public cloud, if needed. Amazon's S3 has emerged as the dominant API over vendor-specific APIs and OpenStack Swift, which is in precipitous decline. Select object storage vendors that offer a wide choice of deployment (software-only versus packaged appliances versus managed hosting) and licensing models (perpetual versus subscription) that can provide flexibility and reduce TCO. These products are capable of a huge scale in capacity and better-suited for workloads that require high bandwidth than transactional workloads that demand high IO/s (IO/s) and low latency.

Business Impact: Rapid growth in unstructured data (40% year over year) and the need to store and retrieve it in a cost-effective, automated manner will drive the growth of object storage. Enterprises often deploy object storage on-premises when they seek to provide a public cloud infrastructure as a service (IaaS) experience within their own data centers. Relatedly, object storage is well-suited to multi-tenant environments and requires no lengthy provisioning for new applications. There is growing interest in object storage from enterprise developers and DevOps team members looking for agile and programmable infrastructures that can be extended to the public cloud. Object storage software, deployed on commodity hardware, is emerging as a threat to external controller-based (ECB) storage hardware vendors in big data environments with heavy volume challenges.

Benefit Rating: High
Market Penetration: 5% to 20% of target audience
Maturity: Adolescent
Sample Vendors: Caringo; Cloudian; DataDirect Networks; Dell EMC; Hitachi Data Systems; IBM; NetApp; Red Hat; Scality; SwiftStack

Solid-State DIMMs
Analysis by Alan Priestley

Definition: Solid-state dual in-line memory modules (SS DIMMs) are all-flash versions of nonvolatile DIMMs (NVDIMMs) that reside on the double data rate (DDR) DRAM memory channel and are persistent. These devices integrate nonvolatile memory (currently NAND flash) and a system controller chip.

Gartner's definition for solid-state DIMMs encompasses NVDIMM-F as classified by NVDIMM Special Interest Group , a consortium within the Storage Networking Industry Association (SNIA).

Position and Adoption Speed Justification: Since DIMMs connect directly to a dedicated memory channel rather than a storage channel, they do not face the bottlenecks of a traditional storage system. Because of this, these SS DIMMs can achieve drastically lower latencies (at least 50% lower) than any existing solid-state storage solution and can be viable alternatives to DRAM memory, if the speeds are acceptable.

SS DIMMs were introduced in 2014, when IBM debuted its eXFlash device through a partnership with Diablo Technologies. However, market adoption was affected by the litigation levied by Netlist, thwarting Diablo's ability to amass more vendor support. In September 2015, the litigation case was closed and decisively ruled for Diablo. In February 2016, Xitore introduced its NVDIMM-X product, which operates in much the same way, but boasts much better performance and lower latency. This is due to an improved cache architecture, local high-speed buffers and enhanced memory controller solution with flexibility to operate with a variety of nonvolatile memory technologies on the back end.

3D XPoint, an emerging nonvolatile memory technology from Intel and Micron, boasts substantial performance and reliability gains over flash memory, but it has only recently become commercially available. Evolution of 3D XPoint for solid-state DIMMs will provide a higher-performance nonvolatile alternative to NAND flash in 2017-2018 and beyond.

Use of any solid-state DIMMs requires a mix or all of the following: support by the host chipset, optimization for the OS and optimization for the server hardware. As such, to achieve greater adoption, support will be required across a wide range of server vendors and OSs. In addition, use cases for memory channel products will need to spread beyond the extremely high-performance, high-bandwidth and ultra-low-latency applications for which they are attracting most interest today.

This technology has faced a number of challenges and has not yet reached maturity. 3D XPoint will likely replace the use of NAND flash on SS DIMMs; however, this process has yet to mature and become available in high volume. The latest specification for persistent memory DIMMS (NVDIMM-P), using a combination of DRAM and nonvolatile memory, may also impact the use of SS DIMMs. For this reason, this technology is transitioning into the trough.

User Advice: Server memory density and capacity are increasing and workloads are being optimized to support utilization of large direct-mapped memory subsystems. IT professionals should evaluate solid-state DIMMs for use as a new tier of storage; however, they should be aware that new persistent memory DIMMS are in development and may better meet their long term needs for nonvolatile memory.

IT professionals should analyze the memory demands and utilization of their workloads and assess the roadmaps of the major server and storage OEMs along with those of the SSD appliance vendors that will be launching DIMM-based storage systems, and weigh the benefits for their needs. They should be aware that servers, applications, OSs and drivers will need to be customized to support SS DIMMs.

NAND flash vendors should consider solid-state DIMMs to enhance their value proposition for commodity NAND flash and expand their addressable market. However they should also be aware that newer nonvolatile memory technologies are coming, which may impact the long-term viability of NAND flash SS DIMMs.

Business Impact: This technology's impact on users will be improved system performance overall. There may also be an impact on traditional storage subsystems as applications are rearchitected to take advantage of large amounts of nonvolatile memory accessible as part of the main server system memory.

Benefit Rating: Moderate
Market Penetration: 1% to 5% of target audience
Maturity: Emerging
Sample Vendors: Diablo Technologies; Huawei; Intel; Micron Technology; Supermicro; Xitore

Emerging storage Protection Schemes
Analysis by Stanley Zaffos

Definition: Emerging storage protection schemes based on erasure code variants, such as Reed Solomon are enabling the use of ever larger HDDs by delivering higher mean time between data loss (MTBDL) than traditional RAID schemes. MTBDLs are further improved by replacing the concept of spare disks with spare capacity, which enables parallel rather than sequential data rebuilds and intelligent rebuilds that only reconstruct data actually stored on failed disks or nodes.

Position and Adoption Speed Justification: HDD capacity is growing faster than HDD data rates. The result is ever longer rebuild times that increase the probability of experiencing subsequent disk failures before the rebuild has completed. Hence, the focus on reducing rebuild times, increasing the fault tolerance, and/or resiliency of the data protection scheme. Erasure coding and dispersal algorithms, which add the physical separation of storage nodes to erasure coding, take advantage of inexpensive and rapidly increasing compute power to store blocks of data as systems of equations, and transform these systems of equations back into blocks of data during read operations. Allowing the user to specify the number of failures that can be tolerated before data integrity can no longer be guaranteed enables users to trade off data protection overheads (costs) against MTBDLs. Erasure coding and dispersal algorithms are most commonly used to support applications that are not response time sensitive and/or have high R/W ratios.

User Advice: Have vendors profile the performance/throughput of their storage systems supporting your workloads using the various protection schemes that they support with various storage efficiency features such as compression and deduplication or auto-tiering turned on and off better understand performance-overhead trade-offs. Confirm that the choice of protection scheme does not limit the use of other value-added features. Request minimum/average/maximum rebuild times to size the likely rebuild window of vulnerability in a storage system supporting your production workloads. Cap microprocessor consumption at 75% of available cycles to ensure that the system's ability to meet service-level objectives is not compromised during rebuilds and microcode updates. Give extra credit to vendors willing to guarantee rebuild times.

Business Impact: Advanced data protection schemes enable vendors and users to continue lowering storage costs and power the evolution of digital businesses by enabling the deployment of low-cost high capacity disks as soon as they become technically and economically attractive. The rapid adoption of new high-capacity HDDs lowers environmental footprints and the frequency and urgency of repair activities by encouraging the deployment of fewer larger storage systems that may also enable users to delay or avoid doing facilities upgrades or expansions.

Benefit Rating: Moderate
Market Penetration: 5% to 20% of target audience
Maturity: Adolescent
Sample Vendors: Caringo; DDN; Dell EMC; IBM; NEC; Panasas; Scality; SwiftStack

Hybrid DIMMs
Analysis by Alan Priestley

Definition: Hybrid dual in-line memory modules (hybrid DIMMs) are nonvolatile DIMMs that reside on the double data rate (DDR) DRAM memory channel, function as DRAM memory and focus on preserving data in case of power failure in critical applications. They integrate DRAM and nonvolatile memory (currently NAND flash), and a system controller. Gartner definition for hybrid DIMMs encompasses NVDIMM-N and NVDIMM-P as classified by NVDIMM Special Interest Group, a consortium within the Storage Networking Industry Association.

Position and Adoption Speed Justification:
There are two classes of hybrid DIMMs:
• Current-generation devices that comprise a DRAM array, an associated flash-based backup storage and an ultracapacitor powerful enough to allow the module time to write the DRAM contents to the flash array in the case of a power failure. Only the DRAM on these devices is accessible to the main system microprocessor, the flash memory is solely used to backup the DRAM contents. These are NVDIMM-N devices.
• A new generation of devices that include DRAM and nonvolatile storage, flash or 3D XPoint, where both the DRAM and nonvolatile storage are accessible by the system microprocessor, which manages what data is stored in which memory device type. This is commonly known as persistent memory or NVDIMM-P.

Hybrid DIMMs use the same industry-standard DDR4 DIMM sockets and - with declines in NAND flash pricing - it is becoming economical to design systems with sufficient storage capacity to enable persistent memory capability. Hybrid DIMMs appear to the OSs as standard memory mapped DRAMs. It is necessary, however, for the system BIOS to support data recovery in the case of recovering from a power failure.

Industry support and standardization are critical for adoption. Currently, only a few major server and storage OEMs support the technology. In addition, hybrid DIMMs are currently available from only two major DRAM and NAND vendors - Micron (via AgigA Tech and Hewlett Packard Enterprise) and SK hynix - and from custom module providers such as Viking Technology, Smart Modular Technology and a few other small module vendors. With the advent of the NVDIMM-P specification and the introduction of 3D XPoint memory by Intel and Micron, we expect other major memory vendors to enter the market, along with other custom module providers that are already involved in both DRAM and flash-based memory modules. We expect adoption of this technology to increase in the next two to three years.

The slow pace of new introductions of hybrid DIMMs, slow growth of OEM support, slow migration of users to new technologies and a lack of education about the potential benefits of hybrid DIMMs has limited penetration of this technology. We have, therefore, moved its position closer to the Trough of Disillusionment on the Hype Cycle compared with last year.

User Advice: IT professionals should educate themselves about the various nonvolatile DIMM options available to meet their persistent system memory needs. They should examine the roadmaps of major server and storage OEMs, as well as those of solid-state array (SSA) vendors, to see which will launch hybrid DIMM-based systems. They should ascertain whether hybrid DIMMs are supported by the OS and server they wish to use, and whether the required BIOS changes have been implemented in their target systems. The latencies of hybrid DIMMs and all DRAM DIMMs require that servers, systems and OS timing routines are tuned properly.

Although the current focus of hybrid DIMMs is DRAM backup, the new generation of devices will be able to be used as a new tier of storage with access times closer to DRAM, as they are faster than conventional SSDs and have a denser form factor that allows for greater system capacities. For this use case, users should consider both hybrid and solid-state DIMMs.

Business Impact: Hybrid DIMMs have several advantages over conventional battery-powered backup DIMMs, including faster speed, lower maintenance costs, greater reliability, HA and improved system performance. Currently, the cost premium over existing solutions is considerable due to the need for DRAM and flash, and this will also impact the storage capacity achievable within the DIMM form factor.

Memory vendors should evaluate hybrid DIMMs and the newer NVDIMM-P specification as a way of adding value to what are essentially two commodity products - DRAM and NAND flash. By exploiting the attributes of these devices, they will not only enhance their own value proposition, but also expand their addressable market.

Benefit Rating: Moderate
Market Penetration: 1% to 5% of target audience
Maturity: Adolescent
Sample Vendors: AgigA Tech; Hewlett Packard Enterprise; Intel; Micron Technology; Netlist; Samsung; SK hynix; Smart Modular Technologies; Viking Technology

Enterprise Endpoint Backup
Analysis by Pushan Rinnen

Definition: Enterprise endpoint backup refers to backup products for laptops, desktops, tablets and smartphones that can recover corrupted or lost data, as well as personal settings residing on the devices. Endpoint backup differs from file sync and share's versioning capabilities in that backup preserves secure, centrally managed copies that cannot be changed or deleted by end users, and that it protects PC/laptop data in a more comprehensive way.

Position and Adoption Speed Justification: Overall, more organizations are adopting endpoint backup to tackle different risks, including ransomware, insider threats and potential risks exposed by enterprise file sync and share solutions, including Office 365 OneDrive for Business. Those that have globally distributed offices and employees like to leverage web-scale public cloud storage providers and backup-as-a-service providers that offer a multiple-country presence. As employees become more mobile, laptop backup has been the driving force for organizations to adopt endpoint backup, not only to restore lost data, but also to enable more efficient ways to perform ongoing laptop refresh/migration, to comply with company policies, and to perform legal hold and e-discovery. Technologywise, vendors have added more features to cater to the mobile nature of laptops, such as VPN-less backup over the internet, cellular network awareness and remote wipe. Other new product developments focus on security and compliance capabilities, device replacement/migration automation and full-text search for faster restore/recovery. The old performance issues are tackled by the use of client-side deduplication, in addition to incremental-forever backups, near-CDP technologies, and CPU and network throttling.

Backup of mobile devices, such as tablets and smartphones, continues to be problematic due to lack of APIs for integration with third-party backup software. As a result, most organizations don't have a policy regarding mobile backup.

User Advice: Protecting endpoint user data must be part of a robust enterprise data protection and recovery plan. Organizations should evaluate and deploy a laptop/PC backup solution, be it on-premises or in the cloud, to maintain control and prevent data loss or leakage, instead of depending on employees to create their own backups.

Business Impact: Endpoint backup and recovery have become increasingly important as the global workforce has become more mobile and is creating more business content on various endpoint devices. Moreover, new malicious attacks, such as ransomware, have increased risk profiles, and organizations often rely on backup to restore data instead of paying the ransom. If employees don't backup their endpoint devices regularly (and many do not on their own), companies may face significant risks when important or sensitive data is lost, stolen or leaked, including R&D setbacks, fines, legal actions and the inability to produce user data in a lawsuit. Based on Gartner's estimates, laptop/PC data loss as a result of lack of backup could cost an organization of 10,000 employees about $1.8 million a year.

Benefit Rating: High
Market Penetration: 20% to 50% of target audience
Maturity: Early mainstream
Sample Vendors: Code42; Commvault; Ctera Networks; Datacastle; Dell EMC; Druva; Infrascale; Micro Focus

Cloud Storage Gateways
Analysis by Raj Bala

Definition: Cloud storage gateways refer to the physical or virtual appliances that reside in an organization's enterprise data center and/or public cloud network. They provide users and applications with seamless access to data stored in a public or private cloud. Users and applications typically read and write data through network file system or host connection protocols. Data is then transparently written to remote cloud storage through web service APIs, such as those offered by Amazon Web Services (AWS) and Microsoft Azure.

Position and Adoption Speed Justification: Cloud storage gateways act as a technology bridge between on-premises storage and public cloud storage. But technology bridges are often temporary. They are eventually dismantled when users understand how to get to the other side. And that is already happening in the market for cloud storage gateways. As public cloud has matured and become mainstream, enterprises no longer need a bridge to consume public cloud infrastructure as a service (IaaS). Customer uses cases for cloud storage gateways are narrowing in on specialized, niche workloads for synchronizing large files found in the architecture, construction and engineering verticals. Customers in these verticals often use products from vendors such as Nasuni and Panzura, which provide a global namespace for files across disparate offices, but with local file access performance.

User Advice: There are no vendors in this market with high-growth revenue. Compared with adjacent storage markets such as that of hyperconverged integrated systems (HCIS) and solid-state arrays (SSA), vendors in the cloud storage gateway market have very modest revenue and customer adoption. As a result, there is inherent risk in becoming dependent on a cloud storage gateway offered by a start-up.

However, there is unique functionality that isn't present in products from more mature markets, such as HCIS. In particular, no other categories of storage products provide a global namespace and file locking. These features serve collaboration and file sharing use cases across disparate geographies that are otherwise underserved by the larger storage market.

Enterprises that require this functionality should factor in the risk associated with small vendors that may eventually get acquired by larger portfolio vendors.

Business Impact: Cloud storage gateways can provide customers that want to reduce in-house backup/DR processes, archives and unstructured data with compelling, cloud-based alternatives. Some organizations are deploying cloud storage gateways such as virtual appliances in compute instances running in public cloud IaaS providers, such as AWS and Google Cloud Platform. The gateways then connect back to a customer's enterprise data center and act as a bridge between elastically scaled compute instances in the public cloud and the data stored on primary storage platforms inside the customer's data center. This scenario is particularly useful for big data workloads where the compute capacity is best used in a temporary, elastic fashion. This model flips the traditional notion of an enterprise's use of public cloud: An enterprise data center becomes an extension of public cloud, rather than the opposite.

Benefit Rating: Low
Market Penetration: 5% to 20% of target audience
Maturity: Adolescent
Sample Vendors: Amazon Web Services; Avere Systems; Ctera Networks; Dell EMC; Microsoft; Nasuni; NetApp; Panzura

DR as a Service (DRaaS)
Analysis by John P. Morency

Definition: DR as a service (DRaaS) is a cloud-based recovery service in which the service provider is responsible for managing VM replication, VM activation and recovery exercise orchestration. Increasingly, in addition to service offerings that just recover VMs, a growing number of service providers are now offering managed hosting services for hybrid recovery configurations that are composed of both physical and virtual servers.

Position and Adoption Speed Justification: DRaaS worldwide revenue is projected to exceed $2 billion by the end of 2017 and will grow to $3.7 billion by 2022. In addition, the number of providers offering DRaaS now exceeds 400. DRaaS growth will be strongest in areas where customers have limited public cloud options in highly regulated industries (such as finance and healthcare), and where business processes consist of IT systems beyond virtualized x86 environments (such as bare-metal servers, as well as legacy Unix platforms for those providers that support it).

Initially, small organizations with fewer than 100 employees were DRaaS early adopters. The reason for the service uptake in smaller organizations was because they often lacked the recovery data center, experienced IT staff and specialized skill sets needed to manage a DR program on their own. This made managed recovery in the cloud an extremely attractive option. However, since the beginning of 2014, many large enterprises (with 1,000 to 5,000 employees) and very large enterprises (with more than 5,000 employees) have also begun initial piloting or have moved beyond the piloting stage to full production. In 2016, DRaaS-specific Gartner clients increased 88% year over year from 2015.

Because of the growing number of service providers, rapidly falling service pricing and significant increases in service pilot evaluations, Gartner has increased the Hype Cycle position of DRaaS to post-trough 15%.

User Advice: Clients should not assume that the use of cloud-based recovery services will subsume the use of traditional DR providers or self-managed DR any time in the near future. The key reasons for this are computing-platform-specific recovery requirements, security concerns, data sovereignty constraints, active-active operations requirements, software licensing and cost advantages of noncloud alternatives, among others. Therefore, it is important to look at DRaaS as just one possible alternative for addressing in-house recovery and continuity requirements.

Consider cloud infrastructure when:
• You need DR capabilities for either existing Windows- or Linux-based production applications.
• Formally managed recovery service levels are required.
• The alternative to a cloud-based recovery approach is the acquisition of additional servers and storage equipment for building out a dedicated recovery site.

Additionally, because public cloud services are still rapidly evolving, carefully weigh the cost benefits against the service management risks as an integral part of your DR sourcing decision making.

Business Impact: The business impact is moderate today. The actual benefits will vary, depending on the diversity of computing platforms that require recovery support and the extent to which service customers can orchestrate recurring recovery exercises that need to be performed. An additional consideration is the extent to which the customer can transparently and efficiently use same-provider cloud storage for ongoing backup, replication and archival. The key challenge is ensuring that these services can be securely, reliably and economically used to complement or supplant the use of more traditional equipment subscription-based services or the use of dedicated facilities. In addition, given that no service, including DRaaS, is immune to scope creep, it is incumbent on service users to ensure that providers consistently deliver on committed recovery time and availability service levels, especially as the size of the in-scope configuration increases and the in-scope data center configuration becomes more heterogeneous.

Benefit Rating: Moderate
Market Penetration: 20% to 50% of target audience
Maturity: Early mainstream
Sample Vendors: BIOS ME; Bluelock; C&W Business; IBM Resiliency Services; Infrascale; Microsoft (Azure); NTT Communications; Recovery Point; Sungard Availability Services; Webair

Climbing the Slope
Public Cloud Storage
Analysis by Raj Bala

Definition: Public cloud storage is infrastructure as a service (IaaS) that provides block, file and/or object storage services delivered through various protocols. The services are stand-alone but often are used in conjunction with compute and other IaaS products. The services are priced based on capacity, data transfer and/or number of requests. The services provide on-demand storage and are self-provisioned. Stored data exists in a multitenant environment, and users access that data through the block, network and REST protocols provided by the services.

Position and Adoption Speed Justification: Public cloud storage has become a critical component of workloads related to analytics, such as Hadoop and Spark, that utilize object storage as a 'data lake.' Innovations in analytics and querying capabilities as they relate to object storage enables enterprises to derive meaning from data faster and apply those lessons to produce better outcomes. The falling cost of raw storage components and advances in software developed by the hyperscale vendors are enabling enterprises to retain data for longer at lower costs. The result is that data can be reanalyzed and re-evaluated as business conditions change.

User Advice: Do not choose a public cloud storage provider based simply on cost or your enterprise's existing relationship with the provider. The lowest cost providers may not have the scale and operational capabilities required to become viable businesses that are sustainable over the long term. Moreover, these providers are also unlikely to have the engineering capabilities to innovate at the rapid pace set by the leaders in this market. Upheaval in this market warrants significant consideration of the risks if organizations choose a provider that is not one of the hyperscale vendors such as Alibaba, Amazon Web Services, Google and Microsoft. Many of the tier 2 public cloud storage providers that exist today may not exist in the same form tomorrow, if they exist at all.

Utilize public cloud storage services when deploying applications in public cloud IaaS environments, particularly those workloads focused on analytics. Match workload characteristics and cost requirements to a provider with equivalently suited services.

Business Impact: Public cloud storage services are among the bedrock that underpins public cloud IaaS. Recent advances in performance as it relates to these storage services have enabled enterprises to use cloud IaaS for mission-critical workloads in addition to new, Mode-2-style applications. The security advances allow enterprises to utilize public cloud storage services and experience the agility aspects of a utility model, yet retain complete control from an encryption perspective.

Benefit Rating: High
Market Penetration: 20% to 50% of target audience
Maturity: Early mainstream
Sample Vendors: Alibaba Cloud; Amazon Web Services; AT&T; Google; IBM; Microsoft; Oracle; Rackspace

VM Backup and Recovery
Analysis by Pushan Rinnen and Dave Russell

Definition: VM backup and recovery focuses on protecting and recovering data from VMs, as opposed to the physical server they run on. Backup methods optimized for VM backup typically leverage hypervisor-native APIs for changed block tracking (CBT), which enables block-level incremental forever backup, eliminating the general need for the in-guest agent backup method. Some backup vendors create their own CBT driver before a hypervisor vendor introduces its own and adopt hypervisor-native CBT when it becomes available.

Position and Adoption Speed Justification: Enterprise VM backup typically focuses on VMware and Hyper-V, as they are the most deployed hypervisors in enterprise data centers. For VMware backup, most backup software solutions have abandoned the traditional guest OS agent approach and adopted image-based backup, leveraging VMware's VMFS snapshots and CBT. However, VMFS snapshots with vSphere versions earlier than v.6.0 can cause the 'VM stun' issue in a large environment with high change rates, and some backup vendors have developed methods to overcome that issue. Others can convert VM formats between VMware and Hyper-V, which is useful for migration. On the down side, many traditional backup applications require installation of guest OS agents to do granular item restore for applications, such as Exchange and SharePoint running on VMware. Other differentiators center around ease of use, scalability and self-service capabilities. For Hyper-V, Microsoft introduced its native CBT function in Windows Server 2016, and many backup vendors have integrated with it. While both VMware and Microsoft have native backup tools for small homogeneous VM environments, data center customers continue using third-party backup tools that are more scalable and have more options. More vendors are offering a socket-based pricing model in addition to a capacity-based model, and a few have also added VM-based pricing.

Agentless backup for other VM platforms is spotty at best, although a few have added KVM support. Backup for containers such as Docker hasn't become a user requirement, as most deployment of containers is for test/development and workloads that don't require persistent storage and data recovery. The sample vendor list only includes the vendors that have products that only or primarily backup VMs due to space limitation.

User Advice: Recoverability of the virtual infrastructure is a significant component of an organization's overall data availability, backup/recovery and DR plan. Protection of VMs needs to be taken into account during the planning stage of a server virtualization deployment, as virtualization presents new challenges and new options for data protection.

Evaluate application data protection and restoration requirements before choosing VM-level backup. Additionally, snapshot, replication and data reduction techniques, and deeper integration with the hypervisor provider, should also be viewed as important capabilities. With hundreds to thousands of VMs deployed in the enterprise, and typically with 10 or more mission-critical VMs on a physical server, improved data capture, bandwidth utilization, and monitoring and reporting capabilities will be required to provide improved protection without complex scripting and administrative overhead.

Business Impact: As production environments have become highly or completely virtualized, the need to protect data in these environments has become critical. VM backup and recovery solutions help recover from the impact of disruptive events, including user or administrator errors, application errors, external or malicious attacks, equipment malfunction, and the aftermath of disaster events. The ability to protect and recover VMs in an automated, repeatable and timely manner is important for many organizations.

Benefit Rating: High
Market Penetration: More than 50% of target audience
Maturity: Mature mainstream
Sample Vendors: Actifio; Cohesity; Hewlett Packard Enterprise; Rubrik; Veeam; Vembu; Zerto

Articles_bottom