What are you looking for ?

GigaOm Report: Storage for Kubernetes

Container storage interface specs still immature, reflecting in products proposed by most vendors

Gigaom SignorettiThis report is authored by Enrico Signoretti, senior storage analyst, GigaOm, independent analyst, blogger, and contributor to El Register, being based in Rimini area, Italy.


Market Landscape Report
GigaOm Radar for storage for Kubernetes

Kubernetes adoption is quickly accelerating, and enterprises are now in a transition phase. In the last few years, we have seen an increasing interest in container-based application development. As a result, IT organizations started to implement a PoCs and laboratories, which moved later to development and test platforms. In this period, the entire industry has matured, both in terms of the core technology (container formats and development tools) and orchestrators, with several companies trying to push their solutions (i.e., Docker Swarm, Mesos DC/OS, Google Kubernetes, and others).

Now that Kubernetes is the clear winner, the number of organizations moving to the production phase is finally growing as well. In most cases, Kubernetes infrastructures are still relatively small, and applications running on them are fairly simple, with limited storage needs. On the other hand, more and more stateful applications are migrating to these platforms, requiring additional resources and performance. At the same time, enterprises of all sizes are embracing hybrid cloud strategies that are becoming more complex and structured. We are quickly moving from a first adoption phase where data and applications are distributed manually and statically in different on-premises and cloud environments to a new paradigm in which data and application mobility is the key for flexibility and agility.

Now, organizations want the freedom to choose where applications and data should run dynamically, depending on several business, technical, and financial factors. They choose the public cloud for its flexibility and agility, while on-premises infrastructures are still a better option from efficiency, cost, and reliability perspectives. In this scenario, it is highly likely that development and testing are made on the public cloud while production could be on-premises, in the cloud, or both, depending on the business, regulatory, economic, and technical needs of the particular enterprise.

Kubernetes is instrumental in executing this vision, but it needs the right integration with infrastructure layers, such as storage, to make it happen. Persistent and reliable storage, alongside data management and security, are vital factors to consider when evaluating Kubernetes deployments in enterprise environments today. These factors expand the scope of the orchestrator to a broader set of applications and use cases across different types of on-premises and cloud infrastructures. The goal is to provide a common storage layer that is abstracted from physical and cloud resources, with a standard set of functionalities, services, protection, security, and management.

Market Categories and Deployment Types
For a better understanding of the market and vendor positioning (Table 1), we categorized solutions for Kubernetes storage by the target market segment (small-medium or large enterprise) and their architecture (enterprise storage systems with a container storage interface (CSI) plug-in, software-defined solutions with optimizations for containers, or cloud-native solution):

  • Small-medium enterprise: In this category, we find solutions that meet the needs of very small businesses that can grow up to address those of medium-sized infrastructures. They can also be solutions adopted by large enterprises for departmental use cases without a very rich feature set, with limited data mobility and management capabilities, but are easy to use and deploy.
  • Large enterprise: Usually adopted for larger and business-critical projects, solutions in this category have a strong focus on data management and mobility features; they also provide additional features to improve security and data protection. Scalability is another big differentiator, including the ability to manage multiple clusters from the same user interface.
  • ISP/MSP: Even though the number of solutions in the Internet service provider/MSP category is still pretty limited, they usually have the same characteristics as those of enterprises with an added focus on multi-tenancy and manageability.
  • Traditional storage arrays with CSI plug-ins: This type of solution is the most common at the moment and, usually, is the first to be adopted by users. They are easy to deploy and allow the reuse of storage resources already in place, with a minimal initial investment. On the other hand, most of them are immature, limited in operational performance, data management, and scalability. Some of the systems in this category, not designed to cope with the number of back-end operations necessary to run a Kubernetes cluster, risk creating bottlenecks that can impact SLAs of critical production environments.
  • SDS with optimizations for containers: The flexibility of this type of solution limits some of the deficiencies of traditional arrays while keeping the storage infrastructure compatible with traditional workloads and applications. The optimizations allow users to adopt Kubernetes gradually while granting good SLAs both to traditional and next-gen environments.
  • Cloud-native solutions: These solutions are expressly designed to work with containers and Kubernetes across on-premises and public cloud environments, in a hybrid and multi-cloud fashion. Usually architected around a set of core features focused on data management and mobility, with specific data services tailored for containers, they take advantage of storage resources local to each single node in the cluster, cloud storage, or traditional enterprise shared storage such as NAS and SAN infrastructures.

Table 1: Vendor Positioning
Gigaom Storage Kubernetes F1

Key Criteria Comparison
Following the general indications introduced with the Key Criteria for Evaluating Hybrid Cloud Data Protection Report, Table 2 quickly summarizes how each vendor included in this research performs in the areas that we consider differentiating and critical for modern data protection. The objective is to give the reader a snapshot of the technical capabilities of different solutions and de ne the perimeter of the market landscape.

Table 2: Key Criteria and Evaluation Metrics Comparison
Gigaom Storage Kubernetes F2

GigaOm Radar
All the key criteria and the critical feature impact analysis consolidate in the following graphic representation: The GigaOm Radar (Figure 1). This vector-based graphics gives an overall perspective on all the vendors included in this research in terms of technical capabilities and features (Table 2), execution on the vision, regardless of their market share or segment (Table 1).

Figure 1: GigaOm Radar for storage for Kubernetes
Gigaom Storage Kubernetes F3

Vendor Roundup/Overview

It has a two-fold strategy about persistent storage for Kubernetes. One on hand, they are developing a CSI plug-in for its block and file storage products aimed at servicing its current customers. On the other hand, they are investing in MayAdata: which is the sponsor behind OpenEBS, an open source solution designed for Kubernetes storage. This approach will help the company to give answers immediately while having access to IP for the development of future solutions.
Strengths: The CSI plug-in has a simple and efficient design, aligns with the latest CSI specs. Easy to adopt and without additional license fees.
Weaknesses: CSI specs are still immature and cannot take advantage of all the potential offered by the company’s platform.

Its Data Services Platform is a flexible, scale-out, SDS solution that bridges the gap between legacy and modern infrastructures. It is integrated with Kubernetes CSI and aligns well with the latest API specs.
Strengths: Storage classes are implemented very well, enabling the DevOps team to take full advantage of the backend resources and flexibility of the solution.
Weaknesses: Advanced data services (data protection, remote replication, DR) are not yet implemented due to CSI limitations.

Dell EMC
CSI-plug-ins are now available for all Dell EMC storage systems, and the company is actively working to improve them. The implementation is still basic, and there are not features aimed at simplifying the deployment and management of Kubernetes storage in large infrastructures.
Strengths: Simplicity, free plug-ins, and ease of adoption for customers allow users to take advantage of existing infrastructures and start quickly without further investments.
Weaknesses: The solution is still immature and does not allow planning for hybrid cloud infrastructures, no DR options, and no official integration yet with Prometheus or other Kubernetes monitoring platforms.

Diamanti offers an end-to-end solution that resembles HCI in the virtualization space. It comes with a fully supported standard Kubernetes distribution pre-installed, or the customer can choose Red Hat OpenShift instead.
Strengths: High resilience, performance, and ease of deployment and management are the most important characteristics of this platform, which also offers several features to overcome the limitations of CSI and Kubernetes networking. Thanks to the support of custom resource definitions (CRDs), the firm can seamlessly replicate data to the public cloud for migrations or DR.
Weaknesses: The solution is relatively expensive and focused on large enterprise deployments. Even though its basis is on commodity hardware and very efficient, the company offoads many storage and network operations to the hardware, limiting hardware choice.

Hitachi Vantara
It offers a CSI plug-in for all its storage systems. The plug-in is available for all its customers and provides a quick path to adopt Kubernetes for its customers.
Strengths: The firm has forward a looking-vision around containers, and Kubernetes in particular, which is aligned with its current product portfolio regarding data analytics and IIoT while protecting existing infrastructure investments.
Weaknesses: Current CSI plug-in is still immature and lacks some features such as remote replication for DR. This is due to the company strategy, focused on adhering to CSI specs and APIs without adding non-standard functionalities.

It offers CSI plug-ins for all its major storage platforms. The plug-in is open source and free to download and install.
Strengths: Customers can take advantage of the installed base to start quickly with their Kubernetes deployments.
Weaknesses: Plug-ins are immature, and firm’s storage products do not allow for planning for hybrid cloud infrastructure deployments.

One of the main advantages offered is the possibility to consolidate a large number of applications, workloads, and data in very few systems. In this context, the company provides a compelling solution for customers that want to consolidate Kubernetes applications alongside others.
Strengths: The new CSI plug-in aligns with the latest specs. Additionally, the firm simplifies DR and data migrations across on-premises and major cloud providers thanks to its Neutrix Cloud service.
Weaknesses: Even though data volumes for Kubernetes can group for monitoring in the UI, there is not a specific integration with Prometheus yet.

Maya Data
Its OpenEBS Enterprise bundles enterprise functionalities and support to the OpenEBS open source project, including a series of tools to improve data migration, mobility, visibility, and infrastructure hardening. Users and developers can adopt OpenEBS at no cost, while the company offers a very flexible licensing.
Strengths: It offers a no lock-in approach for users that want to deploy an open-source storage solution for Kubernetes in a multi-cloud environment.
Weaknesses: The solution offers tools to control and migrate data between Kubernetes cluster but it has some limitations with its current implementation, limiting its potential for high demanding enterprise use cases. Future releases of the product will address this issue.

It is building Trident, a complete storage orchestration platform to address Kubernetes challenges. This allows simplifying storage provisioning and management for DevOps teams while offering a consistent user experience across private and public cloud deployments.
Strengths: Overall strategy is solid, and Trident is a software component that fits very well in company’s Data Fabric vision. The solution shows a good feature set and an interesting roadmap, contributing to a good ROI on the NetApp solution.
Weaknesses: Trident is fully CSI compliant but still misses some advanced features that would simplify data migrations, remote replication, DR orchestration, and backup.

It provides one of the most compelling solutions for enterprise storage dedicated to Kubernetes infrastructures. A cloud-native architecture combined with unique data services that enable enterprise organizations to deploy business and mission-critical applications without the limitations imposed by traditional solutions.
Strengths: Portworx Enterprise allows simplifying most of the operations that are currently limiting enterprise Kubernetes deployments for stateful applications, improving data protection and management processes with a positive impact on overall infrastructure TCO. It can work with local storage installed on cluster nodes or traditional enterprise shared storage.
Weaknesses: Most storage vendors offer free CSI plug-ins and backend resource orchestration, allowing enterprises to start with minimal or no investment.

Pure Storage
It demonstrated again its ability to propose solutions with the right combination of performance and usability that adopts a broad range of use cases. From this point of view, Pure Storage Orchestrator (PSO) has the right characteristic to allow a smooth adoption of Kubernetes that leverages existing storage resources and minimizes initial investment.
Strengths: PSO is a good solution and has the potential to become even more, a key differentiator for the company when the end-user is evaluating storage solutions ready to support containers alongside virtualized and physical systems.
Weaknesses: The solution is not ready for all use cases, especially when backup and remote replication for DR are involved. Pure Storage has best practices in place to mitigate this limitation while working on the right implementation following the evolution of CSI specs.

Red Hat
Its Openshift Container Storage (OCS), based on Ceph, has been designed for simplicity and ease of use. Integrated with Openshift, all management operations are automated using Rook (a CNCF orchestration tool for storage) and automated, enabling DevOps teams with little or no storage knowledge to get a simplified hyperconvergence-like experience.
Strengths: Simplicity and ease of use of the solution, completely integrated with the Red Hat OpenShift Container Platform. The solution can be deployed on-premises as well as in the cloud. Simple support subscription model aligned with OCP support and licensing.
Weaknesses: Rigidity of the configuration and limited scalability can pose some risks for large deployments, and the customer may be forced to replace OCS with a standard Ceph installation.

It offers a cloud-native solution designed explicitly for Kubernetes. As such, the product is more flexible, easier to manage, and integrate with the Kubernetes platform than traditional storage systems. It can deploy both on-premises and in the cloud, providing the same functionalities in different environments.
Strengths: Good, lightweight, and efficient architectural design. The company offers development licenses (up to 500GB of storage)
Weaknesses: It still misses data services features (e.g., snapshots and remote replication) and DR orchestration, limiting the possibility to use it in complex environments that need advanced data protection, data migrations, and fast cloning.

CSI specs are still immature, and this reflects in the products proposed by most vendors. Many storage vendors have chosen a conservative approach that follows the development of CSI specs. This approach means that the integration between the storage platform and Kubernetes is limited, offering scarce support for all those features that are usually considered mandatory in an enterprise environment.

We identified 3 groups of vendors in this space, characterized by the level of sophistication of their approach and the features available on their platform.

The most conservative ones provide a basic CSI plug-in and expose limited functionality from the array. In this group, we find Dell EMC, Hitachi Vantara, and IBM. This severely limits the possibility of implementing Kubernetes for mission-critical environments, especially in hybrid cloud scenarios.

The second group of vendors (Red Hat, Infinidat, Datera, Pure Storage, NetApp, DataCore, StorageOS) opted for a more sophisticated approach that gives additional options to the end-user. Some enterprise features are still missing, but the overall strategy of these vendors is much more aggressive, with clear roadmaps and best practices to overcome the limitation imposed by current CSI specs. Two of particular note in this group are NetApp and Pure Storage; their approach is more holistic and offers better flexibility in large infrastructure and hybrid cloud environments.

Maya Data is worth a mention as well because of the open-source core (OpenEBS) and the innovative tools and services included in the enterprise support subscription.

The leading group is now composed of a couple of start-ups (Diamanti, Portworx), with products designed specifically for Kubernetes and able to overcome the limitations imposed by CSI. In this group, it is worth noting the different approaches. Diamanti is a hyper-converged solution aimed at building Kubernetes cloud-like experience for large enterprises, with high performance and ease of use as its primary characteristics. In contrast, Portworx is more focused on flexibility and consistent user experience across different environments, offering a very extensive feature set that extends storage with data management functionalities aimed at building a consistent data services layer for Kubernetes that spans across different clouds.