Exclusive Interview With Brian Biles, CEO and Co-Founder, Datrium

Who is Brian Biles?

He is:

CEO and co-founder of Datrium Inc.

He was:

Founder and VP of product management at EMC backup recovery systems division (following Data Domain acquisition in 2009)
Founder, VP of product management and business development for Data Domain
VP marketing at VA Linux
Director software product marketing at Sun Microsystems
Sales and SE at Data General

StorageNewsletter: With Data Domain, you invented a new way to protect data with the now ubiquitous data reduction approach. And we know the story, Data Domain got acquired by EMC in 2009 for a significant transaction. With Datrium, it’s a new story again, changing some paradigms, could you share with us the genesis of the company?

Brian Biles: Datrium combined two CTO teams – the first two CTOs from Data Domain (one of whom designed SnapVault for NetApp, deduplication at Data Domain, and was later an EMC Fellow), and two principal engineers from the early days of VMware. We wanted to create the simplest converged infrastructure in the world. Early converged and hyperconverged infrastructures existed at that point, and even though they scored some points, they both had problems.

The solution for on-prem infrastructure in our mind was to mimic the roles of compute and flash on EC2, with persistence on S3, but make transitions and high efficiency automatic and always-on, eliminating the need for cluster settings and along the way, backup. It also lets hosts be stateless, so it eliminates the fragility of HCI hosts.

As it turns out, this also made it easier to port to AWS for hybrid cloud data management.

And what does mean the name Datrium?
“Data atrium” – reflecting the network gap between DVX data services (such as erasure coding, encryption and de-dupe, on hosts) and data persistence (off hosts).

Datrium started with the idea based on the clear distinction between servers with local SSD storage known as compute nodes and a common persistent storage pool with HDDs aka data nodes meaning that this approach provide independent scalability at the compute and storage layer. Why did you choose such design?
While DVX does separate compute and storage, it is more notable for separating persistence from data services and IO speed. No other storage approach for VMs or containers does that, and it makes scaling much, much simpler.

It becomes super simple to decide how to configure, and all expansion can be incremental. There’s no array controller bottleneck whose sizing requires some kind of machine learning. Speed is from hosts and their configurations, capacity is separate and scaled with incremental HA nodes in a scale-out pool.

Unlike HCI, if hosts fail, other hosts still have data access. Services are always fast and always on, so there is no cluster configuration to manage.

It’s higher speed than AFAs automatically, because nothing’s faster than flash reads within the host. Adjusting IO speed dynamically becomes a matter of normal vMotion. Without persistence on hosts, less hardware needs to be on hosts, so we can support any leading server vendor or type, with their flash, using our host software. We also sell turnkey pods with our own pre-installed Compute Nodes.

Finally, there’s no need to buy a separate backup software and appliance silo. We built the DVX persistence layer on de-duped/compressed/encrypted low-cost drives, with backup-grade policies, catalog and search. This data management layer, supporting more than a million snaps per DVX, has also been ported to AWS for snap storage on S3, as just one more type of replica. The only thing that sucks more than tape is separate backup.

You have started with VMware support, obviously the right choice based on its market presence, we read that you support other hypervisors. Right?
This year we announced support for RedHat Linux or CentOS with KVM, as well as bare metal containers with Docker Persistent Volumes, providing data protection policies at that granularity. These hosts can run in the same DVX as ESXi hosts.

And what is your container strategy?
While container developers have a history of stateless containers, there is an emerging interest in persistent volumes (PVs) with container granularity. In our model, starting with Docker PVs, you can snap a container PV on one host and instantly clone it on another. For rapid development, we can eliminate a lot of the time, network bandwidth and capacity related the copies and data sharing that goes on in dev/test. We use containers in our own development and it boosted the output rate in our dev testbeds by about 2x.

Do you provide a pure software flavor of your solution? I mean is DVX downloadable to run on any hardware to leverage commodity and generic servers?
Our host software is available for any recent server from a leading supplier with ESXi or Linux, subject to a modest HCL regarding RAID controllers and their firmware. It is pure software.

To constrain HA testing, our data nodes are pre-configured from Datrium. The components are not exotic, but it gives us a simpler stack to verify. You typically need an order of magnitude fewer of these than hosts.

What is Open Convergence, an approach Datrium promotes a lot?
Converged infrastructure (CI or HCI) is characterized with a couple of constraints. It usually requires a single brand of servers, and in HCI’s case, it prefers single-use app configurations per cluster (clusters are typically less than 16 hosts.) Mixed-use or mixed-server environments are off-target.

DVX strongly supports mixed-use, mixed vendor, mixed-server-type (e.g. blade, 4-socket, whatever), mixed hypervisor or container settings in the same pod. Host I/O is not host-to-host, and hosts are stateless, which simplifies the constraints of HCI.

What is your view on cloud? How do you integrate it with DVX? It’s a natural tier, right?
Yes. We’ve announced snapshot replication to AWS S3, which we demonstrated at VMworld in August. Once it’s set up, it’s just like any DVX replica – VM or container snaps can be found through search, and restored to a physical DVX in another site. Because our replication is de-duplicated on the wire, it also eliminates a substantial cost element of cloud egress costs. We also provide end-to-end “blanket” encryption for all data on the host, across the network and at rest. This is a perfect fit for backup to AWS – providing a secure extension from the private cloud. S3-type services are the Iron Mountain of the future.

What are the unique features and differentiators you have developed that put Datrium in its position on the market?
In addition to what we covered in your previous questions, these are a some other benefits of our approach:

I touched on this in the cloud backup use case – because we implement data encryption in host software co-implemented with data reduction, encryption is end-to-end while preserving data reduction, including encryption in flight, without using added hardware, such as PCI cards or SEDs.
Because our persistence layer is log structured, its always-on wide erasure coding does not slow writes, which is unlike HCI vendors. This means customer data consumes much less overhead capacity for double failure protection, and performs well even for hot data.
DVX performance is automatically so high it’s ridiculous, so it allows users to stop caring. Read bandwidth in our full configuration is over 200GB/s across 128 hosts, or 18 million IO/s (on Broadwell with SATA SSDs). Compare that to your favorite all-flash array, and note that DVX host SSDs are purchased from server vendors at commodity rates.

What are the use cases you target and the ones you excel without real competition?
Our top use cases are databases and VDI, especially with other mixed apps in the same datacenter, because of automatic high performance and low cost scaling. We’ve always done well with SQL, and we now also support Oracle, for both single-instance and RAC.

Our #1 reason for competitive wins comes from overall infrastructure simplification and consolidation, which is not app-specific. DVX is higher performance than AFAs, while also integrating a backup suite with retention as cost effective as a purpose-built backup array.

How many customers do you have? What are segment where you see the fastest growth?
In just 6+ quarters, we have hundreds of deployments. Bookings have roughly doubled each quarter this year. We are typically selling to mid-range to large enterprises, government and service providers.

In term of business model, we understand you’re 100% channel oriented? How many of them do you have? What is the profile of these partners? Are you looking for new ones?
They are typically partners who have a strong infrastructure business. While the roster is growing, we are not trying to be promiscuous and over-distributed. Instead we are focused on the ones who focus on us. We have over 100 active partners today, predominantly in the US, and will be turning our attention to EMEA partners next quarter.

What about OEM or other alternatives?
We don’t have any today.

How do you price your product?
Compute nodes are priced like well-known x86 servers on the web.

Our host software list price is a flat subscription fee, independent of host scale or size of flash.

Data nodes, with about 100TB effective capacity, are a flat fee with annual support, similar to the price of array drive shelves; you can pool up to 10 of these for more than 1PB aggregate capacity in our 3.0 release.

Could you share your revenue range, we heard something between $20 and $40 million but it’s a large window? Are we far from the reality?
No, this is about right.

How do you see NVMe and NVMeoF? Hugo Patterson, Datrium CTO, recently explained and detailed architectures differences, what is your plan even if you already used NVMe devices?
Right, we have supported NVMe in third party servers since we began shipping in early 2016. Our new compute nodes now also support local NVMe drives. We have not announced anything in NVMeoF access, but it looks like a promising approach for composable systems in the future.

More generally, how do you see the future of the company and the product? What are the next steps? Any new direction?
We are working hard on some very interesting hybrid cloud technology for next year, extending the world’s simplest converged infrastructure on-prem to the public cloud. Stay tuned.

Read also:
Start-Up Profile: Datrium
Converged storage and compute in new way, Open Convergence, to simplify webscale and tier 1 private cloud infrastructure deployments.
by Jean Jacques Maleval | 2017.04.07 | News