Exclusive Interview With Peter Godman, CTO and Co-Founder, Qumulo

He is:

Founder and CTO at Qumulo Inc.

He was:

Founder and CEO at Qumulo
Founder, VP engineer then CEO at Corensic Inc.
Director of software engineering at Isilon Systems
Development lead at RealNetworks
Software development engineer at Gemstar-TV Guide

StorageNewsletter.com: Could you refresh us about the root and genesis of Qumulo as well as background of founders?
Peter Godman: Qumulo was founded in March 2012 by Neal Fachan, Aaron Passey, and myself. We met each other while working at Isilon Systems, where we were distinguished engineer, chief architect, and director of software engineering respectively. We all left Isilon in 2008 and went to work at start-ups that revolved around CPU scalability and database scalability. We decided to get back together and solve some new problems in storage in 2012, and founded Qumulo with the mission to be the company the world trusts to store, manage, and curate its data forever. We wanted to create a solution that offered enterprise customers the freedom to store, manage and access their file-based data in any operating environment, at petabyte and global scale.

So far Qumulo raised $130 million, any new round in 2018? Are you far to be profitable?
As a private company, Qumulo doesn’t disclose financial details. What I can say is that Qumulo is experiencing a period of rapid growth and expansion and we will consider tapping into the capital markets to maximize the market opportunity in front of us.

Qumulo represents a new iteration in the scale-out NAS following the success of Isilon. What were the motivations to start this adventure with limitations you found with OneFS?
I remember 15 years ago a great manager I had posed me a question: what would it mean to make a storage system that could handle a billion files? Back then it was an inspiring question, but not a practical one. Today a billion is quotidian. The equivalent scale challenge is trillion files.

One of the best things about working in storage is the need for capacity only grows over time, but the nature of the problem changes simultaneously. Today the problem is about utilizing the cloud well, dealing with a media that is declining rapidly in cost but which offers a fraction of the performance/capacity that it delivered 10 years ago, and managing data sets at trillion-file scale.

The exec team changed a bit, the company mission is pretty the same.But it seems that something has changed in the company during last few quarters illustrated by market adoption. What has changed?
Hierarchical file storage just celebrated its 50th anniversary a couple of years ago. Over those 50 years, file has attracted a huge amount of complexity, from client protocols to caching and locking, hierarchy, replication, snapshots, quotas, and many other ‘table stakes’ pieces of functionality. It’s tough to cross the chasm if you try to build every piece of functionality that anyone who uses file storage needs.

Qumulo chose a path of delighting a narrow set of potential customers and use cases, and gradually broadening that set out to successive generations of use cases and customers. When you see Qumulo taking off in the market, what you’re seeing is Qumulo crossing the chasm for successive use cases and verticals, and the flywheel spinning up. Today we’re enjoying success in many use cases in media and entertainment, life sciences, automotive, research computing, and various other verticals. As long as we stay focused on making sure we delight some rather than disappoint all, this flywheel will continue to spin faster.

Why do you say in your corporate presentation that the company was founded in 2015 and not in 2012 as you’re listed in various places?
Qumulo was founded in 2012. We did not publicly launch out of stealth, however, until March 2015. Some folks have speculated that we pivoted or rebooted, but this isn’t the case. Qumulo is doing what it’s always been doing: delighting customers storing and using giant data sets. Building mature file systems is hard that scale with quality data services is one of if not the most challenging endeavor in storage. Qumulo offers a more mature product 5 years in than many have offered at 15.

Many founding CEOs changed position after a few years aligned with VC round, we understand it was the case for Qumulo and you took a CTO position. Do you consider it was one of the reasons of the new era of Qumulo with the need to have a dedicated CTO and a new CEO with fresh ideas and market approaches?
From day one at Qumulo, I was clear with the rest of our team that I’d be CEO as long as I felt like that would result in the company’s growth and success. As the business started to scale, I came to believe that our success would come from a structure where I focused on product and technology, and a great partner led the business and company. I recruited Bill – who I’ve known and worked with for ten years – to Qumulo, and we’ve been crushing it at a high level ever since.

Some object storage vendors dreamed to replace file storage and even some of them claimed to be Isilon 2.0. Tell us why offering a file system on top of an object storage is hard and finally can’t deliver same promises?
Fundamentally it’s really tough to implement file on top of object.

1. Files can be changed after they’re created. Typically, object cannot. So, if I want to have every file be an object, do I need to rewrite the object every time I want to change the file.

2. Objects live in a flat namespace, files live in a hierarchy that can be changed. If I want the files /home/pete/movie{1..1000}.mov to have object names “/home/pete/movie{1..1000}.mov”, then what happens when I want to rename /home/pete to home/pete.old? In file, I change on directory name. In object, I rename 1,000 objects.

3. You can solve the above problems by not having files correspond to objects, in which case the contents of the object store become meaningless… a scenario end-users often refer to as a ‘blob’.

4. File systems have been optimized for decades for performance. They handle caching, including write-back caching on the client side. They provide for users locking assets against other users, and many other things. Object typically does none of this, or if it does, it does it through a non-scalable gateway that renders the objects gobbledygook per #3.

5. Object storage systems still frequently have silly limits in number of objects per bucket, object size, etc. Since ZFS arrived on the scene, these sorts of limits are less common in file systems.

Exposing an object interface on top of a file system is almost trivial, in others words, people don’t care about object storage as an architecture, they care about how to interact with the storage system or service with an object protocol. What’s your position on that? And how do you consider and work with cloud, S3 and others?
By contract to the previous question, it’s easy to build immutability on top of a system of mutable files. It’s easy to build size limits into objects on top of a file system without limits. It’s easy to build a key-value store on top of a hierarchy.

What’s consistency model choice for Qumulo? Within a DC and across DCs?
Within a DC, Qumulo is fully consistent. Across DCs, using replication, Qumulo is eventually consistent.

As volume grows with billions of files, walking the files tree is impossible. We can’t imagine running the find Linux command anymore. How did you address this challenge? You told us recently about QumuloDB, could you elaborate on that?
Over time we see file systems and document databases converging. Qumulo builds an index into the nodes of its file system, which we call QumuloDB. It’s a data structure that’s a variant of a heap or a tournament tree or merkle tree. It makes many operations, like du (disk usage Unix/Linux command), much faster or instantaneous, and this functionality continues to evolve as Qumulo matures. This is an involved topic which a blog post and our whitepaper cover in-depth.

The company has made some recent announcements, QF2 i.e Qumulo File Fabric, could you summarize them? Especially I’m interested to understand your asynchronous continuous real-time replication. It’s not yet block-based and de-duped. When do you plan to add these?
Most of what we used to call Qumulo Core we now call Qumulo File Fabric (QF2). QF2 can now run in cloud instances as well as appliances and HPE hardware. Qumulo’s replication technology connects these data locations together on a per-directory basis allowing for cloud-bursting and cloud-native file workflows.

For the use cases that Qumulo focuses on, transparent data projection into the cloud and remote collaboration are perhaps the most important applications for replication beyond traditional DR. So, our focus is on delivering replication that facilitates completely new workflows rather than takes on use cases like VMs and databases that we feel are reasonably well-served by other vendors’ products. There are certainly scenarios where block-based (incremental-within-file) and deduplicated replication are of primary concerns, but these are typically minor concerns in large scale unstructured data, the broad class of use cases we cover. Again, were we focused on hosting VMs and databases, these features would be of core concern.

Initially you offered only appliances and today you let users choose between software or appliance? How do you position the two approaches? How does it impact the pricing as you leave money on the table when you sell software?
We consider the transition to software-defined storage as inevitable. At the same time, it’s 100% required that a storage vendor ensures that its customers use hardware that isn’t going to let them down. Some companies have made the mistake of letting customers use any old hardware with their storage systems, and have lost their data as a consequence.

Because Qumulo has considered this transition inevitable, we’ve only ever sold software subscriptions and minimally marked-up hardware. So, regardless of whether people use our hardware or HPE’s or whom ever, Qumulo charges the same for software. We don’t leave money on the table for not selling hardware. We provide the valuable service of turning standard hardware into arbitrarily-scalable, high-performance filers, and this commands the same value regardless of hardware.

Your approach is about delivering a super fast file storage based on file sharing standard protocols such NFS and SMB. Any plan to add a parallel flavor to it? It could be with the support of pNFS as Red Hat distribution comes from a built-in client or offering a special software layer. And NFS 4.2 seems to be pretty good for that. This is an approach also chosen in media and entertainment with other file storage solutions.
Parallel file systems such as NFSv4.2 solve an important problem, specifically for massively concurrent reads. Fundamentally, a system where data doesn’t need to make two hops to get the the client is going to exhibit superior read performance after all optimization is done. Although Qumulo already delivers better than 2GB/s read performance from its appliances, and even though we have a clear line of sight to 5GB/s, that read performance will increase even further with parallel file system support. The challenge to use of parallel file systems is managing the impact on clients. Expect to see Qumulo embrace parallel file system technologies over time.

Qumulo is a choice for many special effects and animation studios. Could you list some of them and tell us what were the key differentiators in your favor? What are the I/O characteristics that make Qumulo the right fit?
Qumulo’s media and entertainment customers include alter ego, Ant Farm, Atomic Fiction, Awesometown, Blind Studios, Crafty Apes, The DAVE School, Deluxe VR, DreamWorks Animation, Eight Solutions, FotoKem, FuseFX, Intelligent Creatures, Mr. X, MSG Networks, Pipeline Studios, RodeoFX, Sportvision, ZOIC Studios. These media and entertainment companies are choosing QF2 to accelerate their mission-critical file-based workloads, including VR, VFX, 4K, animation rendering and broadcast.

QF2 is the world’s first universal-scale file storage system. It was designed from the ground up to meet all of today’s requirements for scale. QF2 runs in the data center and on AWS, and can scale to billions of files. It handles small files as efficiently as large ones. QF2’s analytics let administrators drill down to the file level, get answers, and solve problems in real-time.

Perhaps most importantly, Qumulo is giving animation and VFX studios a path to burst rendering in cloud. These workflows are notoriously peaky, and it’s well-understood that the future is renting rather than owning the peaks. Qumulo QF2 lets studios do this.

What are the others use cases and market segments you address?
Qumulo has quickly become the most scalable and high performance file storage solution in the world, with marquee customers in nearly every industry segment including media and entertainment, life sciences, oil and gas, automotive, telecommunications, higher education and rapidly emerging workloads for IoT, machine learning and AI.

From a go-to-market perspective, you sell through channel partners, what about HPE or others hardware vendors vendors and OEMs more generally?
We are 100%t channel centric with more than 100 partners across North America. We currently have a deep technology partnership with HPE. In 2017, we joined the HPE Complete program, which adds our solutions to the HPE ecosystem, giving enterprise customers the opportunity to purchase complete, validated HPE and Qumulo solutions directly from HPE and its resellers. In 2018 you can expect to see us partner with several other hardware vendors.

We estimate your revenue around $40-50 million, are we far from the reality?
As a private company, Qumulo doesn’t disclose financial details. Having said this, your estimates are not wildly off target.

How do you finish 2017?
2017 was another landmark year for Qumulo. We grew our business and customer base exponentially, while leading an industry high Net Promoter Score (NPS). One of the drivers of that success is rapid product improvement. In 2017 we expanded the use cases and verticals we serve by growing our platform coverage, we brought to market QF2 on an all flash platform, partnered with HPE to deliver QF2 on world-class HPE Apollo servers, delivered new peaks in performance, modernized enterprise features like real-time quotas, patented snapshot technology, and blazing fast intelligent replication. On top of all of this, we launched the world’s fastest, and most fully featured file product in AWS and expanded our operations in EMEA.

What do you plan to add in the product in 2018?
We have aggressive product plans for 2018. We will expand our platform coverage to address more points on the performance/capacity curve, deepen our partnership with HPE, and introduce new OEM vendors. In the same time frame we will enable intelligent performance in our product, enabling QF2 to ‘learn’ customers workloads enabling intelligent performance. Our cloud product will break new ground, offering cloud-native file experiences for provisioning resources and designing workloads. Finally, as our customer base continues to grow rapidly, we will continue to invest in enterprise features that help more large customers standardize on Qumulo.

Read also:
Start-Up Profile: Qumulo
In data-aware scale-out NAS software with analytics
by Jean Jacques Maleval | 2015.03.25 | News | [with our comments]