What are you looking for ?
Advertise with us
RAIDON

Qumulo Helps Undisclosed US Telco Provider Process Logs From Billions Connections Per Day

Replacing Isilon system by data-aware scale-out NAS

One of the world’s largest telecom providers needed to replace an aging scale-out system with a modern storage infrastructure to handle the immense volume and velocity of machine log data, which is accessed by various analytics tools including Splunk, Inc., for its global network.

Qumulo, Inc.‘s data-aware scale-out NAS delivered the necessary capacity, reliability and performance, with improved data visibility and customer support for lower cost.

Finding Right Solution to Replace Critical Infrastructure
The world’s largest telecommunications providers make billions of daily endpoint connections, and each connection generates log data that must be ingested, stored and processed to identify events or anomalies. These logs represent terabytes of machine data each day, multiple petabytes in total.

Storing that immense volume and velocity of data requires capacity and performance, as well as 100% availability given the need to analyze data in real-time on a continuous basis using analytics tools. Constant up-time takes both reliable equipment and customer service. That was the challenge facing a top US-based carrier.

When you’re ingesting terabytes of data each day from more than 60 billion incoming events, and hitting processing peaks of 25,000 IO/s for analyzing that data, you need storage capacity and performance that’s very, very scalable,” notes a high level executive. “You can’t afford a misfire with the production system.

The relevant data includes various types of machine generated events such as server syslogs and numerous router telemetry events from many networks. The team stored that machine data on an Isilon system that had been in continuous use for five years. In fact, it hadn’t even been offline in the last three years, for fear the aging system might not come backup.

Knowing it needed to be replaced, the initial thought was a substantial upgrade. However, after the EMC acquisition of Isilon, the Isilon product roadmap, service and support weren’t meeting expectations and were deemed a high risk to further consider as an option. With their current vendor out of the running, the team started looking elsewhere for a new system.

The company decided to evaluate object storage to address scalability and leverage the open programmability provided via an API. They learned that sacrificing the standard file protocols, such as NFS, that the business had come to rely on was a daunting task and compromised daily operations. Ultimately, the team needed to find a solution with the scalability and API programmability of object storage without sacrificing the reliability of a standards based scale-out NAS filesystem.

The team was left with an aging unclear product future of Isilon and an Object solution that didn’t meet their current operational requirements. Fortunately, they also had an ace in their back pocket. The company had spent the last several years watching the progress of a storage alternative as it made its way to the market.

A New Gold Standard for Storage
Right now, Qumulo’s the closest thing to an Apple unboxing, setup, and support experience in the storage world,” explains a high level executive and senior engineers on the team. “We loved the innovation and energy of Isilon’s early days, and that’s what we saw in Qumulo. In fact given all the attention and promise, our only question was ‘could they deliver’? Fortunately we found the answer was a resounding yes.

Qumulo, in data-aware scale-out NAS, delivers real-time analytics that provide visibility into data usage and storage across fast, flexible and scalable commodity hardware. This combination offered the storage performance and scale the company required, along with the flexible API-based programmability the object system had promised. As importantly, Qumulo’s hybrid solution was less expensive than either of the other systems.

The carrier initially selected Qumulo’s QC208 hybrid storage appliances for its primary ‘gold master data’ cluster, deploying 20 nodes for more than 4PB of raw storage. These systems receive all incoming data, which is centralized and made available to the various analytics tools including Splunk. Centralizing all of the data on Qumulo gives the carrier the flexibility to utilize a variety of tools for analysis. It also deployed eight QC24 nodes – two four node clusters of which serve as a QA test bed for changes to the primary system.

Ease of setup was important to the team, as installation and cable runs are all handled in-house. That wasn’t a problem with the Qumulo system.

Turns out we had the initial four node cluster set up before the notification even went out to Qumulo confirming it was on the loading dock,” notes a senior engineer on the team.

Given the degrading service experience for its previous Isilon system after the EMC acquisition, Qumulo was also under the gun to show how nimble it could be – something that’s much easier with agile two-week development and release cycles for OS updates.

During the sales process, we realized we needed support for Ethernet jumbo frames. Our Qumulo sales rep spoke with development, and the company had it for us in the next OS update,” he says.

Centralizing Machine Data on Qumulo
The carrier migrated all its data over a 30-day period to the new Qumulo production cluster, which currently hosts almost 2PB of machine data and is growing at a rate of more than 2.4TB per day. While scalability is key to handling that capacity load, it’s also critical for meeting the performance requirements – particularly for data reads. A steady stream of log data means high, but manageable, data-write requirements. The system frequently gets pounded for reads, as multiple simultaneous processes and various analytical tools analyze the data looking for anomalies or changes, while applying metadata tags and other details. Throughput and performance demands on the cluster peaks above 50MB/s and, as mentioned earlier, 25,000 IO/s.

Qumulo’s real-time data visibility is also increasingly important for the carrier, enabling the team to monitor and manage usage, and see capacity and performance trends. It also supports the team’s objective to move from the previous ‘vault’ storage mentality to an open platform that can be used by other groups within the carrier. Qumulo gives real-time insight on when files get committed, which groups are using what space, etc., all supporting clear communication and easy internal charge-backs.

The clean, intuitive web interface helps with that usability, as does Qumulo’s RESTbased API.

Honestly, the management process is painless,” notes the senior engineer. “I couldn’t ask for things to be any easier.

Qumulo has also been instrumental in ensuring reliability and adaptability for this massive system. Staff and executives regularly meet with the team to discuss current and future needs. And the Qumulo Care support organization constantly monitors the carrier’s cluster for potential issues.

The team is so pleased with their Qumulo cluster that they have already expanded the deployment with purchase of another 20 nodes (4PB) to mirror the production cluster for DR, bringing the total current installed base to 48 nodes.

Given the success of this deployment, the organization overall now intends standardizing on Qumulo for its machine storage worldwide.

I’m reassured that as we grow, Qumulo is right there with us every step of the way,” he concludes.

Solution Overview:

  • 40 QC208 hybrid storage appliances
  • 8 QC24 hybrid storage appliances
  • NFS and REST protocols
  • Qumulo Care enterprise support

Benefits:

  • Achieves the performance needed to handle daily loads of billions of packets and tens of thousands of IO/s
  • Scales throughput and capacity linearly to support theoretically limitless terabyte/day growth with additional nodes
  • Provides real-time data analytics independent of scale for immediate insight on massive capacity and performance trends
  • Delivers usage statistics enabling transition from closed vault to open global intra-department storage platform
  • Ensures uninterrupted operation through continuous Qumulo Care monitoring and  proactive support
  • Adapts to demanding requirements of even the heaviest production environments through agile and continual software evolution

Right now, Qumulo’s the closest thing to an Apple unboxing, setup, and support experience in the storage world. When you’re ingesting terabytes of data each day from more than 60 billion incoming events, and hitting processing peaks of 25,000 IO/s for analyzing that data, you need storage capacity and performance that’s very, very scalable. You can’t afford a misfire with the production system,” said a high-level executive of US-based carrier.

Articles_bottom
ExaGrid
AIC
ATTOtarget="_blank"
OPEN-E