Syncsort to Enhance Hadoop and Drive Adoption of Big Data Framework
Commitment to open source community and new DMExpress Hadoop edition
This is a Press Release edited by StorageNewsletter.com on May 25, 2011 at 2:49 pmSyncsort announced plans to contribute an external sort plug-in to the Hadoop open source community as part of the company’s commitment to help make the Hadoop framework more robust and easier to use.
Accelerating MapReduce processing and improving the performance of the standard Hadoop sort benchmarks are widely seen as areas where domain experts can play a significant role in enhancing Hadoop and unlocking value for the entire community.
Syncsort’s contribution seeks to enhance the sort framework in Hadoop for all users by making it more modular, flexible and extensible. Organizations simply use the plug-in to bring their existing investments in sort technology into Hadoop regardless of whether they are using Syncsort’s DMExpress or another solution.
Additionally, Syncsort is announcing a special DMExpress Hadoop Edition of its record-setting data integration acceleration software that will include Hadoop Distributed File System (HDFS) connectivity and the ability to create jobs in DMExpress’ graphical user interface and run them in MapReduce.
However, what sets the solution apart is its ability to improve performance in MapReduce by shifting transformations to the DMExpress engine and utilizing new Hadoop accelerator technology that Syncsort is bringing to market.
The Hadoop accelerator will make use of Syncsort’s plug-in contribution to the Hadoop community to seamlessly improve the performance of MapReduce jobs through sort, and will invoke high performance compression as needed to deliver storage savings.
DMExpress Hadoop Edition makes MapReduce processing more efficient by providing a simple, self-tuning alternative that dramatically enhances performance and facilitates ongoing development and maintenance.
The solution takes advantage of DMExpress’ light install and resource footprint to enable seamless deployment on all nodes in the Hadoop framework.
Initially available as part of a limited beta program, DMExpress Hadoop Edition will be available later in the calendar year.
Applying 40+ Years of Performance Expertise
in Data-Intensive Environments to Hadoop
- Hadoop Acceleration – DMExpress Hadoop Edition features proprietary sort algorithms and transformations that optimize Hadoop. Elapsed processing time for existing Hadoop jobs has been reduced by up to 40 percent in TeraSort benchmarks.
- Greater Efficiency – The solution significantly reduces resource utilization, including CPU, memory and I/O, while improving scalability for less hardware requirements and associated costs.
- Easier to Use and Maintain – DMExpress Hadoop Edition requires no tuning, no coding and no MapReduce scripting to significantly increase IT staff productivity and enable greater focus on strategic projects.
Accelerating Hadoop Processing by 2x
in Testing at comScore
- comScore, Inc. a firm in measuring the digital world and preferred source of digital business analytics, has built and defined a market by leveraging ‘Big Data’ to help its customers succeed.
- The company monitors, collects and analyzes more than 20 billion records a day, amounting to terabytes of information, to provide unique insights about users online and offline behavior.
- An existing DMExpress customer, comScore engaged Syncsort to accelerate Hadoop processing and, in benchmark testing, achieved 2x faster performance with DMExpress without additional hardware and with minimal coding and tuning.
- The benchmark testing was completed on a 6 node cluster on Cloudera’s Distribution for Hadoop Version 3 (CDH3) and involved terabytes of data.
comScoreSort Benchmark
"We see tremendous potential for using lightweight, easy to deploy tools like DMExpress for accelerating MapReduce processing and making it more efficient," said Michael Brown, Chief Technology Officer, comScore. "The benchmark testing we have completed with Syncsort has exceeded our expectations and we applaud their leadership in contributing to the open source community and applying their expertise to make it easier for data-intensive organizations like comScore to build and maintain transformations in Hadoop."
"Interest in Hadoop continues to rise as organizations are looking to increase the speed of their analysis on large data sets and utilize computing resources more efficiently," said David Menninger, VP and Research Director, Ventana Research. "Syncsort DMExpress Hadoop Edition is designed to address both of these important issues by leveraging their proven algorithms in combination with access to data stored in the Hadoop distributed file system."
"We believe that it is critical for vendors like Syncsort to find meaningful ways to contribute to the Hadoop community and help make Hadoop a stronger, more viable alternative for organizations," said Flavio Santoni, Chief Executive Officer, Syncsort. "As customers increasingly find that it is not economically or technically feasible to track and manage their exploding data volumes in commercial databases, we have a responsibility to innovate and provide solutions that address their most complex, data-intensive environments. We welcome the opportunity to work with the open source community and customers alike to apply Syncsort’s unique expertise and technology to today’s big data challenges."