What are you looking for ?
Infinidat
Articles_top

EMC Greenplum Analytics Workbench With Partners

1,000-node platform with 24PB to accelerate Hadoop testing and devt.

At EMC World 2012, EMC Corp. announced that the Greenplum Analytics Workbench – a 1,000-node cluster that will act as a lab environment for accelerating the pace of big data innovation – is now live.

One of the primary uses will be to act as an environment for running scale validation of the Apache Hadoop code base.

Greenplum is working with the Apache Software Foundation to ensure that results from the Analytics Workbench are available to the open source community in an effort to leverage the resources of the Workbench to further accelerate the development of Hadoop as a revolutionary technology for big data.

Technology from some of leading software and hardware manufacturers is providing the infrastructure.

Greenplum will use it to test the limits of scale-out infrastructure technology and also to explore the models for applying big data analytics.

Whether that involves working with visionary academic institutions on data-intensive research studies, or collaborating with big data application developers, Greenplum has plans to provide innovative thinkers in the data space with access to the Analytics Workbench.

The 1,000-node cluster will also be made available to members of Greenplum’s training and certification classes for Hadoop. With the first publicly available courses launching this summer, Greenplum will offer a set of Hadoop training programs designed to provide participants with the knowledge and programming skills required to leverage Hadoop. A unique aspect of Greenplum’s Hadoop training program is that any individual who completes the course will be granted access to the 1,000-node cluster to use as a sandbox environment.

The Greenplum Analytics Workbench is the result of several hardware and software companies to collectively facilitate the development of Apache Hadoop as a tool for big data analytics, including:

  • EMC
  • Intel
  • Mellanox Technologies
  • Micron
  • Seagate
  • SuperMicro
  • Switch
  • VMware

In addition to 1,000-plus hardware nodes (or 10,000 nodes with the addition of virtual machines), the test bed cluster consists of 24PB of physical storage. This is the equivalent of nearly half of the entire written works of mankind, from the beginning of recorded history.

Chaitan Baru, director, Center for Large-scale Data Systems research (CLDS), San Diego Supercomputer Center, UC San Diego, said: "The Workshop on big data Benchmarking organized by the Center for Large-scale Data Systems Research (CLDS), UC San Diego on May 8-9 in San Jose has generated a lot of enthusiasm in developing industry benchmark standards for big data applications. A big data benchmarking community is now self-organizing in order to make progress in this area. Access to the 1000-node cluster from Greenplum will be essential in assisting the community in making progress in this important area, which will have impact in business as well as scientific applications."

Amir Prescher, VP of business development at Mellanox Technologies, said: "Mellanox is excited to be a part of the largest Hadoop test bed cluster ever built and to provide the critical elements that help enable this leading unstructured data analytics Hadoop solution. The new Analytics Workbench takes advantage of Mellanox’s 10/40GbE and FDR 56Gb/s IB interconnect solutions, including its Unstructured Data Accelerator (UDA) software that extends Mellanox’s interconnect capabilities of low latency, high-throughput, low CPU overhead and Remote Direct Memory Access (RDMA), to optimize big data application efficiency with up to 2X faster Hadoop job run-time."

Wally Liaw, VP of Sales, International, Super Micro Computer, Inc., said: "Supermicro has contributed to the 1,000 data node infrastructure and integration resources behind EMC’s Greenplum Analytics Workbench to accelerate innovation and the development of new applications within the Hadoop developer community. Our enterprise-class server platforms offer the highest performance, open standards and cost-effective architecture for massive-scale structured and unstructured data analytics. We are excited to offer growing support to the worldwide Hadoop community as big data sciences expand."

Scott Yara, SVP of products and co-founder, Greenplum, said: "We’re thrilled to announce that the Greenplum Analytics Workbench is now live. With more companies implementing big data analytics than ever before, Hadoop-based batch processing of data at massive scale, with continuous testing, is a key component to driving even better, faster data analytics."

Articles_bottom
AIC
ATTO
OPEN-E