What are you looking for ?
Advertise with us
RAIDON

Cloudera Distribution for Hadoop

An open source product to store and process petabytes of information distributed across thousands of servers

Cloudera, the commercial Hadoop company, announced the general availability of the Cloudera Distribution for Hadoop, an open source product used to store and process big data: petabytes of information, often distributed across thousands of servers. Hadoop is in production use at most of the world’s largest Web companies, including Facebook, Google, and Yahoo!. Cloudera, with the financial backing of Accel Partners, is the first company to develop technology to bring Hadoop into enterprise data centers.

cloudera_distribution_for_hadoop

"After working with large Hadoop deployments at companies like Facebook, Google and Yahoo!, we came to realize that people needed Hadoop installation, configuration, and management to be much easier," said Christophe Bisciglia, Cloudera founder and former manager of Google’s Hadoop cluster. "Cloudera is advancing Hadoop technology to make it easier for everyone to store and process the same types of big data that large Web companies are successfully using in their businesses."

The Cloudera Distribution for Hadoop is freely available for download and immediate use. The product is distributed as a pre-packaged RPM bundle for Red Hat Linux systems or an Amazon EC2 image. To make Hadoop easy to install and use, Cloudera is launching a new portal called where people can use a Web-based configuration tool to create custom packages that are optimized to their specific needs. Settings for the cluster can also be saved on the portal to enable automatic updates. There is no charge to use it. The RPM packages and EC2 images are freely distributed under the Apache 2 software license.

"Since we use Hadoop to help run our business, we are excited that Cloudera is offering commercial support for Hadoop and is making the technology more accessible to businesses," said David C Peterson, SVP Technology at ContextWeb, Inc., a leading contextual advertising company and operator of the ADSDAQ Exchange. "Businesses need to feel confident that there is a company like Cloudera to stand behind Hadoop in order for this great open source technology to become widely used by companies."

Cloudera is also making a pre-configured VMware image freely available for evaluation and use with their free online training. People that want to test the Cloudera Distribution for Hadoop or learn more about Hadoop and Cloudera’s online training can download the image and run it on their Linux, Mac or Windows desktop. The image ships with example code and all the components needed to use the Cloudera Distribution for Hadoop, including a master server and single node.

The Cloudera Distribution for Hadoop
is a complete system to handle
the processing and storage of big data.
Major components include:

HDFS – Hadoop Distributed File System, a distributed and fault- tolerant file system designed to run on commodity hardware. HDFS assumes that hardware failure is normal and provides quick detection and automatic recovery. HDFS can support tens of millions of files in a single instance;
MapReduce implementation to divide applications into many small blocks of work for automatic parallelization and execution on large clusters. Cloudera’s implementation of MapReduce takes care of partitioning of input data, scheduling program execution across distributed machines, and the handling of machine failure;
Hive – a data warehousing infrastructure built on top of Hadoop that provides tools for easy data summary generation, ad hoc querying, and analysis. Hive comes with Hive QL, a simple query language based on SQL.
Pig – a platform for analyzing large data sets in Hadoop using a high-level language for expressing data analysis programs, PigLatin.

Comments

Founded in late 2008 by experts from Facebook, Google, Oracle and Yahoo!, Cloudera has closed recently a $5 million round of Series A financing led by Accel Partners. Among angel investors there are Diane Greene, former CEO of VMware, Marten Mickos, previously CEO of MySQL and Gideon Yu, CFO of Facebook. Advisors include the founders of the Hadoop project, Doug Cutting and Mike Cafarella.

Articles_bottom
ExaGrid
AIC
ATTOtarget="_blank"
OPEN-E