What are you looking for ?
Infinidat
Articles_top

From Caringo, Direct Massively Parallel, Compliant Storage for Hadoop

Swarm HadoopFS

Caringo, Inc. announced  Swarm HadoopFS, a native Hadoop 2+ connector for its Swarm that saves time and resources with efficient direct parallel map reduce processing, paired with compliance features such as WORM, integrity seals and Legal Hold.

caringo,Swarm HadoopFS

Using Hadoop for data processing and analytics typically involves a time-consuming and resource-intensive bulk-load of a data from an archive or file server into the Hadoop FileSystem (HDFS). With Caringo’s direct approach, HDFS can read data directly from Swarm and, because of Swarm’s massively parallel approach where all nodes cooperate to perform all processes, each HDFS server can pull data in parallel. This eliminates the time-consuming extract and ingest step, resulting in faster time to the map reduce stage while reducing reliance on expensive NAS or filer storage in a Hadoop environment.

Additionally, organizations can use the standard compliance and data protection features in Swarm to ensure their data is safe, accessible and hasn’t been tampered with. It supports the ability to store data so that it can’t be deleted (WORM); the ability to prove in a court of law content hasn’t been tampered with (Integrity Seals); and the ability to take a snapshot of data and store it immutably (Legal Hold).

These features combined with Swarm’s ability to automatically manage the data lifecycle, moving from erasure coding or replication all on the same servers, make Swarm an option for organizations that want to leverage Hadoop but have stringent regulatory requirements.

The ability to quickly analyze and act upon data is a key competitive advantage,” said Mark Goros, CEO, Caringo. “Organizations of every size understand this and have been deploying Hadoop clusters in a fragmented nature, often relying on HDFS with JBOD for long-term storage which it wasn’t designed for. With SwarmFS we enable resilient, compliant and highly efficient long-term storage for all unstructured data in a highly automated fashion. This includes data that you may not even know you want to analyze yet, all instantly accessible by Hadoop in a direct, massively parallel fashion.”

Swarm HadoopFS is available.

Articles_bottom
AIC
ATTO
OPEN-E