MarkLogic Updates Hadoop Connector for Enterprise NoSQL Database

MarkLogic Corporation announced an update to its connector for Hadoop that allows Hadoop applications access to data indexed and managed by the its Enterprise NoSQL database platform.

It helps enterprises realize the value of Hadoop by simplifying data management, reducing infrastructure costs, and increasing development agility. Using it, a Hadoop application can read all of the data from MarkLogic’s compressed data files stored in the HDFS, without communicating through a MarkLogic database or exporting the data.

“There is no doubt that enterprise adoption of Hadoop is increasing and MarkLogic 7 helps organizations capitalize on their investment. With the ability to easily run MarkLogic directly on top of an existing Hadoop installation, enterprises can build a new class of real-time operational applications,” said Joe Pasqua, SVP, product strategy, MarkLogic. “Sharing this same data as it ages from operational, to historical, to archival with the analytics in MapReduce jobs, companies can dramatically simplify data management, reduce costs and gain better insight from their data.”

With MarkLogic running on Hadoop, indexes are created once and can be used over the life of the data for real-time, transactional queries and updates as well as large-scale batch analysis using MapReduce. This helps to reduce storage and infrastructure costs normally associated with siloed data marts and special-purpose analytic environments. Having fewer copies of the data also simplifies data governance, reducing risk. This is critical to organizations in highly regulated industries – such as financial services, healthcare, and the public sector – that are moving to Hadoop for data management infrastructure.

It’s a milestone in the effort to bring more value to customers leveraging Hadoop technology. Last year, the company unveiled its Tiered Storage strategy and announced tiered storage features in its MarkLogic 7.

The tiered storage offering allows customers to deploy the database platform using a mix of locally attached SSD and spinning disk, SAN, NAS, S3, and HDFS storage within the same database. Administrators can move data between tiers with transactional guarantees and zero downtime. This approach can reduce storage costs, while making it easy to incorporate Hadoop into an enterprise architecture. Combined with its schema-agnostic data model, it provides flexibility to make smarter tradeoffs in live systems among cost, performance, and availability without having to change application code.