What are you looking for ?

Availability of Ascend Structured Data Lake for Unified and Optimized Access to Entire Data Lifecycle

Storage layer accessible to external processing engines and notebook to extend usability across ecosystem

Ascend announced the availability of its Structured Data Lake, allowing users to directly connect their existing data processing engines, notebooks, and BI tools to the company’s optimized data management system.

Click to enlarge

Ascend Structured Data Lake Screen

For the first time, data scientists, architects, and engineers can build on top of a common data lake that automatically ensures data integrity, tracks data lineage, and optimizes performance. With the company’s Dataflow Control Plane managing storage across the data lifecycle, the Structured Data Lake delivers faster time-to-value for users across the enterprise and decreases overall data management costs.

Ascend Dataflow

Ascend Dataflow

We’re excited to extend the value of Ascend to an even larger user base,” said Sean Knapp, founder and CEO, Ascend. “The release of the Structured Data Lake is a huge step forward for accelerating the data development lifecycle. Teams can not only access more data than ever before, but can do so with confidence and security at any stage of development. And, with easy integration into the broader ecosystem, we are eliminating siloed access based on preferred tools or skills.

Backing the firm’s Autonomous Dataflow Service is an optimized data store that is managed by the company’s Dataflow Control Plane, resulting in the first data lake that understands and reacts to the pipelines running vs. it.

This data management is what unlocked Queryable Dataflows – bringing the interactivity of data warehouses to the scale of pipelines-and now extends the traditional data lake architectures with structured, secure, and optimized access to data flowing across the enterprise.

With Structured Data Lake, all managed data is unified and dynamically synchronized with the pipelines that operate on it, making even mid-pipeline data sets available to existing processing engines such as external Apache Spark, Presto, or Apache Hadoop, as well as to familiar tools such as Jupyter and Zeppelin notebooks, all with no additional code or management complexity.

Automated and intelligent management of Structured Data Lake introduces number of capabilities from large-scale data management architectures, including:

  • Trusted data integrity: The Structured Data Lake manages all data and updates as they happen with guaranteed data accuracy, automatic lineage tracking, and dependency management. It also supports atomic updates at scale and intelligent partitioning for safe and optimized access.

  • Deduplication of redundant storage and operations: As a unified storage layer, the Structured Data Lake has full visibility into every operation and data set being developed vs. it. From this, it ensures that no duplicate operations or data sets occur, and has the intelligence to materialize correct data sets as needed. This results in decreased storage costs as well as improved performance for repeat queries and operations.

  • Automated storage maintenance: With management of all data down to fine-grained partitions, the Structured Data Lake can automate some of the more tedious aspects of storage such as garbage collection based on active development, and intelligent backfilling with minimal reprocessing.

Ascend’s vision for data pipelines and orchestration is impressive, and its technology roadmap really speaks to the needs of many data and infrastructure pros,” said Eric Kavanagh, CEO, The Bloor Group, an independent research firm. “The Structured Data Lake release is interesting when you consider the extensibility it provides to existing data services. Watch for Ascend to play an important role in the rapid maturation of DataOps.

Structured Data Lake is available as part of the Ascend Autonomous Dataflow Service.

About Ascend
It provides an Autonomous Dataflow Service, enabling data engineers to build, scale, and operate continuously optimized, Apache Spark-based pipelines with 85% less code. Running natively in Microsoft Azure, Amazon Web Services, and Google Cloud Platform, it combines declarative configurations and automation to manage the underlying cloud infrastructure, optimize pipelines, and eliminate maintenance across the entire data lifecycle. The company is backed by the venture capital firms, including Accel, Sequoia Capital, Lightspeed Venture Partners, and 8VC; and supported by notable advisors, such as Kevin Scott, CTO of Microsoft; Maynard Webb, board member Salesforce, Visa; Scott McNealy, former Sun Microsystems CEO; Luanne Dauber, former CMO at Confluent and VP marketing at Pure Storage; and Deep Nishar, senior managing Partner of Softbank Vision Fund.