Welcome to the first product update of 2023! Our engineering team has been busy with customer-driven requirements you can read about in our release notes, but this week let’s highlight two key new capabilities that enhance your intelligent data pipelines.
New Data Plane Usage Report
As your data workloads grow, it becomes increasingly important to balance the costs of data pipelines against the new business value being created. Yet, since Ascend runs workloads in data cloud providers like Snowflake, Databricks, and BigQuery, the biggest costs are incurred in your accounts for these platforms.
To that end, we are rolling out new features that provide the necessary transparency with which to manage costs, guide new investments, and focus tuning efforts. The new usage screens in this week’s release consolidate cost data for them, and provide instant access and visualization for the workloads over time.
With just a few clicks, engineers using Ascend can see:
- How their data pipelines are utilizing their data cloud provider accounts
- How the workloads vary over time due to the order of operations in the pipelines
- Pinpoint the most costly and most frequent operations in their pipelines
- Validate the cloud data providers invoices
- Validate their Ascend invoices
This transparency helps engineers manage costs more directly, augmenting Ascend automation that already helps reduce overall running costs by:
- Eliminating all development of orchestration code
- Only processing data pipelines incrementally
- Never re-running datasets unnecessarily
- Sequencing operations efficiently
Databricks Merge
Continuing our releases of features in support of Databricks, starting this week, engineers can use the updated native MERGE operation to combine data in Delta tables as a regular instruction in an Ascend Transform component.
Like any other Transform, the operation is automatically orchestrated by the Ascend DataAware™ control plane with zero additional code. The lineage of the datasets upstream and downstream of the Transform is guaranteed, and the execution is monitored by the Ascend platform for completion status and costs.
Databricks MERGE performs simultaneous updates, insertions, and deletions from a Delta Lake table. The newly optimized implementation of MERGE improves performance substantially for common workloads by reducing the number of shuffle operations, and preserving the data layout on existing data that is not modified. Check out Databricks documentation on this cool capability!
Additional Reading and Resources
- Ascend for Databricks Data Pipelines
- Introducing Data Products to Deliver Better Value from Data
- Why Free Data Ingestion Accelerates Business Value