Skip to main content
CloudTechnology

Enhanced Lakehouse Federation Streamlines Data Governance

By September 13, 2023No Comments

New Unity Catalog functionality enables organizations to centrally discover, query, and manage all data, irrespective of their location.

 

Databricks has announced new Lakehouse Federation capabilities that enable organizations to establish a scalable data mesh architecture with unified governance. By integrating previously separate data systems within the Databricks Lakehouse Platform, companies can now easily discover, query, and govern data across various platforms within Databricks, without moving or copying the data first.

For many businesses, data is scattered across diverse systems, which complicates data discovery and governance. Merging this data involves complex engineering processes and leads to time-consuming efforts. The new enhancements in the Unity Catalog features of Databricks’ solution directly address these challenges, making it easier to expose and govern isolated data systems as extensions of the lakehouse.

The key features of these enhancements include:

  • Query federation: This facilitates the consolidation and mapping of data assets from external platforms outside Databricks, including platforms like PostgreSQL, MySQL, Amazon Redshift, and more. Users can securely audit, access, and analyze their data through a unified interface, even when dealing with multiple platforms.
  • Governance beyond Databricks: Unity Catalog provides customers with consistent access policies for rows, tables, columns, and tags on any data asset registered within it. Future capabilities will enable customers to define data access policies in Unity Catalog and apply them across other data warehouses for consistent enforcement, eliminating the need for redundant policy definitions.

Databricks has also recently introduced a Hive Metastore (HMS) interface for Unity Catalog. This allows software compatible with Apache Hive to connect with Unity Catalog, streamlining data management, discovery, and governance. Organizations can centralize these aspects in Unity Catalog and connect to it from various computing platforms, including Apache Spark, Amazon EMR, Amazon Athena, and more.

These Lakehouse Federation enhancements collectively offer a uniform data serving and governance layer for data mesh architectures, facilitating distributed domain ownership, reducing complex data integration tasks, minimizing storage costs from data duplication, and enhancing overall data security and governance.

“We’re cementing Databricks as the most open and flexible lakehouse platform for data, analytics and AI. We’re making it clear that we want to help you unify all your data, no matter where it lives, no matter the format,” said Matei Zaharia, Co-Founder and Chief Technologist at Databricks. “We’re excited to see what customers do with this new functionality. We’re giving organizations access to all of the data they need through one system, which will lead to more innovation — and the best part about that innovation is that it doesn’t sacrifice security. By enabling customers to easily apply the rules consistently across platforms and track data usage, we’ll help them meet compliance requirements while pushing their businesses forward.”

The Lakehouse Federation capabilities offer several benefits to Databricks customers, including:

  • Data democratization and discoverability: Users gain a secure and unified approach to exploring structured and unstructured data, irrespective of its location.
  • Faster access to data: Firms can swiftly expose domain-specific data sources for various use cases without the need for data ingestion. Databricks Lakehouse caching and optimization further accelerate interactive queries.
  • Unified governance across data sources: Users can establish a single permission model for their entire data landscape, ensuring consistent data governance with built-in data lineage and auditability.

Lakehouse Federation and the Hive Metastore interface will be available for public preview soon.

 

About Utility Analytics Institute (UAI)

UAI Enables Utility Transformation Through Analytics

UAI is a utility-led membership organization that provides support to the industry to advance the analytics profession and utility organizations of all types, sizes, and analytics maturity levels, as well as analytics professionals throughout every phase of their career.

Transforming into a data decision-based company is one of the most difficult transitions a utility will have to make to thrive in the new energy economy. It’s more than just managing massive amounts of data, implementing the right tools and technology, and people and process management. It’s ensuring you have proper change management processes in place to address cultural challenges, as well as data management and governance plans, and best practice and compliant security strategies in place. It’s implementing the best organizational structure for your utility, and hiring and retaining talented staff, plus so much more! UAI brings together leading utilities who are serious about tackling these challenges and together we concentrate on utility analytics.

What’s UAI Membership all about? UAI serves multiple audiences providing different membership packages for each audience type. Learn more about how UAI unifies our community, serves each audience to help you meet your goals and address challenges, and how each audience collaborates to better serve the utility industry.

Contact Kevin Praet, Membership Relations, at kpraet@utilityanalytics.com to learn about the benefits of becoming a member of Utility Analytics Institute (UAI).