At the 2022 Cloud Data Summit, Google announced the launch of BigLake, a new cross-platform solution that went into preview on April 13 and will make it easier to analyze data, regardless of where or how it’s stored. It’s big news for developers looking for a more uniform storage engine to streamline their business data.
Building on top of BigQuery, Google’s popular cloud data warehouse, data in separate lakes and warehouses can now flow into a single, unifying data lake for a better analytics experience.
What Is Google BigLake?
Google BigLake is a storage engine built on top of BigQuery, Google’s multi-cloud data warehouse that allows users to query vast amounts of data at lightning speed.
Data lakes store vast amounts of raw data in its native format until it’s needed for analytics applications, and they’re typically siloed from other lakes. BigLake is different in that it stretches BigQueuy’s functions to data lakes on Google Cloud Storage to create a “bottomless” data lake where developers can pull in data from various sources and house it all in one place.
Image Credits: Google
Google says BigLake will be at the center of Google Cloud’s data platform strategy, and they’ll ensure tools and capabilities integrate with it.
Key features include:
- A flexible and cost-effective open data lakehouse architecture
- Fine-grained security controls that allow end-users to apply table, row, column-level security policies on object store tables without needing first to request grant file-level access
- Central management for security policies and a single source of data across Google Cloud products and services as well as outside open-source engines using connectors
- Multi-cloud friendly with the ability to translate Amazon S3 and Azure data lake Gen 2 into BigLake tables with the Data Catalog and allow for a standard semantic model
- Access to the most popular open data formats, including Parquet, Avro, ORC, CSV, JSON
Unifying Data Lakes and Warehouses Everywhere
These days, enterprises hold staggering amounts of ever-increasing data, typically stored across different platforms and environments. But, it’s not without costs or risks.
BigLake provides the ability to query the underlying data stores through a single system while eliminating the need to move or duplicate data. Whether it sits on AWS S3 or Azure Data Lake Storage Gen2 doesn’t matter, and that’s what makes BigLake a big deal.
“With BigLake, Google is redefining multi-cloud computing by breaking down the data silos within the Google Cloud ecosystem and between other cloud service providers like AWS and Azure,” said Jason Blythe, Director of Data Engineering and Analytics at 66degrees. “It’s what makes Google the true leader of multi-cloud solutions.”
Start Modernizing Your Data with Google Cloud
Google BigLake is currently available in preview. Learn more about BigLake and the array of Google Cloud data modernization services today by connecting with one of our cloud experts. As a Google Cloud Premier Partner, we can help you explore the power of BigLake and show what it could do for your cloud environment.