Lakehouse: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
=External=
=External=
* https://www.cidrdb.org/cidr2021/papers/cidr2021_paper17.pdf Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics by Michael Armbrust, Ali Ghodsi, Reynold Xin, Matei Zaharia
* https://www.cidrdb.org/cidr2021/papers/cidr2021_paper17.pdf Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics by Michael Armbrust, Ali Ghodsi, Reynold Xin, Matei Zaharia
=Internal=
* [[Data Lake]]


=Overview=
=Overview=

Revision as of 20:53, 1 May 2023

External

Internal

Overview

An architectural pattern used to implement access to data that is based on open direct-access data formats (such as Apache Parquet and ORC), has support for machine learning and data science and offers state-of-the-art performance. It is based on the concept of Data Lake.

Related Concepts

Data warehouse. Schema on write. Business Intelligence (BI). Unstructured data. Data Lake