0 votes
in Azure Data Factory by
What is the difference between Azure Data Lake and Azure Data Warehouse?

2 Answers

0 votes
by

Data Warehouse is a traditional way of storing data that is still used widely. Data Lake is complementary to Data Warehouse i.e if you have your data at a data lake that can be stored in the data warehouse as well but there are certain rules that need to be followed.

DATA LAKEDATA WAREHOUSE
Complementary to data warehouseMaybe sourced to the data lake
Data is Detailed data or Raw data. It can be in any particular form. you just need to take the data and dump it into your data lakeData is filtered, summarised, refined
Schema on read (not structured, you can define your schema in n number of ways)Schema on write(data is written in Structured form or in a particular schema)
One language to process data of any format(USQL)It uses SQL
0 votes
by

Azure Data Lake and Azure Data Warehouse are widely used to store big data, but they are not synonymous, and we can't use them interchangeably. Azure Data Lake is a huge pool of raw data. On the other hand, Azure Data Warehouse is a repository for structured, processed, and filtered data already processed for a specific purpose.

Following is a list of key differences between Azure Data Lake and Azure Data Warehouse:

Azure Data LakeAzure Data Warehouse
Azure Data Lake has a raw data structure. Raw data means data that has not yet been processed for a specific purpose.Azure Data Warehouse has a processed data structure. The processed data means the data that a larger audience can easily understand.
It is primarily used to store raw and unprocessed data.It is primarily used to store only processed data, saving storage space by not maintaining data that may never be used.
Azure Data Lake is complementary to Azure Data Warehouse. In other words, we can say that if you have your data at a data lake, it can be stored in the data warehouse as well, but you must have to follow certain rules.Azure Data Warehouse is a traditional way of storing data. It is one of the most widely used storage for big data.
The purpose of data storing in the Azure Data Lake is not yet determined.The purpose of data storing in the Azure Data Warehouse is worthy because it is currently in use.
Data scientists mainly use it because data is huge and unprocessed.Business professionals mainly use it because data is processed and can be easily understood by a larger audience.
The data in Azure Data Lake is highly accessible and quick to update.The data in Azure Data Warehouse is more complicated and costly to make changes.
It uses one language to process data of any format.It uses SQL because data is already processed.
Azure Data Lake requires a much larger storage capacity than data warehouses.It usually requires a smaller storage capacity.
It is ideal for machine learning.It is ideal for a specific purpose within the organization.
In Azure Data Lake, the schema is defined when the data is stored successfully.In Azure Data Warehouse, the schema is defined before storing the data.
It follows the ELT (Extract, Load, and Transform) process.It follows ETL (Extract, Transform, and Load) process.
It stores unprocessed data, so sometimes it gets data swamps without appropriate data quality.It doesn't store any garbage data, so storage space is not wasted on data that may never be used.
It is the best platform for doing in-depth analysis.It is the best platform for operational users.

Related questions

0 votes
asked Jun 12, 2022 in Azure Data Factory by SakshiSharma
0 votes
asked Jun 11, 2022 in Azure Data Lake Storage by john ganales
...