Pages

Data WareHouse

Data WareHouse Common Tech Stacks



Difference between Data lakes and Data warehouse

ParametersData LakesData Warehouse
DataData lakes store everything.Data Warehouse focuses only on Business Processes.
ProcessingData are mainly unprocessedHighly processed data.
Type of DataIt can be Unstructured, semi-structured and structured.It is mostly in tabular form & structure.
TaskShare data stewardshipOptimized for data retrieval
AgilityHighly agile, configure and reconfigure as needed.Compare to Data lake it is less agile and has fixed configuration.
UsersData Lake is mostly used by Data ScientistBusiness professionals widely use data Warehouse
StorageData lakes design for low-cost storage.Expensive storage that give fast response times are used
SecurityOffers lesser control.Allows better control of the data.
Replacement of EDWData lake can be source for EDWComplementary to EDW (not replacement)
SchemaSchema on reading (no predefined schemas)Schema on write (predefined schemas)
Data ProcessingHelps for fast ingestion of new data.Time-consuming to introduce new content.
Data GranularityData at a low level of detail or granularity.Data at the summary or aggregated level of detail.
ToolsCan use open source/tools like Hadoop/ Map ReduceMostly commercial tools.