Data WareHouse Common Tech Stacks
Parameters | Data Lakes | Data Warehouse |
---|---|---|
Data | Data lakes store everything. | Data Warehouse focuses only on Business Processes. |
Processing | Data are mainly unprocessed | Highly processed data. |
Type of Data | It can be Unstructured, semi-structured and structured. | It is mostly in tabular form & structure. |
Task | Share data stewardship | Optimized for data retrieval |
Agility | Highly agile, configure and reconfigure as needed. | Compare to Data lake it is less agile and has fixed configuration. |
Users | Data Lake is mostly used by Data Scientist | Business professionals widely use data Warehouse |
Storage | Data lakes design for low-cost storage. | Expensive storage that give fast response times are used |
Security | Offers lesser control. | Allows better control of the data. |
Replacement of EDW | Data lake can be source for EDW | Complementary to EDW (not replacement) |
Schema | Schema on reading (no predefined schemas) | Schema on write (predefined schemas) |
Data Processing | Helps for fast ingestion of new data. | Time-consuming to introduce new content. |
Data Granularity | Data at a low level of detail or granularity. | Data at the summary or aggregated level of detail. |
Tools | Can use open source/tools like Hadoop/ Map Reduce | Mostly commercial tools. |