정보의 중앙 저장소,분석방법을 포함한 정보 관리 시스템

structured,unstructured 데이터를 저장하기 위한 중앙 저장소. 데이터를 가공없이 있는 그대로 저장하여 대시보드와 시각화에서부터 빅데이터 처리, 실시간 분석, 머신러닝까지 수행할수 있다.

Data Warehouse vs Data Lake

CharacteristicsData WarehouseData Lake
DataRelational from transactional systems, operational databases, and line of business applicationsNon-relational and relational from IoT devices, web sites, mobile apps, social media, and corporate applications
SchemaDesigned prior to the DW implementation (schema-on-write)Written at the time of analysis (schema-on-read)
Price/PerformanceFastest query results using higher cost storageQuery results getting faster using low-cost storage
Data QualityHighly curated data that serves as the central version of the truthAny data that may or may not be curated (ie. raw data)
UsersBusiness analystsData scientists, Data developers, and Business analysts (using curated data)
AnalyticsBatch reporting, BI and visualizationsMachine Learning, Predictive analytics, data discovery and profiling
