- Data Lake Development with Big Data
- Pradeep Pasupuleti Beulah Salome Purra
- 272字
- 2021-07-30 10:24:28
What this book covers
Chapter 1, The Need for Data Lake, helps you understand what Data Lake is, its architecture and key components, and the business contexts where Data Lake can be successfully deployed. You will also learn the limitations of the traditional data architectures and how Data Lake addresses some of these inadequacies and provides significant benefits.
Chapter 2, Data Intake, helps you understand the Intake Tier in detail where we will explore the process of obtaining huge volumes of data into Data Lake. You will learn the technology perspective of the various External Data Sources and Hadoop-based data transfer mechanisms to pull or push data into Data Lake.
Chapter 3, Data Integration, Quality, and Enrichment, explores the processes that are performed on vast quantities of data in the Management Tier. You will get a deeper understanding of the key technology aspects and components such as profiling, validation, integration, cleansing, standardization, and enrichment using Hadoop ecosystem components.
Chapter 4, Data Discovery and Consumption, helps you understand how data can be discovered, packaged, and provisioned, for it to be consumed by the downstream systems. You will learn the key technology aspects, architectural guidance and tools for data discovery, and data provisioning functionalities.
Chapter 5, Data Governance, explores the details, need, and utility of data governance in a Data Lake environment. You will learn how to deal with metadata management, lineage tracking, data lifecycle management to govern the usability, security, integrity, and availability of the data through the data governance processes applied on the data in Data Lake. This chapter also explores how the current Data Lake can evolve in a futuristic setting.
- 自然語言處理實戰:預訓練模型應用及其產品化
- Android項目開發入門教程
- Java面向對象思想與程序設計
- Vue.js 3.x從入門到精通(視頻教學版)
- Getting Started with PowerShell
- 軟件測試工程師面試秘籍
- JavaScript+jQuery開發實戰
- PySide GUI Application Development(Second Edition)
- Serverless computing in Azure with .NET
- Node.js 6.x Blueprints
- Spring Boot學習指南:構建云原生Java和Kotlin應用程序
- 算法訓練營:海量圖解+競賽刷題(入門篇)
- Android熱門應用開發詳解
- HTML5+CSS+JavaScript深入學習實錄
- iOS程序員面試筆試真題與解析