目錄(120章)
倒序
- 封面
- 版權信息
- Credits
- About the Author
- Acknowledgements
- About the Reviewer
- www.PacktPub.com
- Preface
- Chapter 1. Getting Started with Hadoop 2.X
- Introduction
- Installing a single-node Hadoop Cluster
- Installing a multi-node Hadoop cluster
- Adding new nodes to existing Hadoop clusters
- Executing the balancer command for uniform data distribution
- Entering and exiting from the safe mode in a Hadoop cluster
- Decommissioning DataNodes
- Performing benchmarking on a Hadoop cluster
- Chapter 2. Exploring HDFS
- Introduction
- Loading data from a local machine to HDFS
- Exporting HDFS data to a local machine
- Changing the replication factor of an existing file in HDFS
- Setting the HDFS block size for all the files in a cluster
- Setting the HDFS block size for a specific file in a cluster
- Enabling transparent encryption for HDFS
- Importing data from another Hadoop cluster
- Recycling deleted data from trash to HDFS
- Saving compressed data in HDFS
- Chapter 3. Mastering Map Reduce Programs
- Introduction
- Writing the Map Reduce program in Java to analyze web log data
- Executing the Map Reduce program in a Hadoop cluster
- Adding support for a new writable data type in Hadoop
- Implementing a user-defined counter in a Map Reduce program
- Map Reduce program to find the top X
- Map Reduce program to find distinct values
- Map Reduce program to partition data using a custom partitioner
- Writing Map Reduce results to multiple output files
- Performing Reduce side Joins using Map Reduce
- Unit testing the Map Reduce code using MRUnit
- Chapter 4. Data Analysis Using Hive Pig and Hbase
- Introduction
- Storing and processing Hive data in a sequential file format
- Storing and processing Hive data in the RC file format
- Storing and processing Hive data in the ORC file format
- Storing and processing Hive data in the Parquet file format
- Performing FILTER By queries in Pig
- Performing Group By queries in Pig
- Performing Order By queries in Pig
- Performing JOINS in Pig
- Writing a user-defined function in Pig
- Analyzing web log data using Pig
- Performing the Hbase operation in CLI
- Performing Hbase operations in Java
- Executing the MapReduce programming with an Hbase Table
- Chapter 5. Advanced Data Analysis Using Hive
- Introduction
- Processing JSON data in Hive using JSON SerDe
- Processing XML data in Hive using XML SerDe
- Processing Hive data in the Avro format
- Writing a user-defined function in Hive
- Performing table joins in Hive
- Executing map side joins in Hive
- Performing context Ngram in Hive
- Call Data Record Analytics using Hive
- Twitter sentiment analysis using Hive
- Implementing Change Data Capture using Hive
- Multiple table inserting using Hive
- Chapter 6. Data Import/Export Using Sqoop and Flume
- Introduction
- Importing data from RDMBS to HDFS using Sqoop
- Exporting data from HDFS to RDBMS
- Using query operator in Sqoop import
- Importing data using Sqoop in compressed format
- Performing Atomic export using Sqoop
- Importing data into Hive tables using Sqoop
- Importing data into HDFS from Mainframes
- Incremental import using Sqoop
- Creating and executing Sqoop job
- Importing data from RDBMS to Hbase using Sqoop
- Importing Twitter data into HDFS using Flume
- Importing data from Kafka into HDFS using Flume
- Importing web logs data into HDFS using Flume
- Chapter 7. Automation of Hadoop Tasks Using Oozie
- Introduction
- Implementing a Sqoop action job using Oozie
- Implementing a Map Reduce action job using Oozie
- Implementing a Java action job using Oozie
- Implementing a Hive action job using Oozie
- Implementing a Pig action job using Oozie
- Implementing an e-mail action job using Oozie
- Executing parallel jobs using Oozie (fork)
- Scheduling a job in Oozie
- Chapter 8. Machine Learning and Predictive Analytics Using Mahout and R
- Introduction
- Setting up the Mahout development environment
- Creating an item-based recommendation engine using Mahout
- Creating a user-based recommendation engine using Mahout
- Predictive analytics on Bank Data using Mahout
- Text data clustering using K-Means using Mahout
- Population Data Analytics using R
- Twitter Sentiment Analytics using R
- Performing Predictive Analytics using R
- Chapter 9. Integration with Apache Spark
- Introduction
- Running Spark standalone
- Running Spark on YARN
- Performing Olympics Athletes analytics using the Spark Shell
- Creating Twitter trending topics using Spark Streaming
- Twitter trending topics using Spark streaming
- Analyzing Parquet files using Spark
- Analyzing JSON data using Spark
- Processing graphs using Graph X
- Conducting predictive analytics using Spark MLib
- Chapter 10. Hadoop Use Cases
- Introduction
- Call Data Record analytics
- Web log analytics
- Sensitive data masking and encryption using Hadoop
- Index 更新時間:2021-07-09 20:03:08
推薦閱讀
- Hands-On Internet of Things with MQTT
- Cinema 4D R13 Cookbook
- 網上生活必備
- 離散事件系統建模與仿真
- 讓每張照片都成為佳作的Photoshop后期技法
- Hadoop Real-World Solutions Cookbook(Second Edition)
- Pig Design Patterns
- C語言開發技術詳解
- C語言寶典
- 信息物理系統(CPS)測試與評價技術
- 網絡安全與防護
- 人工智能趣味入門:光環板程序設計
- 在實戰中成長:Windows Forms開發之路
- 漢字錄入技能訓練
- 中老年人學電腦與上網
- 單片機與微機原理及應用
- Building Virtual Pentesting Labs for Advanced Penetration Testing(Second Edition)
- Ripple Quick Start Guide
- 單片機數據通信及測控應用技術詳解
- 中文版Photoshop CS6高手速成
- Moodle 2.5 Multimedia
- 仿人機器人開發指南
- 電氣控制從理論到實踐:變頻器應用一點通
- 名家傳道:數碼攝影后期處理秘笈
- 巧學活用Excel
- UGNX 5三維造型
- 人人都應該知道的人工智能
- 數據庫應用基礎
- 第一屆空中交通管理系統技術學術年會論文集
- Getting Started with Lumion 3D