- Learning Hunk
- Dmitry Anoshin Sergey Sheypak
- 216字
- 2021-07-23 14:45:02
The big problem
Hadoop is a distributed file system and a distributed framework designed to compute large chunks of data. It is relatively easy to get data into Hadoop. There are plenty of tools for getting data into different formats, such as Apache Phoenix. However it is actually extremely difficult to get value out of the data you put into Hadoop.
Let's look at the path from data to value. First we have to start with collecting data. Then we also spend a lot of time preparing it, making sure that this data is available for analysis, and being able to question the data. This process is as follows:

Unfortunately, you may not have asked the right questions or the answers are not clear, and you have to repeat this cycle. Maybe you have transformed and formatted your data. In other words, it is a long and challenging process.
What you actually want is to collect the data and spend some time preparing it; then you can ask questions and get answers repetitively. Now, you can spend a lot of time asking multiple questions. In addition, you can iterate with data on those questions to refine the answers that you are looking for. Let's look at the following diagram, in order to find a new approach:

- Oracle Database In-Memory(架構與實踐)
- Network Automation Cookbook
- 趣學Python算法100例
- 數據結構(Python語言描述)(第2版)
- Hands-On JavaScript High Performance
- JS全書:JavaScript Web前端開發指南
- 重學Java設計模式
- SQL Server 2008 R2數據庫技術及應用(第3版)
- HTML+CSS+JavaScript網頁制作:從入門到精通(第4版)
- Java高手是怎樣煉成的:原理、方法與實踐
- Java從入門到精通(視頻實戰版)
- Spring Boot從入門到實戰
- Oracle SOA Suite 12c Administrator's Guide
- C++面向對象程序設計
- C語言程序設計