- MySQL 8 for Big Data
- Shabbir Challawala Jaydip Lakhatariya Chintan Mehta Kandarp Patel
- 227字
- 2021-08-20 10:06:01
Store
In this section, we will discuss storing data that has been collected from various sources. Let's consider an example of crawling reviews of organizations for sentiment analysis, wherein each gathers data from different sites with each of them having data uniquely displayed.
Traditionally, data was processed using the ETL (Extract, Transform, and Load) procedure, which used to gather data from various sources, modify it according to the requirements, and upload it to the store for further processing or display. Tools that were every so often used for such scenarios were spreadsheets, relational databases, business intelligence tools, and so on, and sometimes manual effort was also a part of it.
The most common storage used in Big Data platform is HDFS. HDFS also provides HQL (Hive Query Language), which helps us do many analytical tasks that are traditionally done in business intelligence tools. A few other storage options that can be considered are Apache Spark, Redis, and MongoDB. Each storage option has their own way of working in the backend; however, most storage providers exposes SQL APIs which can be used to do further data analysis.
There might be a case where we need to gather real-time data and showcase in real time, which practically doesn't need the data to be stored for future purposes and can run real-time analytics to produce results based on the requests.
- 復雜軟件設計之道:領域驅動設計全面解析與實戰
- 測試驅動開發:入門、實戰與進階
- LabVIEW2018中文版 虛擬儀器程序設計自學手冊
- 信息可視化的藝術:信息可視化在英國
- PowerCLI Cookbook
- Getting Started with SQL Server 2012 Cube Development
- Mastering AndEngine Game Development
- Visual C#通用范例開發金典
- Corona SDK Mobile Game Development:Beginner's Guide(Second Edition)
- Haskell Data Analysis Cookbook
- Learning Continuous Integration with TeamCity
- Functional Python Programming
- 3D Printing Designs:The Sun Puzzle
- Flink入門與實戰
- Java程序設計教程