- Learning Apache Cassandra(Second Edition)
- Sandeep Yarabarla
- 262字
- 2021-07-03 00:19:21
What is big data?
Big data is a relatively new term which has been gathering steam over the past few years. Big data is a term used for datasets that are relatively large to be stored in a traditional database system or processed by traditional data-processing pipelines. This data could be structured, semi-structured, or unstructured data. The datasets that belong to this category usually scale to terabytes or petabytes of data. Big data usually involves one or more of the following:
- Velocity: Data moves at an unprecedented speed and must be dealt with it in a timely manner.
For example, online systems, sensors, social media, web clickstream, and so on.
- Volume: Organizations collect data from a variety of sources, including business transactions, social media, and information from sensor or machine-to-machine data. This could involve terabytes to petabytes of data. In the past, storing it would've been a problem, but new technologies have eased the burden.
- Variety: Data comes in all sorts of formats ranging from structured data to be stored in traditional databases to unstructured data (blobs) such as images, audio files, and text files.
These are known as the 3Vs of big data.
In addition to these, we tend to associate another term with big data:
- Complexity: Today's data comes from multiple sources, which makes it difficult to link, match, cleanse, and transform data across systems. However, it's necessary to connect and correlate relationships, hierarchies, and multiple data linkages, or your data can quickly spiral out of control. It must be able to traverse multiple data centers, cloud, and geographical zones.
推薦閱讀
- 網絡服務器架設(Windows Server+Linux Server)
- 工業機器人技術及應用
- Hands-On Machine Learning on Google Cloud Platform
- AWS:Security Best Practices on AWS
- Cloud Analytics with Microsoft Azure
- 計算機控制技術
- VMware Performance and Capacity Management(Second Edition)
- 聊天機器人:入門、進階與實戰
- 運動控制系統
- 網絡服務搭建、配置與管理大全(Linux版)
- Excel 2010函數與公式速查手冊
- 未來學徒:讀懂人工智能飛馳時代
- 電腦故障排除與維護終極技巧金典
- 軟件測試設計
- 歐姆龍CP1H型PLC編程與應用