- Learning Apache Cassandra(Second Edition)
- Sandeep Yarabarla
- 262字
- 2021-07-03 00:19:21
What is big data?
Big data is a relatively new term which has been gathering steam over the past few years. Big data is a term used for datasets that are relatively large to be stored in a traditional database system or processed by traditional data-processing pipelines. This data could be structured, semi-structured, or unstructured data. The datasets that belong to this category usually scale to terabytes or petabytes of data. Big data usually involves one or more of the following:
- Velocity: Data moves at an unprecedented speed and must be dealt with it in a timely manner.
For example, online systems, sensors, social media, web clickstream, and so on.
- Volume: Organizations collect data from a variety of sources, including business transactions, social media, and information from sensor or machine-to-machine data. This could involve terabytes to petabytes of data. In the past, storing it would've been a problem, but new technologies have eased the burden.
- Variety: Data comes in all sorts of formats ranging from structured data to be stored in traditional databases to unstructured data (blobs) such as images, audio files, and text files.
These are known as the 3Vs of big data.
In addition to these, we tend to associate another term with big data:
- Complexity: Today's data comes from multiple sources, which makes it difficult to link, match, cleanse, and transform data across systems. However, it's necessary to connect and correlate relationships, hierarchies, and multiple data linkages, or your data can quickly spiral out of control. It must be able to traverse multiple data centers, cloud, and geographical zones.
推薦閱讀
- Mobile DevOps
- SharePoint 2010開發最佳實踐
- Apache Spark Deep Learning Cookbook
- WordPress Theme Development Beginner's Guide(Third Edition)
- Enterprise PowerShell Scripting Bootcamp
- Hadoop應用開發基礎
- 從零開始學PHP
- Mastering Text Mining with R
- 新一代人工智能與語音識別
- 計算機硬件技術基礎(第2版)
- 數據清洗
- Oracle 11g基礎與提高
- Intel Edison Projects
- 商務智能
- Hyper-V Security