官术网_书友最值得收藏!

What is big data?

Big data is a relatively new term which has been gathering steam over the past few years. Big data is a term used for datasets that are relatively large to be stored in a traditional database system or processed by traditional data-processing pipelines. This data could be structured, semi-structured, or unstructured data. The datasets that belong to this category usually scale to terabytes or petabytes of data. Big data usually involves one or more of the following:

  • Velocity: Data moves at an unprecedented speed and must be dealt with it in a timely manner.

For example, online systems, sensors, social media, web clickstream, and so on.

  • Volume: Organizations collect data from a variety of sources, including business transactions, social media, and information from sensor or machine-to-machine data. This could involve terabytes to petabytes of data. In the past, storing it would've been a problem, but new technologies have eased the burden.
  • Variety: Data comes in all sorts of formats ranging from structured data to be stored in traditional databases to unstructured data (blobs) such as images, audio files, and text files.

These are known as the 3Vs of big data.

In addition to these, we tend to associate another term with big data:

  • Complexity: Today's data comes from multiple sources, which makes it difficult to link, match, cleanse, and transform data across systems. However, it's necessary to connect and correlate relationships, hierarchies, and multiple data linkages, or your data can quickly spiral out of control. It must be able to traverse multiple data centers, cloud, and geographical zones.
主站蜘蛛池模板: 迁安市| 黔东| 旬阳县| 永仁县| 富平县| 留坝县| 华坪县| 汉寿县| 威海市| 安远县| 万载县| 嫩江县| 根河市| 息烽县| 全州县| 祁东县| 抚松县| 贵港市| 慈溪市| 深泽县| 溆浦县| 三都| 济宁市| 林甸县| 田东县| 西城区| 抚远县| 邓州市| 鲁甸县| 石楼县| 元江| 永春县| 阳朔县| 特克斯县| 章丘市| 河源市| 合水县| 治多县| 南皮县| 冀州市| 青铜峡市|