官术网_书友最值得收藏!

Why we are talking about big data now if data has always existed

By the early 2000’s, rapid advances in computing and technologies, such as storage, allowed users to collect and store data with unprecedented levels of efficiency. The internet further added impetus to this drive by providing a platform that had an unlimited capacity to exchange information at a global scale. Technology advanced at a breathtaking pace and led to major paradigm shifts powered by tools such as social media, connected devices such as smart phones, and the availability of broadband connections, and by extension, user participation, even in remote parts of the world.

By and large, the majority of this data consists of information generated by web-based sources, such as social networks like Facebook and video sharing sites like YouTube. In big data parlance, this is also known as unstructured data; namely, data that is not in a fixed format such as a spreadsheet or the kind that can be easily stored in a traditional database system.

The simultaneous advances in computing capabilities meant that although the rate of data being generated was very high, it was still computationally feasible to analyze it. Algorithms in machine learning, which were once considered intractable due to both the volume as well as algorithmic complexity, could now be analyzed using various new paradigms such as cluster or multinode processing in a much simpler manner that would have earlier necessitated special-purpose machines.

Chart of data generated per minute. Credit: DOMO Inc.

主站蜘蛛池模板: 沙田区| 建湖县| 伊宁县| 邳州市| 晋江市| 连云港市| 青岛市| 永顺县| 德庆县| 交口县| 乳源| 神木县| 梅州市| 神池县| 荃湾区| 扶沟县| 宁国市| 黄骅市| 贵定县| 阳新县| 图木舒克市| 广安市| 稷山县| 吉水县| 全南县| 峨眉山市| 浙江省| 枞阳县| 望城县| 贞丰县| 夏河县| 张家口市| 宣化县| 沧源| 葵青区| 巴塘县| 周至县| 佛坪县| 麟游县| 崇礼县| 靖边县|