官术网_书友最值得收藏!

Sources and types of big data

We learned that big data is omnipresent and that it can be beneficial for enterprises in one or many ways. With the high prevalence of big data from existing hardware and software, enterprises are still struggling to process, store, analyze, and manage big data using traditional data-mining tools and techniques. In this section, we are going to explore the sources of these complex and dynamic data and how can we consume them. 

We can separate the sources of the data into three major categories. The following diagram shows the three major sources of big data:

Let's look into the three major sources one by one:

  • Logs generated by a machine: A lot of the big data is generated from real-time sensors in industrial machinery or vehicles that create logs for tracking user behaviors, environmental sensors, or personal health-trackers and other sensor data. Most of this machine-created data can be grouped into the following subcategories:
    • Click-log stream data: This is the data that is captured every time a user clicks any link on a website. A detailed analysis of this data can reveal information related to customer behavior and deep interactions of the users with the current website, as well as customers' buying patterns.
    • Gaming events log data: A user performs a set of tasks when playing any online game. Each and every move the online user makes in a game can be stored. This data can be analyzed and the results can be helpful in knowing how end users are propeled through a gaming portfolio.
    • Sensors log data: Various types of sensors log data involve radio-frequency ID tags, smart meters, smartwatch sensor data, medical sensor devices such as heart-rate-monitoring sensors, and Global Positioning System (GPS) data. These types of sensors log data can be recorded and then used to analyze the actual status of the subject.
    • Weblog event data: There is extensive use of servers, cloud infrastructures, applications, networks, and so on. These applications operate and record all kinds of data about their events and operation. These data, when stored, can amount to massive volumes of data, and can be useful in understanding how to deal with service-level agreements or to predict security breaches.
    • Point-of-sale event-log data: Almost every product these days has a unique barcode. A cashier in a retail shop or department swipes the barcode of any product when selling, and all the data associated with the product is generated and can be captured. This data can be analyzed to understand the selling pattern of a retailer.
  • Person: People generate a lot of big data from social media, status updates, tweets, photos, and media uploads. Most of these logs are generated through interactions of a user with a network, such as the internet. This data reveal contains how a user communicates with the network. These interaction logs can reveal deep content-interaction models that can be useful in understanding user behavior. This analysis can be used to train a model to present personalized recommendations of web items, including next news to read, or, most likely, products to consider buying. A lot of similar researches are very hot in today's industry, including  sentiment analysis and topic analysis. Most of this data is unstructured, as there is no proper format or well-defined structure available. Most of this data is either in a text format, a portable document format, a comma-separated value (CSV), or a JSON file. 
  • Organization: We get a massive amount of data from an organization in terms of transaction information in databases and structured data open-stored in the data warehouse. This data is a highly structured form of data. Organizations store their data on some type of RDBMS, such as SQL, Oracle, and MS Access. This data resides in a fixed format inside the field or a table. This organization-generated data is consumed and processed in ICT technology to comprehend business intelligence and market analysis.
主站蜘蛛池模板: 安多县| 龙里县| 阿拉善盟| 芮城县| 茂名市| 东辽县| 衡阳县| 米泉市| 资中县| 建平县| 红安县| 通城县| 呼图壁县| 永城市| 五台县| 固阳县| 屏东县| 突泉县| 新巴尔虎右旗| 邵阳县| 浮梁县| 全椒县| 游戏| 黑水县| 咸宁市| 泽普县| 仁寿县| 天长市| 伊宁县| 竹山县| 平定县| 汉源县| 霍邱县| 两当县| 通许县| 丽江市| 高邑县| 黄浦区| 永川市| 青河县| 鱼台县|