官术网_书友最值得收藏!

Collecting data

This should be somewhat obvious—without (at least some) data, we cannot perform any of the subsequent steps (although one might argue the point of inference, that would be inappropriate. There is no magic in data science. We, as data scientists, don't make something from anything. Inference (which we'll define later in this chapter) requires at least some data to begin with.

Some new concepts for collecting data include the fact that data can be collected from ample of sources, and the number and types of data sources continue to grow daily. In addition, how data is collected might require a perspective new to a data developer; data for data science isn't always sourced from a relational database, rather from machine-generated logging files, online surveys, performance statistics, and so on; again, the list is ever evolving.

Another point to ponder—collecting data also involves supplementation. For example, a data scientist might determine that he or she needs to be adding additional demographics to a particular pool of application data previously collected, processed, and reviewed.

主站蜘蛛池模板: 无极县| 专栏| 兴安县| 东辽县| 兴化市| 嘉鱼县| 中阳县| 武乡县| 富宁县| 剑河县| 休宁县| 介休市| 洞口县| 漯河市| 金门县| 安西县| 明溪县| 循化| 房产| 石首市| 赞皇县| 温泉县| 新郑市| 万安县| 中宁县| 朔州市| 临沂市| 革吉县| 吴桥县| 文成县| 毕节市| 广平县| 深泽县| 福海县| 新乐市| 宁波市| 峨山| 即墨市| 广丰县| 许昌县| 靖江市|