官术网_书友最值得收藏!

Finding or observing data

Data can be found or observed in many places. An obvious data source is the internet. With an increase in social media usage, and with mobile phones penetrating deeper as mobile data plans become cheaper or even offer unlimited data, there has been an exponential rise in data consumed by users.

Now, online streaming platforms have emerged—the following diagram shows that the hours spent on consuming video data is also growing rapidly:

To get data from the internet, there are multiple options, as shown in the following list:

  • Bulk downloads from websites such as Wikipedia, IMDb, and the Million Song Dataset (which can be found here: https://labrosa.ee.columbia.edu/millionsong/).
  • Accessing the data through APIs (such as Google, Twitter, Facebook, and YouTube).
  • It is okay to scrape public, non-sensitive, and anonymized data. Be sure to check the terms and conditions and to fully reference the information.

The main drawbacks of the data collected is that it takes time and space to accumulate the data, and it covers only what happened; for instance, intentions and internal and external motivations are not collected. Finally, such data might be noisy, incomplete, inconsistent, and may even change over time.

Another option is to collect measurements from sensors such as inertial and location sensors in mobile devices, environmental sensors, and software agents monitoring key performance indicators.

主站蜘蛛池模板: 忻州市| 张北县| 江达县| 张家界市| 石泉县| 金川县| 乌兰察布市| 滨州市| 池州市| 应城市| 长海县| 龙里县| 曲水县| 东明县| 原阳县| 喜德县| 黎川县| 成都市| 垦利县| 调兵山市| 通许县| 乐昌市| 宣恩县| 马关县| 安仁县| 奉新县| 巫溪县| 阿巴嘎旗| 太康县| 灌阳县| 东兴市| 乌审旗| 新津县| 马鞍山市| 河源市| 莆田市| 雷州市| 前郭尔| 凤翔县| 河间市| 建始县|