官术网_书友最值得收藏!

  • Machine Learning in Java
  • AshishSingh Bhatia Bostjan Kaluza
  • 230字
  • 2021-06-10 19:29:55

Finding or observing data

Data can be found or observed in many places. An obvious data source is the internet. With an increase in social media usage, and with mobile phones penetrating deeper as mobile data plans become cheaper or even offer unlimited data, there has been an exponential rise in data consumed by users.

Now, online streaming platforms have emerged—the following diagram shows that the hours spent on consuming video data is also growing rapidly:

To get data from the internet, there are multiple options, as shown in the following list:

  • Bulk downloads from websites such as Wikipedia, IMDb, and the Million Song Dataset (which can be found here: https://labrosa.ee.columbia.edu/millionsong/).
  • Accessing the data through APIs (such as Google, Twitter, Facebook, and YouTube).
  • It is okay to scrape public, non-sensitive, and anonymized data. Be sure to check the terms and conditions and to fully reference the information.

The main drawbacks of the data collected is that it takes time and space to accumulate the data, and it covers only what happened; for instance, intentions and internal and external motivations are not collected. Finally, such data might be noisy, incomplete, inconsistent, and may even change over time.

Another option is to collect measurements from sensors such as inertial and location sensors in mobile devices, environmental sensors, and software agents monitoring key performance indicators.

主站蜘蛛池模板: 红桥区| 湟中县| 弋阳县| 本溪| 交城县| 西平县| 天峻县| 类乌齐县| 星座| 家居| 札达县| 宁强县| 卢氏县| 岳西县| 九台市| 南宫市| 民乐县| 枣庄市| 汾阳市| 凭祥市| 苏州市| 格尔木市| 连江县| 五峰| 鞍山市| 韩城市| 海兴县| 河曲县| 通城县| 石家庄市| 六安市| 息烽县| 秦皇岛市| 临清市| 南靖县| 宜州市| 连州市| 岗巴县| 措勤县| 平利县| 绥滨县|