官术网_书友最值得收藏!

  • Machine Learning with Swift
  • Alexander Sosnovshchenko
  • 154字
  • 2021-06-24 18:54:51

Data preprocessing

The useful information in the data is usually referred to as a signal. On the other hand, the pieces of data that represent errors of different kinds and irrelevant data are known as noise. Errors can occur in the data during measurements, information transmission, or due to human errors. The goal of data cleansing procedures is to increase the signal/noise ratio. During this stage, you will usually transform all data to one format, delete entries with missed values, and check suspicious outliers (they can be both noise and signal). It is widely believed among ML engineers, that the data preprocessing stage usually consumes 90% of the time allocated for the ML project. Then, algorithm tweaking consumes another 90% of time. This statement is a joke only partially (about 10% of it). In Chapter 13Best Practices, we are going to discuss common problems with the data and how to fix them.

主站蜘蛛池模板: 平遥县| 桃园市| 郧西县| 塔城市| 漠河县| 嵩明县| 武定县| 碌曲县| 手游| 凤庆县| 滁州市| 股票| 莱西市| 泸州市| 绿春县| 木里| 肥东县| 随州市| 恩平市| 神木县| 崇州市| 剑河县| 布尔津县| 福安市| 新野县| 鹤壁市| 灌南县| 仪征市| 洞头县| 五峰| 苗栗县| 马关县| 海晏县| 双桥区| 洪湖市| 新龙县| 宝丰县| 八宿县| 湾仔区| 河源市| 根河市|