官术网_书友最值得收藏!

Data cleaning

Most datasets require this step, in which you get rid of errors, noise, and redundancies. We need our data to be accurate, complete, reliable, and unbiased, as there are lots of problems that may arise from using bad knowledge base, such as:

  • Inaccurate and biased conclusions
  • Increased error
  • Reduced generalizability, which is the model's ability to perform well over the unseen data that it didn't train on previously
主站蜘蛛池模板: 巫溪县| 蒙自县| 神池县| 巫山县| 桑植县| 泰兴市| 太仆寺旗| 独山县| 大悟县| 建湖县| 南雄市| 高唐县| 昌平区| 鄂托克前旗| 荥经县| 武功县| 开阳县| 随州市| 永靖县| 陆丰市| 彭水| 乌兰察布市| 大邑县| 淅川县| 安宁市| 商城县| 甘泉县| 晋州市| 鸡西市| 株洲市| 丘北县| 定安县| 邵阳县| 东山县| 栖霞市| 天峨县| 沈丘县| 锡林浩特市| 噶尔县| 新干县| 临夏县|