官术网_书友最值得收藏!

Chapter 2. Integrity and Inspection

This chapter will cover the following recipes:

  • Trimming excess whitespace
  • Ignoring punctuation and specific characters
  • Coping with unexpected or missing input
  • Validating records by matching regular expressions
  • Lexing and parsing an e-mail address
  • Deduplication of nonconflicting data items
  • Deduplication of conflicting data items
  • Implementing a frequency table using Data.List
  • Implementing a frequency table using Data.MultiSet
  • Computing the Manhattan distance
  • Computing the Euclidean distance
  • Comparing scaled data using the Pearson correlation coefficient
  • Comparing sparse data using cosine similarity
主站蜘蛛池模板: 偃师市| 普定县| 汾阳市| 融水| 微博| 福海县| 托克逊县| 保靖县| 大田县| 宁波市| 收藏| 宁都县| 康乐县| 临邑县| 南江县| 瓦房店市| 贵定县| 门源| 惠州市| 板桥市| 遂昌县| 壤塘县| 湖北省| 屏山县| 永登县| 黄骅市| 大宁县| 九台市| 延寿县| 巩义市| 海伦市| 土默特左旗| 黄骅市| 安西县| 开远市| 大方县| 肥东县| 绵竹市| 秭归县| 高雄市| 扶沟县|