官术网_书友最值得收藏!

Examining, cleaning, and filtering data

The next steps after importing the data are to examine it and check for missing or erroneous data. We then need to clean the data and apply filters and selections. Different kinds of datasets need different approaches to carry out these steps. R has powerful packages to handle this and some of them are as follows:

  • dplyrdplyr is a powerful R package that provides methods to make examining, cleaning, and filtering data fast and easy.
  • tidyr: The tidyr package helps to organize messy data for easier data analysis.
  • stringr: The stringr package provides methods and techniques of working with string data efficiently.
  • forcats: Factors are widely used while doing data analysis in R. The forcats package makes it easy to work with factors.
  • lubridate: lubridate makes wrangling date-time data quick and easy.
  • hms: hms is a great package for handling datasets that include data with time of day values.
  • blob: Not all data always comes stored in plain ASCII text; you sometimes have to deal with binary data formats. The blob package makes this easy.
主站蜘蛛池模板: 兖州市| 将乐县| 克拉玛依市| 邢台市| 新乡县| 烟台市| 高要市| 祥云县| 时尚| 理塘县| 潞西市| 天门市| 开封市| 即墨市| 颍上县| 竹山县| 克山县| 新龙县| 武平县| 环江| 贵定县| 大悟县| 登封市| 乌什县| 健康| 虞城县| 平定县| 兴海县| 沂源县| 白朗县| 芜湖市| 榆林市| 太湖县| 博白县| 融水| 虹口区| 怀集县| 昌都县| 阿尔山市| 云林县| 获嘉县|