官术网_书友最值得收藏!

Splitting a string on lines, words, or arbitrary tokens

Useful data is often interspersed between delimiters, such as commas or spaces, making string splitting vital for most data analysis tasks.

Getting ready

Create an input.txt file similar to the following one:

$ cat input.txt

first line
second line
words are split by space
comma,separated,values
or any delimiter you want

Install the split package using Cabal as follows:

$ cabal install split

How to do it...

  1. The only function we will need is splitOn, which is imported as follows:
    import Data.List.Split (splitOn)
  2. First we split the string into lines, as shown in the following code snippet:
    main = do 
      input <- readFile "input.txt"
      let ls = lines input
      print $ ls
  3. The lines are printed in a list as follows:
    [ "first line","second line"
    , "words are split by space"
    , "comma,separated,values"
    , "or any delimiter you want"]
    
  4. Next, we separate a string on spaces as follows:
      let ws = words $ ls !! 2
      print ws
  5. The words are printed in a list as follows:
    ["words","are","split","by","space"]
    
  6. Next, we show how to split a string on an arbitrary value using the following lines of code:
      let cs = splitOn "," $ ls !! 3
      print cs
  7. The values are split on the commas as follows:
    ["comma","separated","values"]
    
  8. Finally, we show splitting on multiple letters as shown in the following code snippet:
      let ds = splitOn "an" $ ls !! 4
      print ds
  9. The output is as follows:
    ["or any d","limit","r you want"]
    
主站蜘蛛池模板: 读书| 略阳县| 鹤庆县| 黄冈市| 神农架林区| 中宁县| 同德县| 西宁市| 四会市| 乌兰县| 台山市| 静安区| 靖远县| 泸州市| 洛宁县| 万山特区| 翼城县| 新蔡县| 青龙| 上栗县| 平罗县| 阜平县| 静安区| 新乡市| 平果县| 宁城县| 兴城市| 临邑县| 玉田县| 大冶市| 辽源市| 博罗县| 平阳县| 孙吴县| 崇仁县| 克东县| 海宁市| 平顺县| 轮台县| 贵定县| 内乡县|