官术网_书友最值得收藏!

Trimming excess whitespace

The text obtained from sources may unintentionally include beginning or trailing whitespace characters. When parsing such an input, it is often wise to trim the text. For example, when Haskell source code contains trailing whitespace, the GHC compiler ignores it through a process called lexing. The lexer produces a sequence of tokens, effectively ignoring meaningless characters such as excess whitespace.

In this recipe, we will use built-in libraries to make our own trim function.

How to do it...

Create a new file, which we will call Main.hs, and perform the following steps:

  1. Import the isSpace :: Char -> Bool function from the built-in Data.Char package:
    import Data.Char (isSpace)
  2. Write a trim function that removes the beginning and trailing whitespace:
    trim :: String -> String
    trim = f . f
      where f = reverse . dropWhile isSpace
  3. Test it out within main:
    main :: IO ()
    main = putStrLn $ trim " wahoowa! "
  4. Running the code will result in the following trimmed string:
    $ runhaskell Main.hs
    
    wahoowa!
    

How it works...

Our trim function lazily strips the whitespace from the beginning and ending parts of the string. It starts by dropping whitespace letters from the beginning. Then, it reverses the string to apply the same function again. Finally, it reverses the string one last time to bring it back to the original form. Fortunately, the isSpace function from Data.Char handles any Unicode space character as well as the control characters \t, \n, \r, \f, and \v.

There's more…

Ready-made parser combinator libraries such as parsec or uu-parsinglib could be used to do this instead, rather than reinventing the wheel. By introducing a Token type and parsing to this type, we can elegantly ignore the whitespace. Alternatively, we can use the alex lexing library (package name, alex) for this task. These libraries are overkill for this simple task, but they allow us to perform a more generalized tokenizing of text.

主站蜘蛛池模板: 米泉市| 务川| 澄江县| 惠东县| 图片| 东丽区| 高清| 丹东市| 临安市| 买车| 普定县| 鸡东县| 永登县| 临漳县| 桂林市| 山丹县| 鞍山市| 当涂县| 炎陵县| 沁阳市| 宁晋县| 岗巴县| 潜江市| 凭祥市| 恩平市| 海淀区| 兴山县| 浠水县| 临汾市| 日土县| 砀山县| 永寿县| 长兴县| 大方县| 图们市| 通海县| 玛纳斯县| 麻栗坡县| 孝义市| 贵阳市| 张家口市|