- Haskell Data Analysis Cookbook
- Nishant Shukla
- 322字
- 2021-12-08 12:43:35
Trimming excess whitespace
The text obtained from sources may unintentionally include beginning or trailing whitespace characters. When parsing such an input, it is often wise to trim the text. For example, when Haskell source code contains trailing whitespace, the GHC compiler ignores it through a process called lexing. The lexer produces a sequence of tokens, effectively ignoring meaningless characters such as excess whitespace.
In this recipe, we will use built-in libraries to make our own trim
function.
How to do it...
Create a new file, which we will call Main.hs
, and perform the following steps:
- Import the
isSpace :: Char -> Bool
function from the built-inData.Char
package:import Data.Char (isSpace)
- Write a trim function that removes the beginning and trailing whitespace:
trim :: String -> String trim = f . f where f = reverse . dropWhile isSpace
- Test it out within
main
:main :: IO () main = putStrLn $ trim " wahoowa! "
- Running the code will result in the following trimmed string:
$ runhaskell Main.hs wahoowa!
How it works...
Our trim
function lazily strips the whitespace from the beginning and ending parts of the string. It starts by dropping whitespace letters from the beginning. Then, it reverses the string to apply the same function again. Finally, it reverses the string one last time to bring it back to the original form. Fortunately, the isSpace
function from Data.Char
handles any Unicode space character as well as the control characters \t
, \n
, \r
, \f
, and \v
.
There's more…
Ready-made parser combinator libraries such as parsec
or uu-parsinglib
could be used to do this instead, rather than reinventing the wheel. By introducing a Token
type and parsing to this type, we can elegantly ignore the whitespace. Alternatively, we can use the alex lexing library (package name, alex
) for this task. These libraries are overkill for this simple task, but they allow us to perform a more generalized tokenizing of text.
- Java程序設計(慕課版)
- 計算思維與算法入門
- 數字媒體應用教程
- Learning Real-time Processing with Spark Streaming
- Mastering RabbitMQ
- Moodle Administration Essentials
- Learning Spring 5.0
- Mastering Android Development with Kotlin
- Instant Debian:Build a Web Server
- 深入解析Java編譯器:源碼剖析與實例詳解
- Android應用開發實戰(第2版)
- 零基礎學C++(升級版)
- 讀故事學編程:Python王國歷險記
- 生成藝術:Processing視覺創意入門
- C#程序開發教程