- Haskell Data Analysis Cookbook
- Nishant Shukla
- 295字
- 2021-12-08 12:43:35
Lexing and parsing an e-mail address
An elegant way to clean data is by defining a lexer to split up a string into tokens. In this recipe, we will parse an e-mail address using the attoparsec
library. This will naturally allow us to ignore the surrounding whitespace.
Getting ready
Import the attoparsec
parser combinator library:
$ cabal install attoparsec
How to do it…
Create a new file, which we will call Main.hs
, and perform the following steps:
- Use the GHC
OverloadedStrings
language extension to more legibly use theText
data type throughout the code. Also, import the other relevant libraries:{-# LANGUAGE OverloadedStrings #-} import Data.Attoparsec.Text import Data.Char (isSpace, isAlphaNum)
- Declare a data type for an e-mail address:
data E-mail = E-mail { user :: String , host :: String } deriving Show
- Define how to parse an e-mail address. This function can be as simple or as complicated as required:
e-mail :: Parser E-mail e-mail = do skipSpace user <- many' $ satisfy isAlphaNum at <- char '@' hostName <- many' $ satisfy isAlphaNum period <- char '.' domain <- many' (satisfy isAlphaNum) return $ E-mail user (hostName ++ "." ++ domain)
- Parse an e-mail address to test the code:
main :: IO () main = print $ parseOnly e-mail "nishant@shukla.io"
- Run the code to print out the parsed e-mail address:
$ runhaskell Main.hs Right (E-mail {user = "nishant", host = "shukla.io"})
How it works…
We create an e-mail parser by matching the string against multiple tests. An e-mail address must contain some alphanumerical username, followed by the 'at' sign (@
), then an alphanumerical hostname, a period, and lastly the top-level domain.
The various functions used from the attoparsec
library can be found in the Data.Attoparsec.Text
documentation, which is available at https://hackage.haskell.org/package/attoparsec/docs/Data-Attoparsec-Text.html.
- Mastering JavaScript Functional Programming
- Java編程指南:基礎(chǔ)知識、類庫應(yīng)用及案例設(shè)計
- 人臉識別原理及算法:動態(tài)人臉識別系統(tǒng)研究
- Visual Basic程序設(shè)計實驗指導(dǎo)(第4版)
- HTML5+CSS3網(wǎng)站設(shè)計基礎(chǔ)教程
- Ext JS 4 Web Application Development Cookbook
- SQL Server數(shù)據(jù)庫管理與開發(fā)兵書
- OpenGL Data Visualization Cookbook
- Citrix XenServer企業(yè)運維實戰(zhàn)
- Kotlin Programming By Example
- SQL Server 2008中文版項目教程(第3版)
- Java EE 7 with GlassFish 4 Application Server
- SQL Server on Linux
- 計算機應(yīng)用基礎(chǔ)(Windows 7+Office 2010)
- 數(shù)據(jù)結(jié)構(gòu)與算法詳解