- Haskell Data Analysis Cookbook
- Nishant Shukla
- 295字
- 2021-12-08 12:43:35
Lexing and parsing an e-mail address
An elegant way to clean data is by defining a lexer to split up a string into tokens. In this recipe, we will parse an e-mail address using the attoparsec
library. This will naturally allow us to ignore the surrounding whitespace.
Getting ready
Import the attoparsec
parser combinator library:
$ cabal install attoparsec
How to do it…
Create a new file, which we will call Main.hs
, and perform the following steps:
- Use the GHC
OverloadedStrings
language extension to more legibly use theText
data type throughout the code. Also, import the other relevant libraries:{-# LANGUAGE OverloadedStrings #-} import Data.Attoparsec.Text import Data.Char (isSpace, isAlphaNum)
- Declare a data type for an e-mail address:
data E-mail = E-mail { user :: String , host :: String } deriving Show
- Define how to parse an e-mail address. This function can be as simple or as complicated as required:
e-mail :: Parser E-mail e-mail = do skipSpace user <- many' $ satisfy isAlphaNum at <- char '@' hostName <- many' $ satisfy isAlphaNum period <- char '.' domain <- many' (satisfy isAlphaNum) return $ E-mail user (hostName ++ "." ++ domain)
- Parse an e-mail address to test the code:
main :: IO () main = print $ parseOnly e-mail "nishant@shukla.io"
- Run the code to print out the parsed e-mail address:
$ runhaskell Main.hs Right (E-mail {user = "nishant", host = "shukla.io"})
How it works…
We create an e-mail parser by matching the string against multiple tests. An e-mail address must contain some alphanumerical username, followed by the 'at' sign (@
), then an alphanumerical hostname, a period, and lastly the top-level domain.
The various functions used from the attoparsec
library can be found in the Data.Attoparsec.Text
documentation, which is available at https://hackage.haskell.org/package/attoparsec/docs/Data-Attoparsec-Text.html.
- 演進式架構(原書第2版)
- Flask Blueprints
- Python 3.7網絡爬蟲快速入門
- Mastering ServiceStack
- Learning Spring 5.0
- Vue.js 3.x從入門到精通(視頻教學版)
- Three.js開發指南:基于WebGL和HTML5在網頁上渲染3D圖形和動畫(原書第3版)
- Unity Shader入門精要
- Learning Data Mining with R
- 從Excel到Python:用Python輕松處理Excel數據(第2版)
- C/C++數據結構與算法速學速用大辭典
- PHP 7從零基礎到項目實戰
- The Statistics and Calculus with Python Workshop
- Java自然語言處理(原書第2版)
- 51單片機C語言程序設計經典實例(第3版)